Now, let’s sample 1000 times, every time we draw a sample of size 10, and compute the mean and SD for each sample.
means = replicate(1000,mean(rnorm(10,mean=5,sd=3)))
sds = replicate(1000,sd(rnorm(10,mean=5,sd=3)))
Notice the difference between the rep() and replicate() functions in R
Limit Laws :
Rules of Thumb -
For most distributions, n > 30 will give a sampling distribution that is nearly normal
For fairly symmetric distributions, n > 15
For normal population distributions, the sampling distribution of the mean is always normally distributed
A point estimate is a single number,
a confidence interval provides additional information about the variability of the estimate
There are two tasks.
Task 1 duration follows a normal distribution with mean 5 days and standard deviation of 1 day.
Task 2 duration follows an uniform distribution between 3 and 7 days.
What is the probability both tasks get completed in 10 days if sequentially done?
Same question if done in parallel.
What is the median time completion in both cases?
- Simulate times:
T1 = rnorm(1000,5,1)
T2 = runif(1000,min=3,max=7)
Compute probability for sequential project
S = T1 + T2
Compute probability for parallel project
P = pmax(T1,T2)
pmax function example:
> x <- c(3, 26, 122, 6)
> y <- c(43,2,54,8)
> z <- c(9,32,1,9)
 43 32 122 9
Phoncessories manufactures several customized accessories for smart phones and packages them into boxes.
Each box consists of 20 units.
Processing each unit in a box takes 2 minutes (constant).
Company classifies the boxes into “simple” (ordered 60% of the time) and “complex” (ordered 40% of the time):
Simple box setup time is exponential dist. with mean = 1 hour
Complex box setup time is exponential dist. With mean = 1.5 hours
The firm would like to study the overall time to process a random order for a box.
This is the code to generate a sample of production times for 1,000 boxes:
BinomialDistribution Formular PPT page 54
Example: PPT page 56
R Binomial Function
Density or point probability (d)
Cumulated probability distribution (p)
Random generator (r)
Practice 1: Past Records, 0.08 probability that an online tretail order is fraudlent. Suppose we have 20 orders, Probbility no order is fraudulent?
Analysize the Problem: Triail -> Checking order ; Fraudelent or not? ;Whether we are sucess -> order is fraudulent; n = 20 trials; Pi = probability of success = 0.08; X = the number of fraudulent orders among the 20
Solution: P(X = 0) -> d; P(x<=a) = F(a) -> p; P(X = 0) is what we want!
R coding: PPT Page 63
dbinom(0, size = 20, prob = 0.08) # P( X = 0 )
Practice 2: Probability of 2 or more Fraudulent orders among the 20?
Solution: P(X>=2) = P(X>1) = 1 - F(1)
1-pbinom(1, size = 20, prob = 0.08)
Mean & Standard Deviation in PPT page 60
EV.Fraud = 20* 0.08
SD.Fraud = sqrt(20*0.08*(1-0.08))
Chebyshev Rule 75% of the time we will get success get 0 to 4 fraudulent orders
EV.Fraud - 2*SD.Fraud
EV.Fraud + 2*SD.Fraud