Sampling Distribution
Generate Samples:
Now, let’s sample 1000 times, every time we draw a sample of size 10, and compute the mean and SD for each sample.
means = replicate(1000,mean(rnorm(10,mean=5,sd=3)))
sds = replicate(1000,sd(rnorm(10,mean=5,sd=3)))
Notice the difference between the rep() and replicate() functions in R
Limit Laws :
Rules of Thumb 
For most distributions, n > 30 will give a sampling distribution that is nearly normal
For fairly symmetric distributions, n > 15
For normal population distributions, the sampling distribution of the mean is always normally distributed
Confidence Intervals
A point estimate is a single number,
a confidence interval provides additional information about the variability of the estimate
Example 1:
There are two tasks.
Task 1 duration follows a normal distribution with mean 5 days and standard deviation of 1 day.
Task 2 duration follows an uniform distribution between 3 and 7 days.
What is the probability both tasks get completed in 10 days if sequentially done?
Same question if done in parallel.
What is the median time completion in both cases?
Solution:
 Simulate times:
T1 = rnorm(1000,5,1)
T2 = runif(1000,min=3,max=7)

Compute probability for sequential project
S = T1 + T2
sum(S<10)/1000

Compute probability for parallel project
P = pmax(T1,T2)
sum(P<10)/1000

Compute medians:
median(S)
median(P)
pmax function example:
> x < c(3, 26, 122, 6)
> y < c(43,2,54,8)
> z < c(9,32,1,9)
> pmax(x,y,z)
[1] 43 32 122 9
Example 2:
Phoncessories manufactures several customized accessories for smart phones and packages them into boxes.
Each box consists of 20 units.
Processing each unit in a box takes 2 minutes (constant).
Company classifies the boxes into “simple” (ordered 60% of the time) and “complex” (ordered 40% of the time):
Simple box setup time is exponential dist. with mean = 1 hour
Complex box setup time is exponential dist. With mean = 1.5 hours
The firm would like to study the overall time to process a random order for a box.
This is the code to generate a sample of production times for 1,000 boxes:
TSimple=rexp(1000,rate=1/60)
TComplex=rexp(1000,rate=1/90)
OrderType=sample(c(0,1),size=1000,replace=TRUE,prob=c(0.6,0.4))
ProdTime=ifelse(OrderType==0,TSimple,TComplex)+40
BinomialDistribution Formular PPT page 54
Example: PPT page 56
R Binomial Function
Density or point probability (d)
Cumulated probability distribution (p)
Quantiles (q)
Random generator (r)
Practice 1: Past Records, 0.08 probability that an online tretail order is fraudlent. Suppose we have 20 orders, Probbility no order is fraudulent?
Analysize the Problem: Triail > Checking order ; Fraudelent or not? ;Whether we are sucess > order is fraudulent; n = 20 trials; Pi = probability of success = 0.08; X = the number of fraudulent orders among the 20
Solution: P(X = 0) > d; P(x<=a) = F(a) > p; P(X = 0) is what we want!
R coding: PPT Page 63
dbinom(0, size = 20, prob = 0.08) # P( X = 0 )
Practice 2: Probability of 2 or more Fraudulent orders among the 20?
Solution: P(X>=2) = P(X>1) = 1  F(1)
R Coding:
1pbinom(1, size = 20, prob = 0.08)
Mean & Standard Deviation in PPT page 60
EV.Fraud = 20* 0.08
EV.Fraud
SD.Fraud = sqrt(20*0.08*(10.08))
SD.Fraud
Chebyshev Rule 75% of the time we will get success get 0 to 4 fraudulent orders
EV.Fraud  2*SD.Fraud
EV.Fraud + 2*SD.Fraud