## Sampling Distribution

Generate Samples: Now, let’s sample 1000 times, every time we draw a sample of size 10, and compute the mean and SD for each sample.

means = replicate(1000,mean(rnorm(10,mean=5,sd=3)))
sds = replicate(1000,sd(rnorm(10,mean=5,sd=3)))

Notice the difference between the rep() and replicate() functions in R

Limit Laws : Rules of Thumb -

For most distributions, n > 30  will give a sampling distribution that is nearly normal
For fairly symmetric distributions, n > 15
For normal population distributions, the sampling distribution of the mean is always normally distributed

Confidence Intervals A point estimate is a single number, a confidence interval provides additional information about the variability of the estimate

## Monte-Carlo Distributions Practical Codes

Example 1: There are two tasks. Task 1 duration follows a normal distribution with mean 5 days and standard deviation of 1 day. Task 2 duration follows an uniform distribution between 3 and 7 days. What is the probability both tasks get completed in 10 days if sequentially done? Same question if done in parallel. What is the median time completion in both cases? Solution: - Simulate times:

T1 = rnorm(1000,5,1)
T2 = runif(1000,min=3,max=7)
• Compute probability for sequential project

S = T1 + T2 sum(S<10)/1000

• Compute probability for parallel project

P = pmax(T1,T2) sum(P<10)/1000

• Compute medians:

median(S) median(P)

pmax function example: > x <- c(3, 26, 122, 6) > y <- c(43,2,54,8) > z <- c(9,32,1,9) > pmax(x,y,z) [1] 43 32 122 9

Example 2: Phoncessories manufactures several customized accessories for smart phones and packages them into boxes. Each box consists of 20 units. Processing each unit in a box takes 2 minutes (constant). Company classifies the boxes into “simple” (ordered 60% of the time) and “complex” (ordered 40% of the time): Simple box setup time is exponential dist. with mean = 1 hour Complex box setup time is exponential dist. With mean = 1.5 hours The firm would like to study the overall time to process a random order for a box.

This is the code to generate a sample of production times for 1,000 boxes: TSimple=rexp(1000,rate=1/60) TComplex=rexp(1000,rate=1/90) OrderType=sample(c(0,1),size=1000,replace=TRUE,prob=c(0.6,0.4)) ProdTime=ifelse(OrderType==0,TSimple,TComplex)+40

# R coding: PPT Page 63

dbinom(0, size = 20, prob = 0.08) # P( X = 0 )

# R Coding:

1-pbinom(1, size = 20, prob = 0.08)

## Mean & Standard Deviation in PPT page 60

EV.Fraud = 20* 0.08
EV.Fraud

SD.Fraud = sqrt(20*0.08*(1-0.08))
SD.Fraud

## Chebyshev Rule 75% of the time we will get success get 0 to 4 fraudulent orders

EV.Fraud - 2*SD.Fraud
EV.Fraud + 2*SD.Fraud

tonyleidong