Poisson Distribution Practical Codes

Poisson Distribution

Formular PPT Page 67

Practice: Cookies quality, average number of chocolate- chips per cookie is 6

1) Pick a random cookies probiliy is fewer than 5 chips

Succes: Finding a chip; X: the number of successes; there is no a priority limit for X(Different to the size of bionatial distribution); lamda = 6 chocolete-chips per cookies;

Solution: P(X<5) = P(X <= 4) = F(4)

R Coding:

ppois(4, lambda = 6)

2) Pick a random cookies probiliti exactly 5 chips

R Coding:

dpois(5, lambda = 6)

3) 80% of the cookies will have at most what number of chips?

Want to find "b" such that P(X<= b) = 0.80

R Coding:

qpois(0.8, lambda = 6)

That means 80% of the cookies will have 8 or fewer chips; or 80% of the cookies will have at most 8 chips

4) 20% of the cookies will have at least what number of chips?

Solution: P(X>=c) =P(X>c-1) = 1 - F(c-1) = 0.2, so the ans: F(c-1) = 0.8 means 80% quantile is 9,so C=9

qpois(0.8, lambda = 6)

20% of the cookies will have at least 9 of chips?

5) Acookie is considered high quality, if it has at least 5 chips, otherwise, it will be disorded.

In a betch of 100 cookies, how many expected to be discarded?

Analysis: Y = numbers of discorded cookies out of the 100 in the betch; Maximum for Y is 100; success-> cookie with fewer than 5 chips -> disorded.

Y is Binomial n=100, we need Pi to caculate. Pi = P(X<5) = P(X<=4) = F(4) = 0.285 (X is the number of chips on one cookie)

R coding

ppois(4,lambda = 6)

we got pi = 0.285, So the expected value of that is E[Y] = nPi = 100(0.285) = 28.5

Basic Probability

PPT alt text

Grouping and Aggregating Data

Grouping and Aggregating Data

Grouping and Aggregating Data


Sometimes, you may want to group a quantitative variable into groups to get summarized information. The cut() function is used convert a quantitative variable to a grouping factor.

Agegroups = cut(Age,c(20,50,70,80))
Agegroups
x1 = table(Sex,Treatment,Agegroups,Improved)
ftable(x1)

You can also use the quantile()function to break the “Age” variable into equal sized groups The quantile() function returns the quartiles of the data by default (see nest section)

Agegroups = cut(Age,quantile(Age))
table(Agegroups)

Aggregating Numerical Data:

The function tapply(numeric variable, grouping variable, function to aggregate)is useful to summarize statistics (length, mean, range, quantile, sd, etc.) within groups.

In the tapply function, the first argument is numeric and the second is a grouping variable which is generally a factor The results are placed in a tabular form. For example, x1 = tapply(Gas,Insul,mean) will compute the mean of “Gas” based on “Insul” and place it in X1. Make sure to use na.rm=T if there is missing data.

If you have multiple grouping variables, then you can use the list argument as follows:

tapply(Salaries$salary,list(Salaries$sex,Salaries$rank),mean)

Histogram Command:

To get a traditional frequency histogram, use the “freq” option:

hist(weight,breaks=c(50,60,80,100),freq=T)

GitHub – tonyleidong

tonyleidong