Basic Math/Probability
Expert: Josh - 12/1/2004
QuestionHi Josh, thanks so much for your help! I went through the problem on my own again and got the same answer as you did. I was just wondering if you could point me in the right direction for another question, as I'm just not sure where to start:
Q: There are 40 students in a statistics class. The instructor knows that the time needed to grade a randomly chosen midterm paper is a random variable with an expected value of 6 min and a standard deviation of 6 min. If grading times are independent and the instructor begins grading at 5:50P.M and grades continuously, what's the probability that she is through grading before the 10pm news?
Thanks for all your help,
Anne
-------------------------
Followup To
Question -
H, I'm trying to work out a problem, but I can't seem to get anywhere. If you could provide some help, it'd be greatly appreciated.
Q: The tonnage of freight handled per month is normally distributed w/ an average tonnage of 225 tonnes per month, and a population standard deviation of 30 tonnes. What are the chances that in a random sample of 12 months the sample mean tonnage will be between 218 and 232 tonnes?
Thanks for your help,
Anne
Answer -
Hello Anne,
This question is about the Central Limit Theorem.
Its formal definitions are well documented, so I'll give you a more intuitive explanation and leave it up to you whether you want to do further research using a search engine on this topic.
In this example, we are taking N=12 random samples. Each random sample X_i follows an independent and identically distributed Gaussian distribution, with mean "mu_X" and variance "sigma_X"^2. To consider their average behavior, we define their average in terms of a new random variable,
Y=(1/N)*SUM_i X_i, where i=1,2,...N.
The central limit theorem is a statement about the statistical properties of Y. The mean of Y and variance of Y are given by,
"mu_Y" = "mu_X",
"sigma_Y" = "sigma_X"/sqrt(N).
That is, the mean of the joint average from X1 to X12 remains the same. But the uncertainty about this average (the standard deviation) decreases in direct proportion to the sample size (N).
In order to use the standard tables in your statistics textbook, you need to convert the samples drawn from a Gaussian distribution (i.e., an arbitrary Normal distribution) into the standard Normal distribution N(0,1). The former has mean "mu" and variance "sigma"^2, while the latter has zero-mean and one unit of standard deviation. Please refer to illustrations at[
http://mathworl
d.wolfram.com/NormalDistribution.html].
Before transformation:
Mean for X: mu_X=225
S.D. for X: sigma_X=30
Mean for Y: mu_Y=225
S.D. for Y: sigma_Y=sigmaX/sqrt(N)=30/sqrt(12)=8.66025403
After transformation:
The normalized Z score is Z=(Y-mu_Y)/sigma_Y.
The probability of Y falling between Y_lower=218 and Y_upper=232 is found by integrating the area under the Normal curve between Z_lower=(Y_lower-mu_Y)/sigma_Y and Z_upper=(Z_upper-mu_Y)/sigma_Y.
Here, Z_lower=-0.808290, Z_upper=0.808290.
Referring to the Z-table at[
http://www.statsoft.com/textbook/sttable.html]
You'll see that the value corresponding to Z_upper is about 0.3106 (if you integrate from z=0 to z=Z_upper).
Since the bell curve is symmetrical, we conclude that
the chance of Y falling between 218 and 232 is about 62.1%.
Cheers.
AnswerHi Anne,
I tried to reply to your question last Friday, but the web server has rejected my submissions on five separate occasions. Eventually, I posted the answer in the message section of my profile [
http://www.allexperts.com/displayExpert.asp?Expert=46653]. Not sure if you had noticed.
Although the question did not mention the marginal distribution for random variable X_i, but Y=(X1+...+XN)/N tends towards a Gaussian distribution if N is sufficiently large. As a rule of thumb, we need N>=20.
Here, N=40. Appealing to the central limit theorem again, we have
mu_Y = mu_X_i = 6 ...[#1]
sigma_Y=sigma_X_i/sqrt(N)=6/sqrt(40) ...[#2]
The time elapsed between 5:50PM and 10:00PM is 250 minutes in duration. This imposes an upper bound on the average time that the marker can spend marking each paper.
Y_upper=250/40 ...[#3]
So, the question is asking what is the probability that Y<=Y_upper. To use the standard normal cumulative distribution function (CDF), let Z=(Y-mu_Y)/sigma_Y.
Note: Last time the figures I quoted correspond to the area integrated between 0 and Z_upper. This time you can add 0.5 to the lookup value OR integrate from -infinity to Z_upper by using the normal CDF.
I think you can now retrace the steps given last time to finish the problem. Let me know how you go.
Cheers.