Basic Math/Mean and Standard Deviation
Expert: Josh - 12/11/2007
QuestionQUESTION: Let n be any odd whole number.
a) Which has the greater mean: the odd whole numbers from 1 to n, or the even whole numbers from 2 to n-1?
b) Which has the greater standard deviation: the odd whole numbers from 1 to n, or the even whole numbers from 2 to n-1?
ANSWER: Hi Jamie
Part a) describes two arithmetic sequences with a difference of 2 between consecutive terms. To decide which is greater, we can apply the arithmetic sum formula. [Do a search on Google if you are interested. The main observation is that the first and last term can be added together, then, the second and second last term can be added together and so forth; there are N/2 such pairings with equivalent value.]
FORMULA:
Sum = (A+L)*N/2, where N represents the number of terms in the sum, A and L denote the first and last term, respectively.
i) Let S=1+3+...+n. We find N=(1+n)/2, A=1, L=n.
Using formula, S adds up to (1+n)*[(1+n)/2]/2 = (1+n)^2/4
Thus, the average is [(1+n)^2/4]/[(1+n)/2] = (1+n)/2.
ii) Let T=2+4+...+(n-1), A=2 and L=(n-1) and there are N=(1+n)/2 -1 terms (one less than before).
Applying the arithmetic sum formula, we get
T = (2+n-1)*[(1+n)/2-1]/2 ......[#1]
= (1+n)*[(1+n)-2]/4
= (1+n)^2/4 -(1+n)/2.
Although the sum is less than before, dividing [#1] by N=[(1+n)/2-1], we see that the average is the SAME as before, still (1+n)/2.
b) To answer this question properly, we need formal analysis which involves a fair bit of algebra. If you want to get to the bottom of this, you need to be patient.
In statistics, the standard deviation is simply the square root of the variance. If the variance of a set of numbers S is greater than the variance of another set of numbers T, the same can be said about their respective standard deviation.
Notation:
Let X and Y contain the set of numbers {x[1],x[2],x[3],...x[N]}={1,3,5,...,n} and {y[1],y[2],y[3],...y[M]}={2,4,6,...,n-1}, respectively. There are N=(n+1)/2 terms in set S, and M=(n-1)/2 terms in set T.
Definitions:
Variance of X = Mean of X^2 - [Mean of X]^2
Variance of Y = Mean of Y^2 - [Mean of Y]^2.
Since we have established in part a) that the mean of X is equivalent to the mean of Y, we can ignore the terms [Mean of X]^2 and [Mean of Y]^2 from the above equations in our comparison. Whether the variance is greater for the odd numbers (in X), or even numbers (in Y) will be determined based on the magnitude of "Mean of X^2" versus "Mean of Y^2". This is what we will focus on: finding the average of the squared odd numbers X[i]^2, and the average of the squared even numbers Y[i]^2.
Let S = {S[1],S[2],......,S[N]} = {1,9,25,49,...,n^2}
T = {T[1],T[2],...,T[M]} = {4,16,36,...,(n-1)^2}.
These are the squared values for the corresponding odd and even numbers in X and Y.
If T[i]=k^2, then, T[i]-S[i] = k^2 -(k-1)^2 = 2k-1.
T[i+1]-S[i+1] = (k+2)^2 - (k+1)^2 = 2k+3.
In fact, it can be shown that T[i+j]-S[i+j] = 2k+4(j-1)-1.
Example: When i=1, T[1]=4, S[1]=1. The value of k=2 since 2^2=4. As predicted 2k-1=3, this is the difference between T[1] and S[1]. j can take any value from 1 to M. But if we choose j=1, with k=2 for illustration, we can see once again that T[2]-S[2] = 2k+4(j-1)-1 = 7 is the difference between the two terms T[2]=16, S[2]=9.
The enables us to ultimately compare the sum of numbers in S with the sum of numbers in T.
Specifically, if we let the difference d[i]= T[i]-S[i]= 2k+[4(i-1)-1],
[4+16+...+(n-1)^2]-[1+9+...+(n-1)^2] = SUM d[i] from i=1 to i=M, where M is the number of terms in T (one less than N, the number of terms in S).
Applying the arithmetic sum formula,
[4+16+...+(n-1)^2]-[1+9+...+(n-1)^2]
= 2kM + [-1+4(M-1)-1]*M/2 simplify using k=2, M=(n-1)/2
= 2(n-1)+(n-4)(n-1)/2.
Note: This is the accumulated difference between the squared odd and even numbers, i.e., the S[i] terms and T[i] terms. The summation goes from 1 to M, neglecting the last term S[N] in the set S.
Thus, subtracting the sum of squared even numbers from the sum of ALL squared odd numbers, noting the extra n^2 term now incorporated into this expression
[1+9+...+(n-1)^2+n^2]-[4+16+...+(n-1)^2]
= n^2 -[2(n-1)+(n-4)(n-1)/2]
= n^2 -[2n-2+(n^2-5n+4)/2]
=(n^2+n)/2
In other words, [1+9+...+(n-1)^2+n^2] is greater than [4+16+...+(n-1)^2] by (n^2+n)/2 ...[#2]
Next, we use a well known result regarding the sum of squares:
1 + 2^2 + 3^2 + 4^2 +....+ n^2 = (n/6)(n+1)(2n+1) ...[#3]
Observe that the left hand side can be split into the sum of squared odd numbers, and the sum of squared even numbers. i.e.,
1 + 2^2 + 3^2 + 4^2 +....+ n^2
= [1+9+...+n^2] + [4+16+...+(n-1)^2] using result from [#2]
= 2 * [4+16+...+(n-1)^2] + (n^2+n)/2
Thus, according to [#3],
[4+16+...+(n-1)^2] = { (n/6)(n+1)(2n+1) - (n^2+n)/2 }/2 ...[#4]
We now have a direct formula for sum of squared even numbers from 2 to (n-1). After some algebra, it can be shown that the expression in [#4] simplifies to
[4+16+...+(n-1)^2] = (n^3-n)/6 .....[#5]
Last step: The following are all equivalent statements
- The mean of S is greater than the mean of the T;
- The average value of the X[i]^2 terms in the odd sequence is greater than the average value of the Y[i]^2 terms in the even sequence;
- The variance of X (set of odd integers from 1 to n) is greater than the variance of Y (set of even integers from 2 to n-1);
- The standard deviation of X (set of odd integers from 1 to n) is greater than the standard deviation of Y (set of even integers from 2 to n-1);
IF WE CAN SHOW THAT
The average of [1,9,...,(n-2)^2,n^2] is greater than the average of [4,16,...,(n-1)^2]
i.e., we need to check that
[(n^3-n)/6 +(n^2+n)/2 ]/N > [(n^3-n)/6]/M , where N=(n+1)/2, M=(n-1)/2
[(n^3-n)/6 +(n^2+n)/2 ]/(n+1) > [(n^3-n)/6]/(n-1)
[(1/6)n^3+(1/2)n^2+(1/3)n]*(n-1) > [(n^3-n)/6]*(n+1)
....after some messy algebra
(1/6)n^4+(1/2-1/6)n^3+(1/3-1/2)n^2-(1/3)n > (1/6)n^4+(1/6)n^3-(1/6)n^2-(1/6)n
....multiply both sides by 6
n^4+2n^3-n^2-2n > n^4+n^3-n^2-n
....finally we arrive at
n^3-n > 0
....which is always true for n>1
Conclusion: If we haven't made any mistake, this result suggests the standard deviation of [1,3,...,(n-2),n] is greater than the standard deviation of [2,4,...,(n-1)].
Numerical Example:
n=71,
Prediction:
i) 1^2+ 2^2 + 3^3 +...+71^2 = (n/6)(n+1)(2n+1) = 121836
ii) From [#2], [1+9+...+(n-1)^2+n^2]-[4+16+...+(n-1)^2] = (n^2+n)/2 = 2556
iii)Sum of squared even integers: 2^2+4^2+...+70^2 = 59640
Verification (from computer program):
SUM S[i]^2 = 1+3^2+...+71^2 = 62196
SUM T[i]^2 = 2^2+4^2+...+70^2 = 59640
Mean {1,3,5,......,71} = 36
Mean {2,4,6,...,70} = 36
Variance {1,3,5,......,71} = 1727.666666
Variance {2,4,6,...,70} = 1704
---------- FOLLOW-UP ----------
QUESTION: so for part a, you are saying that the odd whole numbers from 1 to n has the greater mean. for part b you are saying that the even whole numbers from 2 to n-1 has the greater standard deviation.
AnswerHi Jamie,
a) No, actually both the odd sequence 1 to n and even sequence 2 to n-1 have the SAME mean (1+n)/2. You may have missed something, if you go over my response to part (a) again, you will see why this is. The example given at the end also verifies this for the case n=71.
b) The bottom line is that the standard deviation for the odd sequence {1,3,...,n} is GREATER than the standard deviation for the even sequence {2,4,...,n-1}. Have a look at the example given at the end of my last reply to convince yourself.
I am aware that the solution offered for part b) may at first look very complicated. But I cannot see any alternative or obvious way of proving this result beyond any doubt. If you try following the arguments, and ask questions to clarify anything that you don't understand along the way, you may start to see its beauty. The reasoning is more important than the mechanics which is just a messy algebra exercise. You have come so far now, it's important that you don't give up, and ask plenty of questions to make sure the procedure makes sense to you at every point. Knowing the correct answer is one thing, but it means very little if you cannot justify how you arrive at the solution. I'm here to offer support. So don't hesitate to ask questions.
Cheers:)