You are here:

- Home
- Science
- Mathematics
- Probability & Statistics
- Statistics: comparing two population porportions or means

Advertisement

QUESTION: I have been in a debate with some friends about the process of comparing means/proportions in two populations. For example, consider the question of whether the same proportion of males and proportion females like ice cream. A typical way of dealing with this question seems to be to take an independent random sample of each of the two populations and then perform some statistical test, usually a either a z or a t test, on the two samples.

The question I have is what is the appropriate null hypothesis to test? Typically the null hypothesis used seems to be that "the two means/percents are then same". Here is my issue. If the two populations are large, or if the random variable is continuous, then the likelihood(colloquial) that the two population have the same mean/percent is either extremely close to zero, or zero. One doesn't need a statistical test to argue this. It is a priori obvious. If one does perform a stat test and doesn't detect a difference, all that means is the power of the test was too weak to detect what we already know. However it seems to me that one would be a fool to be agnostic as to whether the two means/proportions are in fact different.

I think that what this points to that "the means/percents of ... in the 2 populations is the same" is the wrong null hypothesis to test. Lets say we are comparing some percent in two populations, and lets say that we detect a difference at the 5 sigma significance level. Do we really care when the two percents differ at the 9th decimal point? It depends on the question we are trying to answer, but probably not. I argue that instead of this null hypothesis, what should really be tested is the null hypothesis which states "the difference in the percent of ... between the two populations is greater than 5%" or something like that. Then if you get a statistically significant result, you also have a practically significant result.

Is my logic sound, or am I missing something?

ANSWER: No -- your logic is not sound.

You are really running up against three issues, which I can illustrate by example.

Human height is a random variable. It is affected by many factors, and is a very complex thing, but for our purposes, we can assume human height is determined by some random process that (in theory) leads to a normal distribution.

However, there are not infinitely many humans, nor could such humans fill in a continuous distribution since height is measured only to a particular refinement (inches, mm, whatever).

So there is a difference between the actual distribution of humans, which is literally the set of all heights of all humans, and the ideal or theoretical distribution that we assert would happen if the number of humans approached infinity (that is, the limit distribution).

There is

The null hypothesis (and its rejection) is

In particular, this part of your argument is flawed:

"If the two populations are large, or if the random variable is continuous, then the likelihood(colloquial) that the two population have the same mean/percent is either extremely close to zero, or zero. One doesn't need a statistical test to argue this. It is a priori obvious. If one does perform a stat test and doesn't detect a difference, all that means is the power of the test was too weak to detect what we already know."

A random variable has a probability distribution. The parameters of that distribution are

Further, if you have two populations and you take a sample, you could observe that one has mean 7.01 and the other population has mean 7.00. There are two

Whether a difference of 0.01 is

---------- FOLLOW-UP ----------

QUESTION: Thank you for your response.

It appears that I wasn't being very clear in my description. Let me address some of the points that you made with an aim at clearing up the position I am arguing. Lets ignore measurement resolution/errors in heights/etc to keep things simple, and focused on the statistical aspects the problem we are discussing. If you like we can consider all height measurements to be at an accuracy of, let us say, an inch for the purposes of this discussion.

First off, I don't see how your discussion on limit distributions relates to the question. I am talking about two finite populations. Regardless of how they were created (deterministic, random, ...), both populations have means which can in principle be measured exactly by simply measuring every person in the two populations and doing some arithmetic.

I understand that the null hypothesis is a statement about the entire population, and not the samples taken. I also understand that the sample mean and population mean will, in general, be different. I also understand that the sample mean gives a point estimator of the population mean, and that the sample standard deviation can be use as an estimator of the population standard deviation. One can the use the sample mean and sample std to construct approximate p values and confidence intervals, which can be uses to determine if a certain null hypothesis can be rejected at a given alpha.

I also agree that human height is effected "by some random process that (in theory) leads to a normal distribution.". It might not be normal, but that is besides the point. My point is that if one was to measure the heights of EVERY person (not a sub-sample) in two disjoint populations, and then calculate the two POPULATION means, that they would almost certainly be different. This is precisely BECAUSE heights are determined, at least partially, by random processes.

Consider the following example:

Consider two groups of men with 100,000 men each, selected, via a random process, from the total population of men on earth. Lets label these groups A and B. In this example, the POPULATIONS under consideration are group A and group B. We are NOT considering group A and group B to be sub-samples of the total population of men on earth (though they of course are) and are NOT using them to estimate any parameters of the total population of men on the earth. We are considering them to be two complete populations in and of themselves (which they also are).

Now we can ask some questions about these two populations. On question is whether the mean heights of the two populations are the same or not. If we have access to all the men in both populations, we can directly measure the means, with out resorting to any statistics. We simply measure the heights of all 200,000 men and do a little bit of math. We then know the two population means EXACTLY.

Now suppose we are only allowed to measure 40 men from each group. Lets go do a stat test on these two populations to compare there means. We take a random sample of 40 people from each of the two populations of 100,000. Then calculate the sample means and standard deviations of the two samples we have collected, and use these values in a Z or T test against the hypothesis "The mean height of the men in group A is the same as the mean height of the men in group B".

Since our sample size is pretty small, it is likely that we will NOT be able to reject the null hypothesis, but this is just due to the fact that we have a weak test due to a small sample size. We know that means in population A and B almost certainly have to be different, without even measuring them, simply BECAUSE the heights of the individuals in the two populations are, in part, determined by random events. If I flip a coin 200,000 times, the probability that the number of heads in first 100,000 flips is the same as the number heads in the second 100,000 flips is almost zero. More specific to this example, we could calculate the probability that two random samples of 100,000 men, taken from the population of the earth, would have the same sample means. We can do this because, in fact, we know that our populations A and B are in fact random samples taken from the population of all men on earth. I haven't done this calculation, but I am certain that the probability is exceedingly small.

So my question then is: What is the point of testing the null hypothesis that "The mean height of the men in population A is the same as the mean height of the men in population B"? We know with almost certainty that is false, even if we can't reject it based on our statistical test on our two samples of 40 people. The reason we know is that we know that the populations were constructed via a random process.

In the above example, the random process which created our two populations was that they were both random sub-populations of the total population of men on earth. This is admittedly contrived. A more realistic example would be comparing the mean height men in California to those in Ohio. Here the primary random processes effecting our two populations would be primarily genetic (copy errors, etc) with some input from random environmental factors (cosmic rays, etc).

While the above example is contrived, I hope that it illustrates my point that if two large populations are constructed, at least partially, via random processes, that the null hypothesis "the two populations have the same mean" will almost always be false, and hence is the wrong hypothesis to test. Much more useful, I argue, is the hypothesis "The means of population A and B differ by more than ...".

Also, note that if I am actually interested in the question "the mean height of the two groups differ by more than 4 inches", that knowing that I can reject "The mean heights of the two groups are the same" at a the .05 level doesn't tell me if I can be 95% confident that the two means differ by more than 4 inches. To determine that I need to reject the null hypothesis "the population means in the two groups differ by less than 4 inches" at the .05 level. Thus, to me, it DOES seem to make sense to incorporate the "practical" difference resolution one is interested in into the null hypothesis being tested.

You are viewing the problem in the wrong way.

The means of these populations

The sample means are random variables, but the null hypothesis is

What you test is whether the difference between the sample's means is

Keep in mind, if you literally just took two population means and said "these means differ by 2 inches (or 0.002 inches)," then you have indeed proved that the two population means are different. However, this is

As for the "random process" creating the sample, that's not just some "random process," it's the sampling of the two populations. The

If your sample is so small (40 is pretty small), the samples' means may be very different -- because your samples are too small to give statistically significant results. If your sample is larger, the means will be very close. They may not be equal, but they will be close to equal. (Or, if the means are far apart and you have a large enough sample, you'd reject the null hypothesis.)

As for your four inches question, the null hypothesis is not what you think it is. To quote Wikipedia, which sums it up well enough:

The null hypothesis is "These are not different." It

I can answer all questions up to, and including, graduate level mathematics. I do not have expertise in statistics (I can answer questions about the mathematical foundations of statistics). I am very much proficient in probability. I am not inclined to answer questions that appear to be homework, nor questions that are not meaningful or advanced in any way.

I am a PhD educated mathematician working in research at a major university.**Organizations**

AMS**Publications**

Various research journals of mathematics. Various talks & presentations (some short, some long), about either interesting classical material or about research work.**Education/Credentials**

BA mathematics & physics, PhD mathematics from a top 20 US school.**Awards and Honors**

Various honors related to grades, various fellowships & scholarships, awards for contributions to mathematics and education at my schools, etc.**Past/Present Clients**

In the past, and as my career progresses, I have worked and continue to work as an educator and mentor to students of varying age levels, skill levels, and educational levels.