Question QUESTION: Hello, If i have a very large bucket of several thousand balls. All I know is that almost all the balls are black and a small percentage are red. If i take out a ball one at a time and note the color. What is the minimum number of balls i need to take out to have an almost certain accuracy of the ratio of black to red balls ?
Im looking for an answer will sound something like: you will need X number of red balls to be 90% sure of the ratio and Y number of red balls to be 99% sure etc.
Thanks !
ANSWER: mike -
the sort of answer you want is not clear. in particular, the sentence you wrote:
"Im looking for an answer will sound something like: you will need X number of red balls to be 90% sure of the ratio and Y number of red balls to be 99% sure etc."
seem garbled.
if you rephrase the question a bit more carefully, i might get a better grasp of what you are looking for.
ronny
---------- FOLLOW-UP ----------
QUESTION: ronny, To be clearer, ill give you an example of my sample taking.
I am trying to determine the ratio of red to black balls as early as possible. The first red ball comes out after taking 99 black balls. At that stage I have a very rough idea that the ratio of red:black is 1:100. Then after the next 79 black balls, a red one comes. the average is now 180/2 , the ratio is 1 in 90. Then after 119 black balls, a red one comes. At that stage I take the average 300 balls taken and 3 red balls, the ratio is 1 in 100. How many samples do I need to continue to take to be 95% or 99% sure of having the correct ratio of the all the balls.
Answer mike -
let's be clear about one thing straightaway: there is no sampling scheme that will allow you to assert that with 95% (or whatever) confidence you will be able to know the true fraction of red
balls exactly. any statistical sampling scheme can only result in a statement like:
"i am 95% confident that the true fraction of red balls is between certain limits - such as between .04 and .06, say."
this can also be restated as: the true proportion of red balls is .05, with a margin of error of ±.01.
so if you are willing to accept this limitation on what can be learned from statistical sampling, we can proceed from there.
in that case, you need to decide on what degree of confidence you want (such as 95%) and how large a margin of error you can live with. the smaller the margin of error - or the higher the degree of confidence desired - the larger will be the amount of sampling required.
there is more that has to be specified:
the sampling scheme you describe - drawing until some number of red balls is obtained - is called inverse sampling. you don't know in advance the total number of balls you will have to draw.
one can also do direct sampling - saying, for example, "i will draw 500 balls and see how many of them are red."
either scheme will give an answer of the same sort - an estimate of the true proportion and a margin of error.
which sampling scheme to use may be based on the particulars of the situation one is dealing with.
for example, if one has a fixed budget, you might be able to afford a direct sample of 500, say. an inverse scheme might require more than 500 draws - and one might not be in a position to have that happen. so one would use a direct scheme in that case.
there might also be mathematical reasons to prefer one or the other scheme (if money is no object).
these issues must be clarified before one can come up with an answer of whether to use direct or inverse sampling - and then how large a sample is required.
so if you want to pursue the matter - the ball is now in your court.