Probability & Statistics/False positive
I am having trouble comprehending this question/
It is known that 0.001% of the general population has a certain type of cancer. A patient visits a doctor complaining of symptoms that might indicate the presence of this cancer. The doctor performs a blood test that will confirm the cancer with a probability of 0.99 if the patient does indeed have cancer. However, the test also produces false positives or says a person has cancer when he does not. This occurs with a probability of 0.2. If the test comes back positive, what is the probability that the person has cancer?
I have looked at your previous answers but still cannot figure it out.
There are really four things that can happen:
Case 1: The person has cancer, and the test is positive.
Case 2: The person has cancer, but the test is negative.
Case 3: The person doesn't have cancer, but the test is positive.
Case 4: The person doesn't have cancer, and the test is negative.
The probability of each one can be computed somewhat easily. There are three probabilities given:
pc = P(C) = probability of cancer = 0.00001
ptp = P(Pos|C) = probability of a true positive reading = 0.99
That is a conditional probability. Assuming the person has cancer
, the probability of a positive test is 99%. This doesn't tell you anything about the case when the person does not have cancer. However, that information is also given:
pfp = P(Pos|notC) = probability of a false positive = 0.2
Case 1: pc * ptp = (0.00001) * (0.99) = 0.0000099
Case 2: pc * (1-ptp) = (0.00001) * (0.01) = 0.0000001
Case 3: (1-pc) * (pfp) = (0.99999) * (0.2) = 0.199998
Case 4: (1-pc) * (1-pfp) = (0.99999) * (0.8) = 0.799992
We can give these quantities names too:
P(Pos+C) = 0.0000099
P(Neg+C) = 0.0000001
P(Pos+notC) = 0.199998
P(Neg+notC) = 0.799992
Now, what it's asking is a conditional probability -- assuming the person has cancer, you know how well the test performs (99%). And assuming not, you know how well it performs (80%).
However, you are using a cancer test. You don't know whether the person has cancer, presumably. You have probabilities that rely on (or "condition on") knowing whether the person has cancer. What you want are the more useful ones -- assuming you actually see a test result like positive or negative, what are the chances the person really has cancer?
For that, you can use the four values above -- the true values of these probabilities, to compute the usual probabilities. For example, the effectiveness of the test can be computed by simply considering:
P(Pos|C) = P(Pos+C) / P(C) = 0.0000099 / 0.00001 = 0.99
That's something you already knew, but it can be re-determined from the data in these four cases. The numerator is the probability "has cancer and the test is positive" and the denominator is "has cancer". That gives the probability -- assuming someone has cancer -- that the test will come out positive.
But you can do the opposite, which is what you want!
P(C|Pos) = P(Pos+C) / P(Pos)
P(Pos) is not given, but we know P(Pos+C) and P(Pos+notC) so we can add them up:
P(Pos) = P(Pos+C) + P(Pos+notC) = 0.0000099 + 0.199998 = 0.200008
Then you compute:
P(C|Pos) = P(Pos+C) / P(Pos) = 0.0000099 / 0.200008 = 0.000049498
right. If a test comes out positive, the probability that the person has cancer is still very
low. This is actually a common problem, due to the way some medical tests work.
But that's okay. A positive test is a cause for concern, and requires more testing or screenings to verify the diagnosis. The real problem is which case? False negatives! That means a person has cancer, but the test was negative.
Let's go a step further and see whether a person who has a negative test is likely to have cancer by computing the following:
P(C|Neg) = P(Neg+C) / P(Neg) = P(Neg+C) / ( P(Neg+C) + P(Neg+notC) )
= 0.0000001 / ( 0.0000001 + 0.799992 ) = 0.0000001 / 0.7999921 = 0.0000125001
That is very small. There is about a ten in a million chance that a sick person would slip through this cancer test and be diagnosed healthy when he/she is not.