# Probability & Statistics/False positive

Question
Sir,

I am having trouble comprehending this question/
It is known that 0.001% of the general population has a certain type of cancer. A patient visits a doctor complaining of symptoms that might indicate the presence of this cancer. The doctor performs a blood test that will confirm the cancer with a probability of 0.99 if the patient does indeed have cancer. However, the test also produces false positives or says a person has cancer when he does not. This occurs with a probability of 0.2. If the test comes back positive, what is the probability that the person has cancer?

I have looked at your previous answers but still cannot figure it out.

There are really four things that can happen:

Case 1: The person has cancer, and the test is positive.

Case 2: The person has cancer, but the test is negative.

Case 3: The person doesn't have cancer, but the test is positive.

Case 4: The person doesn't have cancer, and the test is negative.

The probability of each one can be computed somewhat easily. There are three probabilities given:

pc = P(C) = probability of cancer = 0.00001

ptp = P(Pos|C) = probability of a true positive reading = 0.99

That is a conditional probability. Assuming the person has cancer , the probability of a positive test is 99%. This doesn't tell you anything about the case when the person does not have cancer. However, that information is also given:

pfp = P(Pos|notC) = probability of a false positive = 0.2

Case 1: pc * ptp = (0.00001) * (0.99) = 0.0000099

Case 2: pc * (1-ptp) = (0.00001) * (0.01) = 0.0000001

Case 3: (1-pc) * (pfp) = (0.99999) * (0.2) = 0.199998

Case 4: (1-pc) * (1-pfp) = (0.99999) * (0.8) = 0.799992

We can give these quantities names too:

P(Pos+C) = 0.0000099

P(Neg+C) = 0.0000001

P(Pos+notC) = 0.199998

P(Neg+notC) = 0.799992

Now, what it's asking is a conditional probability -- assuming the person has cancer, you know how well the test performs (99%). And assuming not, you know how well it performs (80%).

However, you are using a cancer test. You don't know whether the person has cancer, presumably. You have probabilities that rely on (or "condition on") knowing whether the person has cancer. What you want are the more useful ones -- assuming you actually see a test result like positive or negative, what are the chances the person really has cancer?

For that, you can use the four values above -- the true values of these probabilities, to compute the usual probabilities. For example, the effectiveness of the test can be computed by simply considering:

P(Pos|C) = P(Pos+C) / P(C) = 0.0000099 / 0.00001 = 0.99

That's something you already knew, but it can be re-determined from the data in these four cases. The numerator is the probability "has cancer and the test is positive" and the denominator is "has cancer". That gives the probability -- assuming someone has cancer -- that the test will come out positive.

But you can do the opposite, which is what you want!

P(C|Pos) = P(Pos+C) / P(Pos)

P(Pos) is not given, but we know P(Pos+C) and P(Pos+notC) so we can add them up:

P(Pos) = P(Pos+C) + P(Pos+notC) = 0.0000099 + 0.199998 = 0.200008

Then you compute:

P(C|Pos) = P(Pos+C) / P(Pos) = 0.0000099 / 0.200008 = 0.000049498

That's actually right. If a test comes out positive, the probability that the person has cancer is still very low. This is actually a common problem, due to the way some medical tests work.

But that's okay. A positive test is a cause for concern, and requires more testing or screenings to verify the diagnosis. The real problem is which case? False negatives! That means a person has cancer, but the test was negative.

Let's go a step further and see whether a person who has a negative test is likely to have cancer by computing the following:

P(C|Neg) = P(Neg+C) / P(Neg) = P(Neg+C) / ( P(Neg+C) + P(Neg+notC) )

= 0.0000001 / ( 0.0000001 + 0.799992 ) = 0.0000001 / 0.7999921 = 0.0000125001

That is very small. There is about a ten in a million chance that a sick person would slip through this cancer test and be diagnosed healthy when he/she is not.
Questioner's Rating
 Rating(1-10) Knowledgeability = 10 Clarity of Response = 10 Politeness = 10 Comment Thank you for your time.Regards

Probability & Statistics

Volunteer

#### Clyde Oliver

##### Expertise

I can answer all questions up to, and including, graduate level mathematics. I do not have expertise in statistics (I can answer questions about the mathematical foundations of statistics). I am very much proficient in probability. I am not inclined to answer questions that appear to be homework, nor questions that are not meaningful or advanced in any way.

##### Experience

I am a PhD educated mathematician working in research at a major university.

Organizations
AMS

Publications
Various research journals of mathematics. Various talks & presentations (some short, some long), about either interesting classical material or about research work.

Education/Credentials
BA mathematics & physics, PhD mathematics from a top 20 US school.

Awards and Honors
Various honors related to grades, various fellowships & scholarships, awards for contributions to mathematics and education at my schools, etc.

Past/Present Clients
In the past, and as my career progresses, I have worked and continue to work as an educator and mentor to students of varying age levels, skill levels, and educational levels.