About Vivian Expertise I can answer questions on probability, distributions, statistical inference, statistical estimation, hypothesis testing, analysis of categorical data, linear regression, generalized linear regression, ANOVA, and linear mixed models. I cannot answer questions on stochastic processes.
Experience I have worked as a research assistant at the University of Michigan, Ann Arbor for two years.
Organizations American Statistical Association
Education/Credentials University of Michigan, Ann Arbor
Master of Science
Question 1. Let A and B two events. Suppose that P (A) = 0.4, P (B) = p, and P(A U B)=0.8
a) For what value of p will A and B be mutually exclusive?
b) For what value of p will A and B be independent?
2. In a survey, 1000 adults were asked whether they favored an increase in the state income tax if the additional revenues went to education. In addition it was noted whether the person lived in a city, suburb, or rural part of the state. Do the results indicate that the place of residence and the opinion about tax increase are independent?
Increase income tax
Yes No Total
City 100 300 400
Suburb 250 150 400
Country 50 150 200
400 600 1000
3. The accuracy of medical diagnostic test, in which a positive results indicates the presence of a disease, is often stated in terms of its sensitivity, the proportion of diseased people that test positive or P (+ │Disease), and its specificity, the proportion of people without the disease who test negative or P (- │No Disease). Suppose that 10% of the population has the disease (called the prevalence rate). A diagnostic test for the disease has 99% sensitivity and 98% specificity. Therefore,
P (+ │Disease) = 0.99, P (- │No Disease) = 0.98
P (- │Disease) = 0.01, P (+ │No Disease) = 0.02
a) A person’s test result is positive. What is the probability that the person actually has the disease?
b) A person’s test result is negative. What is the probability that the person actually does not have the disease? Considering this result and the result from (a), would you say that this diagnostic test is reliable? Why or why not?
c) Now suppose that the disease is rare with a prevalence rate of 0.1%. Using the same diagnostic test, what is the probability that the person who tests positive actually has the disease?
d) The results from (a) and (c) are based on the same diagnostic test applied to populations with very different prevalence rates. Does this suggest any reason why mass screening programs should not be recommended for a rare disease? Explain.
Answer 1. Let A and B two events. Suppose that P (A) = 0.4, P (B) = p, and P(A U B)=0.8
a) For what value of p will A and B be mutually exclusive?
b) For what value of p will A and B be independent?
Always, P (the intersection of A and B) = P (A) + P (B) - P (A U B)
If A and B are mutually exclusive, then P (the intersection of A and B) =0
If A and B are independent, then P (the intersection of A and B) = P(A)P(B)
2. In a survey, 1000 adults were asked whether they favored an increase in the state income tax if the additional revenues went to education. In addition it was noted whether the person lived in a city, suburb, or rural part of the state. Do the results indicate that the place of residence and the opinion about tax increase are independent?
Increase income tax
Yes No Total
City 100 300 400
Suburb 250 150 400
Country 50 150 200
400 600 1000
Code the place of residence and the opinion about tax as dummy variables, and then we may use Fisher’s Exact Test or Pearson Chi-square Test to test the association.
As indicated by the name, Fisher’s ‘Exact‘Test is more accurate since it use exact small-sample distributions rather than large-sample approximations. But when the population number becomes large, the calculation of Fisher’s Exact Test becomes hard.
3. The accuracy of medical diagnostic test, in which a positive results indicates the presence of a disease, is often stated in terms of its sensitivity, the proportion of diseased people that test positive or P (+ │Disease), and its specificity, the proportion of people without the disease who test negative or P (- │No Disease). Suppose that 10% of the population has the disease (called the prevalence rate). A diagnostic test for the disease has 99% sensitivity and 98% specificity. Therefore,
P (+ │Disease) = 0.99, P (- │No Disease) = 0.98
P (- │Disease) = 0.01, P (+ │No Disease) = 0.02
a) A person’s test result is positive. What is the probability that the person actually has the disease?
P (Disease │+) = P (Disease and +) /P(+)
Where P (+) = P (Disease and +) + P (No Disease and +) since a person can either have the disease or not.
P (Disease and +) = P (+│Disease)P(Disease) and P (No Disease and +) = P(+│No Disease)P(No Disease)
Now, just plug in the values. Note: P (Disease) =0.1
b) A person’s test result is negative. What is the probability that the person actually does not have the disease? Considering this result and the result from (a), would you say that this diagnostic test is reliable? Why or why not?
Can be solved in a similar way.
c) Now suppose that the disease is rare with a prevalence rate of 0.1%. Using the same diagnostic test, what is the probability that the person who tests positive actually has the disease?
Can be solved in a similar way.
d) The results from (a) and (c) are based on the same diagnostic test applied to populations with very different prevalence rates. Does this suggest any reason why mass screening programs should not be recommended for a rare disease? Explain.
If the disease is rare, P(Disease│+) is lower compared with the situation when the disease is not rare. So a mass screening program for a rare disease may not be efficient.