# Math and Science Solutions for Businesses/expansion of sample

Question
Hi Randy:

I am working on an audit engagement and encountered the following scenario.
1 of my sample of 25 in a population of 250 has a deviation from the expected behavior.My sample was selected with 0 deviations expected. Our firm guidance says the sample should be expanded by 15 to 40 to evaluate the control as effective. In other words, 1 error out of 40 appear to be acceptable. It also suggests 2 errors out of 60 samples is acceptable to conclude control is effective. I am trying to understand the underlying concept and rationale. The confidence interval is 90 percent. I would appreciate if you could explain the inputs/rationale that is driving these numbers. Thanks a lot

Hi Mark, Here's what I think is going on.

Recommending a sample size of 40 from an original size of 25 in order to achieve a confidence interval of 90 percent involves a so-called z-score

z90 = ∆p/{p(1-p)/n}^1/2 = 1.645

where

1.645 = value of z-score (= number of standard deviations) that gives a one-sided probability of 95% of values less than z
- this z-score gives a 2-sided (central) probability of 90%, as desired
- this assumes the population distribution of defects is close to normal, which is not a bad assumption in this case
- this statistic is used all the time

p = probability of a sample being defective
- your question states that errors (defects?) of 1-out-of-40, or 1/40 = 0.025, is OK, so I'll take p = 0.025

∆p = deviation of measured probability from expected probability
- here, I assume the measured probability is the one from your first sampling, p = 1/25 = 0.04.
- I'm not sure this is strictly kosher since the the expected number of deviations is so small (i.e., 1), but whatever
- your question says that the expected number of deviations = 0, therefore ∆p = 0.04
- I'm not sure an expected value of precisely 0 is legal but I suppose assuming it is << 0.04 works

{p(1-p)/n}^1/2 = standard deviation of estimate of p
- from assuming a binomial distribution with mean = p

n = number of elements in the sample (sample size)

The goal here is to use the z90 eq to specify a sample size to achieve the 90% level. So, rearranging

n = (z90/∆p)^2･p(1-p)

= (1.645/0.04)^2･(0.025)･(0.975) = 41.2 ~ 40.

So this is probably where the sample size of 40 comes from, although using p = 1/40 and then solving for n = 40 is a little disconcerting.

Your question also states that 2-out-of-60 errors are also acceptable. If we try to calculate p by assuming a binomial distribution and setting probability of 2-out-0f-60, P(2|60), equal to the probability of 1-out-of-40, P(1|40) ,then we get

P(1|40) = {40!/(39!1!)}･p(1-p)^39 = probability of getting 1 error in 40 samples for given p

P(2|60) = {60!/(58!2!)･p^2･(1-p)^58 = probability of getting 2 errors out of 60 samples.

Setting these equal to each other gives

(40)･p(1-p)^39 = (1770)･p^2･(1-p)^58

40/1770 = p･(1-p)^19

ln(40/1770) - lnp = 19･ln(1-p)

This is an implicit equation for p which has to be "solved" graphically by looking at the intersection of the left and right hand sides. This, unfortunately, does not give a clear result: the 2 curves approach each other near 0.05 but do not cross at a well defined point => inconclusive. So I'm not sure where this information fits in. If you can get some information from your "firm guidance" regarding this, please pass it on.

Hope this helps.

Math and Science Solutions for Businesses

Volunteer

#### Randy Patton

##### Expertise

Questions regarding application of mathematical techniques and knowledge of physics and engineering principles to product and services design, optimization, prediction, feasibility and implementation. Examples include sales and product performance projections based on math/physics models in addition to standard regression; practical and cost effective sensor design and component configuration; optimal resource allocation using common tools (eg., MS Office); advanced data analysis techniques and implementation; simulation and "what if" analysis; and innovative applications of remote sensing.

##### Experience

26 years as professional physical scientist and project manager for elite research company providing academic quality basic and applied research for government and defense industry clients (currently retired). Projects I have been involved in include: - Notional sensor performance predictions for detecting underwater phenomena - Designing and testing guidance algorithms for multi-component system - Statistical analysis of ship tracking data and development of anomaly detector - Deployed vibration sensors in Arctic ice floes; analysis of data - Developed and tested ocean optical instrument to measure particles - Field testing of protoype sonar system - Analysis of synthetic aperture radar system data for ocean surface measurements - Redesigned dust shelters for greeters at Burning Man Festival Project management with responsibility for allocation and monitoriing of staff and equipment resources.

Publications
“A Numerical Model for Low-Frequency Equatorial Dynamics” (with Mark A. Cane), J. of Phys. Oceanogr., 14, No. 12, pp. 18531863, December 1984.

Education/Credentials
MIT, MS Physical Oceanography, 1981 UC Berkeley, BS Applied Math, 1976

Past/Present Clients
Am also an Expert in Advanced Math and Oceanography