You are here:

Chemistry (including Biochemistry)/Statistical Error in scale measurements


QUESTION: Hello Dr. Robichaud:

I am sure that you have had to deal with this type of problem at some point.  I am have a manufacturing background so I am not very confident in my knowledge of balance scales.

At work we are trying to accurately measure to ~1.2 mg or less. our scale is capable of measuring to 0.1 mg.  This does not immediately seem to be a problem until I started thinking about this further.  

Here is the issue.  The scale has a repeatability of 0.1 mg (standard deviation) per the manual.  First of all, is my understanding of repeatability correct.  6 st dev x 0.1 mg = 0.6 mg resulting in a ~99% confidence that the actual measured wight is in this range?  Is this your interpretation as well?

So to measure the chemical residue at 1.2 mg or better I need to weigh have a pre and post weight for a control beaker and then pre and post weight for the test beaker.  With each weighting the repeatability of the scale comes into play.  So error is introduced each time I weigh a beaker and my final value could be off by as much as 4 * 0.6 mg or 2.4 mg.  How can  I accurately weigh to 1.2 mg or less if my potential error is so large compared to my desired result?

Are there any methods that are used in chemistry/pharmaceuticals and or biology that would help to reduce or at least understand this type of error?

We have other sources of error (human interaction, temp, calibration etc.) as well, but this was a good example.  I suppose that we could change to a higher resolution scale but this would be cost prohibitive.  I am looking for other solutions or at least to understand our capability.

Thank you,


ANSWER: Hello Allen!

Hoo boy, statistics for reproducibility and reliability and confidence intervals. I'll give you my best shot, but honestly if there's an active Statistics expert on this site they may be more help than me, and make more sense to boot.

Long story short; I don't think your error is as big a problem as you do.

To chemists, as long as you calibrate the scale once a day using clean, dry standard weights (wear gloves!) and the method in the book, the scale may be considered accurate to 0.1mg. Ideally you'd calibrate before using, but once a day should be good.

What you want to ask the statistician is how to calculate equipment variation (EV) and appraiser variation (AV) for your scale.
(I found the following link to be a useful explanation of these concepts.

An experiment you can try to do this yourself. Most of the calculations were taken from this book, which you can find by putting "d2 (Duncan 1974, Table M)" in a google search, then calculating reproducibility and reliability by hand.
( Sas/Qc 9.2 User's Guide, Volumes 1-4

Get ten new pennies from this mint year (time to hit the bank), and clean them with isopropyl alcohol. Label them one through ten with a Sharpie. Have three patient people go through and weigh each penny three times. (Ideally, go 1 through 10 three times, not putting the same penny on and taking it off 3 times in a row.)

Calculate the range of your data for this population as denoted here:

To get PennyRange, compute the mean (average), compute the difference between each piece of data and the mean, square each difference (to remove the sign), add all the differences, divide by n of 10 (to get a variance, a kind of spread), take the square root.

To get your EV, you put those numbers in this formula:

standard deviation for repeatability (EV) = (PennyRange) / d2

(we used this table to get d2 of 1.693 based on number of operators and pennies:

We know it should be at 0.1mg. Does it come close?

To get your AV, you will need to find the overall average range for each person, then subtract the smallest range from the largest. (RUsers)

AV = { [ (Rusers/d2*) - (PennyRange/d2*) ]^2 - [1/(TestNum*PennyNum) ] * (EV)^2 } ^(1/2)

d2* is from the table and is 1.716.

What's our R&R?

R&R = [ (AV)^2 + (EV)^2 ]^(1/2)

To get total variation, we then take a side trip to compute part (penny) variation, which can be approximated by

PV = (PennyRange / d2*)

Now Total Variation;

TV = [ (AV)^2 + (PV)^2 ]^(1/2)

To quote the siliconfareast web site,
"In a GR&R report, the final results are often expressed as %EV, %AV, %R&R, and %PV, which are simply the ratios of EV, AV, R&R, and PV to TV expressed in %.  Thus, %EV=(EV/TV)x100%; %AV=(AV/TV)x100%; %R&R=(R&R/TV)x100%; and %PV=(PV/TV)x100%. The gage is good if its %R&R is less than 10%.  A %R&R between 10% to 30% may also be acceptable, depending on what it would take to improve the R&R.  A %R&R of more than 30%, however, should prompt the process owner to investigate how the R&R of the gage can be further improved."

So let's say your scale is good. 99% confidence interval that the eleventh new penny will fall in that range?

99 CI = (PennyMeanMass) +/- 2.575 * [PennyStandardDeviation / (TotalObservationNumber)^1/2]

You'll get a distribution. Is this distribution within 0.1 mg? Do you find it acceptable, after all that coffee and scribbling?

Good luck. You might want to talk to a real statistician still.

EDIT: Can you target your process to be within the accuracy of the scale? Usually I have to scale up, then aliquot, then dry samples to get very small amounts.  For example, if I wanted to sell vials of product at .1 mg each, I'd weigh out 10 mg (scale accurate here), dilute it in 1 ml of something nonreactive, use an accurate pipet to take 0.01 ml out, put it in a separate tube, and then dry those tubes under vacuum to get .1 mg. Realize you can't weigh that small an amount, and work around it?

---------- FOLLOW-UP ----------

QUESTION: Hello Dr. Robichaud,

Thank you very much for your advice.  This is great information and it confirms that I am on the right track.  Unfortunately, I was already one step ahead of you.  The reason that I was asking this question was because I am in the process of evaluating the results of a Gauge R&R study.  I performed this study using control beakers instead of pennies.  I had created 10 beakers with different levels of simulated residue proportioning out a known contaminated solvent over the 10 test beakers.  As a result of the study, I found that most of the error came from the scale.  We had a repeatability variation (EV) of 26% vs 5% contribution from the appraiser variation (AR) with a total GR&R variation (%R&R) of 27%.  This tells me, much like you have shown above, that the scale is adequate but could use improvement. (I had actually tested 11 beakers using one as a control beaker for the other 10 in this study.)

My intention was to evaluate the scale to better understand the error that is inherent with the use of gauge itself.  I am looking for ways to improve the results from the scale by changing techniques or procedures and or eliminating mistakes.  Since I was only evaluating control beakers, and these are only half of the measurement process, the variation may increase with the addition of the test beakers.

Again, thank you very much for helping me evaluate this problem.

Hello Allen!

I've got some experimental design questions for you.

1) Did you mass the clean beakers before putting the residue in them, and then remass the beakers to get the residue mass?
2) If you're not able to do 1, can you mass the residue beakers and then rinse out the residue, and re-mass the beakers?

There will be significant beaker variance uncontrolled for by using an 11th empty beaker as your tare. Right now the 27% is likely a combination of variability of glass thickness in beakers, residue variability, and scale variability. (I routinely mass 1 dram glass vials in my work. I have to mass each one wearing gloves while it's clean, and then re-mass it after my unknown is added and dried.)

The best way to test the scale variability is by testing something repeatedly which has only one unknown variable. I picked new mint pennies because I thought they'd be cheap, easy, and likely varying in the 50 mg range one from another, so within the measurable area of the scale. In a perfect world you'd have 7 to 10 precisely machined 1 gr calibration weights to do this with, but we cope.

I'd try your experiment again, but make sure you have each person weigh each empty, spot free clean completely dry beaker before filling it. Fingerprints will also add mass, so make sure your personnel are gloved up.

Good luck!

Chemistry (including Biochemistry)

All Answers

Answers by Expert:

Ask Experts


Trista Robichaud, PhD


No homework questions, especially ones copied and pasted from textbooks. I will answer questions about principles or give hints, but I do not do other's homework. I'm comfortable answering basic biochemistry, chemistry, and biology questions up to and including an undergraduate level of understanding. This includes molecular biology, protein purification, and genetics. My training/inclination is primarily in structural biology, or how the shapes of things affect their function. Other interests include protein design, protein engineering, enzyme kinetics, and metabolic diseases such as cancer, atherosclerosis, and diabetes. My chemistry weaknesses are that I do not know organic or inorganic synthesis well, nor am I familiar with advanced inorganic reactions. I will attempt quantum mechanics and thermodynamics questions, but primarily as they relate to biological systems. Furthermore, I cannot tell you if a skin photograph is cancerous, or otherwise diagnose any disease. I can tell you how we currently understand the basic science behind a disease state, but I cannot recommend treatment in any way. Please direct such questions to your medical professional.


I hold a PhD in Biomedical Science from the University of Massachusetts Medical School in Worcester. I specialize in Biochemistry, with a focus on protein chemistry. My thesis work involved the structure and functions of the human glucose transporter 1. (hGLUT1) Currently I am a postdoc working in peptide (mini-protein) design and enzymology at the University of Texas Health Science Center in San Antonio, Texas. I am in Bjorn Steffensen's lab (PhD, DDS), studying gelatinase A and oral carcinoma.

2001 American Association for the Advancement of Science
2007 American Chemical Society
2007 Protein Society
2011 UTHSCSA Women’s Faculty Association

Levine KB, Robichaud TK, Hamill S, Sultzman LA, Carruthers A. Properties of the human erythrocyte glucose transport protein are determined by cellular context. Biochemistry 44(15):5606-16, 2005. (PMID 15823019)
Robichaud TK, Appleyard AN, Herbert RB, Henderson PJ, Carruthers A “Determinants of ligand binding affinity and cooperativity at the GLUT1 endofacial site” Biochemistry 50(15):3137-48, 2011. (PMID 21384913)
Xu X, Mikhailova M, Chen Z, Pal S, Robichaud TK, Lafer EM, Baber S, Steffensen B. “Peptide from the C-terminal domain of tissue inhibitor of matrix metalloproteinases-2 (TIMP-2) inhibits membrane activation of matrix metalloproteinase-2 (MMP-2)” Matrix Biol. 2011 Sep;30(7-8):404-12. (PMID: 21839835)
Robichaud TK, Steffensen B, Fields GB. Exosite interactions impact matrix metalloproteinase collagen specificities. J Biol Chem. 2011 Oct 28;286(43):37535-42 (PMID: 21896477)

Poster Abstracts:
Robichaud TK, Carruthers. A "Mutagenesis of the Human type 1 glucose transporter exit site: A functional study." ACS 234th Meeting, Boston MA. Division of Biological Chemistry, 2007
Robichaud TK, Bhowmick M, Tokmina-Roszyk D, Fields GB “Synthesis and Analysis of MT1-MMP Peptide Inhibitors” Biological Chemistry Division of the Protein Society Meeting, San Diego CA 2010
Robichaud TK; Tokmina-Roszyk D; Steffensen B and Fields GB “Catalytic Domain Exosites Contribute to Determining Matrix Metalloproteinase Triple Helical Collagen Specificities” Dental Science Symposium. UTHSCSA 2011
Robichaud TK; Tokmina-Roszyk D; Steffensen B and Fields GB “Exosite Interactions Determine Matrix Metalloproteinase Specificities” Gordon Research Conference on Matrix Metalloproteinase Biology, Bristol RI 2011

Oakland University, Auburn Hills MI BS, Biochemistry 1998
University of Massachusetts Medical School, Worcester MA PhD, Biochemistry & Molecular Pharmacology 2001-2008
University of Texas Health Science Center, San Antonio TX Postdoc, Biochemistry 2009-Present

Awards and Honors
1998 Honors College Graduate, Oakland University
2009 Institutional National Research Service Award, Pathobiology of Occlusive Vascular Disease T32 HL07446
2011 1st Place, Best Postdoctoral Poster, Dental Science Symposium, UTHSCSA, April 2011

Past/Present Clients
Invited Seminars:
Robichaud TK, Fields GB. “Synthesis and Analysis of MTI-MMP Triple Helical Peptide Inhibitors” Pathology Research Conference, University of Texas Health Science Center San Antonio Pathology Department (June 18th, 2010)
Robichaud TK & Hill, B “How To Give A Great Scientific Talk” Invited Lecture, Pathobiology of Occlusive Vascular Disease Seminars, UTHSCSA (Nov 11th 2010), Cardiology Seminar Series, Texas Research Park (Feb 21st, 2011)
Robichaud TK; Tokmina-Roszyk D; Steffensen B and Fields GB “Exosite Interactions Determine Matrix Metalloproteinase Specificities” Gordon-Keenan Research Seminar “Everything You Wanted to Know About Matrix Metalloproteinases But Were Afraid to Ask” Bristol, RI (Aug 6th, 2011)

©2017 All rights reserved.