Probability & Statistics/Average Annual Confusion?
QUESTION: Ok, I am a librarian and I like to run analysis of the print books in my library annually to determine what sections are being used well and what sections may need improvement. One of the things I look at is Average Annual Circulation as opposed to just looking at “raw” absolute total circulation. The average annual circulation is the number of circulations (i.e., checkouts) that a given item or collection averages per year.
Here is an example of what I mean:
Book A has circulated 20 times (raw circulation) in ten years on the shelf. Which means it has an Average Annual Circulation of 2 (20 circs/ 10 years).
Now where I run into trouble is doing this calculation for items that have been on the shelf for less than a complete year.
Book B has circulated 3 times in a quarter of a year on the shelf. When I run the same calculation for Book B for Average Annual Circulation I get 16 (4 circs/ .25 years).
Now how can the average be higher than the absolute total? What am I missing here? Any assistance would be greatly appreciated.
ANSWER: Your calculation makes sense -- it is not a paradox.
In 3 months (0.25 years), the book was checked out 3 times (or was it 4?). Let's say it's 3.
That means you have an average circulation of 3/(0.25) = 12 times.
It is okay that the average is higher than the total, because the average is over an entire year
, while the current count is only a fraction of a year
(hence, it is smaller).
---------- FOLLOW-UP ----------
QUESTION: Thanks. So now that the calculation is correct, is it fair to compare average annual circulation for items that have a shelf-life of less than a year against items that have a shelf-life over a year? Can I group them all together or should I separate them? Will the "inflated" values for the items newer than a year throw off totals?
Because realistically very few, if any, books older than a year ever have average annual circulations as high as the ones for the books that are less than a year old.
I just fear that these "spikes" in the data for the new items throws things off as they are not sustainable for longer lived items.
Generally speaking, no, these figures are all comparable. They all represent the annual circulation rates, regardless of the shelf-life. Now, depending on how you use the data, you ought to be aware of this.
If you are doing long-term projections, for example, you would not want to include the short-life items. Any results you get might be misleading if you were to predict, for example, that items that normally get checked out 20 times in one month and are then removed from circulation will somehow be checked out hundreds of times in the next decade (they won't! they were removed from circulation!).
But they are not "spikes" or "inflated" -- these are the true annual circulation rates. It is important to understand that this rate is only one important quantity associated to a particular item. The shelf-life is another important quantity to consider. Depending on your analysis, in some cases you may need to include the short-life items. In other cases, not.
Here are two examples, one good, one bad:
Imagine you are trying to count how many check-outs total there will be in the next year. If you have 5000 long-life items that average 2 circulations per year, that is 10000 check-outs you will see. You may also have 600 short-life items that average 20 circulations per year, but if these items do not last the full year, you should NOT be counting 12000 check-outs for this group!!
On the other hand, if you were trying to determine which items are the most popular, something like average annual circulation may be a good measurement. If an obscure book is checked out once per year, while a short-life item is checked out 30 times per year (that might be 10 times over 4 months), the second item is clearly much more popular.
If your analysis is highly statistical, you would also need a statistically significant sample size, but if you are at a library or any such environment, you presumably have a large number of items in circulation.