You are here:

Probability & Statistics/Testing a population with playing cards

Advertisement


Question
QUESTION: Hi Clyde,

I have a statistics question comparing inherent probability and the output of a computer program.. Basically, I want to test the randomness of a specific program in relation to playing poker.  Using data of a few hundred hands per day over a month using this program,  I want to see if it is skewed towards giving premium hands like Aces or Kings, or if its truly random. I have been keeping track of number of hands and number of times I get selected hands. It is also easy to figure out the probability of getting each selected hand. After I have all of my data (tracked in excel), what test should I run, and how would I interpret that data?

ANSWER: First, let's try to understand the quantity of data you have.

Say you have 250 hands per day for 30 days -- that's 7500 hands.

There are 52 choose 2 = 2,598,960 possible hands.

So you are taking a fair number of hands, but fewer than the total set of outcomes.

However, you can still analyze the data in a number of ways. Since the focus is on poker, you can consider each type of poker hand -- are you getting a fair number of each such hand?

The numbers of ways to get every possible hand are as follows:

Royal Flush: 4
Straight Flush: 36
Four of a Kind: 624
Full House: 3,744
Flush: 5,108
Straight: 10,200
Three of a Kind: 54,912
Two Pairs: 123,552
One Pair: 1,098,240
High Card: 1,302,540

As usual, the lower possibilities do not include the higher possibilities, so the category "two pair" does not include a full house (even though technically, a full house is two pair plus one more matched card). Likewise, the "flush" category does not count any straight flush, since those are counted separately.

Based on that, you should have seen approximately this many of each out of 7500:

Royal Flush: 0
Straight Flush: 0
Four of a Kind: 2
Full House: 11
Flush: 15
Straight: 29
Three of a Kind: 158
Two Pairs: 356
One Pair: 3169
High Card: 3759

Now of course, this may not be exactly what you saw -- and that's okay. Maybe you were lucky and got a straight flush. Maybe you got lots of one-pair hands, something like 3300. That's not impossible or even unlikely, on its own.

The correct way to test this is a Chi-Squared test. You can read about that here.

For example, if you obtained the results:

Royal Flush: 0
Straight Flush: 0
Four of a Kind: 2
Full House: 10
Flush: 17
Straight: 26
Three of a Kind: 160
Two Pairs: 355
One Pair: 3200
High Card: 3800

You would simply run a Chi-Squared test based on that to obtain a "p value" of nearly 100% (see the table here for example). That value represents the likelihood that the data match a fair set of outcomes from the distribution we expect (so in this case, it seems reasonable).

On the other hand, if your results are:

Royal Flush: 1
Straight Flush: 0
Four of a Kind: 3
Full House: 10
Flush: 17
Straight: 24
Three of a Kind: 160
Two Pairs: 355
One Pair: 3200
High Card: 3800

the p value is around 10^(-14). Very unlikely -- too many good hands.

---------- FOLLOW-UP ----------

QUESTION: Thanks for the thorough response, Clyde,

I will refresh myself on how to do a Chi-Squared test (been a few years), but had 1 follow up question. How do I account for the difference in the amount of hands per night? Let's say 1 night I have 200 hands, but another have 275, how do I account for those differences? I understand I can use the percentage and use that as the observation, but I want those differences to count, I want the 275 hand night to weight that much more heavily than the 200 hand night.

This is an actual experiment I am starting. Since it is too hard to track all the examples you listed, and not every hand is seen to the end, I am keeping track of starting hands in texas holdem. Tracking each time I get dealt a pair of Aces, K, Q or Js as starting hands. Every night I see a different amount of hands, but still want an accurate number. It can sometimes be big differences, especially the rare night I see over 400 hands.

ANSWER: It doesn't matter how many hands you have per night -- just the total. (Unless you suspect that the random dealer algorithm is changing day-to-day..)

As for the hands, I assumed these were normal five-card poker hands. You can adjust for a the two-card hold'em hand (it's actually much easier).

Total: 52 choose 2 = 1326

Pair J-A: 24
Pair 1-10: 60
Non-Pair, both high (J-A): 192
Non-Pair otherwise: 1050

So assuming your tallies are of these types of hands, or such a thing, that works.

If you're literally only counting high pairs vs. everything else, then the outcomes are only:

Pair J-A: 24
Others: 1302

The probability of such a pair is very low, but you can run a Chi-Squared test on this too. You expect about 1.8% of the hands to be these high pairs -- if it's way more, you might see a very low p value from your Chi-Squared test, meaning indeed, there may be something wrong with the dealing algorithm.

---------- FOLLOW-UP ----------

QUESTION: Great, Thanks.

Sorry, but this has me thinking about another problem in the same study I realize I will run into. My overall theory is that this algorithm is changing, not day to day, but period to period. My theory is that it's geared to instigate drama, causing long periods of hands that match what you expect, periods of significantly above average premium pairs, and periods of significantly less than expected premium hands, and results during those times match the trend in some way. These periods are about 1000-1500 hands (I am over 2000 hands into the experiment).  Since these instances come in waves, I can do the p-value for a select time period and the hands involved in that.

What I have found raises another issue, because I am not sure a p-value will help given my data. I have found that in periods of craziness and losing on my part, day-to-day, I get the expected amount of premium pairs, with little variation. In periods I have won, the amount of the premium pairs is either grossly above average, or grossly below average, so the net average also turns out normal. That wouldn't seem so unusual normally, and may trigger what to look for to be successful in this version of poker, but every loss has been in a row, and every win has been in a row, which is suspicious. I was thinking showing the standard deviation in losing sessions vs standard deviation in winning sessions, or even over set time periods, may show the difference. As I mentioned each session is a different length, so how do I adjust to be able to calculate the standard deviation? Should they be weighted based on the size of the session? Should I use the percentage of premium pair instead of number to balance it out? Will that distort my findings if some sessions are more than twice as long as others?

Answer
The number of trials in each "session" is important to the p value you will obtain, assuming you compute the p value corresponding to your chi-squared score correctly.

What you will find is that a low number of trials leads to a high p value  (ie. insignificant results, no conclusions of bias can be drawn).

If you believe the results are changing over time, the analysis could include data based on each session, but in any single session, the p value is less likely to indicate any bias because the number of trials per day, 2-day session, week session, etc. will be less than the monthly total.


You should note, however, that it is perfectly normal to have long strings of good or bad hands. This is a "paradox" of probability that is, in fact, totally reasonable. If you were to never see a long string of bad or good hands, that would actually indicate that there is something wrong with the random algorithm for dealing cards.

Probability & Statistics

All Answers


Answers by Expert:


Ask Experts

Volunteer


Clyde Oliver

Expertise

I can answer all questions up to, and including, graduate level mathematics. I do not have expertise in statistics (I can answer questions about the mathematical foundations of statistics). I am very much proficient in probability. I am not inclined to answer questions that appear to be homework, nor questions that are not meaningful or advanced in any way.

Experience

I am a PhD educated mathematician working in research at a major university.

Organizations
AMS

Publications
Various research journals of mathematics. Various talks & presentations (some short, some long), about either interesting classical material or about research work.

Education/Credentials
BA mathematics & physics, PhD mathematics from a top 20 US school.

Awards and Honors
Various honors related to grades, various fellowships & scholarships, awards for contributions to mathematics and education at my schools, etc.

Past/Present Clients
In the past, and as my career progresses, I have worked and continue to work as an educator and mentor to students of varying age levels, skill levels, and educational levels.

©2016 About.com. All rights reserved.