Probability & Statistics/p-value/standard deviation
Continuing from another post (which I will include). Just wondering about the standard deviation question from my last post. I understand it is normal to have groups of good and bad hands. Time will tell if there is a pattern to it.
My previous question and your answer:
Sorry, but this has me thinking about another problem in the same study I realize I will run into. My overall theory is that this algorithm is changing, not day to day, but period to period. My theory is that it's geared to instigate drama, causing long periods of hands that match what you expect, periods of significantly above average premium pairs, and periods of significantly less than expected premium hands, and results during those times match the trend in some way. These periods are about 1000-1500 hands (I am over 2000 hands into the experiment). Since these instances come in waves, I can do the p-value for a select time period and the hands involved in that.
What I have found raises another issue, because I am not sure a p-value will help given my data. I have found that in periods of craziness and losing on my part, day-to-day, I get the expected amount of premium pairs, with little variation. In periods I have won, the amount of the premium pairs is either grossly above average, or grossly below average, so the net average also turns out normal. That wouldn't seem so unusual normally, and may trigger what to look for to be successful in this version of poker, but every loss has been in a row, and every win has been in a row, which is suspicious. I was thinking showing the standard deviation in losing sessions vs standard deviation in winning sessions, or even over set time periods, may show the difference. As I mentioned each session is a different length, so how do I adjust to be able to calculate the standard deviation? Should they be weighted based on the size of the session? Should I use the percentage of premium pair instead of number to balance it out? Will that distort my findings if some sessions are more than twice as long as others?
The number of trials in each "session" is important to the p value you will obtain, assuming you compute the p value corresponding to your chi-squared score correctly.
What you will find is that a low number of trials leads to a high p value (ie. insignificant results, no conclusions of bias can be drawn).
If you believe the results are changing over time, the analysis could include data based on each session, but in any single session, the p value is less likely to indicate any bias because the number of trials per day, 2-day session, week session, etc. will be less than the monthly total.
You should note, however, that it is perfectly normal to have long strings of good or bad hands. This is a "paradox" of probability that is, in fact, totally reasonable. If you were to never see a long string of bad or good hands, that would actually indicate that there is something wrong with the random algorithm for dealing cards.
Sorry, I did not address that part of the question because it isn't really relevant. The correct test here, regardless, will be a chi-squared test. The standard deviations won't tell you anything particularly useful here, at least, not to answer the question you want answered.