Advanced Math/Time Series and Conditional Probability
QUESTION: Hi Scott,
I recently stumbled across a discussion on a financial radio show that raised some questions about the probability of consecutive up/down days in the S&P. One of the speakers had tallied the occurrences of consecutive up/down days in the market from 1950 to the present and noted that the data set was normally distributed. Using the total number of 3 consecutive up-days divided by the total number of points, he estimated the probability of a 3 consecutive up-days at approximately 0.162. He did this for each of the discrete values of streaks to calculate the probability of a streak appearing in the time series of any recorded size.
Using Bayes' theorem, he wanted to show, for example, that the probability of 4 consecutive up days given 3 consecutive up-days was not significantly different from the probability of a single up-day (which was 0.53). In other words, given the existence of a streak in consecutive up-days, the probability of another up-day was no different than the probability of a single up-day given no prior streak. He did the following calculation using Bayes' theorem:
p(4 up-days | 3 up-days) = p(4 up-days)/p(3 up-days)
= 0.086 / 0.162
He did this calculation for each of the discrete values for up/down-days. For example, p(8 up-days | 7 up-days), p(8 down-days | 7 down-days), ect. Using Bayes' theorem, he demonstrated that the probability for a continuation of each of the streaks was approximately 0.53.
My question is, does this negate the original distribution that assigned a probability of 4 consecutive up-days as 0.086? I am struggling to understand the use of conditional probability in this context. Given the fact that the original data set of consecutive up/down-days was normally distributed, can one claim that that a streak of 8 consecutive up-days, for example, is less probable than a new streak, which would be a down-day? If so, given a streak of 7 up-days, would it not be less probable for the series to take on a streak of 8 up-days rather than reverting to the mean, that is, produce a down-day? Does the fact that, according to the previous calculations, the probability of an up-day does not change despite the existence of a streak of any size negate the the application of the '68-95-99.7% Rule' with a normally distributed data set? In other words, according to the application of Bayes' theorem above, is the probability of the continuation of a 7 up-day streak really 0.53 despite the low probability of an 8-day streak?
Thanks for the insight.
ANSWER: The chance of getting N up-days is smaller depending upon the size of N.
The chance of the next day being an up-day, however, is not dependent on how many occurred so far.
As an example, consider drawing a face card out of a deck, and then putting the card back in and reshuffling after each draw. Since, in each suit, there are three face cards and ten cards that are not, this makes the chances of drawing a face card as 3/13. It makes no difference how many have been drawn in the past - the chance is still 3/13. The chance of getting two in a row 9/169, which is a lot smaller. However, given that one has already been drawn, the chance of the next one being a face card is independent of the last draw since each card is reshuffled back into the deck after each draw.
That is similar to the problem given. The chance of a streak may be low, but the chance of having an up-day is independent of how many up-days have occurred previously.
---------- FOLLOW-UP ----------
QUESTION: Thanks for the clarification. As a follow up question, wouldn't the fact that a streak of 3 up-days MUST precede a streak of 4 up-days imply that given the fact that today's close completed a 3 day streak, the probability of tomorrow closing as an up-day is less than the probability of tomorrow closing as a down-day given the normal distribution of streaks? Is the idea that the continuation of streaks is random despite the clearly defined probability of a streak of any particular size occurring? It seems that if the probability of a streak of 4 up-days is 0.086, given that today's close completed a 3 day streak, shouldn't the probability that tomorrow closes an up day still be 0.086?
This is an added comment I just remembered from somewhere in my past.
If there are 10 people in the room, the chance of all of them having different birthdays is 88%.
If there are 20 people in the room, the chance of all of them having different birthdays is only 59%.
If there are 30 people in the room, the chance of all of them having different birthdays is only 29%.
If there are 40 people in the room, the chance of all of them having different birthdays is only 11%.
That is because for only 2 people, there is only one pair to look at.
If there are 3 people, A, B, and C, there are 3 pairs to look at - AB, AC, or BC.
If there are 4 people, A, B, C, and D, there are 6 pairs to look at - AB, AC, AD, BC, DB, and CD.
To get the number of pairs to look at, it is (n-1)!.
That is, 2-1 = 1, and 1! = 1; 3-1 = 2, and 2! = 2; 4-1 = 3, and 3! = 6.
Now when we get up to 10 people, there are 9!, and that is 9*8*7*6*5*4*3*2*1 = 362,880.
Even though there are only 10 people in the room, there are far more than only 10 pairs.
To actually compute that chance of all the birthdays being different, for two it is only 1/365.
See, the 1st person is born on someday, and whatever day that is, that leaves 364 days for the 2nd person. When we just keep multiplying by 366-1 on top and dividing by 365, our number quickly becomes small. Granted, for the first 36 people each of the number turns out to be at least 90% of being different, but when we multiply all of them together, it becomes small quickly.