Advanced Math/compound interest, follow up
QUESTION: Hi Randy,
I have one more follow up question but the website didn't allow any more replies.
Here's your response to the original question and my question is at the bottom of the page:
[[[[ ANSWER: I understand this better now. You have an initial amount of $10000 which gets compounded each day, but the interest rate at which the return (or balance) is calculated varies from compounding period to period. Because of the fluctuations in the interest rate, the actual return after n periods is different than what you would calculate using the average of the interest rates over the same number of periods (total time). You want an estimate of the magnitude of this deviation as a function of the mean and variance of the interest rate and the total length of time (total number of accounting periods). This is an interesting problem.
To start, define the amount accrued over the total time consisting of n compounding periods as Am. Using the exponential version of the return calculation I introduced last time, this amount will be
Am = P･∏ (j=1->n) exp(Rj),
where ∏ is the symbol for multiplying elements over the range of indices 1 ≤ j ≤ n. The Rj are the interest rates which will be assumed to be normally distributed with mean M and variance V. Furthermore, let Rj = M + R'j, where the mean of R'j is zero (but still with variance V. We can then write
Am = exp(nM)･∏ (j=1->n)exp(R'j).
Note that nM = interest rate over the total time and exp(nM) = total return.
Now what we need to derive is the expected value of Am, or A'm = Am/P for convenience
<A'm> = exp(nM) <∏ (j=1->n)exp(R'j)>.
Note that if the interest rates are constant and equal to M, we get A'm = exp(nM) as we should. Assuming R'j << 1, we can write
<∏ (j=1->n)exp(R'j)> = <∏ (j=1->n)(1+R'j+R'j^2/2 + ...)>, using the Taylor expansion of the exponential. This can also be written
<(1+R'1+(R'1^2)/2)(1+R'2+(R'2^2)/2) ... (1+R'n+(R'n^2)/2)>.
Multiplying this out and keeping only quadratic terms <R'j･R'j> = V (remember R'j has mean = 0) and noting that the random terms are uncorrelated so that <R'j･R'k> = 0 for J≠k, we end up with
<∏ (j=1->n)exp(R'j)> = 1 + (m/2)V so that
<A'm> = exp(nM)･(1+mV/2).
This shows that, for interest rates Rj with a variance V, the expected value of the mean is biased upward. For your example with P = 10000, m = 5, M = 0.005 and V = 3x10^-4, the expected value is
<Am> = 10260, which is slightly greater than the 10254 you calculated using the average M. As can be seen, the bias in Am increasing with the number of periods m.
Because we are dealing with random numbers and a limited number of samples, the actual value of Am will be variable, but this estimate of the bias should help in predicting errors. There is also a little bias in using the exponential approximation, but it is on the order of the approximations made above. All this can be redone using the actual, discrete formulation you are used to.
1) Is V = 3x10-4 a constant measure of the variance or must this value be calculated for each scenario(different time periods, different mean return)
ANSWER: The value of V should be pre-determined (pre-specified) in order for the calculation to be a true prediction. Afterall, if you can calculate it after the final return is realized, then it doesn't do much good. I used a value given by the data in your example, but this was just in order to get a representative value. In a real situation, you would need to have an estimate of V based on previous data or other information.
You can use the formula to play a what-if game where you specify different variances to see what errors in the final result would be for different compounding periods, etc.
I didn't realize the website limited responses. I think if you just give your question a different name it should always work (which is what you did). As I said, I'd be glad to help you with any other questions.
---------- FOLLOW-UP ----------
QUESTION: I understand, the specified variance must represent the data to take account for the observed variance.
With that, I have 2 questions:
1.) How would I go about calculating this variance provided I have a set of data (returns).
2.) Would the specified variance need to change for predicting Am among different compounding periods?
If I have 2 years of actual data to derive a measure of variance, would I use a years worth of that data and the variance calculated to estimate Am of 1 year in the future?
Or if I only want to estimate Am after a month, would I use a month of actual data to determine the variance or would I use the entire series(2 years) of data to arrive at the best measure of variance for all calculations of Am regardless of the periods used?
ANSWER: 1. a) If you have data to analyze, I hope you have the interest rates as well as the returns. If so then the variance is easily calculated in the usual way. For an interest rate of rj corresponding to a compounding period of length Tc and a total time of Tt we have
m = Tt/Tc = number of compounding periods in the total period over which the return is to be calculated (eg., if Tt = 1 year and Tc = 1 week, then m = 52), and
V = variance of r = (1/m)∑(j=1->m)(rj-M)^2
where ∑ denotes the sum over the j terms. M = average of r = (1/m)∑(j=1->m)(rj).
Just to review, we have 2 time periods to consider
1. short period Tc corresponding to a compounding period, for which the interest rate r and its statistics M and V are defined
2. longer period Tt corresponding to multiple, sequential compounding periods.
b) If you only have the returns and not the time series of interest rates, then you have to estimate the underlying variance (and mean) fitting a model to the data. This is more complicated and less accurate, but possible (and interesting).
2. The length of data (time) over which to average depends on the behavior of the data (or at least assumptions made about the data). The simplest situation is where the variance (and mean) does not change with time (the data is stationary). In this case, the longer the data set you can use the better. This is because the estimates of the (population) statistics (mean and variance), which are the so-called sample statistics, become more accurate with more samples. Keep in mind that M and V represent the mean and variance of r for a period Tc, even though they are calculated over a longer period Tt.
If the statistics change over time (non-stationary), then you need to consider an intermediate time, Ti, to average over. The easiest way to tell if the data is non-stationary is to see if the mean is changing significantly over time, i.e., if there is a trend in the mean. IF the mean calculated for segments of time Ti seems to be rising or falling significantly over the entire time Tt, then the data may need to be detrended (which is easy). If the means calculated over Ti wobble around a constant value, then the data can be considered stationary. The effect of the wobbles will be taken into account in the variance calculation. This whole topic gets into more complicated time series analysis which is probably overkill for your problem.
As you have no doubt surmised, some judgement is required here. Let me summarize the most straightforward, and likely the most appropriate, situation:
- you have the interest rates r (and returns) for short time periods Tc (a day, a week, etc)
- you want to incorporate estimates of the variabiity of r for returns over a longer time, Tt
- the data is stationary
- calculate the variance over Tt (or longer period if available)
- use the formula I derived to estimate the bias in the mean return, Am, for the period Tt.
I'm curious how this interesting problem arose and why you are concerned about the variability in the interest rates (other than to just know if it matters or not). Also, you can send data for me to look at if you want.
---------- FOLLOW-UP ----------
QUESTION: I'll take you up on your offer to look at the data. At some point in the future, I'll have some data that I'll need to use to find a realistic measure of variance. I'm putting together the data now and I'll shoot you another question when I'm ready to do some calculations.
Until then I have one more question.
I want to solve for r (rate), and take into account the variance (we can use 0.0003 for now). What would it look like to incorporate the variance in that equation?
The effect of variance in the interest rate on the total return has a large effect on the return's variance but a small effect on the bias (expected value), which is the formula I derived. This is illustrated in the images I've attached, which I'll discuss in a moment. First, here's the formula I gave you for the expected value of the return
<A'm> = exp(nM)･(1+nV/2)
where, for simplicity, I'm using n instead of m for the number of compounding periods, and M = <r> = mean and V = <(r-M)^2> = variance of the random interest rates. Your goal is to estimate r given this formula given <A'm>. Good news and bad news. Bad news is that M and V are unknown and there is only one equation involving them so they cannot be independently estimated. Good news is that the multiplicative term with V is very small (bias is small) and so
<A'm> ≈ exp(nM), where n is known. This is fine for estimating M. However, what you are really interested in is the variance of A'm, Var(A'm) = <(A'm-M)^2>, which I haven't derived but whose effect is shown the images.
Given the M = 0.005 and V = 0.0003 for the interest rates, I generated 5 sets of realizations of random interest rates, each representing 100 compounding periods. The realiztions were from a normal (Gaussian) distribution and uncorrelated. I then calculated the returns at each compounding step using the usual iterative formula R(j+1) = R(j)･(1+rj), where rj = jth random interest rate. I also calculated the returns based on exp(nM) = return using average value of interest rate and exp(nM)･(1+nV/2) = average value including bias due to the variance of interest rate. These are shown in the first image. The 5 realizations of the returns for the random interest rates are shown overlaid by the mean and mean +bias curves (predictions) and the average return average over the 5 realizations.
Note that the mean and mean+bias curves are very close together while the random realizations differ significantly. This is not unexpected since the standard deviation sigma = sqrt(V) = 0017 which is significantly greater than the mean of 0.005 (factor of 3.4). However, the average of the realizations is close to the predictions, which is good.
The 2nd image sort of summarizes everything and shows the average of the predictions +/- their standard deviations. It also shows the predicted returns and +/- their standard deviations (SD = sqrt(1+nM/2)). The results show that the mean and standard deviations are closely predicted by the formulas.
Again, we have good news and bad news. The good news is that the predictions of the STATISTICS are accurate. The bad news is that the variability in the REALIZATIONS is significant and predicting a single realization has a significant error (it also shows that the bias is not that important, as mentioned above).
All this is interesting and informative but what I think I need to do now is be able to estimate V given a single realization of the compounding returns in order to provide you (and your clients) with an estimate of the range of the possible values of the returns based on this single sequence. Does that sound right?
Keep me posted.