Probability & Statistics/Linear Regression and Probability
The first part of the problem was to run a linear regression using mileage and sales price, with price being the dependent variable. I've completed this step and I now have a function that accepts a mileage value and returns a predicted sales price.
The second part of the problem is where I'm stuck. It says "Also on the same dataset, and using your results from the linear regression, compute an expected time to sell given 3 prices - [P, P+10%, P-10%] and assuming a 50% sell probability." Not sure what P is in this case. My data set has 2500 cars in it and not all of them are sold, so I suppose P could be the asking price of unsold cars? Perhaps it's asking to predict how long it will take to sell the remaining cars at the current asking price as well as +/- 10%?
ANSWER: Your dependent variable is P. According to the regression, you can express P as linearly related to M and T (mileage and time to sell). So if you have your entire data (i.e. inventory) set and it wants the expected
time to sell at price P, you just average the time to sell (which is already in your data as T for each datum).
But then you assume the prices are adjusted by +10%. In this case, you have to use your regression to decide how changing P would change T for your dataset. Then you average all of the adjusted values for T. (Notice that your regression includes M. Although the M values do not change, you should be using M and 1.1P to deduce the T value for each datum.)
Finally, repeat for -10%.
Please follow-up if you have additional questions.
---------- FOLLOW-UP ----------
QUESTION: Hi Clyde, thanks for the quick reply. How does the 50% probability of sale factor into this? I thought maybe it had something to do with Z tables but not sure how to account for it.
To be frank, I skipped over that entirely because I believe it is extraneous. You have been given raw data with certain quantities, none of which is a probability of sale or an event corresponding to some possible sale/non-sale (i.e. a customer interaction).
You can solve the problem without this information, according to my reading of the prompt. You perform a linear regression on the data, then take an average, then translate it according to that regression, then take another average. This does not require any "sell rate." Is there some other part of the problem connecting this rate to the data in some way beyond what you have quoted? If not, I don't think it's relevant.
If there is some interpretation of the question that requires this "50% sell rate," I would be interested to hear it, but I do not believe this figure is salient (or even applicable) to the data in question. You should be able to work through the problem as I have outlined, and you can present your solution to your instructor or consult the solutions manual to your textbook, hopefully that will clear up the prompt.
[Note that there is no assumption that these data are normally distributed or that sales occur as some Poisson process -- there wouldn't be any z-table or other figures for you to look up this 50% figure in the first place.]