Basic Math/PRecAl
Expert: Josh - 10/9/2003
QuestionCould you also help with this one. I know how to begin but I don't understand once I get to the middle part of a.
A photo processing lab charges 30 cents for a 3*5 print, 45 cents for a 4*7 print and 65 cents for a 5*9 print.
a) Find a quadratic function of the form f(x)=ax^2+bx+c that models the cost (in cents) per print in terms of the smaller dimension of the print size.
b) Using this model, determine the approximate price of an 8*10 print.
AnswerOkay, one way of solving this problem is to use linear models and a statistical method called multiple regression. This ALWAYS works for a single dependant variable, x.
Firstly, tabulate the data.
Use x to denote the smaller dimension of the print size,
use y to denote the actual price.
x y
x1=3 30
x2=4 45
x3=5 65
Aim: To find the least square solutions, ie., coefficients "a","b","c" such that the mean square error (average deviation from actual cost) is minimized.
Definitions:
*Error as a function of the photo size, e(x)=y(x)-f(x)
*Squared Error or Residual, [e(x)]^2=[y(x)-f(x)]^2
*Mean Squared Error Average{[e(x)]^2}=
{[y(x1)-f(x1)]^2+[y(x2)-f(x2)]^2+[y(x3)-f(x3)]^2}/3
Basic Formulation: ...[#A]
e(x1)=f(3)-y(3)= 9a+3b+c-30,
e(x2)=f(4)-y(4)=16a+4b+c-45,
e(x3)=f(5)-y(5)=25a+5b+c-65,
At this point, we can try solving this set of equations by Gaussian elimination (either by eliminating variables or using matrix algebra). But I won't bother with this, because in practice, there is every chance that this will fail. That is to say, a perfect solution generally does not exist, you can only find it in textbook, because the author wants you to succeed in finding the answer they are looking for. In any case, I think you know how to solve this given some patience. Here is the alternative (more powerful) method that I'm gonna show you.
Firstly, convert this into matrix form:
Y=Xp+e,
where *Coefficient vector, p=[a,b,c]'
is 3 rows-by-1 column;
*Error vector, (residual) e=[e(x1),e(x2),e(x3)]'
is also 3 rows-by-1 column;
*Matrix X=[[9,3,1];[16,4,1];[25,5,1]]
is stacked 3 rows-by-3 columns
*Cost vector, Y is 3 rows-by-1 column.
There is a notional correspondance between X & x, Y & y.
The well known solution to the posed problem is referred to as the Normal equations, which make use of generalized inverses. Mathematically, we do not expect matrix X to have full-rank (i.e., we do not expect the simultaneous equations to be strictly linearly independent). All this means is that we do not expect to be so lucky, that each of our observation contains non-redundant information which allows us to work out an exact model that fits the data perfectly.
By the orthogonality principle, (if the error is to vanish),
Y=Xp
introduce transpose X', where columns in X are written as row and vice versa.
X'Xp=X'Y
p=(X'X)^(-1) *X'Y
Numerical solution:
X=[[9,3,1];[16,4,1];[25,5,1]]
X'=[[9,16,25];[3,4,5];[1,1,1]]
X'*X=[[962,216,50];[216,50,12];[50,12,3]]
inv(X'*X)=[[1.5,-12,23];[-12,96.5,-186];[23,-186,361]]
inv(X'*X)*X'=[[0.5,-1,0.5];[-4.5,8,-3.5];[10,-15,6]]
p=[2.5,-2.5,15]'
Check the residual! It's zero. In this case, the problem was contrieved in such a way that the data does fit perfectly to a quadratic equation.
We've done more than what was required of us. Could have solved the set of equations [#A] simultaneously to obtain these very nice results. But we cannot expect to be so lucky in the real world. The statistical method advocated here is the more general-purpose, only feasible technique in practice. It only takes a little measurement error to cause Gaussian elimination to fail and you would not be able to obtain the answer, let alone the optimal solution.
Cheers,
Josh