图解最小二乘法

一、定义:

样本为(x_1,y_1),(x_2,y_2),...(x_n,y_n),预测函数\hat{y}=f(x,\theta),当误差平方和J_{LS}(\theta)=0.5\sum_{\mathclap{1\le i\le n}} (\hat{y}_i-y_i)^2为最小值时,\theta即为所求,记:\theta_{LS}=\underset{\theta}{\text{argmini}}J_{LS}(\theta)

二、一维线性模型

设一维线性模型:\hat{y}=f(x,\theta)=\theta_1x+\theta_2,那么J_{LS}(\theta)=0.5\sum_{\mathclap{1\le i\le n}} (\hat{y}_i-y_i)^2=0.5\sum_{\mathclap{1\le i\le n}}(\theta_1x_i+\theta_2-y_i)^2,对\theta_1 ,\theta_2求导:\sum_{\mathclap{1\le i\le n}}x_i(\theta_1x_i+\theta_2-y_i)=\theta_1\sum x_i^2+\theta_2\sum x_i - \sum x_i y_i=0\sum_{\mathclap{1\le i\le n}}(\theta_1x_i+\theta_2-y_i)=\theta_1\sum x_i+n\theta_2-\sum y_i=0解得:\theta_1=\dfrac{n\sum x_iy_i-\sum x_i\sum y_i}{n\sum x_i^2-(\sum x_i)^2}\theta_2=\dfrac{\sum y_i-\theta_1\sum x_i}{n}

三、一般线性模型

记基函数向量\phi(x)=(\phi_1(x),\phi_2(x),...,\phi_b(x)),参数向量:\theta=(\theta_1,\theta_2,...,\theta_b).如\phi(x)=(1,x,...,x^{b-1}),那么一元多次方程:f(x)=\phi(x)\cdot\theta=\theta_1+\theta_2x+\theta_3x^2+...+\theta_bx^{b-1}.训练样本方差为:J_{LS}(\theta)=0.5\|\Phi\theta-y\|^2,其中设计矩阵:\Phi=\begin{pmatrix}\phi(x_1)\\\phi(x_2)\\..\\\phi(x_n)\end{pmatrix}=\begin{pmatrix} \phi_1(x_1) &..& \phi_b(x_1)\\&..&\\ \phi_1(x_n) &..& \phi_b(x_n)\end{pmatrix},求得:\theta_{LS}=\Phi^\dag y,由于\Phi不一定为方阵,所以这里\Phi^\dag为广义逆矩阵,\Phi^\dag=(\Phi' \Phi)^{-1} \Phi',注\Phi'为转置矩阵.


上例子,实际模型:y=5*x+2,基函数:\phi(x)=(x,1)

实际模型:y=5sin(x)+2x,基函数:\phi(x)=(1,x,sin(x),cos(x),sin(x/2),cos(x/2))