最小二乘法与投影
"最小二乘法的本质是希尔伯特空间的投影。"
Suppose we want to project some higher-dimensional vector to a lower-dimensional space(say plane).
The projection is \(\vec{p}\), original one is \(\vec{b}\), the vector that is vertical(or say orthogonal) to the space should be \[\vec{e} = \vec{b} - \vec{p}\]
for the vector \(\vec{p}\) that is within the plane(say n-dimensional space) \[\vec{p} = Ax \] where A is the matrix where each column \(a_i\) is a basis.
since \(e\perp a_i\) \[a_i^T(b-Ax) = 0\] \[A^T(b-Ax) = 0\] \[A^TAx = A^Tb\]
If \(A^TA\) is invertible \[x = (A^TA)^{-1}A^Tb\] \[p = Ax = A(A^TA)^{-1}A^Tb\]
The best scenario is A is composed of orthogonal basis. A is standard orthogonal matrix.
Question: when is \(A^TA\) invertible?
Answer: when the column vectors of \(A(m\times n)\) are linearly independent
Proof(Contradiction):
If column vectors are linearly independent and \(A^TA\) is invertible,
\(A^TAx=0\) has non-zero solution (because \(\det{A^TA} = 0\))
\[x^TA^TAx = 0\]
\[(Ax)^TAx = 0\]
\[\therefore Ax = 0\]
\(\because\) A's columns are linearlly independent \[\therefore x=0\] Contradicted, so when column vectors are linearly independent, \(A^TA\) is invertible
格拉姆-施密特正交化 Gram-Schmidt Orthogonalization
标准正交阵 \(Q\) 满足:
\[ Q^TQ = I\] \[ Q^T = Q^{-1}\]
投影矩阵 \(P\) 满足: \[ P = A(A^TA)^{-1}A^T \]
当\(A\)为正交阵\(Q\)时, \[P = Q(Q^TQ)^{-1}Q^T = QQ^T\]
如果有一个矩阵 \[q = [a, b, c]\] 经过正交化 \[Q = [A, B, C]\] 其中\(A, B, C\)正交。令\(A=a\)(第一个列向量不变,调整其他向量使正交)
研究\(b\)在\(a(A)\)上的投影\(p\),利用向量投影公式 \[p = \frac{A^Tb}{A^TA}A\]
\(B\)就是\(b\)到\(A\)垂线的向量 \[B = b - p = b - \frac{A^Tb}{A^TA}A\]
已经有了前两个维度,\(C\)就是\(c\)减去在\(A,B\)平面上的投影,也可以认为是减去在前两个方向上的投影(因为平面上的投影也是两个方向上投影的叠加) \[C = c - \frac{A^Tc}{A^TA}A - \frac{B^Tc}{B^TB}B\]
对\(u_i = A,B,C\)模数归一化得到\(e_i\),就得到正交矩阵\(Q\)
QR分解
\(Ax = b, A\in \mathbb{R}^{m\times n}\) 的可能情况有: 1. \(m=n\) 2. \(m>n\), over-determined 3. \(m<n\), under-determined
定义
\(A\in \mathbb{R}^{m\times n}, m\geq n\)可以被分解成 \(A=QR\), 其中: * \(Q\in \mathbb{R}^{m\times m}\) 是正交矩阵 * \(R = \begin{bmatrix}\hat{R}\\0\end{bmatrix}\in \mathbb{R}^{m\times n}\) * \(\hat{R}\in \mathbb{R}^{n\times n}\)是上三角阵
正交矩阵性质
- \(Q^TQ = QQ^T = I\)
- 左乘正交矩阵不影响欧式范数
计算QR分解的方法
- Gram-Schmidt orthogonalization
- Householder Triangularization
- Givens Rotations
Gram-Schmidt Orthogonalization
Reduced QR分解
GSO从矩阵\(A\in \mathbb{R}^{m\times n}\)的第一个列向量\(A_{:,0}\)开始构建互相正交的基。 对于原矩阵\(A\) \[A_{:,0} = r_{0,0}q_0\] \[A_{:,1} = r_{0,1}q_0 + r_{1,1}q_1\] \[...\] \[A_{:,n-1} = r_{0,n-1}q_0 + r_{1,n-1}q_1 ... + r_{n-1,n-1}q_{n-1}\]
\[ A = \hat{Q} \hat{R} \] \[ [A_{:,0}, ... A_{:,n-1}] = [q_0, ..., q_{n-1}] \begin{bmatrix} r_{0,0} \ r_{0,1} \ ... r_{0,n-1}\\ ~~~~~~~ r_{1,1} \ ... r_{1,n-1} \\ ~~~~... \\ ~~~~~~~~~~~~~r_{n-1, n-1}\end{bmatrix} \]
\(\hat{Q} \in \mathbb{R}^{m\times n}, \hat{R} \in \mathbb{R}^{n\times n}\) 称为 Reduced-QR,\(\hat{Q}\)不为方阵。 \(\hat{Q}^T\hat{Q} = I\),因为\(m>n\),不满足\(\hat{Q}\hat{Q}^T = I\) (满秩方阵才能两侧逆)
Full-QR分解
\(Q \in \mathbb{R}^{m\times m}, R \in \mathbb{R}^{m\times n}\)
若\(m>n\),需要把\(\hat{Q}\)的n个基拓展到m个基。多出的\(m-n\)个基对应到\(R = \begin{bmatrix} \hat{R} \\ 0 \end{bmatrix}\) 的空白部分。
Python numpy代码
1 | q, r = np.linalg.qr(A) # Reduced-QR |
Classic GSO (CGSO)
\[ [A_{:,0}, ... A_{:,n-1}] = [q_0, ..., q_{n-1}] \begin{bmatrix} r_{0,0} \ r_{0,1} \ ... r_{0,n-1}\\ ~~~~~~~ r_{1,1} \ ... r_{1,n-1} \\ ~~~~... \\ ~~~~~~~~~~~~~r_{n-1, n-1}\end{bmatrix} \]
每一次迭代\(q_0, ..., q_{j-1}\)已知,\(r_{j,j}\)未知,\(r_{0,j}, ..., r_{j-1,j}\)满足 \[r_{i,j} = q_i^TA_{:,~j}\]
\[r_{j,j}q_j = v_j = A_{:,j} - \sum_{i=0}^{j-1}{r_{i,j}q_i} = A_{:,j} - \sum_{i=0}^{j-1}{q_i^TA_{:,~j}q_i}\]
对\(v_i\)按模数归一化, \[q_j = \frac{v_j}{||v_j||_2}\] \[r_{j,j} = ||v_j||_2\]
Modified GSO (MGSO)
...
Givens Rotations
矩阵形式
矩阵作用
\(A\in \mathbb{R}^{m\times n}, m \geq n, A_{i,j}, A_{i,k}, j < k\), \[\begin{bmatrix}c & s\\-s & c \end{bmatrix} \begin{bmatrix}A_{i,j}\\A_{i,k}\end{bmatrix} = \begin{bmatrix} \alpha \\ 0 \end{bmatrix}\] \[\alpha = \sqrt{A_{i,j}^2 + A_{i,k}^2}\] \[c=\frac{A_{i,j}}{\alpha}\] \[s=\frac{A_{i,k}}{\alpha}\]