LSQR算法是计算大型稀疏线性方程组的算法,由Paige和Saunders于1982年提出。该方法主要为求解以下线性方程组(A为m*n的矩阵,m>n),且保持二阶残差范数最小:
A x = b M i n ∥ A x − b ∥ 2 Ax=b\\ Min\parallel Ax-b \parallel_{2}Ax=bMin∥Ax−b∥2
对于传统的最小二乘问题,可以直接求出方程的解,但是对于大型稀疏矩阵,往往是病态问题(超定方程组或欠定方程组),因此,LSQR方法不是直接求解x,而是遵守最小二乘的原则,在法方程限定的空间中寻求最佳解。
寻找最佳解的过程如下,可以基于该过程编程:
初始条件
(a)β 1 u 1 = b α 1 v 1 = A T u 1 \beta_1u_1=b \quad \alpha_1v_1=A^Tu_1β1u1=bα1v1=ATu1
(b)w 1 = v 1 x 0 = 0 w_1=v_1 \quad \; x_0=0w1=v1x0=0
(c)ϕ ˉ 1 = β 1 ρ 1 = α 1 \bar\phi_1=\beta_1 \quad \; \rho_1=\alpha_1ϕˉ1=β1ρ1=α1对i=1,2,3,…重复步骤(3)-(6)
对角化系数矩阵
对角化的目的是将系数矩阵双对角化,以便进行QR分解,具体原理我会在另一篇博客说明。分为两个步骤,第一是把 A v i − α i u i Av_i-\alpha_iu_iAvi−αiui 进行标准化,系数是β i + 1 \beta_{i+1}βi+1。第二是把A T u i + 1 − β i + 1 v i A^Tu_{i+1}-\beta_{i+1}v_iATui+1−βi+1vi进行标准化,系数是α i + 1 \alpha_{i+1}αi+1。
(a)β i + 1 u i + 1 = A v i − α i u i \beta_{i+1}u_{i+1}=Av_i-\alpha_iu_iβi+1ui+1=Avi−αiui
(b)α i + 1 v i + 1 = A T u i + 1 − β i + 1 v i \alpha_{i+1}v_{i+1}=A^Tu_{i+1}-\beta_{i+1}v_iαi+1vi+1=ATui+1−βi+1vi计算QR分解的中间变量
带阻尼的LSQR算法和无阻尼的LSQR算法的区别主要在该步骤中,关于QR分解的原理我会在另一篇博客详细说明。
无阻尼LSQR算法的QR分解中间变量赋值过程:
( a ) ρ i = ρ ˉ i 2 + β i + 1 2 ( b ) c i = ρ ˉ i / ρ i ( c ) s i = β i + 1 / ρ i ( d ) θ i + 1 = s i α i + 1 ( e ) ρ ˉ i + 1 = − c i α i + 1 ( f ) ϕ i = c i ϕ ˉ i ( g ) ϕ ˉ i + 1 = s i ϕ ˉ i \begin{array}{lllll} (a) \quad \rho_{i} =\sqrt{\bar\rho_{i}^2+\beta_{i+1}^2} \\ (b) \quad c_{i} ={\bar\rho_{i}} / {\rho_{i}} \\ (c) \quad s_{i} ={\beta_{i+1}} / {\rho_{i}} \\ (d) \quad \theta_{i+1} =s_{i}\alpha_{i+1} \\ (e) \quad \bar\rho_{i+1}=-c_{i}\alpha_{i+1} \\ (f) \quad \phi_{i} =c_{i}\bar\phi_{i} \\ (g) \quad \bar\phi_{i+1}=s_{i}\bar\phi_{i} \end{array}(a)ρi=ρˉi2+βi+12(b)ci=ρˉi/ρi(c)si=βi+1/ρi(d)θi+1=siαi+1(e)ρˉi+1=−ciαi+1(f)ϕi=ciϕˉi(g)ϕˉi+1=siϕˉi
带阻尼LSQR算法的QR分解中间变量的赋值过程:
( a ) ρ ˉ i + 1 = ρ ˉ i 2 + λ 2 ( b ) c i = ρ ˉ i / ρ ˉ i + 1 ( c ) s i = λ / ρ ˉ i + 1 ( d ) ψ i = s i ϕ ˉ i ( e ) ϕ ˉ i + 1 = c i ϕ i ( f ) ρ i = ρ ˉ i 2 + λ 2 + β i + 1 2 ( g ) c i + 1 = ρ ˉ i + 1 / ρ i ( h ) s i + 1 = β i + 1 / ρ i ( i ) θ i + 1 = s i + 1 α i + 1 ( j ) ρ ˉ i + 1 o = − c i + 1 α i + 1 ( k ) ϕ i + 1 = c i + 1 ϕ ˉ i + 1 ( l ) ϕ ˉ i + 1 = s i + 1 ϕ i + 1 \begin{array}{lllllll} (a) & \bar\rho_{i+1}=\sqrt{\bar\rho_{i}^2+\lambda^2} \\ (b)& c_{i} ={\bar\rho_{i}} / {\bar\rho_{i+1}} \\ (c)& s_{i} ={\lambda} / {\bar\rho_{i+1}} \\ (d) & \psi_{i} =s_{i}\bar\phi_{i} \\ (e) & \bar\phi_{i+1}=c_{i}\phi_{i} \\ (f)& \rho_{i} =\sqrt{\bar\rho_{i}^2+\lambda^2+\beta_{i+1}^2} \\ (g)& c_{i+1} ={\bar\rho_{i+1}} / {\rho_{i}} \\ (h) & s_{i+1} ={\beta_{i+1}} / {\rho_{i}} \\ (i)& \theta_{i+1} =s_{i+1}\alpha_{i+1} \\ (j)& \bar\rho_{i+1}^o=-c_{i+1}\alpha_{i+1} \\ (k) & \phi_{i+1} =c_{i+1}\bar\phi_{i+1} \\ (l) & \bar\phi_{i+1} =s_{i+1}\phi_{i+1} \\ \end{array}(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)(l)ρˉi+1=ρˉi2+λ2ci=ρˉi/ρˉi+1si=λ/ρˉi+1ψi=siϕˉiϕˉi+1=ciϕiρi=ρˉi2+λ2+βi+12ci+1=ρˉi+1/ρisi+1=βi+1/ρiθi+1=si+1αi+1ρˉi+1o=−ci+1αi+1ϕi+1=ci+1ϕˉi+1ϕˉi+1=si+1ϕi+1
更新x和w,到此步骤为止,x i x_ixi和w i w_iwi能够计算出来了
(a)x i + 1 = x i + w i x_{i+1}=x_i+w_ixi+1=xi+wi
(b)w i + 1 = v i + 1 − w i w_{i+1}=v_{i+1}-w_iwi+1=vi+1−wi判断迭代条件,即停机准则
停机准则有以下三条,涉及到范数的计算,关于范数的计算可以参考我的另一篇博客。
(a)∥ r k ∥ ≤ B T O L ∥ b ∥ + A T O L ∥ A ∥ ∥ x ∥ \parallel r_k \parallel \leq BTOL \parallel b \parallel +ATOL \parallel A \parallel \parallel x \parallel∥rk∥≤BTOL∥b∥+ATOL∥A∥∥x∥
(b)∥ A T r k ∥ ∥ A ∥ ∥ r k ∥ ≤ A T O L \frac{\parallel A^Tr_k \parallel}{\parallel A \parallel \parallel r_k \parallel} \leq ATOL∥A∥∥rk∥∥ATrk∥≤ATOL
(c)c o n d ( A ) > C O N L I M cond(A)>CONLIMcond(A)>CONLIM