【求解变分不等式】
求解变分不等式
介绍
设  
     
      
       
       
         θ 
        
       
         : 
        
        
        
          R 
         
        
          n 
         
        
       
         → 
        
       
         ( 
        
       
         ? 
        
       
         ∞ 
        
       
         , 
        
       
         ∞ 
        
       
         ] 
        
       
      
        \theta:R^n → (-\infty,\infty] 
       
      
    θ:Rn→(?∞,∞] 为闭正常凸函数, 
     
      
       
       
         F 
        
       
         : 
        
        
        
          R 
         
        
          n 
         
        
       
         → 
        
        
        
          R 
         
        
          n 
         
        
       
      
        F:R^n → R^n 
       
      
    F:Rn→Rn为向量
 值和连续映射。广义变分不等式 
     
      
       
       
         ( 
        
       
         G 
        
       
         V 
        
       
         I 
        
       
         ) 
        
       
      
        (GVI) 
       
      
    (GVI)[9,19]形式如下 
      
       
        
         
         
           x 
          
         
           ? 
          
         
        
          ∈ 
         
         
         
           R 
          
         
           n 
          
         
        
          , 
         
        
          θ 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          ? 
         
        
          θ 
         
        
          ( 
         
         
         
           x 
          
         
           ? 
          
         
        
          ) 
         
        
          + 
         
        
          ( 
         
        
          x 
         
        
          ? 
         
         
         
           x 
          
         
           ? 
          
         
         
         
           ) 
          
         
           T 
          
         
        
          F 
         
        
          ( 
         
         
         
           x 
          
         
           ? 
          
         
        
          ) 
         
        
          ≥ 
         
        
          0 
         
        
          , 
         
        
          ? 
         
        
          x 
         
        
          ∈ 
         
         
         
           R 
          
         
           n 
          
         
        
          . 
         
        
       
         x^*\in R^n,\theta(x)-\theta(x^*)+(x-x^*)^TF(x^*)\ge 0,\forall x\in R^n. 
        
       
     x?∈Rn,θ(x)?θ(x?)+(x?x?)TF(x?)≥0,?x∈Rn.
 保留 
     
      
       
       
         ∞ 
        
       
         ? 
        
       
         ∞ 
        
       
         = 
        
       
         ∞ 
        
       
      
        \infty -\infty=\infty 
       
      
    ∞?∞=∞的算术运算。在本文中,我们关注 
     
      
       
       
         F 
        
       
      
        F 
       
      
    F是单调的情况,即 
     
      
       
       
         ( 
        
       
         F 
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
         ? 
        
       
         F 
        
       
         ( 
        
       
         y 
        
       
         ) 
        
        
        
          ) 
         
        
          T 
         
        
       
         ( 
        
       
         x 
        
       
         ? 
        
       
         y 
        
       
         ) 
        
       
         ≥ 
        
       
         0 
        
       
      
        (F(x) -F(y))^T(x -y)\ge 0 
       
      
    (F(x)?F(y))T(x?y)≥0,对于所有 
     
      
       
       
         x 
        
       
         , 
        
       
         y 
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        x, y \in R^n 
       
      
    x,y∈Rn。在本文中,我们对 
     
      
       
       
         M 
        
       
         G 
        
       
         V 
        
       
         I 
        
       
         ( 
        
       
         θ 
        
       
         , 
        
       
         F 
        
       
         ) 
        
       
      
        MGVI(\theta, F) 
       
      
    MGVI(θ,F)做了以下额外的假设:
  
     
      
       
       
         ( 
        
       
         a 
        
       
         ) 
        
       
      
        (a) 
       
      
    (a)单调算子 
     
      
       
       
         F 
        
       
      
        F 
       
      
    F是 
     
      
       
       
         L 
        
       
         i 
        
       
         p 
        
       
         s 
        
       
         c 
        
       
         h 
        
       
         i 
        
       
         t 
        
       
         z 
        
       
      
        Lipschitz 
       
      
    Lipschitz连续的,其中 
     
      
       
       
         L 
        
       
         i 
        
       
         p 
        
       
         s 
        
       
         c 
        
       
         h 
        
       
         i 
        
       
         t 
        
       
         z 
        
       
      
        Lipschitz 
       
      
    Lipschitz 常数  
     
      
       
       
         L 
        
       
         > 
        
       
         0 
        
       
      
        L > 0 
       
      
    L>0。
  
     
      
       
       
         ( 
        
       
         b 
        
       
         ) 
        
       
      
        (b) 
       
      
    (b) 
     
      
       
       
         M 
        
       
         G 
        
       
         V 
        
       
         I 
        
       
         ( 
        
       
         θ 
        
       
         , 
        
       
         F 
        
       
         ) 
        
       
      
        MGVI(\theta ,F) 
       
      
    MGVI(θ,F)的解集,表示为 
     
      
       
        
        
          Ω 
         
        
          ? 
         
        
       
      
        \Omega^* 
       
      
    Ω?,非空。
各种凸优化问题可以表述为 
     
      
       
       
         M 
        
       
         G 
        
       
         V 
        
       
         I 
        
       
         s 
        
       
      
        MGVIs 
       
      
    MGVIs,例如最小绝对收缩和选择算子(LASSO,least absolute shrinkage and selection operator/ 
     
      
       
        
        
          l 
         
        
          1 
         
        
       
      
        l_1 
       
      
    l1?正则化最小二乘)问题、基追问题[7]、基追去噪问题[5],和  
     
      
       
       
         D 
        
       
         a 
        
       
         n 
        
       
         t 
        
       
         z 
        
       
         i 
        
       
         g 
        
       
      
        Dantzig 
       
      
    Dantzig 选择器 [6],等等。此外,单调广义变分不等式( 
     
      
       
       
         M 
        
       
         G 
        
       
         V 
        
       
         I 
        
       
      
        MGVI 
       
      
    MGVI,monotone generalized variational inequality)包含经典单调变分不等式 ( 
     
      
       
       
         M 
        
       
         V 
        
       
         I 
        
       
      
        MVI 
       
      
    MVI,monotone variational inequality)。通过设置 
     
      
       
       
         θ 
        
       
         = 
        
        
        
          δ 
         
        
          C 
         
        
       
      
        \theta = \delta _C 
       
      
    θ=δC?(  
     
      
       
        
        
          δ 
         
        
          C 
         
        
       
      
        \delta _C 
       
      
    δC?是 
     
      
       
       
         C 
        
       
      
        C 
       
      
    C的指标函数,如果是 
     
      
       
       
         x 
        
       
         ∈ 
        
       
         C 
        
       
      
        x \in C 
       
      
    x∈C,则等于 
     
      
       
       
         0 
        
       
      
        0 
       
      
    0,否则等于 
     
      
       
       
         ∞ 
        
       
      
        \infty 
       
      
    ∞。),其中 
     
      
       
       
         C 
        
       
         ? 
        
        
        
          R 
         
        
          n 
         
        
       
      
        C \subseteq R^n 
       
      
    C?Rn是一个非空闭凸集,则  
     
      
       
       
         M 
        
       
         G 
        
       
         V 
        
       
         I 
        
       
      
        MGVI 
       
      
    MGVI 就以下列形式简化为 
     
      
       
       
         M 
        
       
         V 
        
       
         I 
        
       
      
        MVI 
       
      
    MVI
  
      
       
        
         
         
           x 
          
         
           ? 
          
         
        
          ∈ 
         
        
          C 
         
        
          , 
         
        
          ( 
         
        
          x 
         
        
          ? 
         
         
         
           x 
          
         
           ? 
          
         
         
         
           ) 
          
         
           T 
          
         
        
          F 
         
        
          ( 
         
         
         
           x 
          
         
           ? 
          
         
        
          ) 
         
        
          ≥ 
         
        
          0 
         
        
          , 
         
        
          ? 
         
        
          x 
         
        
          ∈ 
         
        
          C 
         
        
          . 
         
        
       
         x^*\in C,(x-x^*)^TF(x^*)\ge 0,\forall x\in C. 
        
       
     x?∈C,(x?x?)TF(x?)≥0,?x∈C.
 设 
     
      
       
       
         f 
        
       
         : 
        
        
        
          R 
         
        
          n 
         
        
       
         → 
        
       
         ( 
        
       
         ? 
        
       
         ∞ 
        
       
         , 
        
       
         ∞ 
        
       
         ] 
        
       
      
        f: R^n → (?∞, ∞] 
       
      
    f:Rn→(?∞,∞]为闭正常凸函数, 
     
      
       
       
         f 
        
       
      
        f 
       
      
    f的邻近算子定义如下
  
      
       
        
        
          P 
         
        
          r 
         
        
          o 
         
         
         
           x 
          
         
           f 
          
         
        
          : 
         
        
          x 
         
        
          ? 
         
        
          a 
         
        
          r 
         
        
          g 
         
        
          m 
         
        
          i 
         
        
          n 
         
        
          { 
         
        
          f 
         
        
          ( 
         
        
          y 
         
        
          ) 
         
        
          + 
         
         
         
           1 
          
         
           2 
          
         
        
          ∥ 
         
        
          x 
         
        
          ? 
         
        
          y 
         
         
         
           ∥ 
          
         
           2 
          
         
        
          : 
         
        
          y 
         
        
          ∈ 
         
         
         
           R 
          
         
           n 
          
         
        
          } 
         
        
       
         Prox_f:x \mapsto argmin\{f(y)+\frac{1}{2}\|x-y\|^2 : y \in R^n\} 
        
       
     Proxf?:x?argmin{f(y)+21?∥x?y∥2:y∈Rn}
设 
     
      
       
       
         f 
        
       
         : 
        
        
        
          R 
         
        
          n 
         
        
       
         → 
        
       
         ( 
        
       
         ? 
        
       
         ∞ 
        
       
         , 
        
       
         ∞ 
        
       
         ] 
        
       
      
        f: R^n \rightarrow (-\infty, \infty] 
       
      
    f:Rn→(?∞,∞]为闭正常凸函数。那么接下来的三个条件是等价的:
  
     
      
       
       
         1. 
        
       
         p 
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
        
          f 
         
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
         . 
        
       
      
        1. p = Prox_f (x). 
       
      
    1.p=Proxf?(x).
  
     
      
       
       
         2. 
        
       
         x 
        
       
         ? 
        
       
         p 
        
       
         ∈ 
        
       
         ? 
        
       
         f 
        
       
         ( 
        
       
         p 
        
       
         ) 
        
       
         . 
        
       
      
        2. x - p \in \partial f(p). 
       
      
    2.x?p∈?f(p).
  
     
      
       
       
         3. 
        
       
      
        3. 
       
      
    3. 对于所有 
     
      
       
       
         w 
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        w\in R^n 
       
      
    w∈Rn,都有 
     
      
       
       
         ( 
        
       
         x 
        
       
         ? 
        
       
         p 
        
        
        
          ) 
         
        
          T 
         
        
       
         ( 
        
       
         p 
        
       
         ? 
        
       
         w 
        
       
         ) 
        
       
         ≥ 
        
       
         f 
        
       
         ( 
        
       
         p 
        
       
         ) 
        
       
         ? 
        
       
         f 
        
       
         ( 
        
       
         w 
        
       
         ) 
        
       
         . 
        
       
      
        (x-p)^T(p-w) \ge f(p) - f(w). 
       
      
    (x?p)T(p?w)≥f(p)?f(w).
l 1 l_1 l1?正则化最小二乘问题
对线性逆问题:
  
     
      
       
       
         b 
        
       
         = 
        
       
         A 
        
       
         x 
        
       
         + 
        
       
         w 
        
       
      
        b=Ax+w 
       
      
    b=Ax+w,其中 
     
      
       
       
         b 
        
       
         ∈ 
        
        
        
          R 
         
        
          m 
         
        
       
         , 
        
       
         A 
        
       
         ∈ 
        
        
        
          R 
         
         
         
           m 
          
         
           × 
          
         
           n 
          
         
        
       
      
        b\in R^m,A\in R^{m\times n} 
       
      
    b∈Rm,A∈Rm×n已知, 
     
      
       
       
         w 
        
       
      
        w 
       
      
    w为噪音,求解 
     
      
       
       
         x 
        
       
      
        x 
       
      
    x。
让噪声最小,最小二乘法求解:
  
     
      
       
        
        
          x 
         
        
          ~ 
         
        
       
         = 
        
        
         
         
           arg 
          
         
           ? 
          
         
           min 
          
         
           ? 
          
         
        
          x 
         
        
       
         ∥ 
        
       
         A 
        
       
         x 
        
       
         ? 
        
       
         b 
        
        
        
          ∥ 
         
        
          2 
         
        
       
      
        \widetilde{x}=\mathop{\arg\min}\limits_{x}\|Ax-b\|^2 
       
      
    x 
           =xargmin?∥Ax?b∥2
 若 
     
      
       
       
         m 
        
       
         = 
        
       
         n 
        
       
      
        m=n 
       
      
    m=n且 
     
      
       
       
         A 
        
       
      
        A 
       
      
    A非奇异, 
     
      
       
       
         x 
        
       
         = 
        
        
        
          A 
         
         
         
           ? 
          
         
           1 
          
         
        
       
         y 
        
       
      
        x=A^{-1}y 
       
      
    x=A?1y。但是很多情况下, 
     
      
       
       
         A 
        
       
      
        A 
       
      
    A病态,用最小二乘法求解时,系统微小的扰动都会导致结果差别很大。为了求解病态线性系统的逆问题,可以使用 
     
      
       
        
        
          l 
         
        
          2 
         
        
       
      
        l_2 
       
      
    l2?正则化最小二乘(也叫Tikhonov regularization,Ridge regression)
  
     
      
       
        
        
          x 
         
        
          ~ 
         
        
       
         = 
        
        
         
         
           arg 
          
         
           ? 
          
         
           min 
          
         
           ? 
          
         
        
          x 
         
        
       
         { 
        
        
        
          1 
         
        
          2 
         
        
       
         ∥ 
        
       
         A 
        
       
         x 
        
       
         ? 
        
       
         b 
        
        
        
          ∥ 
         
        
          2 
         
        
       
         + 
        
       
         λ 
        
       
         ∥ 
        
       
         x 
        
        
        
          ∥ 
         
        
          2 
         
        
       
         } 
        
       
      
        \widetilde{x}=\mathop{\arg\min}\limits_{x}\{\frac{1}{2}\|Ax-b\|^2+\lambda\|x\|_2\} 
       
      
    x 
           =xargmin?{21?∥Ax?b∥2+λ∥x∥2?},
 即 
     
      
       
        
         
         
           m 
          
         
           i 
          
         
           n 
          
         
         
         
           x 
          
         
           ∈ 
          
          
          
            R 
           
          
            n 
           
          
         
        
       
         f 
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
         : 
        
       
         = 
        
       
         { 
        
        
        
          1 
         
        
          2 
         
        
       
         ∥ 
        
       
         A 
        
       
         x 
        
       
         ? 
        
       
         b 
        
        
        
          ∥ 
         
        
          2 
         
        
       
         + 
        
       
         λ 
        
       
         ∥ 
        
       
         x 
        
        
        
          ∥ 
         
        
          2 
         
        
       
         } 
        
       
      
        \mathop{min}\limits_{x\in R^n}f(x):=\{\frac{1}{2}\|Ax-b\|^2+\lambda\|x\|_2\} 
       
      
    x∈Rnmin?f(x):={21?∥Ax?b∥2+λ∥x∥2?}
而 
     
      
       
        
        
          l 
         
        
          1 
         
        
       
      
        l_1 
       
      
    l1?正则化最小二乘(Lasso,least absolute shrinkage and selection operator,最小绝对值收敛和选择算子)问题。
  
      
       
        
         
          
          
            m 
           
          
            i 
           
          
            n 
           
          
          
          
            x 
           
          
            ∈ 
           
           
           
             R 
            
           
             n 
            
           
          
         
        
          f 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          : 
         
        
          = 
         
        
          { 
         
         
         
           1 
          
         
           2 
          
         
        
          ∥ 
         
        
          A 
         
        
          x 
         
        
          ? 
         
        
          b 
         
         
         
           ∥ 
          
         
           2 
          
         
        
          + 
         
        
          λ 
         
        
          ∥ 
         
        
          x 
         
         
         
           ∥ 
          
         
           1 
          
         
        
          } 
         
        
       
         \mathop{min}\limits_{x\in R^n}f(x):=\{\frac{1}{2}\|Ax-b\|^2+\lambda\|x\|_1\} 
        
       
     x∈Rnmin?f(x):={21?∥Ax?b∥2+λ∥x∥1?}
 对无约束问题 
     
      
       
       
         m 
        
       
         i 
        
       
         n 
        
       
         f 
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
         , 
        
       
         x 
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        minf(x),x\in R^n 
       
      
    minf(x),x∈Rn
 梯度下降法 
      
       
        
         
         
           x 
          
         
           k 
          
         
        
          = 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ? 
         
         
         
           t 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
       
         x_k=x_{k-1}-t_{k-1}\nabla f(x_{k-1}) 
        
       
     xk?=xk?1??tk?1??f(xk?1?)
 梯度下降法对可微函数可直接求微分,对不可微函数引入了次梯度(subgradient),次微分(subdifferential)概念
 投影算子(projection,Pc)
  
      
       
        
        
          P 
         
        
          c 
         
        
          = 
         
        
          P 
         
        
          r 
         
        
          o 
         
         
         
           x 
          
          
           
           
             t 
            
           
             k 
            
           
           
           
             δ 
            
           
             C 
            
           
          
         
        
       
         Pc=Prox_{t_k\delta _C} 
        
       
     Pc=Proxtk?δC??
  
      
       
        
         
         
           x 
          
         
           k 
          
         
        
          = 
         
        
          P 
         
        
          c 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ? 
         
         
         
           t 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
          ) 
         
        
       
         x_k=Pc(x_{k-1}-t_{k-1}\nabla f(x_{k-1})) 
        
       
     xk?=Pc(xk?1??tk?1??f(xk?1?))
 近端线性 
      
       
        
         
         
           x 
          
         
           k 
          
         
        
          = 
         
         
          
          
            arg 
           
          
            ? 
           
          
            min 
           
          
            ? 
           
          
         
           x 
          
         
        
          { 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
          + 
         
        
          < 
         
        
          x 
         
        
          ? 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          , 
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
          > 
         
        
          + 
         
         
         
           1 
          
          
          
            2 
           
           
           
             t 
            
            
            
              k 
             
            
              ? 
             
            
              1 
             
            
           
          
         
        
          ∥ 
         
        
          x 
         
        
          ? 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
         
         
           ∥ 
          
         
           2 
          
         
        
          } 
         
        
       
         x_k=\mathop{\arg\min}\limits_{x}\{f(x_{k-1})+<x-x_{k-1},\nabla f(x_{k-1})>+\frac{1}{2t_{k-1}}\|x-x_{k-1}\|^2\} 
        
       
     xk?=xargmin?{f(xk?1?)+<x?xk?1?,?f(xk?1?)>+2tk?1?1?∥x?xk?1?∥2}
  
      
       
        
        
          f 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
          + 
         
        
          < 
         
        
          x 
         
        
          ? 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          , 
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
          > 
         
        
          + 
         
         
         
           1 
          
          
          
            2 
           
           
           
             t 
            
            
            
              k 
             
            
              ? 
             
            
              1 
             
            
           
          
         
        
          ∥ 
         
        
          x 
         
        
          ? 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
         
         
           ∥ 
          
         
           2 
          
         
        
          = 
         
         
         
           1 
          
          
          
            2 
           
           
           
             t 
            
            
            
              k 
             
            
              ? 
             
            
              1 
             
            
           
          
         
        
          ∥ 
         
        
          x 
         
        
          ? 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ? 
         
         
         
           t 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
          ) 
         
         
         
           ∥ 
          
         
           2 
          
         
        
          + 
         
        
          D 
         
        
       
         f(x_{k-1})+<x-x_{k-1},\nabla f(x_{k-1})>+\frac{1}{2t_{k-1}}\|x-x_{k-1}\|^2=\frac{1}{2t_{k-1}}\|x-(x_{k-1}-t_{k-1}\nabla f(x_{k-1}))\|^2+D 
        
       
     f(xk?1?)+<x?xk?1?,?f(xk?1?)>+2tk?1?1?∥x?xk?1?∥2=2tk?1?1?∥x?(xk?1??tk?1??f(xk?1?))∥2+D
 D为常数
  
      
       
        
         
         
           x 
          
         
           k 
          
         
        
          = 
         
         
          
          
            arg 
           
          
            ? 
           
          
            min 
           
          
            ? 
           
          
         
           x 
          
         
        
          { 
         
         
         
           1 
          
          
          
            2 
           
           
           
             t 
            
           
             k 
            
           
          
         
        
          ∥ 
         
        
          x 
         
        
          ? 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ? 
         
         
         
           t 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
          ) 
         
         
         
           ∥ 
          
         
           2 
          
         
        
          } 
         
        
       
         x_k=\mathop{\arg\min}\limits_{x}\{\frac{1}{2t_k}\|x-(x_{k-1}-t_{k-1}\nabla f(x_{k-1}))\|^2\} 
        
       
     xk?=xargmin?{2tk?1?∥x?(xk?1??tk?1??f(xk?1?))∥2}
 对无约束问题 
     
      
       
       
         m 
        
       
         i 
        
       
         n 
        
       
         f 
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
         + 
        
       
         g 
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
         , 
        
       
         x 
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        minf(x)+g(x),x\in R^n 
       
      
    minf(x)+g(x),x∈Rn
  
      
       
        
         
         
           x 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
         
          
          
            arg 
           
          
            ? 
           
          
            min 
           
          
            ? 
           
          
         
           x 
          
         
        
          { 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ) 
         
        
          + 
         
        
          < 
         
        
          x 
         
        
          ? 
         
         
         
           x 
          
         
           k 
          
         
        
          , 
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ) 
         
        
          > 
         
        
          + 
         
         
         
           1 
          
          
          
            2 
           
           
           
             t 
            
           
             k 
            
           
          
         
        
          ∥ 
         
        
          x 
         
        
          ? 
         
         
         
           x 
          
         
           k 
          
         
         
         
           ∥ 
          
         
           2 
          
         
        
          + 
         
        
          g 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          } 
         
        
       
         x_{k+1}=\mathop{\arg\min}\limits_{x}\{f(x_k)+<x-x_k,\nabla f(x_k)>+\frac{1}{2t_k}\|x-x_k\|^2+g(x)\} 
        
       
     xk+1?=xargmin?{f(xk?)+<x?xk?,?f(xk?)>+2tk?1?∥x?xk?∥2+g(x)}
  
      
       
        
         
         
           x 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
         
          
          
            arg 
           
          
            ? 
           
          
            min 
           
          
            ? 
           
          
         
           x 
          
         
        
          { 
         
         
         
           1 
          
         
           2 
          
         
        
          ∥ 
         
        
          x 
         
        
          ? 
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
         
         
           t 
          
         
           k 
          
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ) 
         
        
          ) 
         
         
         
           ∥ 
          
         
           2 
          
         
        
          + 
         
         
         
           t 
          
         
           k 
          
         
        
          g 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          } 
         
        
       
         x_{k+1}=\mathop{\arg\min}\limits_{x}\{\frac{1}{2}\|x-(x_k-t_k\nabla f(x_k))\|^2+t_kg(x)\} 
        
       
     xk+1?=xargmin?{21?∥x?(xk??tk??f(xk?))∥2+tk?g(x)}
 即 
      
       
        
         
         
           x 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
        
          P 
         
        
          r 
         
        
          o 
         
         
         
           x 
          
          
           
           
             t 
            
           
             k 
            
           
          
            g 
           
          
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
         
         
           t 
          
         
           k 
          
         
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ) 
         
        
          ) 
         
        
       
         x_{k+1}=Prox_{t_kg}(x_k-t_k\nabla f(x_k)) 
        
       
     xk+1?=Proxtk?g?(xk??tk??f(xk?))
基追问题
 
      
       
        
         
         
           m 
          
         
           i 
          
         
           n 
          
         
         
        
          ∥ 
         
        
          x 
         
         
         
           ∥ 
          
         
           1 
          
         
         
        
          s 
         
        
          . 
         
        
          t 
         
        
          . 
         
         
        
          A 
         
        
          x 
         
        
          = 
         
        
          b 
         
        
       
         \mathop{min}\quad \|x\|_1 \\s.t. \quad Ax=b 
        
       
     min∥x∥1?s.t.Ax=b
 其中 
     
      
       
       
         A 
        
       
         ∈ 
        
        
        
          R 
         
         
         
           m 
          
         
           × 
          
         
           n 
          
         
        
       
         , 
        
       
         b 
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        A\in R^{m\times n},b\in R^{n} 
       
      
    A∈Rm×n,b∈Rn
可分离两块凸优化模型
 
      
       
        
         
         
           m 
          
         
           i 
          
         
           n 
          
         
         
         
         
           θ 
          
         
           1 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
         
         
           θ 
          
         
           2 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
         
        
          s 
         
        
          . 
         
        
          t 
         
        
          . 
         
         
        
          A 
         
        
          x 
         
        
          + 
         
        
          B 
         
        
          y 
         
        
          = 
         
        
          c 
         
        
       
         \mathop{min}\quad\theta_1(x)+\theta_2(x)\\ s.t. \quad Ax+By=c 
        
       
     minθ1?(x)+θ2?(x)s.t.Ax+By=c
 其中 
     
      
       
       
         A 
        
       
         ∈ 
        
        
        
          R 
         
         
         
           m 
          
         
           × 
          
         
           n 
          
         
        
       
         , 
        
       
         B 
        
       
         ∈ 
        
        
        
          R 
         
         
         
           m 
          
         
           × 
          
         
           q 
          
         
        
       
         , 
        
       
         c 
        
       
         ∈ 
        
        
        
          R 
         
        
          m 
         
        
       
      
        A\in R^{m\times n},B\in R^{m\times q},c\in R^m 
       
      
    A∈Rm×n,B∈Rm×q,c∈Rm,
  
     
      
       
        
        
          θ 
         
        
          1 
         
        
       
         : 
        
        
        
          R 
         
        
          n 
         
        
       
         → 
        
       
         ( 
        
       
         ? 
        
       
         ∞ 
        
       
         , 
        
       
         ∞ 
        
       
         ] 
        
       
      
        \theta_1:R^n\rightarrow(-\infty,\infty] 
       
      
    θ1?:Rn→(?∞,∞]和 
     
      
       
        
        
          θ 
         
        
          2 
         
        
       
         : 
        
        
        
          R 
         
        
          q 
         
        
       
         → 
        
       
         ( 
        
       
         ? 
        
       
         ∞ 
        
       
         , 
        
       
         ∞ 
        
       
         ] 
        
       
      
        \theta_2:R^q\rightarrow(-\infty,\infty] 
       
      
    θ2?:Rq→(?∞,∞]为两个正常闭和凸函数
迭代收缩阈值算法(ISTA,Iterative Shrinkage-Thresholding Algorithm)
ISTA
取 
      
       
        
        
          f 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          = 
         
         
         
           1 
          
         
           2 
          
         
        
          ∥ 
         
        
          A 
         
        
          x 
         
        
          ? 
         
        
          b 
         
         
         
           ∥ 
          
         
           2 
          
         
        
          , 
         
        
          g 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          = 
         
        
          λ 
         
        
          ∥ 
         
        
          x 
         
         
         
           ∥ 
          
         
           1 
          
         
        
       
         f(x)=\frac{1}{2}\|Ax-b\|^2,g(x)=\lambda \|x\|_1 
        
       
     f(x)=21?∥Ax?b∥2,g(x)=λ∥x∥1?
  
      
       
        
        
          ? 
         
        
          f 
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ) 
         
        
          = 
         
         
         
           A 
          
         
           T 
          
         
        
          ( 
         
        
          A 
         
        
          x 
         
        
          ? 
         
        
          b 
         
        
          ) 
         
        
          , 
         
         
         
           t 
          
         
           k 
          
         
        
          = 
         
         
         
           1 
          
          
          
            L 
           
          
            k 
           
          
         
        
       
         \nabla f(x_k)=A^T(Ax-b),t_k=\frac{1}{L_k} 
        
       
     ?f(xk?)=AT(Ax?b),tk?=Lk?1?
 得
  
      
       
        
         
         
           x 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
        
          P 
         
        
          r 
         
        
          o 
         
         
         
           x 
          
          
           
           
             1 
            
            
            
              L 
             
            
              k 
             
            
           
          
            λ 
           
          
            ∥ 
           
           
           
             ∥ 
            
           
             1 
            
           
          
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
         
         
           1 
          
          
          
            L 
           
          
            k 
           
          
         
         
         
           A 
          
         
           T 
          
         
        
          ( 
         
        
          A 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
        
          b 
         
        
          ) 
         
        
          ) 
         
        
       
         x_{k+1}=Prox_{\frac{1}{L_k}\lambda \|\|_1}(x_k-\frac{1}{L_k}A^T(Ax_k-b)) 
        
       
     xk+1?=ProxLk?1?λ∥∥1??(xk??Lk?1?AT(Axk??b))
  
     
      
       
       
         L 
        
       
      
        L 
       
      
    L是 
     
      
       
       
         ? 
        
       
         f 
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
      
        \nabla f(x) 
       
      
    ?f(x)的 
     
      
       
       
         L 
        
       
         i 
        
       
         p 
        
       
         s 
        
       
         c 
        
       
         h 
        
       
         i 
        
       
         t 
        
       
         z 
        
       
      
        Lipschitz 
       
      
    Lipschitz常数, 
     
      
       
       
         L 
        
       
         > 
        
        
        
          λ 
         
         
         
           m 
          
         
           a 
          
         
           x 
          
         
        
       
         ( 
        
        
        
          A 
         
        
          T 
         
        
       
         A 
        
       
         ) 
        
       
      
        L>\lambda_{max}(A^TA) 
       
      
    L>λmax?(ATA)
 由 
     
      
       
        
        
          T 
         
         
         
           λ 
          
          
          
            t 
           
          
            k 
           
          
         
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
         
          
          
            t 
           
          
            k 
           
          
         
           λ 
          
         
           ∥ 
          
          
          
            ∥ 
           
          
            1 
           
          
         
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
         
         
           λ 
          
          
          
            t 
           
          
            k 
           
          
         
        
       
      
        \Tau _{\lambda t_k}=Prox_{t_k\lambda\|\|_1}=Prox_{\lambda t_k} 
       
      
    Tλtk??=Proxtk?λ∥∥1??=Proxλtk??得
  
      
       
        
         
         
           x 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
         
         
           T 
          
          
          
            λ 
           
           
           
             L 
            
           
             k 
            
           
          
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
         
         
           1 
          
          
          
            L 
           
          
            k 
           
          
         
         
         
           A 
          
         
           T 
          
         
        
          ( 
         
        
          A 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
        
          b 
         
        
          ) 
         
        
          ) 
         
        
       
         x_{k+1}=T_{\frac{\lambda}{L_k}}(x_k-\frac{1}{L_k}A^T(Ax_k-b)) 
        
       
     xk+1?=TLk?λ??(xk??Lk?1?AT(Axk??b))
 收缩阈值算子(Shrinkage-Thresholding operator )
  
      
       
        
         
         
           T 
          
         
           λ 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          = 
         
        
          s 
         
        
          g 
         
        
          n 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          m 
         
        
          a 
         
        
          x 
         
        
          ( 
         
        
          ∣ 
         
        
          x 
         
        
          ∣ 
         
        
          ? 
         
        
          λ 
         
        
          , 
         
        
          0 
         
        
          ) 
         
        
          = 
         
        
          m 
         
        
          a 
         
        
          x 
         
        
          ( 
         
        
          0 
         
        
          , 
         
        
          x 
         
        
          ? 
         
        
          λ 
         
        
          ) 
         
        
          + 
         
        
          m 
         
        
          i 
         
        
          n 
         
        
          ( 
         
        
          0 
         
        
          , 
         
        
          x 
         
        
          + 
         
        
          λ 
         
        
          ) 
         
        
       
         \Tau _{\lambda}(x)=sgn(x)max(|x|-\lambda,0)=max(0, x -\lambda) + min(0, x +\lambda) 
        
       
     Tλ?(x)=sgn(x)max(∣x∣?λ,0)=max(0,x?λ)+min(0,x+λ)
  
     
      
       
        
        
          x 
         
        
          k 
         
        
       
      
        x_k 
       
      
    xk?收敛速度 
     
      
       
       
         O 
        
       
         ( 
        
        
        
          1 
         
        
          t 
         
        
       
         ) 
        
       
      
        O(\frac{1}{t}) 
       
      
    O(t1?)
matlab核心代码
T = @(tau, x) max(0, x - tau) + min(0, x + tau);% 收缩阈值函数
x_new =  T(lambda/L0, x_old - (1/L0)*A'*(A*x_old-b));
FISTA改进:
快速迭代收缩阈值算法(FISTA,Fast Iterative Shrinkage-Thresholding Algorithm)
 
     
      
       
       
         s 
        
       
         t 
        
       
         e 
        
       
         p 
        
        
       
         0 
        
       
         : 
        
       
      
        step\quad 0: 
       
      
    step0:
  
      
       
        
         
         
           y 
          
         
           1 
          
         
        
          = 
         
         
         
           x 
          
         
           0 
          
         
        
          , 
         
         
         
           t 
          
         
           1 
          
         
        
          = 
         
        
          1 
         
        
       
         y_1=x_0,t_1=1 
        
       
     y1?=x0?,t1?=1
  
     
      
       
       
         s 
        
       
         t 
        
       
         e 
        
       
         p 
        
        
       
         k 
        
       
         ( 
        
       
         k 
        
       
         ≥ 
        
       
         1 
        
       
         ) 
        
       
         : 
        
       
      
        step\quad k(k\ge1): 
       
      
    stepk(k≥1):
  
      
       
        
         
         
           x 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
         
         
           T 
          
          
          
            λ 
           
           
           
             L 
            
           
             k 
            
           
          
         
        
          ( 
         
         
         
           y 
          
         
           k 
          
         
        
          ? 
         
         
         
           1 
          
          
          
            L 
           
          
            k 
           
          
         
         
         
           A 
          
         
           T 
          
         
        
          ( 
         
        
          A 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
        
          b 
         
        
          ) 
         
        
          ) 
         
        
       
         x_{k+1}=T_{\frac{\lambda}{L_k}}(y_k-\frac{1}{L_k}A^T(Ax_k-b)) 
        
       
     xk+1?=TLk?λ??(yk??Lk?1?AT(Axk??b))
  
      
       
        
         
         
           t 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
         
          
          
            1 
           
          
            + 
           
           
            
            
              1 
             
            
              + 
             
            
              4 
             
             
             
               t 
              
             
               k 
              
             
               2 
              
             
            
           
          
         
           2 
          
         
        
       
         t_{k+1}=\frac{1+\sqrt{1+4t_k^2}}{2} 
        
       
     tk+1?=21+1+4tk2???
  
      
       
        
         
         
           y 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
         
         
           x 
          
         
           k 
          
         
        
          + 
         
        
          ( 
         
         
          
           
           
             t 
            
           
             k 
            
           
          
            ? 
           
          
            1 
           
          
          
          
            t 
           
           
           
             k 
            
           
             + 
            
           
             1 
            
           
          
         
        
          ) 
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
         
         
           x 
          
          
          
            k 
           
          
            ? 
           
          
            1 
           
          
         
        
          ) 
         
        
       
         y_{k+1}=x_k+(\frac{t_k-1}{t_{k+1}})(x_k-x_{k-1}) 
        
       
     yk+1?=xk?+(tk+1?tk??1?)(xk??xk?1?)
  
     
      
       
        
        
          x 
         
        
          k 
         
        
       
      
        x_k 
       
      
    xk?收敛速度 
     
      
       
       
         O 
        
       
         ( 
        
        
        
          1 
         
         
         
           t 
          
         
           2 
          
         
        
       
         ) 
        
       
      
        O(\frac{1}{t^2}) 
       
      
    O(t21?)
广义外梯度法(GEM,Generalized Extragradient Method)
外梯度法(EM,Extragradient Method)
广义外梯度法(Generalized Extragradient Method)
 由外梯度法(EM,Extragradient Method)推广得到,其中EM采用投影算子( 
     
      
       
       
         P 
        
       
         c 
        
       
      
        Pc 
       
      
    Pc,projection operator),适用于求解 
     
      
       
       
         M 
        
       
         V 
        
       
         I 
        
       
         ( 
        
       
         θ 
        
       
         , 
        
       
         F 
        
       
         , 
        
       
         C 
        
       
         ) 
        
       
      
        MVI(\theta,F,C) 
       
      
    MVI(θ,F,C);GEM采用邻近算子( 
     
      
       
       
         P 
        
       
         r 
        
       
         o 
        
       
         x 
        
       
      
        Prox 
       
      
    Prox,proximity operator),适用于求解 
     
      
       
       
         M 
        
       
         G 
        
       
         V 
        
       
         I 
        
       
         ( 
        
       
         θ 
        
       
         , 
        
       
         F 
        
       
         ) 
        
       
      
        MGVI(\theta,F) 
       
      
    MGVI(θ,F),其他步骤基本相同,都是预估校正算法
GEM
 
     
      
       
        
        
          l 
         
        
          1 
         
        
       
      
        l_1 
       
      
    l1?正则化最小二乘问题,由KKT条件,易知
  
      
       
        
        
          θ 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          = 
         
        
          λ 
         
        
          ∥ 
         
        
          x 
         
         
         
           ∥ 
          
         
           1 
          
         
        
          , 
         
        
          F 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          = 
         
         
         
           A 
          
         
           T 
          
         
        
          ( 
         
        
          A 
         
        
          x 
         
        
          ? 
         
        
          b 
         
        
          ) 
         
        
       
         \theta(x)=\lambda\|x\|_1,F(x)=A^T(Ax-b) 
        
       
     θ(x)=λ∥x∥1?,F(x)=AT(Ax?b)
  
     
      
       
        
        
          β 
         
        
          k 
         
        
       
         ≤ 
        
        
        
          ν 
         
        
          L 
         
        
       
         , 
        
       
         ν 
        
       
         ∈ 
        
       
         ( 
        
       
         0 
        
       
         , 
        
       
         1 
        
       
         ) 
        
       
         , 
        
       
         L 
        
       
      
        \beta^k\le\frac{\nu}{L},\nu\in(0,1),L 
       
      
    βk≤Lν?,ν∈(0,1),L为 
     
      
       
       
         F 
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
      
        F(x) 
       
      
    F(x)的 
     
      
       
       
         L 
        
       
         i 
        
       
         p 
        
       
         s 
        
       
         c 
        
       
         h 
        
       
         i 
        
       
         t 
        
       
         z 
        
       
      
        Lipschitz 
       
      
    Lipschitz常数
 GEM是预估校正算法
 预测: 
      
       
        
         
          
          
            x 
           
          
            ~ 
           
          
         
           k 
          
         
        
          = 
         
        
          P 
         
        
          r 
         
        
          o 
         
         
         
           x 
          
          
           
           
             β 
            
           
             k 
            
           
          
            θ 
           
          
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
         
         
           β 
          
         
           k 
          
         
        
          F 
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ) 
         
        
          ) 
         
        
       
         \widetilde{x}_k=Prox_{\beta^k\theta}(x_k-\beta^kF(x_k)) 
        
       
     x 
             k?=Proxβkθ?(xk??βkF(xk?))
 校正: 
      
       
        
         
         
           x 
          
          
          
            k 
           
          
            + 
           
          
            1 
           
          
         
        
          = 
         
        
          P 
         
        
          r 
         
        
          o 
         
         
         
           x 
          
          
           
           
             β 
            
           
             k 
            
           
          
            θ 
           
          
         
        
          ( 
         
         
         
           x 
          
         
           k 
          
         
        
          ? 
         
         
         
           β 
          
         
           k 
          
         
        
          F 
         
        
          ( 
         
         
          
          
            x 
           
          
            ~ 
           
          
         
           k 
          
         
        
          ) 
         
        
          ) 
         
        
       
         x_{k+1}=Prox_{\beta^k\theta}(x_k-\beta^kF(\widetilde{x}_k)) 
        
       
     xk+1?=Proxβkθ?(xk??βkF(x 
             k?))
邻近收缩算法 (PCA,proximity and contraction algorithms)
收缩阈值算子( 
     
      
       
       
         T 
        
       
      
        \Tau 
       
      
    T),投影算子( 
     
      
       
        
        
          P 
         
        
          C 
         
        
       
      
        P_C 
       
      
    PC?)是特殊的邻近算子( 
     
      
       
       
         P 
        
       
         r 
        
       
         o 
        
       
         x 
        
       
      
        Prox 
       
      
    Prox)
 邻近算子具有收缩性质,也就是生成的序列 
     
      
       
       
         { 
        
        
        
          x 
         
        
          k 
         
        
       
         } 
        
       
      
        \{x_k\} 
       
      
    {xk?}会收敛
邻近点算法(PPA,proximity point algorithms)
邻近梯度算法(PGA,proximity gradient algorithms)
使用邻近算子(Prox)作为近似梯度
 变分不等式
  
      
       
        
        
          θ 
         
        
          ( 
         
         
          
          
            x 
           
          
            ~ 
           
          
         
           k 
          
         
        
          ) 
         
        
          ? 
         
        
          θ 
         
        
          ( 
         
         
         
           x 
          
         
           ? 
          
         
        
          ) 
         
        
          + 
         
        
          ( 
         
         
          
          
            x 
           
          
            ~ 
           
          
         
           k 
          
         
        
          ? 
         
         
         
           x 
          
         
           ? 
          
         
         
         
           ) 
          
         
           T 
          
         
        
          F 
         
        
          ( 
         
         
         
           x 
          
         
           ? 
          
         
        
          ) 
         
        
          ≥ 
         
        
          0 
         
        
       
         \theta(\tilde{x}^k) -\theta(x^*) +(\tilde{x}^k - x^*)^T F(x^*) \ge 0 
        
       
     θ(x~k)?θ(x?)+(x~k?x?)TF(x?)≥0
 单调
  
      
       
        
        
          ( 
         
         
          
          
            x 
           
          
            ~ 
           
          
         
           k 
          
         
        
          ? 
         
         
         
           x 
          
         
           ? 
          
         
         
         
           ) 
          
         
           T 
          
         
        
          ( 
         
        
          F 
         
        
          ( 
         
         
          
          
            x 
           
          
            ~ 
           
          
         
           k 
          
         
        
          ) 
         
        
          ? 
         
        
          F 
         
        
          ( 
         
         
         
           x 
          
         
           ? 
          
         
        
          ) 
         
        
          ) 
         
        
          ≥ 
         
        
          0 
         
        
       
         (\tilde{x}^k - x^*)^T(F(\tilde{x}^k) -F(x^*))\ge 0 
        
       
     (x~k?x?)T(F(x~k)?F(x?))≥0
P G A a 1 PGA_{a1} PGAa1?
 
      
       
        
        
          F 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          = 
         
        
          M 
         
        
          x 
         
        
          + 
         
        
          q 
         
        
       
         F(x)=Mx+q 
        
       
     F(x)=Mx+q
 M半正定,不需要对称
  
     
      
       
        
        
          x 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        x^0 \in R^n 
       
      
    x0∈Rn,并且 
     
      
       
       
         γ 
        
       
         ∈ 
        
       
         ( 
        
       
         0 
        
       
         , 
        
       
         2 
        
       
         ) 
        
       
         , 
        
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         ≥ 
        
        
        
          1 
         
         
         
           ∥ 
          
         
           I 
          
         
           + 
          
          
          
            β 
           
          
            k 
           
          
          
          
            M 
           
          
            T 
           
          
          
          
            ∥ 
           
          
            2 
           
          
            2 
           
          
         
        
       
      
        \gamma \in (0,2),\alpha ^*_k\ge\frac{1}{\|I+\beta ^kM^T\|^2_2} 
       
      
    γ∈(0,2),αk??≥∥I+βkMT∥22?1?
  
     
      
       
       
         预测器: 
        
       
      
        \textbf{预测器:} 
       
      
    预测器:选择 
     
      
       
        
        
          β 
         
        
          k 
         
        
       
         > 
        
       
         0 
        
       
      
        \beta ^k>0 
       
      
    βk>0,且 
     
      
       
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
         
          
          
            β 
           
          
            k 
           
          
         
           θ 
          
         
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
        
        
          β 
         
        
          k 
         
        
       
         F 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ) 
        
       
         ) 
        
       
      
        \widetilde{x}^k=Prox_{\beta ^k\theta}(x^k-\beta ^kF(x^k)) 
       
      
    x 
            k=Proxβkθ?(xk?βkF(xk));
  
     
      
       
       
         校正器: 
        
       
      
        \textbf{校正器:} 
       
      
    校正器:设 
     
      
       
       
         d 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         , 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         , 
        
        
        
          β 
         
        
          k 
         
        
       
         ) 
        
       
         = 
        
       
         ( 
        
       
         I 
        
       
         + 
        
        
        
          β 
         
        
          k 
         
        
        
        
          M 
         
        
          T 
         
        
       
         ) 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ) 
        
       
      
        d(x^k,\widetilde{x}^k,\beta ^k)=(I+\beta ^kM^T)(x^k-\widetilde{x}^k) 
       
      
    d(xk,x 
            k,βk)=(I+βkMT)(xk?x 
            k),并且计算 
     
      
       
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         = 
        
        
         
         
           ∥ 
          
          
          
            x 
           
          
            k 
           
          
         
           ? 
          
          
           
           
             x 
            
           
             ~ 
            
           
          
            k 
           
          
          
          
            ∥ 
           
          
            2 
           
          
         
         
         
           ∥ 
          
         
           d 
          
         
           ( 
          
          
          
            x 
           
          
            k 
           
          
         
           , 
          
          
           
           
             x 
            
           
             ~ 
            
           
          
            k 
           
          
         
           , 
          
          
          
            β 
           
          
            k 
           
          
         
           ) 
          
          
          
            ∥ 
           
          
            2 
           
          
         
        
       
      
        \alpha ^*_k=\frac{\|x^k-\widetilde{x}^k\|^2}{\|d(x^k,\widetilde{x}^k,\beta ^k)\|^2} 
       
      
    αk??=∥d(xk,x 
                    k,βk)∥2∥xk?x 
                    k∥2?。设置 
     
      
       
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         = 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
       
         γ 
        
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         d 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         , 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         , 
        
        
        
          β 
         
        
          k 
         
        
       
         ) 
        
       
      
        x^{k+1}=x^k-\gamma\alpha ^*_kd(x^k,\widetilde{x}^k,\beta ^k) 
       
      
    xk+1=xk?γαk??d(xk,x 
            k,βk)。
P G A a 2 PGA_{a2} PGAa2?
 
      
       
        
        
          F 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          = 
         
        
          M 
         
        
          x 
         
        
          + 
         
        
          q 
         
        
       
         F(x)=Mx+q 
        
       
     F(x)=Mx+q
 M对称半正定, 
     
      
       
       
         G 
        
       
         = 
        
       
         I 
        
       
         + 
        
       
         β 
        
       
         M 
        
       
      
        G=I+\beta M 
       
      
    G=I+βM
  
     
      
       
        
        
          x 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        x^0 \in R^n 
       
      
    x0∈Rn, 
     
      
       
       
         β 
        
       
         > 
        
       
         0 
        
       
      
        \beta >0 
       
      
    β>0,并且 
     
      
       
       
         γ 
        
       
         ∈ 
        
       
         ( 
        
       
         0 
        
       
         , 
        
       
         2 
        
       
         ) 
        
       
      
        \gamma \in (0,2) 
       
      
    γ∈(0,2), 
     
      
       
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         ≥ 
        
        
        
          1 
         
         
          
          
            λ 
           
           
           
             m 
            
           
             a 
            
           
             x 
            
           
          
         
           ( 
          
         
           G 
          
         
           ) 
          
         
        
       
      
        \alpha ^*_k \ge \frac{1}{\lambda_{max}(G)} 
       
      
    αk??≥λmax?(G)1?
  
     
      
       
       
         预测器: 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
         
         
           β 
          
         
           θ 
          
         
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
       
         β 
        
       
         F 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ) 
        
       
         ) 
        
       
      
        \textbf{预测器:}\widetilde{x}^k=Prox_{\beta\theta}(x^k-\beta F(x^k)) 
       
      
    预测器:x 
            k=Proxβθ?(xk?βF(xk));
  
     
      
       
       
         校正器: 
        
       
      
        \textbf{校正器:} 
       
      
    校正器:设 
     
      
       
       
         d 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         , 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ) 
        
       
         = 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
      
        d(x^k,\widetilde{x}^k)=x^k-\widetilde{x}^k 
       
      
    d(xk,x 
            k)=xk?x 
            k,并且计算 
     
      
       
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         = 
        
        
         
         
           ∥ 
          
          
          
            x 
           
          
            k 
           
          
         
           ? 
          
          
           
           
             x 
            
           
             ~ 
            
           
          
            k 
           
          
          
          
            ∥ 
           
          
            2 
           
          
         
         
         
           ∥ 
          
         
           d 
          
         
           ( 
          
          
          
            x 
           
          
            k 
           
          
         
           , 
          
          
           
           
             x 
            
           
             ~ 
            
           
          
            k 
           
          
         
           ) 
          
          
          
            ∥ 
           
          
            G 
           
          
            2 
           
          
         
        
       
      
        \alpha ^*_k=\frac{\|x^k-\widetilde{x}^k\|^2}{\|d(x^k,\widetilde{x}^k)\|^2_G} 
       
      
    αk??=∥d(xk,x 
                    k)∥G2?∥xk?x 
                    k∥2?,设置 
     
      
       
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         = 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
       
         γ 
        
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         d 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         , 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ) 
        
       
      
        x^{k+1}=x^k-\gamma\alpha ^*_kd(x^k,\widetilde{x}^k) 
       
      
    xk+1=xk?γαk??d(xk,x 
            k)。
P G A b 1 PGA_{b1} PGAb1?
 
     
      
       
       
         F 
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
      
        F(x) 
       
      
    F(x)单调
  
     
      
       
        
        
          x 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        x^0 \in R^n 
       
      
    x0∈Rn,并且 
     
      
       
       
         γ 
        
       
         ∈ 
        
       
         ( 
        
       
         0 
        
       
         , 
        
       
         2 
        
       
         ) 
        
       
      
        \gamma \in (0,2) 
       
      
    γ∈(0,2) , 
     
      
       
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         ≥ 
        
        
        
          1 
         
        
          2 
         
        
       
      
        \alpha ^*_k \ge\frac{1}{2} 
       
      
    αk??≥21?
  
     
      
       
       
         预测器: 
        
       
      
        \textbf{预测器:} 
       
      
    预测器:选择 
     
      
       
        
        
          β 
         
        
          k 
         
        
       
         > 
        
       
         0 
        
       
      
        \beta ^k>0 
       
      
    βk>0,使得 
     
      
       
        
        
          β 
         
        
          k 
         
        
       
         ∥ 
        
       
         F 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ) 
        
       
         ? 
        
       
         F 
        
       
         ( 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ) 
        
       
         ∥ 
        
       
         ≤ 
        
       
         ν 
        
       
         ∥ 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ∥ 
        
       
      
        \beta ^k\|F(x^k)-F(\widetilde{x}^k)\|\le\nu\|x^k-\widetilde{x}^k\| 
       
      
    βk∥F(xk)?F(x 
            k)∥≤ν∥xk?x 
            k∥,其中 
     
      
       
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
         
          
          
            β 
           
          
            k 
           
          
         
           θ 
          
         
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
        
        
          β 
         
        
          k 
         
        
       
         F 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ) 
        
       
         ) 
        
       
      
        \widetilde{x}^k=Prox_{\beta ^k\theta}(x^k-\beta ^kF(x^k)) 
       
      
    x 
            k=Proxβkθ?(xk?βkF(xk));
  
     
      
       
       
         校正器: 
        
       
      
        \textbf{校正器:} 
       
      
    校正器:设置 
     
      
       
       
         d 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         , 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         , 
        
        
        
          β 
         
        
          k 
         
        
       
         ) 
        
       
         = 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ) 
        
       
         ? 
        
        
        
          β 
         
        
          k 
         
        
       
         ( 
        
       
         F 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ) 
        
       
         ? 
        
       
         F 
        
       
         ( 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ) 
        
       
         ) 
        
       
      
        d(x^k,\widetilde{x}^k,\beta ^k)=(x^k-\widetilde{x}^k)-\beta ^k(F(x^k)-F(\widetilde{x}^k)) 
       
      
    d(xk,x 
            k,βk)=(xk?x 
            k)?βk(F(xk)?F(x 
            k)),并且计算 
     
      
       
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         = 
        
        
         
         
           ( 
          
          
          
            x 
           
          
            k 
           
          
         
           ? 
          
          
           
           
             x 
            
           
             ~ 
            
           
          
            k 
           
          
          
          
            ) 
           
          
            T 
           
          
         
           d 
          
         
           ( 
          
          
          
            x 
           
          
            k 
           
          
         
           , 
          
          
           
           
             x 
            
           
             ~ 
            
           
          
            k 
           
          
         
           , 
          
          
          
            β 
           
          
            k 
           
          
         
           ) 
          
         
         
         
           ∥ 
          
         
           d 
          
         
           ( 
          
          
          
            x 
           
          
            k 
           
          
         
           , 
          
          
           
           
             x 
            
           
             ~ 
            
           
          
            k 
           
          
         
           , 
          
          
          
            β 
           
          
            k 
           
          
         
           ) 
          
          
          
            ∥ 
           
          
            2 
           
          
         
        
       
      
        \alpha^*_k=\frac{(x^k-\widetilde{x}^k)^Td(x^k,\widetilde{x}^k,\beta ^k)}{\|d(x^k,\widetilde{x}^k,\beta ^k)\|^2} 
       
      
    αk??=∥d(xk,x 
                    k,βk)∥2(xk?x 
                    k)Td(xk,x 
                    k,βk)?,设置 
     
      
       
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         = 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
       
         γ 
        
        
        
          α 
         
        
          k 
         
        
          ? 
         
        
       
         d 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         , 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         , 
        
        
        
          β 
         
        
          k 
         
        
       
         ) 
        
       
      
        x^{k+1}=x^k-\gamma\alpha^*_kd(x^k,\widetilde{x}^k,\beta ^k) 
       
      
    xk+1=xk?γαk??d(xk,x 
            k,βk)。
P G A b 2 PGA_{b2} PGAb2?
 
      
       
        
        
          F 
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          = 
         
        
          A 
         
        
          x 
         
        
          + 
         
        
          c 
         
        
       
         F(x)=Ax+c 
        
       
     F(x)=Ax+c
 A对称半正定, 
     
      
       
       
         G 
        
       
         = 
        
       
         I 
        
       
         ? 
        
       
         β 
        
       
         M 
        
       
      
        G=I-\beta M 
       
      
    G=I?βM,G对称正定
  
     
      
       
        
        
          x 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        x^0 \in R^n 
       
      
    x0∈Rn, 
     
      
       
       
         0 
        
       
         < 
        
       
         β 
        
       
         < 
        
        
        
          1 
         
         
          
          
            λ 
           
           
           
             m 
            
           
             a 
            
           
             x 
            
           
          
         
           ( 
          
         
           A 
          
         
           ) 
          
         
        
       
      
        0<\beta <\frac{1}{\lambda_{max}(A)} 
       
      
    0<β<λmax?(A)1?,并且 
     
      
       
       
         γ 
        
       
         ∈ 
        
       
         ( 
        
       
         0 
        
       
         , 
        
       
         2 
        
       
         ) 
        
       
      
        \gamma \in (0,2) 
       
      
    γ∈(0,2)
  
     
      
       
       
         预测器: 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
         
         
           β 
          
         
           θ 
          
         
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
       
         β 
        
       
         F 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         ) 
        
       
         ) 
        
       
      
        \textbf{预测器:}\widetilde{x}^k=Prox_{\beta\theta}(x^k-\beta F(x^k)) 
       
      
    预测器:x 
            k=Proxβθ?(xk?βF(xk));
  
     
      
       
       
         校正器: 
        
       
      
        \textbf{校正器:} 
       
      
    校正器:设 
     
      
       
       
         d 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         , 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ) 
        
       
         = 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
      
        d(x^k,\widetilde{x}^k)=x^k-\widetilde{x}^k 
       
      
    d(xk,x 
            k)=xk?x 
            k,设置 
     
      
       
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         = 
        
        
        
          x 
         
        
          k 
         
        
       
         ? 
        
       
         γ 
        
       
         d 
        
       
         ( 
        
        
        
          x 
         
        
          k 
         
        
       
         , 
        
        
         
         
           x 
          
         
           ~ 
          
         
        
          k 
         
        
       
         ) 
        
       
      
        x^{k+1}=x^k-\gamma d(x^k,\widetilde{x}^k) 
       
      
    xk+1=xk?γd(xk,x 
            k)。
交替方向乘子法(ADMM,alternating direction method of multipliers)
 
      
       
        
        
          m 
         
        
          i 
         
         
         
           n 
          
          
          
            x 
           
          
            , 
           
          
            y 
           
          
            , 
           
          
            z 
           
          
         
         
         
           θ 
          
         
           1 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
         
         
           θ 
          
         
           2 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
         
         
           θ 
          
         
           3 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
       
         min_{x,y,z}\theta_1(x)+\theta_2(x)+\theta_3(x) 
        
       
     minx,y,z?θ1?(x)+θ2?(x)+θ3?(x)
  
      
       
        
        
          s 
         
        
          . 
         
        
          t 
         
        
          . 
         
        
          f 
         
        
          ( 
         
        
          x 
         
        
          , 
         
        
          y 
         
        
          , 
         
        
          z 
         
        
          ) 
         
        
          = 
         
        
          0 
         
        
       
         s.t. f(x,y,z)=0 
        
       
     s.t.f(x,y,z)=0
 拉格朗日函数 
      
       
        
        
          L 
         
        
          ( 
         
        
          x 
         
        
          , 
         
        
          y 
         
        
          , 
         
        
          z 
         
        
          , 
         
        
          λ 
         
        
          ) 
         
        
          = 
         
         
         
           θ 
          
         
           1 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
         
         
           θ 
          
         
           2 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
         
         
           θ 
          
         
           3 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
        
          λ 
         
        
          f 
         
        
          ( 
         
        
          x 
         
        
          , 
         
        
          y 
         
        
          , 
         
        
          z 
         
        
          ) 
         
        
       
         L(x,y,z,\lambda)=\theta_1(x)+\theta_2(x)+\theta_3(x)+\lambda f(x,y,z) 
        
       
     L(x,y,z,λ)=θ1?(x)+θ2?(x)+θ3?(x)+λf(x,y,z)
  
     
      
       
       
         λ 
        
       
      
        \lambda 
       
      
    λ为拉格朗日乘子
 增广拉格朗日函数 
      
       
        
        
          L 
         
        
          ( 
         
        
          x 
         
        
          , 
         
        
          y 
         
        
          , 
         
        
          z 
         
        
          , 
         
        
          λ 
         
        
          ) 
         
        
          = 
         
         
         
           θ 
          
         
           1 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
         
         
           θ 
          
         
           2 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
         
         
           θ 
          
         
           3 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
        
          λ 
         
        
          f 
         
        
          ( 
         
        
          x 
         
        
          , 
         
        
          y 
         
        
          , 
         
        
          z 
         
        
          ) 
         
        
          + 
         
         
         
           ρ 
          
         
           2 
          
         
        
          ∥ 
         
        
          f 
         
        
          ( 
         
        
          x 
         
        
          , 
         
        
          y 
         
        
          , 
         
        
          z 
         
        
          ) 
         
         
         
           ∥ 
          
         
           2 
          
         
           2 
          
         
        
       
         L(x,y,z,\lambda)=\theta_1(x)+\theta_2(x)+\theta_3(x)+\lambda f(x,y,z)+\frac{\rho}{2} \|f(x,y,z)\|_2^2 
        
       
     L(x,y,z,λ)=θ1?(x)+θ2?(x)+θ3?(x)+λf(x,y,z)+2ρ?∥f(x,y,z)∥22?
  
     
      
       
       
         λ 
        
       
      
        \lambda 
       
      
    λ为拉格朗日乘子, 
     
      
       
       
         ρ 
        
       
      
        \rho 
       
      
    ρ罚因子
ADMM
 
      
       
        
        
          m 
         
        
          i 
         
         
         
           n 
          
          
          
            x 
           
          
            , 
           
          
            y 
           
          
            , 
           
          
            z 
           
          
         
         
         
           θ 
          
         
           1 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
          + 
         
         
         
           θ 
          
         
           2 
          
         
        
          ( 
         
        
          x 
         
        
          ) 
         
        
       
         min_{x,y,z}\theta_1(x)+\theta_2(x) 
        
       
     minx,y,z?θ1?(x)+θ2?(x)
  
      
       
        
        
          s 
         
        
          . 
         
        
          t 
         
        
          . 
         
        
          A 
         
        
          x 
         
        
          + 
         
        
          B 
         
        
          y 
         
        
          = 
         
        
          c 
         
        
       
         s.t. Ax+By=c 
        
       
     s.t.Ax+By=c
  
     
      
       
       
         初始化: 
        
        
        
          x 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        \textbf{初始化:}x^0 \in R^n 
       
      
    初始化:x0∈Rn, 
     
      
       
        
        
          y 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          q 
         
        
       
      
        y^0 \in R^q 
       
      
    y0∈Rq, 
     
      
       
        
        
          λ 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          m 
         
        
       
      
        \lambda^0 \in R^m 
       
      
    λ0∈Rm,并且 
     
      
       
       
         ρ 
        
       
         > 
        
       
         0. 
        
       
      
        \rho>0. 
       
      
    ρ>0.
  
     
      
       
       
         一般步骤: 
        
       
      
        \textbf{一般步骤:} 
       
      
    一般步骤:对 
     
      
       
       
         k 
        
       
         = 
        
       
         0 
        
       
         , 
        
       
         1 
        
       
         , 
        
       
      
        k=0,1, 
       
      
    k=0,1,…执行以下步骤:
  
     
      
       
       
         ( 
        
       
         a 
        
       
         ) 
        
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         ∈ 
        
        
        
          arg 
         
        
          ? 
         
        
          min 
         
        
          ? 
         
        
       
         { 
        
        
        
          θ 
         
        
          1 
         
        
       
         ( 
        
       
         x 
        
       
         ) 
        
       
         + 
        
        
        
          ρ 
         
        
          2 
         
        
       
         ∥ 
        
       
         A 
        
       
         x 
        
       
         + 
        
       
         B 
        
        
        
          y 
         
        
          k 
         
        
       
         ? 
        
       
         c 
        
       
         + 
        
        
        
          1 
         
        
          ρ 
         
        
        
        
          λ 
         
        
          k 
         
        
        
        
          ∥ 
         
        
          2 
         
        
       
         : 
        
       
         x 
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
         } 
        
       
         ; 
        
       
      
        (a)x^{k+1}\in \mathop{\arg\min}\{\theta_1(x)+\frac{\rho}{2}\|Ax+By^k-c+\frac{1}{\rho}\lambda^k\|^2:x\in R^n\}; 
       
      
    (a)xk+1∈argmin{θ1?(x)+2ρ?∥Ax+Byk?c+ρ1?λk∥2:x∈Rn};
  
     
      
       
       
         ( 
        
       
         b 
        
       
         ) 
        
        
        
          y 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         ∈ 
        
        
        
          arg 
         
        
          ? 
         
        
          min 
         
        
          ? 
         
        
       
         { 
        
        
        
          θ 
         
        
          2 
         
        
       
         ( 
        
       
         y 
        
       
         ) 
        
       
         + 
        
        
        
          ρ 
         
        
          2 
         
        
       
         ∥ 
        
       
         A 
        
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         + 
        
       
         B 
        
       
         y 
        
       
         ? 
        
       
         c 
        
       
         + 
        
        
        
          1 
         
        
          ρ 
         
        
        
        
          λ 
         
        
          k 
         
        
        
        
          ∥ 
         
        
          2 
         
        
       
         : 
        
       
         x 
        
       
         ∈ 
        
        
        
          R 
         
        
          q 
         
        
       
         } 
        
       
         ; 
        
       
      
        (b)y^{k+1}\in \mathop{\arg\min}\{\theta_2(y)+\frac{\rho}{2}\|Ax^{k+1}+By-c+\frac{1}{\rho}\lambda^k\|^2:x\in R^q\}; 
       
      
    (b)yk+1∈argmin{θ2?(y)+2ρ?∥Axk+1+By?c+ρ1?λk∥2:x∈Rq};
  
     
      
       
       
         ( 
        
       
         c 
        
       
         ) 
        
        
        
          λ 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         = 
        
        
        
          λ 
         
        
          k 
         
        
       
         + 
        
       
         ρ 
        
       
         ( 
        
       
         A 
        
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         + 
        
       
         B 
        
        
        
          y 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         ? 
        
       
         c 
        
       
         ) 
        
       
         . 
        
       
      
        (c)\lambda^{k+1}=\lambda^k+\rho(Ax^{k+1}+By^{k+1}-c). 
       
      
    (c)λk+1=λk+ρ(Axk+1+Byk+1?c).
交替方向线性近似乘子法(AD-LPMM,linearized proximal)
交替方向近似乘子法(AD-PMM,proximal)的一个特例
  
     
      
       
       
         初始化: 
        
        
        
          x 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          n 
         
        
       
      
        \textbf{初始化:}x^0 \in R^n 
       
      
    初始化:x0∈Rn, 
     
      
       
        
        
          y 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          q 
         
        
       
      
        y^0 \in R^q 
       
      
    y0∈Rq, 
     
      
       
        
        
          λ 
         
        
          0 
         
        
       
         ∈ 
        
        
        
          R 
         
        
          m 
         
        
       
      
        \lambda^0 \in R^m 
       
      
    λ0∈Rm,
  
     
      
       
       
         ρ 
        
       
         > 
        
       
         0 
        
       
      
        \rho>0 
       
      
    ρ>0, 
     
      
       
       
         α 
        
       
         ≥ 
        
       
         ρ 
        
        
        
          λ 
         
         
         
           m 
          
         
           a 
          
         
           x 
          
         
        
       
         ( 
        
        
        
          A 
         
        
          T 
         
        
       
         A 
        
       
         ) 
        
       
      
        \alpha\ge\rho\lambda_{max}(A^TA) 
       
      
    α≥ρλmax?(ATA), 
     
      
       
       
         β 
        
       
         ≥ 
        
       
         ρ 
        
        
        
          λ 
         
         
         
           m 
          
         
           a 
          
         
           x 
          
         
        
       
         ( 
        
        
        
          B 
         
        
          T 
         
        
       
         B 
        
       
         ) 
        
       
         . 
        
       
      
        \beta\ge\rho\lambda_{max}(B^TB). 
       
      
    β≥ρλmax?(BTB).
  
     
      
       
       
         一般步骤: 
        
       
      
        \textbf{一般步骤:} 
       
      
    一般步骤:对 
     
      
       
       
         k 
        
       
         = 
        
       
         0 
        
       
         , 
        
       
         1 
        
       
         , 
        
       
      
        k=0,1, 
       
      
    k=0,1,…执行以下步骤:
  
     
      
       
       
         ( 
        
       
         a 
        
       
         ) 
        
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
         
          
          
            1 
           
          
            α 
           
          
          
          
            θ 
           
          
            1 
           
          
         
        
       
         [ 
        
        
        
          x 
         
        
          k 
         
        
       
         + 
        
        
        
          ρ 
         
        
          α 
         
        
        
        
          A 
         
        
          T 
         
        
       
         ( 
        
       
         A 
        
        
        
          x 
         
        
          k 
         
        
       
         + 
        
       
         B 
        
        
        
          y 
         
        
          k 
         
        
       
         ? 
        
       
         c 
        
       
         + 
        
        
        
          1 
         
        
          ρ 
         
        
        
        
          λ 
         
        
          k 
         
        
       
         ) 
        
       
         ] 
        
       
         ; 
        
       
      
        (a)x^{k+1}=Prox_{\frac{1}{\alpha}\theta_1}[x^k+\frac{\rho}{\alpha}A^T(Ax^k+By^k-c+\frac{1}{\rho}\lambda^k)]; 
       
      
    (a)xk+1=Proxα1?θ1??[xk+αρ?AT(Axk+Byk?c+ρ1?λk)];
  
     
      
       
       
         ( 
        
       
         b 
        
       
         ) 
        
        
        
          y 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         = 
        
       
         P 
        
       
         r 
        
       
         o 
        
        
        
          x 
         
         
          
          
            1 
           
          
            β 
           
          
          
          
            θ 
           
          
            2 
           
          
         
        
       
         [ 
        
        
        
          y 
         
        
          k 
         
        
       
         + 
        
        
        
          ρ 
         
        
          β 
         
        
        
        
          B 
         
        
          T 
         
        
       
         ( 
        
       
         A 
        
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         + 
        
       
         B 
        
        
        
          y 
         
        
          k 
         
        
       
         ? 
        
       
         c 
        
       
         + 
        
        
        
          1 
         
        
          ρ 
         
        
        
        
          λ 
         
        
          k 
         
        
       
         ) 
        
       
         ] 
        
       
         ; 
        
       
      
        (b)y^{k+1}=Prox_{\frac{1}{\beta}\theta_2}[y^k+\frac{\rho}{\beta}B^T(Ax^{k+1}+By^k-c+\frac{1}{\rho}\lambda^k)]; 
       
      
    (b)yk+1=Proxβ1?θ2??[yk+βρ?BT(Axk+1+Byk?c+ρ1?λk)];
  
     
      
       
       
         ( 
        
       
         c 
        
       
         ) 
        
        
        
          λ 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         = 
        
        
        
          λ 
         
        
          k 
         
        
       
         + 
        
       
         ρ 
        
       
         ( 
        
       
         A 
        
        
        
          x 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         + 
        
       
         B 
        
        
        
          y 
         
         
         
           k 
          
         
           + 
          
         
           1 
          
         
        
       
         ? 
        
       
         c 
        
       
         ) 
        
       
         . 
        
       
      
        (c)\lambda^{k+1}=\lambda^k+\rho(Ax^{k+1}+By^{k+1}-c). 
       
      
    (c)λk+1=λk+ρ(Axk+1+Byk+1?c).
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。 如若内容造成侵权/违法违规/事实不符,请联系我的编程经验分享网邮箱:veading@qq.com进行投诉反馈,一经查实,立即删除!