Skip to content
  • Jö Fahlke's avatar
    [!221] [TLIME] Relax convergence criterion. · ae21be67
    Jö Fahlke authored
    Merge branch 'fix-tlime-residual-limit' into 'master'
    
    The previous convergence limit was originally determined experimentally as
    1e-11. This worked for many blas implementations and architectures. However,
    when used with openblas on skylake, apparently the residual norm would not go
    below ~1e-10, so convergence was never achieved. In fact, even on non-skylake
    the residual norm would go above 1e-11 again after briefly dipping below, if
    iterating further.
    
    We believe that this is due to openblas selecting -- at runtime -- some
    skylake specific algorithm leading to a different ordering of operations, in
    turn leading to differences in numerical cancellation. We have however not
    verified this conclusively, nor have we identified precisely which blas
    algorithm is causing this.
    
    This patch raises the convergence limit to
    `sqrt(numeric_limits<field_type>::epsilon())`. This limit has no theoretical
    justification -- it was selected because it usually works as a convergence
    limit for other (completely unrelated) algorithms, and because it works for
    both Skylake and other architectures (AMD Epyc) in this particular case.
    
    Developed together with Sebastian Westerheide.
    
    Fixes: \#48.
    
    
    See merge request !221
    
    (cherry picked from commit 9da2de98)
    
    0bb1e69d [TLIME] Relax convergence criterion.
    ae21be67