Skip to content
  • Jö Fahlke's avatar
    [TLIME] Relax convergence criterion. · 0bb1e69d
    Jö Fahlke authored
    The previous convergence limit was originally determined experimentally
    as 1e-11.  This worked for many blas implementations and architectures.
    However, when used with openblas on skylake, apparently the residual norm
    would not go below ~1e-10, so convergence was never achieved.  In fact, even
    on non-skylake the residual norm would go above 1e-11 again after briefly
    dipping below, if iterating further.
    
    We believe that this is due to openblas selecting -- at runtime -- some
    skylake specific algorithm leading to a different ordering of operations, in
    turn leading to differences in numerical cancellation.  We have however not
    verified this conclusively, nor have we identified precisely which blas
    algorithm is causing this.
    
    This patch raises the convergence limit to
    `sqrt(numeric_limits<field_type>::epsilon())`.  This limit has no theoretical
    justification -- it was selected because it usually works as a convergence
    limit for other (completely unrelated) algorithms, and because it works for
    both Skylake and other architectures (AMD Epyc) in this particular case.
    
    Developed together with Sebastian Westerheide.
    
    Fixes: #48.
    0bb1e69d