Skip to content
Snippets Groups Projects
Commit 0bb1e69d authored by Jö Fahlke's avatar Jö Fahlke
Browse files

[TLIME] Relax convergence criterion.

The previous convergence limit was originally determined experimentally
as 1e-11.  This worked for many blas implementations and architectures.
However, when used with openblas on skylake, apparently the residual norm
would not go below ~1e-10, so convergence was never achieved.  In fact, even
on non-skylake the residual norm would go above 1e-11 again after briefly
dipping below, if iterating further.

We believe that this is due to openblas selecting -- at runtime -- some
skylake specific algorithm leading to a different ordering of operations, in
turn leading to differences in numerical cancellation.  We have however not
verified this conclusively, nor have we identified precisely which blas
algorithm is causing this.

This patch raises the convergence limit to
`sqrt(numeric_limits<field_type>::epsilon())`.  This limit has no theoretical
justification -- it was selected because it usually works as a convergence
limit for other (completely unrelated) algorithms, and because it works for
both Skylake and other architectures (AMD Epyc) in this particular case.

Developed together with Sebastian Westerheide.

Fixes: #48.
parent 8e042356
Branches
Tags
1 merge request!221[TLIME] Relax convergence criterion.
Pipeline #10496 passed
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment