Feature/float128
Summary
Add support for __float128
floating point type to dune. Since the gnu quad-precision library provides only a C interface and only very limited support in C++ standard library, some cmath and type_traits utilities are added to allow for a usage of quad precision types in FieldVector and FieldMatrix. Since a discussion in !515 (closed) a wrapper Dune::Float128
around the intrinsic type is provided to have better control about overload resolution.
Motivation
Quad precision floating point types are currently not supported natively by all compilers or even in hardware, but some compilers provide a library implementation that might be sufficient already. The GNU multiprecision library adds this feature and is supported in GCC and clang >= 3.9. Also in intel icc compiler a quad precision type can be enabled.
Intent to merge: 2018-05-25 2018-06-25
Merge request reports
Activity
- dune/common/quadmath.hh 0 → 100644
299 } // namespace Dune 300 301 namespace std 302 { 303 #ifndef NO_STD_NUMERIC_LIMITS_SPECIALIZATION 304 template <> 305 class numeric_limits<Dune::Impl::Float128> 306 { 307 using Float128 = Dune::Impl::Float128; 308 309 public: 310 static const bool is_specialized = true; 311 static Float128 min() { return FLT128_MIN; } 312 static Float128 max() { return FLT128_MAX; } 313 static Float128 lowest() { return -FLT128_MAX; } 314 static const int digits = FLT128_MANT_DIG; I think that the functions are declared
noexcept
since C++11: http://en.cppreference.com/w/cpp/header/limitsIs there any info on whether
__float128
is, in fact, usable in a constant expression? Problem is: we're only allowed to mark a functionconstexpr
if there is at least one possible combination of (template-)arguments where the function evaluation is a constant expression. And if__float128
is not usable in a constant expression, I don't think there is any function here (or inFloat128
) that can be markedconstexpr
.I'd say yes: http://coliru.stacked-crooked.com/a/466039c24058c3b1 But I don't have any harder evidence.
I looked in the standard and could not find anything either. The standard mentions extended integer types (which would qualify for use in constant expressions), but it does not mention extended floating-point types.
Boost has a wrapper which they say is constexpr, no idea how well they thought that through though.
Of course, you could work around all this: Make the wrapper a template class, and templatize it with the wrapped type. Then do a
using Float128 = Float128Impl::Wrapper<__float128>;
Make sure all the freestanding math functions are templates too. Then you can argue: well, I could instantiate the template with some literal typeLT
in which case the wrapper would really be a literal type. And for the math functions you can argue: well, I could provide an overloadconstexpr LT acosq(LT);
, thus I can mark thetemplate<class T> Wrapper<T> acos(Wrapper<T>);
function template asconstexpr
. Then whether those things really produce constant expressions depends on whether the compiler considers the underlying expressions as constant. So you could get different results on different compilers.I have marked only functions constexpr that depend on
__float128
orFloat128
only and do not call any C functions. So, I think, the classFloat128
is a LiteralType and can thus be used as argument in constexpr functions. Maybe it can even be simplified to an aggregate type by removing all the constructors.
- Resolved by Simon Praetorius
added 1 commit
- c8d57a2b - test of several cmath function for Float128 added
Another Dumux test fails for expressions like
using Scalar = Dune::Float128; ... Scalar a; using std::max; max(a, 0.0);
because
0.0
is interpreted asdouble
and thestd::max
function isn't triggered. What should we do about this? Enforcing the user to writemax(a, Scalar(0.0));
every time or overloading
Dune::Float128 max(Dune::Float128 a, double b);
The latter seems more attractive to me.
Same for
pow
. I added a merge request in simon.praetorius.That
max
andmin
fail for different floating point types is the correct behavior ofstd::max
andstd::min
, respectively, since it is defined for exactly the same type. Otherwise, it can not be implemented with return-type a references, since a case into a common_type must be performend in?:
operator.min()
andmax()
don't work for mixed types, not evenfloat
mixed withdouble
. Don't bother trying to implement it. I'm afraid users will have to useScalar(0)
as an argument.Mixed type min/max is possible to implement, but then it is different from std::min/std::max, and in several places in Dune std::min/std::max is called explicitly. So, does it make sense to provide overloads for mixed type in Dune::Impl namespace to be found only if
std::
is not added to function call?For
pow
it is different, since there, the general signaturepow(Arithmetic1,Arithmetic2)
is allow forstd::pow
, since could be implemented also forFloat128
and by default it casts the arguments to_float128
.pow()
is another story. That is useful to have with mixed data types.The difference is: math functions like
pow()
are specified to have overloads for all kinds of combinations and automatically promote their arguments, see here, case (7).min()
andmax()
on the other hand take their arguments by reference and do not promote them. The "taking by reference" part has created all kinds of problems. Yes, we do overload them for some of the simd types, because for those the result can be a mix of both arguments, and thus the default implementation supplied by the standard library is insufficient (and would not even compile). But I'll readily admit that providing overloads for them is questionableAs a workaround for the min/max problem: We introduce the cmath functions
fmin
,fmax
and these allow to have mixed types. Arguments are passed by value and return type is the promoted type. So, if you callstd::min/std::max
you are required to have the same types but in case you do not bother, call fmin/maxI updated the merge request in simon.praetorius, throwing out
min
/max
and generalizingpow
to hopefully the right thing.added 1 commit
- 944cbe70 - added mixed type arguments for binary cmath functions
because
0.0
is interpreted asdouble
and thestd::max
function isn't triggered. What should we do about this? Enforcing the user to writemax(a, Scalar(0.0));
This explicit instatiation is necessary in many cases. It is the recommended way to write generic operators. If you look into ISTL this is one the changes we had to introduce to make other field types work.
I think this should be OK, as the user seldom writes a new operator (or local operator) and if someone later wants to use it with something like
Dune::Float128
it is very straight forword to fix such errors.I'm not in favor of additional overloads, as it is not clear which ones you will need. ALso many users will write
std::max(a,0.0)
, so that you will have to change the code anyway.added 1 commit
- 0bdd80e4 - cleanup of Float128 class and removed int-assignment test in fvectortest
- Resolved by Simon Praetorius
- Resolved by Simon Praetorius
- Resolved by Simon Praetorius
added 1 commit
- 682a0b11 - marked quadmath functions inline and noexcept
@core Any objections? Otherwise I intent to merge
2018-05-252018-06-25.(Edit: fixed mistyped date, thanks @markus.blatt)
Edited by Jö Fahlkeassigned to @joe
@joe may borrow your time machine sometimes? Might come in handy.
mentioned in commit fbe0ac34
mentioned in issue #124 (closed)
Fantastic! Thank you, @simon.praetorius and @joe!