[draft][AD] Add some automated differentiation utilities
This provide the ADValue<T,maxOrder,dim> class as a primitive
for automatic differentiation. It that encapsulates a scalar type
T (e.g. T=double) and provides arithmetic operations.
Each ADValue is considered as an intermediate result in
the evaluation of a scalar dim-variate function but does
not store the value only, but the jet of all derivatives up
to maxOrder. Currently only maxOrder<=2 is implemented.
Additionally, this provides some convenience utilities
for using this ADValue:
- A functions for easy definition of AD-aware nonlinear differentiable functions.
- Overloads for a few basic functions (
abs,sin,cos,log,exp,sqrt,pow). - Some glue code to implements dune-functions differentiable
functions (for a callback
fsimply useADFunction<f>.
Usage example:
using std::sin;
using std::exp;
using std::pow;
using namespace Dune::Indices;
// Define and initialize functions arguments of a tri-variate
// twice differentiable function.
auto x0 = ADValue<double,2,3>(23., 0);
auto x1 = ADValue<double,2,3>(42., 1);
auto x2 = ADValue<double,2,3>(13., 2);
// Evaluate expression
auto y = pow(2, exp(sin(x[0]*x[1])*sin(x[2]) + 3));
// Extract value of function, 1st and 2nd order partial derivatives
y.partial();
y.partial(i);
y.partial(j);
Why propose this despite the fact we have AdolC bindings?
- Disadvantages of AdolC tape-based mode:
- This relyies on global variables and is not thread safe.
- It is much slower. For simple expressions I measured, that
it is between 10 and 100 times slower than
ADValue. When using AD for the derivatives of the energy with a Newton method for the minimal surface equation, assembly is takes about 20 times as long with AdolC compared toADValue. The latter is essentially as fast as with manually implemented derivatives and evenmore allows for multi-threading (not used in the comparison).
- AFAIK
ADValuecan be characterized as tapeless forward mode Taylor polynomial AD. - AdolC also has a tape-less forward mode, which is probably
similar to
ADValuebut has some conceptual restrictions:- Only 1st order derivatives are available, so this can't be used for Newton's method.
- You have to decide in advance on the maximal domain dimension
and set it using a
macroglobal variable. This influences the memory consumption of all AD-values stored later on. - There's also no guarantee about thread-safety.
Edited by Carsten Gräser