[draft][AD] Add some automated differentiation utilities (!241) · Merge requests · fufem / dune-fufem

Carsten Gräser requested to merge feature/forward-ad into master Oct 21, 2024

This provide the ADValue<T,maxOrder,dim> class as a primitive for automatic differentiation. It that encapsulates a scalar type T (e.g. T=double) and provides arithmetic operations. Each ADValue is considered as an intermediate result in the evaluation of a scalar dim-variate function but does not store the value only, but the jet of all derivatives up to maxOrder. Currently only maxOrder<=2 is implemented. Additionally, this provides some convenience utilities for using this ADValue:

A functions for easy definition of AD-aware nonlinear differentiable functions.
Overloads for a few basic functions (abs, sin, cos, log, exp, sqrt, pow).
Some glue code to implements dune-functions differentiable functions (for a callback f simply use ADFunction<f>.

Usage example:

using std::sin;
using std::exp;
using std::pow;
using namespace Dune::Indices;

// Define and initialize functions arguments of a tri-variate
// twice differentiable function.
auto x0 = ADValue<double,2,3>(23., 0);
auto x1 = ADValue<double,2,3>(42., 1);
auto x2 = ADValue<double,2,3>(13., 2);

// Evaluate expression
auto y = pow(2, exp(sin(x[0]*x[1])*sin(x[2]) + 3));

// Extract value of function, 1st and 2nd order partial derivatives
y.partial();
y.partial(i);
y.partial(j);

Why propose this despite the fact we have AdolC bindings?

Disadvantages of AdolC tape-based mode:
- This relyies on global variables and is not thread safe.
- It is much slower. For simple expressions I measured, that it is between 10 and 100 times slower than ADValue. When using AD for the derivatives of the energy with a Newton method for the minimal surface equation, assembly is takes about 20 times as long with AdolC compared to ADValue. The latter is essentially as fast as with manually implemented derivatives and evenmore allows for multi-threading (not used in the comparison).
AFAIK ADValue can be characterized as tapeless forward mode Taylor polynomial AD.
AdolC also has a tape-less forward mode, which is probably similar to ADValue but has some conceptual restrictions:
- Only 1st order derivatives are available, so this can't be used for Newton's method.
- You have to decide in advance on the maximal domain dimension and set it using a ~~macro~~ global variable. This influences the memory consumption of all AD-values stored later on.
- There's also no guarantee about thread-safety.

Edited Oct 23, 2024 by Carsten Gräser

[draft][AD] Add some automated differentiation utilities

Merge request reports