Conceptually ISTL uses multi indices, but there is no direct support in dune-istl.
On the other hand most discretization modules use multi-index objects internally to handle nested vectors and matrices. In all these modules we find in some way additional helper functions that allow to access or modify ISTL containers, based on multi indices or partial multi indices.
I discussed with @simon.praetorius and tried to collect some functions that might be helpful. Including such functions in dune-istl directly should significantly simplify the implementation of backends, e.g. in PDELab, functions or Amdis. FEM currently has very limited support of nested containers, on the other hand, it the burden of writing a backend vanishes, this might perhaps change?!
This issue should act as collection of ideas for such a multi index interface. I start with ideas for the vectors, matrices a bit more complex and should follow later:
we need a proper definition of a multi index concept.
for convenience we might want to add an operator [](MultiIndex) or some free standing access function like accessVectorEntry(Vector,MultiIndex).
functions for perform operators on a set of MultiIndex. This allows for example to
gather/scatter entries into a communication buffer
compute a parallel scalar product only on entries owned by the current rank
Designs
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
For the record: In dune-fufem we also have utilities for accessing nested istl-matrices using non-uniform multi-indices similar to the one for vectors in dune-functions. Most of the problems originate from using multi-type containers and corresponding dynamic non-uniform multi-indices:
Accessing multi-type containers with dynamic indices requires some dynamic to static dispatch (which comes at a runtime cost).
Any method/utility providing entry access by returning references to entries requires that the return type (i.e. the fully resolved entry type)
is uniform (no mixed real/complex entries).
Accessing non-uniform multi-indices cannot generically be mapped to istl-matrices because the latter all require a row and column index. Hence one needs to provide some additional information to do this. In dune-fufem this is solved by tagging matrices using SingleRowMatrix and SingleColumnMatrix wrappers.
These are all important issues. Thanks for bringing these up here.
Having some specific implementations, of e.g. multi-index generation in the downstream discretization modules is still needed and cannot be extracted or generalized. We discussed that there is still code duplication for which we might find the common part that could be useful in the core module dune-istl.
One aspect is the multi-index access. It has intrinsic issues as you mentioned and it is hard to reimplement this in your own module. Maybe this is independent enough to be extracted into dune-istl.
Due to the issue of the required common element type, we thought that it would be better in some situations to have a "for-each" or "apply" function, that does not return, but passes the container block into a generic function passed to the algorithm. Maybe this is easier to implement (?). Still we need to protect several invalid accesses and provide a proper recursion stop condition for the hierarchic traversal, but there are several ways to do that.
Just a sketch:
template<classIndices,classContainer,classF>voidforEachIndex(Indicesconst&indices,Container&&container,F&&f){for(autoconst&index:indices)applyAtIndex(index,container,f);}template<classMultiIndex,classBlock,classF,std::size_ti=0>voidapplyAtIndex(MultiIndexconst&mi,Block&&block,F&&f,index_constant<i>={}){ifconstexpr(i<mi.max_size()){if(i<mi.size()){ifconstexpr(hasDynamicIndexAccess<Block>){// access block directlyapplyAtIndex(mi,block[mi[i]],f,index_constant<i+1>{});}elseifconstexpr(hasStaticIndexAccess<Block>){// access block by searching for the static index equal to mi[i]usingSize=decltype(Hybrid::size(block));returnHybrid::switchCases(std::make_index_sequence<Size::value>{},mi[i],[&](autoii){applyAtIndex(mi,block[ii],f,index_constant<i+1>{});});}else{// leaf entry in the containerf(mi,block);}}else{// end of multi-indexf(mi,block);}}else{// end of multi-index (static condition)f(mi,block);}}
The matrix part is difficult, probably not possible in a full generic way without anything extra on the matrix side.
Comparing those approaches @simon.praetorius's seems to be very similar to what we do in dune-fufem with a few differences. The most important one to me is, that dune-fufem allows to customize the termination criterion in the recursion which is needed to treat some special cases:
Historically it was very common to use FieldMatrix<K,1,1> as scalar (because you could not use BCRSMatrix<K> directly for a long time) which mostly works. Unfortunately, there is no cast to non-const K&, such that the is K or casts to K termination criterion does not work.
Similarly there's no cast of ScaledIdentityMatrix<K,1,1> to K& and const K&.
When assembling into something like OuterMatrix<... ScaledIdentityMatrix<K,n> ...> it's hard to decide from a dynamic multi-index if entries exist. To avoid this one can implement some addToEntry(...) utility that ignores non-existing entries. This again needs to parametrize the index resolution with a visitor and some custom termination criteria. A prototype (that doesn't convince me fully so far) is implemented here.
In contrast, @santiago.ospina's approach seems to not provide access with dynamic multi-indices so far.
In contrast, @santiago.ospina's approach seems to not provide access with dynamic multi-indices so far.
Not supporting dynamic multi-indices with multi-type containers is a deliberate decision: Code itself is trivial, but I regard it as an anti-pattern. Especially for index accessors. To avoid the run-time penalty, I provide several utility functions that traverse the container tree with hybrid multi-indices. When this is not sufficient, this force the user to hard-code the dynamic dispatch by themselves and make the run-time penalty obvious.
I think all these remarks back my original statement ... it would be helpful to include this already in dune-istl. I further think we should try to identify the necessary usecases in the different modules.
@santiago.ospina Interestingly I'm of the opposite wrt most of what you said:
In contrast to access by appropriate static/hybrid multi-indices, I consider access by dynamic ones to be the non-trivial case. Especially for index accessors.
For me the dynamic to static dispatch at runtime is as much an anti-pattern as using std::variant with std::visit.
In contrast to FE ansatz trees, traversal of matrix/vector containers with multi-indices is something that I have never needed so far.
To me it's pointless to provide a generic functionality if it only covers the trivial case that every user can implement herself but leaves out the complicated case that most will struggle to get right.
(edit: Removed second part of my comment, because it's a reply to another post)
@carsten.graeser Yes, thinking about it better, you are probably right wrt the difficulty on the implementation side. If I understand it correctly, is mostly because it requires inside knowledge from the containers and their contained types. So I retire my statement about it.
Perhaps it's worth rewinding and see why I have a position that is diametrically opposite to yours: I created dune-assembler, which is essentially a re-implementation of PDELab. From the very beginning, the design envisioned to have nested and heterogeneous containers. In particular, I tried to be very careful in having most of the elements to allow unit safety which is a more restrictive condition than heterogeneous containers. Thus, virtually everything has been handled with hybrid multi-indices from day zero, and as a result, I never had to fall back to dynamic multi-index container access. Neither for DOFs nor for communication operations.
So, despite having a functional complex finite element module that handles heterogeneous containers, I don't find difficult the dynamic multi-indices accesses because I don't have them in the first place. On the other hand, since I find hybrid multi-indices a necessary condition towards heterogeneous containers usage, what I find difficult is the generation of such hybrid multi-indices. And this is where most of our disagreement comes from: I do not think that using dynamic multi-indices access is the right approach for our applications whereas you treat it as a fundamental part of your machinery.
So, to wrap up, I think that your arguments make sense as long as one can show the need for the dynamic multi-index access. That is your case, and is totally opposite to mine.
I do see the value on having dynamic dispatch for code like @simon.praetorius's and @carsten.graeser's. It makes sense. But, as a hard-core hybrid multi-index believer, I find very difficult to parse functions that may unintentionally trigger the dynamic dispatch (i.e. MultiTypeContainer::operator[](std::size_t)). It is simply too easy to forget and get into a run-time penalty when not needed. And the problem with this is that it will compile and go unnoticed for very long time (e.g. staging/dune-typetree!106 (merged)).
So, in a generic module like dune-istl, I would like to propose to have different function names/operators for the two different approaches.
Why just dune-ISTL? I think this belongs into dune-common, as many matrix types are already there. Further, adding ISTL as a dependency to dune-typetree just for getting multi-indices doesn't sound right, does it?
So, to wrap up, I think that your arguments make sense as long as one can show the need for the dynamic multi-index access. That is your case, and is totally opposite to mine.
If this is really needed I can repeat the obvious: The reason why dynamic multi-index access is necessary is that the many people use it. Dune-functions and thus also dune-fufem and Amdis rely on in. As far as I remember, PDELab did, too.
why dune-istl? Because dune-common doesn't know about nested vector/matrix types. consequently it doesn't use multi-indices
dune-typetree... how did you get the impression that we want to introduce an additional dependency to dune-typetree? First multi-indeices are not defined dune-typetree and second we don't want to enforce a particular multi-index implementation, but build upon a documented interface.
But at least the TypeTree::HybridTreePath might possibly be a candidate for multi-indices. It is not implemented like this within dune-functions, but in another toolbox this might be the case. If I understand @santiago.ospina correctly, he uses the hybrid container to implement multi-indices. Nevertheless, if we can agree on a minimal interface for multi-indices, maybe we can adapt the tree-path implementation correspondingly. No need to add a dependency to dune-typetree, though.
Yes, since staging/dune-typetree!107 (merged) was merged TypeTree::HybridTreePath is a full multi-index with respect to the MultiIndex concept given above. Dune::ReservedVector is not. This is a (sub-optimal) implementation of what would be missing.
while experimenting, I observed, that we need methods to traverse different containers at once, e.g. to calculate a scalar product.
This holds for the suggested forEachIndex, but also for other functions, like the flatVectorForEach etc. One problem with the later is that it's difficult to implement this in a backwards compatible way.
The other question in general is, what to do, when the nesting depth of the different containers differ.
One could for example think of a masked evaluation, where the vector is BlockVector<FieldVector<double,k>>, but the bool vector has lower depth, e.g. std::vector<bool>. The reason is that one will often be in a situation, we have to consider the whole FieldVector or skip it completely. Opinions?
For different containers to work in a hierarchic traversal, we still nee a static break condition. IT could, for example, be the case that the two containers share some hierarchy and then one cannot be iterated further down. This should stop the iteration. However, my inner-product implementation linked above does not do this.
The questions brought up by @christi are indeed important. And we should IMHO first clarify 1. and 2.
Should include the question if we want to support dynamic non-uniform indices (for me the answer is yes).
... could ignore the question on a member function/operator for now. Having an external utility would be fine as a first step. However, the question of entry visitation additionally to plain access seems to be important to me (see above).
Before considering the question of member functions and 3. one should IMO try to generalize the concepts/utilities to matrices and think about the Single(Row|Column)Matrix approach that @simon.praetorius and I came up with in parallel.
in the communication case, I think we have to support non-uniform indices.
also in many other cases non-uniform indices make life easier, as this functionality would otherwise need to be provided by the discretization module or by the user.
I agree, a member function can easily be added later.
and finally yes, we also have to think about matrices, but I started with the vectors, as these are the ones, where I currently have a better picture of what is needed.
I think it's important that the interfaces provided for matrices and vectors are consistent. Hence we should have the matrix case in mind from the beginning to avoid having to change the vector case later. To give an example: The vector-backend in dune-functions was invented having vectors in mind only and thus provides operator[] for multi-index access. In contrast the later developed corresponding matrix-backend in dune-fufem has to use operator() (at least until C++23).
What about adding one of our implementations of Single[Row|Col]Matrix to dune-istl? Additionally, an array-like container with static size and istl container interface would be nice (but this is not so important for now)
This may sound trivial, but there has to be an agreement on the direction of the multi-index. When reading indices left-to-right: is it Inner container index to Outer container index (IO), or Outer container index to Inner container index (OI)? This is important because:
Algorithms will result on either a push or a pop which for dynamic multi-indices may be more or less efficient depending on the direction of the algorithms:
Container access consume multi-indices and will be more efficient when having IO direction.
Container traversals produce multi-indices and will be more efficient when having OI direction.
On dynamic multi-indices, a reversal of the indices may be very costly if performed very often.
If this is failed to be addressed, code may seem to work, until someone actually uses nested containers and finds out that there is a mismatch in the direction somewhere (flat containers are direction agnostic and most code will seem to work).
Different discretization modules have different conventions when they produce and consume their multi-indices:
Module
Direction
Type
Amdis
OI
Dynamic
Assembler
OI
Hybrid
DuMuX
OI
?
FuFem
OI
Dynamic
Functions
OI
Dynamic
PDELab
IO
Dynamic
Note: Feel free to modify this comment to add/modify entries to the table
This may sound trivial, but there has to be an agreement on the direction of the multi-index.
Any thing else then outer-to-inner does not make sense to me. And I cannot imagine any reason why this should be less efficient (unless one would really remove consumed indices using a pop_front operation - but why would anyone want to do this).
BTW: Dune-fufem also uses outer first. The same is probably true for Amdis since it also relies on dune-functions. And my last information is, that there's the plan that PDELab slowly moves into the direction of dune-functions.
I also think that OI is ore intuitive and should be implementable without a performance penalty. As a first step we would have to decide on a multiindex interface anyway and depending on the actual implementation a thin wrapper could then map to this interfaces. This does not only concern OI/IO, but also the question about pop and shift, which are mostly equivalent, and can be handled via a small wrapper.
Discretisation modules tend to produce indices in a bottom-up approach, or in other words, they first produce inner container indices, then modify them, and finally proceed with the next outer container. This is also not very optimal for OI dynamic multi-indices as this needs a push_front (e.g. dune-functions).
This very much depends on the application. In many cases this just uses push_back(). Furthermore one could implement the other cases without push_front() within the existing dune-functions interface. However, I'd not try to do this unless someone shows that this really provides a measurable performance improvement.
[...] And my last information is, that there's the plan that PDELab slowly moves into the direction of dune-functions.
That's partially one of the things I tried to do in dune-assembler: A wrapper of PDELab grid function spaces that satisfies the dune-functions interface, but the reversal of multi-indices in the localView.index(i) ended up being super costly! Now it's solved with another approach, but that's in principle why I am bringing up the multi-index direction question below. It has real performance implications.
I am very skeptical that PDELab will ever change the direction of the multi-indices because it propagates to all sort of places in the module, but who knows...
@robert.kloefkorn, @andreas.dedner, @samuel.burbulla, @tkoch, @markus.blatt I want to fill this table regarding multi-index production on blocked containers for different framekworks. Do you have container blocking in your modules? If so, what direction of multi-index do you use?
we use container blocking in dumux but currently in a very limited/fixed scope (one level of static blocking (multi-type vector/matrix) plus one level of dynamic blocking (blockvector/bcrs)). We thought about multi-indices a lot to simplify vector/matrix traversal and generalize assembly (support different blocking strategies). We mostly did not progress because it is hard to get a generic implementation right. So I really appreciate this initiative! For the multi-index direction, outer->inner seems like the natural choice for me, I haven't thought about the other direction.
When talking to @christi we discussed that while we do not necessarily need a concrete MultiIndex type in dune-istl, we at least need a MultiIndex concept: how do we want to use the multi-indices. In the comments above there were mentioned already a few possibilities how to traverse the multi-index components - by pop() function, or by shifted indexing. A suggestion (up to discussion):
template<classM>conceptMultiIndex=requires(constMmi,std::size_ti){requiresstd::integral<typenameM::value_type>{mi[i]}->std::convertible_to<typenameM::value_type>;{mi.empty()}->std::convertible_to<bool>;{mi.size()}->std::convertible_to<std::size_t>;{mi.max_size()}->std::convertible_to<std::size_t>;// must be constexprstd::integral_constant<std::size_t,m.max_size()>;};
In case you have a balanced hierarchic container (length of the multi-index is the same for all blocks), we can use a std::array-like container to represent the MultiIndex. Then max_size == size and we can implement a static recursion break condition. In case of dynamic (unbalanced) hierarchic containers, the max_size is either = capacity and thus a final break condition, or it is something like std::numeric_limits<size_type>::max() and thus hardly ever used to break the recursion. See also the example above.
All std containers should have a constexpr max_size() method. Also ReservedVector does. HybridTreePath does not, but could easily be added.
Maybe we should instead ask for a conditional constexpr size() method. But I hate to always check whether size is constexpr and then call it in a if constexpr context, otherwise just using if. This makes the implementation harder. BTW, the new c++ std proposal mdspan implements both, a static and dynamic size/extent function, where the static implementation returns std::dynamic_extent in case no static size information is available for the corresponding dimension.
Maybe max_size (or static size) is not needed at all. Then we should agree on another ways to implement the recursion break condition, e.g., the container cannot be accessed using operator[] anymore. However, the proposed hierarchic container TypeTree::SizeTree has a "leaf" container EmptySizeTree that implements an operator[] redirecting to itself. For this, the access-break condition does not work. Maybe this is an exotic case, though.
The issue with the different directions of the multi-index processing makes the generalization a bit more difficult. We currently find these different conventions in modules, so we cannot completely ignore it. While I hear arguments for OI, there might be other modules than just PDELab implementing IO. We have now two options:
Enforce one of the directions, probably OI since used by more modules.
Require a more involved MultiIndex interface that is invariant of the direction, e.g., it could provide functions pop() or get() and shift() that either provide the most left or most right index subject to a corresponding index shift.
Maybe we could also provide a customization point so that downstream modules can simply implement their own strategy of MultiIndex traversal, e.g., something like the ShiftedMultiIndex class in dune-functions.
Option 2. and also the specialization idea seems more complicated to document. On the other hand it provides more flexibility.
I would advocate to decide for one direction in the interface. If a module uses reversed order, providing a wrapper reversing the order (on the fly) should be simple. Technically, both directions should work equally well (maybe except (*) below). However I can see many conceptual reasons for the OI order, but I have not seen a single one for the reverse order.
(*) When using the index to access a container, you first have to provide the outer index. For OI this means to simply use the first one which requires to dereference an address. Later indices are obtained by successive increments of this address. When using IO with a dynamic index, one additionally has to add the (runtime) length of the multi-index to the address before dereferencing which may come at an extra cost. While one can argue that for some (less frequently used) index schemes the same applies when building an OI index, you only do this once, while the index is often used several times to access a container.
This issue is only problematic on container accessors with dynamic multi-indices. On container traversal, you control both multi-index creation and container access. And reversal is for free on hybrid multi-indices.
Maybe we could also provide a customization point so that downstream modules can simply implement their own strategy of MultiIndex traversal, e.g., something like the ShiftedMultiIndex class in dune-functions.
In my opinion, this is the way. That's how I overcame the performance penalty of the reversal of the multi-indices from IO format in PDELab to OI format in dune-assembler (Customization point vs Customized point).
Introducing such customization points complicates the interface significantly. This would e.g. mean to introduce a new interface, to get the first (in what ever user defined sense) entry of a container, like e.g. Dune::front(multiIndex), where you in general don't have Dune::front(multiIndex) == multiIndex.front(). I'm really looking forward to explain this to users that struggle to see the intrinsic beauty.
Also notice that this forbids that different modules use the same index implementation. Because if you specially Dune::front(MI) for the multi-index type MI as IO for module A you cannot use it as OI with module B or for any other custom permutation in module C.
I consider it much more reasonable to teach users to use vectorBackend(reverseIndex(multiIndex)) because a certain module uses the reverse order in contrast to the general convention. And since I still have not seen any valid convincing reason for a different convention I would object if this convention is anything else than OI.
I agree with you, customization point on individual multi-index operations (e.g. front(multiIndex)) would be a horrible idea. I am talking about the overall accessVectorEntry(Vector,MultiIndex), not about the individual multi-index operations.
I would not consider this a customization point, because again you cannot specialize for Vector or MultiIndex if these should be used in different modules. I guess what you have in mind is implementing:
where reverseMultiIndexView() is either implemented in dune-common or dune-pdelab.
If - in contrast - accessVectorEntry() should really be customized (in the sense of implemented separately) in any downstream module, we're exactly where we are now, except that we agree on a common name for this function.
I mean the second. The point of customization points is to be able to use code that you have not seen yet. In other words, dune-istl and other modules may invoke accessVectorEntry() independent of the direction of the multi-index and without seeing how it will be implemented later on, yet, having a default implementation with OI direction. Yes, we agree on the name, but is also much more than that. So I don't agree that that's "where we are now".
In other words, dune-istl and other modules may invoke accessVectorEntry() independent of the direction of the multi-index and without seeing how it will be implemented later on
That's exactly opposite to the topic brought up by @christi. He suggests (and I strongly agree), that it would be helpful to include functions providing multi-index access into dune-istl, such they can be used by downstream modules. You suggest that dune-istl uses functions implemented downstream.
I don't think reimplementing these different algorithms in down-stream modules is a viable idea. We want to reduce the complexity and not increase it.
I still don't get where the problem of a an additional view-wrapper that maps IO to OI is.
For example in the case of an algebraic (i.e. not grid based) communication, we would collect a list of multi-indices that need to be sent to an other rank. If PDELab uses a different semantic for the multiindices, we would once map these and store the mapped ones.
If we want to replace major parts of the backend, we would have either very few indices, that could be wrapped, or we operate on all entries and would just use the flat-view helpers from ISTL.
[the naming of the branch is a bit strange in order to fit to that of PDElab]
First I have to say, it works with some minor problems, but I also found some further functions that would be very helpful, e.g. given a set of indices, compute the size of this subset. One can do this with the existing functionality, but in many cases one could significantly speed things up. Also forEachIndex and flatVectorForEach would be helpful not only on a single vector, but on a set of vectors iterated in parallel.