[WIP] Add virtualized grid
I implemented a virtualised (i.e. type erased) version of the DUNE grid interface.
It just takes dimension and world dimension as template parameters, and then can be constructed by passing another grid to the constructor. For instance:
Dune::VirtualizedGrid<2,2> ( yaspGrid );
I‘m looking forward to use this for the python bindings to improve compilation speed of JIT modules significantly.
It is currently tested with YaspGrid<1/2/3>. Other grid types have to be tested as implementations differ here and there, and performance might be improved. My gridcheck currently shows at least factor 2 in runtime compared to the non-virtualised grid.
Merge request reports
Activity
requested review from @robert.kloefkorn
assigned to @andreas.dedner
@samuel.burbulla: Seems like I can't access your repo for testing.
Cool. I was thinking about this a couple of times but never really started.
The general structure looks already quiet nice. Maybe as inspiration: In dune-functions there are type-erasure utilities implemented to simplify the structure a little bit. E.g. implementations of copy and move operations are hidden and do not need to be repeated. I have used this for the type-erasure of several classes. And I was thinking to move these utilities to, e.g., dune-common, if needed more upstream. See: https://gitlab.dune-project.org/staging/dune-functions/-/blob/master/dune/functions/common/typeerasure.hh, and for an application, see https://gitlab.com/amdis/amdis/-/blob/master/amdis/LocalOperator.hpp
There is also a small-object optimization in https://gitlab.dune-project.org/staging/dune-functions/-/blob/master/dune/functions/common/polymorphicsmallobject.hh that might improve the performance (slightly). It is used in the type-erasure utilities.
Small-object optimization may indeed improve this. As far as I can see, you're currently doing an allocation for any creating temporary object, e.g. when dereferencing an iterator or obtaining a geometry object. Those allocations may be significantly more expensive than calling the polymorphic methods.
You are right that the intermediate allocations seem to be the most significant time issue. Unfortunately, the small-object optimisation I played around with didn't lead to large benefit.
But I would also expect that perfect move semantics would create the temporary objects in the right place and wouldn't lead to too much overhead. A simple Virtualized class around an std::vector (see
test-virtualizedgrid.cc
) shows that the overhead can be reduced to a reasonable amount if we specify the right move constructors. It also shows that the dynamic lookup is not the issue.Virtualized: 0.710141 Virtualized: 0.668765 Call Standard: 0.000204375 Call Virtualized: 0.000207
There must be something within the grid interface that I oversee which leads to the overhead. In the comparison VirtualizedGrid and pure YaspGrid, my current timings for a grid check give 6x runtime.
Any further ideas? Does anyone want to have a closer look? Or: Do we want to live with that for now?
============= 1D ============= ------------------------------ YaspGrid<1>: 0.00047425 ------------------------------ Virtualized<1>: 0.00240154 ============================= ============= 2D ============= YaspGrid<2>: 0.00154621 ------------------------------ Virtualized<2>: 0.00629467 ============================= ============= 3D ============= ------------------------------ YaspGrid<3>: 0.00452396 ------------------------------ Virtualized<3>: 0.0262107 =============================
@andreas.dedner I added more small-grained tests, see
test-virtualizedgrid.cc
.Allocation of 1000 std::vectors of different sizes compared to the virtualised version:
Standard: 0.000455459 Virtual: 0.000432916
-> No measurable overhead. I think this shows that the allocation is not the issue (as long everything is perfectly forwarded) and small-object optimisation won't help out.
Calling a cheap member function, e.g.
.size()
very often on both:Call Standard (c = 700032704): 0.00495767 Call Virtual (c = 700032704): 0.00940317
-> Virtualisation doubles the runtime. Could be the vtable lookup, or the compiler optimises sth. out in the standard version. However, this seems to be the overhead we have to accept using virtualisation (even though I read that it should be about 25%).
Iteration over a 2d YaspGrid and adding up the cell volumes gives:
Standard (vol = 1): 0.00345117 Virtual (vol = 1): 0.0923742
-> That's a runtime factor of ~25x. This can not just be the overhead of the function call and needs improvement, but I don't see where the bottleneck is. There is a lot of forwarding between Implementation, Interface and Virtualization. Maybe sth. is not perfectly forwarded and we copy objects, or YaspGrid's entity volume is just optimised in the loop.
ALUGrid doesn't work so far, as ALUGrid entities have no
ilevelbegin
etc. and I don't know how to get them.ALUGrid doesn't work so far, as ALUGrid entities have no ilevelbegin etc. and I don't know how to get them.
I don't think ilevelbegin is an interface method on the entity. Aeons ago we had this for level and leafs on the entities before introducing the Views. Now only the views have
ibegin
and return something like the level or leaf versions depending on which GridView it is. The default implementation, i.e., DefaultLevelGridView make use of that old method on the entity. AluGrid implements its own version indune-alugrid/dune/alugrid/3d/gridview.hh
. YaspGrid uses the default version and therefore needs to implement ilevelbegin.I suggest to take a look at IDGrid from dune-metagrid (https://gitlab.dune-project.org/extensions/dune-metagrid/-/tree/master/dune/grid/idgrid). There one should find a reasonably good wrapping of an arbitrary Dune grid.
DefaultLevelGridView
is just a convenience class to deal with this interface change that happened long time ago. I guess YaspGrid never really updated to new interface.Thanks for pointing out. I tried to use the DefaultGridView, but now I know that I have to add a virtualised GridView as well.
@robert.kloefkorn Thanks! I will follow the Grid interface of IDGrid. The grid implementations differ here and there slightly, so I was a bit confused. Maybe we have to update some grid implementations (like Yasp), let's see.
This looks nice (could also speed up CI/testing) by reducing jit compilation.
For the Python bindings: would you
- add a
virtualize=False
parameter to themodule
function indune-grid/python/dune/grid/grid_generator.py
- provide a new function
virtualize(originalGridFunction,*args,**gwargs)
. - add some environment variable or other
"global variable"
that allows you to switch in general
or do you have some other idea?
- add a
added 2 commits
added 28 commits
-
a18af395...d50215d4 - 11 commits from branch
core:master
- 2d27e18d - Add some first draft of virtualized grid.
- 97ce75fb - First compiling version.
- 4f9856ca - Try to make gridCheck compile.
- 78462c53 - Make 1d compile.
- 8f1d1328 - Fix some dynamic casts.
- abd3d61d - Make grid check run.
- 1df26f34 - Check runtime.
- d824bacc - Fix 2d.
- e9e606cc - Specify virtual interface for 3d.
- 2628f8f8 - Clean up.
- 4ca30114 - Add cast.hh and use move in entity.
- ccf80dc7 - Add more move.
- 1c69f4eb - Clean up dynamic casts.
- d6353788 - Add some consts.
- a15e388b - Add virtualizedGrid on python side.
- 27e86214 - Improve forward semantics.
- cabfc92a - Add simple Virtualized class to test.
Toggle commit list-
a18af395...d50215d4 - 11 commits from branch
added 73 commits
-
cabfc92a...d1b659ef - 54 commits from branch
core:master
- 1b66fbe1 - Add some first draft of virtualized grid.
- 004f8ccc - First compiling version.
- e4cccb3f - Try to make gridCheck compile.
- 3c29e2d4 - Make 1d compile.
- 2abc162f - Fix some dynamic casts.
- d88d9243 - Make grid check run.
- 91a10126 - Check runtime.
- f4166bef - Fix 2d.
- c25c82f3 - Specify virtual interface for 3d.
- 71c0f83f - Clean up.
- 6ea678c6 - Add cast.hh and use move in entity.
- 7b199258 - Add more move.
- daa6be59 - Clean up dynamic casts.
- 52d93174 - Add some consts.
- 9c551459 - Add virtualizedGrid on python side.
- e6b3945f - Improve forward semantics.
- f7df2ce6 - Add simple Virtualized class to test.
- 6fcf9a82 - Add volume calculation test.
- 7683d80e - Add more small tests and use static_cast.
Toggle commit list-
cabfc92a...d1b659ef - 54 commits from branch
@samuel.burbulla - Performance improvement tricks:
I suggest to cache some of the small but likely to be called routines on the Entity class and then make those non-virtual. E.g. when updating the entity there is one virtual call, but that could already gather info like type, volume, ... and then store it in the VirtualizedEntity to be used without calls to a virtual routine.
Maybe I do something wrong, but testing with the
testEfficiency
tools in dune-alugrid fails, since not all iterators are implemented in the virtualized grid, e.g., the Interior_Partition iterator is missing. I started to add this but quickly stopped due to the huge amount of code that needs to be added. Then I thought about a more general approach for implementing several functions for different codims and different partition types and came up with a solution that combines templates, type-erasure, virtual inheritance, CRTP and variadics :-) See my code snippets https://gitlab.dune-project.org/-/snippets/70 and https://gitlab.dune-project.org/-/snippets/71 Maybe this could simplify and generalize some code.Edited by Simon PraetoriusSee !656 (merged) for a sketch of the implementation.
Ok, I have to correct me. The IdType virtualization does not play a major role. The issue with the "SUPER" slow running was something different. When running the efficiency test with YaspGrid, it simply does not do any adaption. This is strange. When wrapped into the virtualized grid, it performs adaption and thus has to run on a much finer grid. This explains the huge difference.
After fixing the adaption level, I get a factor of 2-5 in the runtime compared to YaspGrid. Still need to find out which part makes it slow.
I have extracted the benchmark code from dune-alugrid into a separate repository to test other grids than ALUGrid. Note, something else than YaspGrid does not yet work with neither the benchmark code nor the VirtualizedGrid wrapper: https://gitlab.dune-project.org/simon.praetorius/benchmark-grids. Compile and run
build-cmake/src/efficiency/main_ball_eff 1 2 2
(the last two arguments must be identical otherwise it does not work with YaspGrid)Edited by Simon PraetoriusAwesome, @simon.praetorius, you figured out one of the main remaining issues! The need to specify all codims/partitions was not viable and I was stuck at this point. Thanks a lot for this impressive piece of code.
Also many thanks for reviewing the code, I will adopt to your suggestions.
Afterwards, I will test out the benchmark-grids repo and see if I can figure out some performance bottle necks.
- dune/grid/virtualizedgrid/gridview.hh 0 → 100644
160 int overlapSize ( int codim ) const 161 { 162 return grid().overlapSize(level_, codim ); 163 } 164 165 int ghostSize ( int codim ) const 166 { 167 return grid().ghostSize(level_, codim ); 168 } 169 170 template< class DataHandle, class Data > 171 void communicate ( CommDataHandleIF< DataHandle, Data > &dataHandle, 172 InterfaceType interface, 173 CommunicationDirection direction ) const 174 { 175 return grid().communicate( dataHandle, interface, direction, level_ ); This method does not exist in grid. Note, the 4th argument need to be provided. This was also missing in IdentityGrid, see !640 (merged)
40 41 template< class I > 42 struct DUNE_PRIVATE Implementation final 43 : public Interface 44 { 45 Implementation ( I&& i ) : impl_( std::forward<I>(i) ) {} 46 virtual Implementation *clone() const override { return new Implementation( *this ); } 47 48 virtual void increment() override { ++impl(); } 49 50 virtual Entity dereference() const override 51 { 52 return VirtualizedGridEntity<codimension, GridImp::dimension, GridImp> ( *impl() ); 53 } 54 55 virtual bool equals( const VirtualizedGridHierarchicIterator<GridImp>& i ) const override - Resolved by Samuel Burbulla
- Resolved by Samuel Burbulla
- Resolved by Samuel Burbulla
- Resolved by Samuel Burbulla
- Resolved by Samuel Burbulla