Skip to content
Snippets Groups Projects

Add `update()` method to `CommDataHandleIF`

Closed Ansgar Burchardt requested to merge feature/loadbalance-update-method into master

When using a grid's load balancing feature, the following steps happen:

  1. User data is gathered (gather)
  2. The grid is manipulated
  3. User data is scattered again (scatter)

User data is often stored in a vector and accessed via the entity indices obtained from a mapper. However between steps (2.) and (3.) the grid is manipulated and the vector size and mappers need to be updated before they can be used to store the scattered user data.

This patch adds a update() method that must be called by grid implementations between step (2.) and (3.) to allow updating any vectors and mappers used to store data.

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • When adding support for load balancing element data in addition to vertex data to UGGrid, I noticed that the communication handler used for distributing data during load balancing doesn't work well with data accessed via a mapper. In fact it doesn't even work that nice for data accessed directly via an IndexSet: the test in test-parallel-ug.cc resized the data vector in every call of the gather method...

    The options I saw for improving this are:

    1. Having the user deal with it. One can add a Boolean flag to resize vectors and update mappers in the first gather call.
    2. Adding a update() method to the data handler:
      1. Using a virtual function (overhead can be neglected, users can use final anyway, easy to provide the default implementation)
      2. Using Barton-Nackman
    3. Adding a new interface for load balancing (that adds the `update() method)

    Option (1.) seems very unelegant to me, so I implemented (2.1.) as I see no advantage in (2.2.) over the simpler (2.1.) here.

    The update() method could also be used to keep using the same communication handle after a grid refinement (where the same problems occur). That is a small argument against (3.).

    This is an interface change, but existing code should just keep working the same as before.

  • Ansgar Burchardt mentioned in merge request !172 (merged)

    mentioned in merge request !172 (merged)

  • I am in favour of this. You need it when load-balancing grids that contain more than one element type. It is minimally invasive and requires no changes to user code (unless you want it to work with multi-element-type grids).

  • Uh, I don't quite understand. I always thought that you could not rely on the index sets during load balancing (including the gather and scatter steps). Thus you needed to transfer your data from vectors to maps (with the local ID as the key) before calling loadBalance(), operating on those maps during gather and scatter, and transfer the data back afterward. From what I remember from ALUGrid, you could not even rely on the global ID during scatter, because ALUGrid would overlap the scatter and grid manipulation steps, and the global ID set would only be built at the end.

    @andreas.dedner, @martin.nolte, @robert.kloefkorn: Can any of you comment: Am I getting this completely wrong or is this indeed a problem?

  • ALUGrid actually interleaves the extraction of grid elements and user data for the stream, i.e., after an element has been constructed on the process, the corresponding user data is scattered. The gathering step is interleaved in a similar way.

    As an obvious consequence, you cannot rely on the index sets during scatter. It might be possible during gather in ALUGrid, though, because the indices actually o not change. However, such an asymmetry should not be reflected in the interface.

    For the GlobaldSet: It is shipped in the data stream, too, and should be extracted before the user data. Therefore, it should be valid during scatter (otherwise, I would consider this a bug). On the other hand, I would always advise using the LocalIdSet unless you really need global ids. At least for ALUGrid, the GlobalIdSet is quite a large, artificial DUNE structure that is generated on first use. Once generated, it must be communicated during load balance, too, adding to the communication overhead.

    Summary: When suggesting your update method, you assume the grid has been fully extracted from the stream and index sets have been constructed before any user data is scattered. ALUGrid currently does not satisfy this assumption.

  • Interesting. Is this documented anywhere?

  • If ALUGrid works differently, adding the update() method isn't a good idea; therefore I'll close the MR.

    Maybe one should just use a map + IdSet with dune-uggrid as well, even though it wouldn't be strictly necessary. I've done so now for test-parallel-ug.

Please register or sign in to reply
Loading