Skip to content
Snippets Groups Projects

Improve performance of MatrixIndexSet

Merged Carsten Gräser requested to merge feature/improve-matrixindexset into master
2 unresolved threads

This improves performance of MatrixIndexSet significantly by two changes:

  • Replace the std::set used to store the column indices for each rows by a std::vector based implementation. The std::vector is kept sorted when inserting such that we can avoid duplicates. In the worst case (reverse order insertion) this may lead to O(n^2) complexity when inserting n entries in a row compared to O(n log(n)) for std::set. However, the sorted std::vector implementation still wins for relatively large n. To avoid the worst case complexity when using very dense rows, the implementation automatically switches to std::set if the size exceeds the threshold value maxVectorSize. The default maxVectorSize=2048 was selected based on benchmark results.

  • Change the stored index type from std::size_t to uint32_t to reduce memory consumption and improve performance. Notice that this excludes exceptionally large matrices with more than 2^32 columns that worked before. To give an estimate on the influence of this: If you want to build a square BCRSMatrix with 2^32 rows and two entries per row using the old MatrixIndexSet implementation this alone would need more than 0.5 TB memory. One could make the index type a template parameter, but this would be a breaking interface change.

  • Additionally this MR adds documentation of the thread safety of MatrixIndexSet. While this does not change any code, it is strictly speaking an interface change, because it introduces public guarantees that have been implementation details before. Hence you can no longer relax these guarantees (e.g. to change the implementation from row- to column-major) without breaking compatibility. However, it's IMO important to make these guarantees, because without them you cannot do multi-threaded assembly. (Notice that the 'implicit' build mode of BCRSMatrix is e.g. not thread safe.)

Edited by Carsten Gräser

Merge request reports

Pipeline #61462 passed

Pipeline passed for aa9408f1 on feature/improve-matrixindexset

Approved by

Merged by Carsten GräserCarsten Gräser 2 years ago (Apr 14, 2023 1:36pm UTC)

Merge details

  • Changes merged into master with a0db70c0.
  • Deleted the source branch.

Pipeline failed for a0db70c0 on master

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Simon Praetorius
    • I like the idea to use a flat set instead of a tree-based set for storing the nonzeros. The switch to std::set is interesting. I need to do some benchmarks myself to find out the optimal value of maxVectorSize for my platform/system.

    • Be warned: There's no simple answer to this. The sorted vector implementation is mainly problematic if you insert in (almost) reverse order. The default threshold was selected based on tests with reverse order to balance two concurrent goals: a) Use the fast vector implementation as long as possible. b) Reduce the additional complexity when switching to set as much as possible (because you essentially sort twice).

      For forward or random insertion order the vector based implementation wins for even denser rows. But I consider more than 2048 entries per row already exceptionally large and trying to optimize the default further is probably pointless. In exceptional cases you can still change the threshold according to your specific needs.

    • Please register or sign in to reply
    • Resolved by Simon Praetorius

      I am not yet 100% happy about the name maxVectorSize. When reading the code it gets clear that this is kind of a switching point. It is not related to the maximal number of entries to store in the matrix, also not the expected number of nonzeros. If we use a vector, we could clearly make also a reservation about the expected number of nonzeros per row.

      The internal storage should be more like an implementation detail. We use the vector as long as it is the fastest, then switch to something else. How to choose the maxVectorSize? Do we need to first perform some experiments and benchmarks and then can make an educated guess? But why do we switch to a std::set then and not to a std::unordered_set? Why do we store the row nonzeros in a vector and not in a ReservedVector if we know that there are just a very few entries per row?

      As I said, I like this improvement of the (default) MatrixIndexSet. But maybe it would be nice to just have an alternative or a much more generic MatrixIndexSet, where it even could be possible to configure the internal storage.

  • Carsten Gräser added 3 commits

    added 3 commits

    • d593339e - Switch internal data structure in MatrixIndexSet
    • 0a0af6fe - Switch to std::uint_least32_t for stored indices in MatrixIndexSet
    • 99fc4660 - [doc] Mention improvements of MatrixIndexSet in changelog

    Compare with previous version

  • Carsten Gräser added 3 commits

    added 3 commits

    • 39c231a6 - Switch internal data structure in MatrixIndexSet
    • 1feabdb3 - Switch to std::uint_least32_t for stored indices in MatrixIndexSet
    • e9346f26 - [doc] Mention improvements of MatrixIndexSet in changelog

    Compare with previous version

  • Carsten Gräser added 3 commits

    added 3 commits

    • 3df7ba4b - Switch internal data structure in MatrixIndexSet
    • bf28b103 - Switch to std::uint_least32_t for stored indices in MatrixIndexSet
    • 692e68c4 - [doc] Mention improvements of MatrixIndexSet in changelog

    Compare with previous version

  • Carsten Gräser added 3 commits

    added 3 commits

    • c8272a07 - Switch internal data structure in MatrixIndexSet
    • 4886d18b - Switch to std::uint_least32_t for stored indices in MatrixIndexSet
    • 107a54a8 - [doc] Mention improvements of MatrixIndexSet in changelog

    Compare with previous version

  • Looks good to me.

  • added 1 commit

    • aa9408f1 - [doc] Document thread safety guarantees of MatrixIndexSet

    Compare with previous version

  • Before merging, I also added documentation of the thread safety guarantees of MatrixIndexSet.

  • Carsten Gräser changed the description

    changed the description

  • Simon Praetorius approved this merge request

    approved this merge request

  • Carsten Gräser mentioned in commit a0db70c0

    mentioned in commit a0db70c0

  • mentioned in merge request dune-common!1251 (merged)

  • Please register or sign in to reply
    Loading