|
|
There is some few operations in our sum factorization code that should translate to good assembly code in order to have the overall code perform well. I have set up godbolt to study these for us:
|
|
|
|
|
|
* `horizontal_add`
|
|
|
|
|
|
https://godbolt.org/g/ZGmSYN
|
|
|
|
|
|
* `transpose`
|
|
|
|