|
|
Small documentation on how to get some meaningful results on where we spend our time:
|
|
|
|
|
|
# Step 1: Building
|
|
|
Start from a fully optimized build and add the following flags
|
|
|
* `-ggdb` for debug symbols
|
|
|
* `-fno-omit-frame-pointer` to not have unidentifiable samples in the dataset
|
|
|
|
|
|
# Step 2: Running
|
|
|
Start the program (for sake of simplicity: sequential program) through the following wrapper:
|
|
|
```
|
|
|
perf record --call-graph fp ./program
|
|
|
```
|
|
|
|
|
|
# Step 3: Analyzing
|
|
|
For a live analysis try (in the same directory you ran stuff in):
|
|
|
```
|
|
|
perf report --hierarchy
|
|
|
```
|
|
|
Unfortunately, PDELab types are so long, that perf completely trips over and cuts them unreadable.
|
|
|
To get around, do:
|
|
|
```
|
|
|
perf report --hierarchy --stdio --no-demangle > undemangled
|
|
|
c++filt < undemangled > demangled
|
|
|
```
|
|
|
And try to read the `demangled` file.
|
|
|
|
|
|
# Step 4: Postprocessing
|
|
|
I am currently writing a small postprocessing tool to dig through template parameters efficiently. |
|
|
\ No newline at end of file |