...
Codeblock | ||
---|---|---|
| ||
vtune -help |
Run VTune via command line interface
...
Run your application with VTune wrapper as follows:
www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2023-0/command-line-interface.html
Example "hotspot analysis"
Codeblock | ||
---|---|---|
| ||
mpirun -np 4 vtune –collect hotspots -result-dir vtune_hotspot ./path-to_your/app.exe args_of_your_app |
After completion, explore hotspot the results of the analysis e.g. via
Codeblock | ||
---|---|---|
| ||
vtune -report summary -r vtune_* |
Run VTune-GUI (not recommended)
Login with x-window support (ssh -X) and then start
Codeblock | ||
---|---|---|
| ||
vtune-gui |
Run VTune-GUI remotely
...
on your
...
local browser (recommended
...
)
First, login Login to the supercomputer with local port forwarding and start your VTune server on an exclusive compute node (1h job):
Codeblock | ||
---|---|---|
| ||
ssh -L 127.0.0.1:55055:127.0.0.1:55055 blogin.hlrn.de salloc -p standard96:test -t 01:00:00 ssh -L 127.0.0.1:55055:127.0.0.1:55055 $SLURM_NODELIST module load vtune/2022 vtune-backend --web-port=55055 --enable-server-profiling & |
Second, open Open 127.0.0.1:55055 in your browser (allow security exception, if first time set initial password).
In 1st 1st "Welcome" VTune tab (run MPI parallel Performance Snapshot):
Under WHAT click: Configure Analysis
-> Set application: /path-to-your-application/program.exe
-> Check: Use app. dir. as work dir.
-> In case of MPI parallelism, expand "Advanced": keep defaults but paste the following wrapper script and check "Trace MPI":
Codeblock | ||
---|---|---|
| ||
Under WHAT (in 1st "Welcome" tab) -> Click: Configure Analysis --> Application: /path-to-your-application/program.exe --> Check: Use app. dir. as work dir. --> Expand "Advanced": keep defaults but paste "Wrapper script:" #!/bin/bash # Prefix script echo "Target process PID: ${VTUNE_TARGET_PID}" # Run VTune collector mpirun -np 2 "$@" --> Expand "Advanced" ---> keep defaults but paste "Wrapper script:" ---> Check: Trace MPI Under HOW (in 1st "Welcome" tab) -> Run "Performance Snapshot" When complete (in 2nd tab r0...) -> for overview expand: "HPC Perf. Characterization" -> for results & to select next analysis expand: "Performance Snapshot" --> Click: "Hotspots" |
Under HOW run: Performance Snapshot.
(When complete 2nd tab opens automatically.)
...
(here with 4 MPI ranks)
mpirun -np 4 "$@" |
Under HOW run a general: Performance Snapshot.
(After completion/result finalization a 2nd result tab opens automatically.)
In 2nd "r0..." VTune tab (explore Performance Snapshot results):
-> Here you find several analysis results e.g. the HPC Perf. Characterization.
-> Under Performance Snapshot - depending on the snapshot outcome - VTune suggests more detailed follow-up analysis types:
--> For example re-run a Hotspot analysis (after completion another tab opens.).
In 3nd "r0..." VTune tab (Hotspot analysis):
-> Expand sub-tab Top-down Tree
--> In Function Stack expand "_start" fct. and expand further down to "main" fct. (first with entry under "Source File")
Codeblock | ||
---|---|---|
| ||
Under HOW (in 3rd tab r0...) -> Run "Hotspots" When complete (after finalizing results) --> Expand sub-tab "Top-down Tree" ---> In "Function Stack" expand "_start" fct. and expand further down to "main" fct. (first with entry under "Source File") ---> Double click on: source_file_name.c --> In new sub-tab "source_file_name.c" scroll down to line with max. "CPU Time: Total" to find hotspot |
...