* Re: [PATCH] perf doc: Add AMD IBS usage document
2024-06-19 9:22 [PATCH] perf doc: Add AMD IBS usage document Ravi Bangoria
@ 2024-06-19 13:28 ` Arnaldo Carvalho de Melo
2024-06-19 15:39 ` Ravi Bangoria
2024-06-19 14:51 ` Namhyung Kim
1 sibling, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-06-19 13:28 UTC (permalink / raw)
To: Ravi Bangoria
Cc: namhyung, irogers, peterz, mingo, mark.rutland,
alexander.shishkin, jolsa, adrian.hunter, kan.liang, yangjihong1,
linux-kernel, linux-perf-users, sandipan.das, ananth.narayan,
santosh.shukla
On Wed, Jun 19, 2024 at 09:22:34AM +0000, Ravi Bangoria wrote:
> Add a perf man page document that describes how to exploit AMD IBS with
> Linux perf. Brief intro about IBS and simple one-liner examples will help
> naive users to get started. This is not meant to be an exhaustive IBS
> guide. User should refer latest AMD64 Architecture Programmer's Manual
> for detailed description of IBS.
>
> Usage:
>
> $ man perf-amd-ibs
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
> tools/perf/Documentation/perf-amd-ibs.txt | 126 ++++++++++++++++++++++
> tools/perf/Documentation/perf.txt | 3 +-
> 2 files changed, 128 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/Documentation/perf-amd-ibs.txt
>
> diff --git a/tools/perf/Documentation/perf-amd-ibs.txt b/tools/perf/Documentation/perf-amd-ibs.txt
> new file mode 100644
> index 000000000000..d3dfa71e320c
> --- /dev/null
> +++ b/tools/perf/Documentation/perf-amd-ibs.txt
> @@ -0,0 +1,126 @@
> +perf-amd-ibs(1)
> +===============
> +
> +NAME
> +----
> +perf-amd-ibs - Support for AMD Instruction-Based Sampling with perf tool
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'perf record' -e ibs_op//
> +'perf record' -e ibs_fetch//
> +
> +DESCRIPTION
> +-----------
> +
> +Instruction-Based Sampling (IBS) provides precise Instruction Pointer (IP)
> +profiling support on AMD platforms. IBS has two independent components: IBS
> +Op and IBS Fetch. IBS Op sampling provides information about instruction
> +execution (micro-op execution to be precise) with details like d-cache
> +hit/miss, d-TLB hit/miss, cache miss latency, load/store data source, branch
> +behavior etc. IBS Fetch sampling provides information about instruction fetch
> +with details like i-cache hit/miss, i-TLB hit/miss, fetch latency etc. IBS is
> +per-smt-thread i.e. each SMT hardware thread contains standalone IBS units.
> +
> +Both, IBS Op and IBS Fetch, are exposed as PMUs by Linux and can be exploited
> +using Linux perf utility. Following files will be created at boot time if IBS
the The
> +is supported by the hardware and kernel.
> +
> + /sys/bus/event_source/devices/ibs_op/
> + /sys/bus/event_source/devices/ibs_fetch/
> +
> +IBS Op PMU supports two events: cycles and micro ops. IBS Fetch PMU supports
> +one event: fetch ops.
> +
> +IBS VS. REGULAR CORE PMU
> +------------------------
> +
> +IBS gives samples with precise IP, i.e. the IP recorded with IBS sample has
> +no skid. Whereas the IP recorded by regular core PMU will have some skid
> +(sample was generated at IP X but perf would record it at IP X+n). Hence,
> +regular core PMU might not help for profiling with instruction level
> +precision. Further, IBS provides additional information about the sample in
> +question. On the other hand, regular core PMU has it's own advantages like
> +plethora of events, counting mode (less interference), up to 6 parallel
> +counters, event grouping support, filtering capabilities etc.
IIRC if one does:
perf record -e cycles:P
on AMD systems it maps it to
ibs_op//
No?
I don't have access right now to my 5950X, so its from memory, about
"IBS invocation from core PMUs with precise_ip set"
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=78075d947534013b4575687d19ebcbbb6d3addcd
One other thing to mention is 'perf mem record' that will use ibs_op//
as we can see in the cover letter for this perf-tools merge commit
upstream:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9d64bf433c53cab2f48a3fff7a1f2a696bc5229a
# perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000'
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.199 MB perf.data (2913 samples) ]
#
# ls -la perf.data
-rw-------. 1 root root 2346486 Jan 9 18:36 perf.data
# perf evlist
ibs_op//
dummy:u
# perf evlist -v
ibs_op//: type: 11, size: 136, config: 0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT, read_format: ID, disabled: 1, inherit: 1, freq: 1, sample_id_all: 1
Another examples available in the merge commit of when ibs_op support
was added to 'perf c2c' and 'perf mem':
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d465bff130bf4ca17b6980abe51164ace1e0cba4
Showing how you can use 'perf report -D' to extract info about these
samples should be interesting as well:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0429796e45ec17eee26d7a59de92271c275d7666
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=291dcb98d7ee5cd719f4c5991d977794b1829c16
> +EXAMPLES
> +--------
> +
> +IBS Op PMU
> +~~~~~~~~~~
> +
> +System-wide profile, cycles event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op// -c 100000 -a
> +
> +Per-cpu profile (cpu10), cycles event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op// -c 100000 -C 10
> +
> +Per-cpu profile (cpu10), cycles event, sampling freq: 1000
> +
> + $ sudo perf record -e ibs_op// -F 1000 -C 10
> +
> +System-wide profile, uOps event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a
> +
> +Same command, but also capture IBS register raw dump along with perf sample:
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a --raw-samples
> +
> +System-wide profile, uOps event, sampling period: 100000, L3MissOnly (Zen4 onward)
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1,l3missonly=1/ -c 100000 -a
> +
> +Per process(upstream v6.2 onward), uOps event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -p 1234
> +
> +Per process(upstream v6.2 onward), uOps event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -- ls
> +
> +To analyse recorded profile in aggregate mode
> +
> + $ sudo perf report
> + /* Select a line and press 'a' to drill down at instruction level. */
> +
> +To go over each sample
> +
> + $ sudo perf script
Here I think it would be to have an example of such output.
> +
> +Raw dump of IBS registers when profiled with --raw-samples
> +
> + $ sudo perf report -D
> + /* Look for PERF_RECORD_SAMPLE */
Ditto
> +
> +IBS applied in a real world usecase
> +
> +~90% regression was observed in tbench with specific scheduler hint which
> +was counter intuitive. IBS profile of good and bad run captured using perf
> +helped in identifying exact cause of the problem:
> +
> + https://lore.kernel.org/r/20220921063638.2489-1-kprateek.nayak@amd.com
> +
> +IBS Fetch PMU
> +~~~~~~~~~~~~~
> +
> +Similar commands can be used with Fetch PMU as well.
> +
> +System-wide profile, fetch ops event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_fetch// -c 100000 -a
> +
> +System-wide profile, fetch ops event, sampling period: 100000, Random enable
> +
> + $ sudo perf record -e ibs_fetch/rand_en=1/ -c 100000 -a
> +
> +etc.
> +
> +SEE ALSO
> +--------
> +
> +linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1]
perf-mem, perf-c2c
> diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
> index 09f516f3fdfb..cbcc2e4d557e 100644
> --- a/tools/perf/Documentation/perf.txt
> +++ b/tools/perf/Documentation/perf.txt
> @@ -82,7 +82,8 @@ linkperf:perf-stat[1], linkperf:perf-top[1],
> linkperf:perf-record[1], linkperf:perf-report[1],
> linkperf:perf-list[1]
>
> -linkperf:perf-annotate[1],linkperf:perf-archive[1],linkperf:perf-arm-spe[1],
> +linkperf:perf-amd-ibs[1], linkperf:perf-annotate[1],
> +linkperf:perf-archive[1], linkperf:perf-arm-spe[1],
> linkperf:perf-bench[1], linkperf:perf-buildid-cache[1],
> linkperf:perf-buildid-list[1], linkperf:perf-c2c[1],
> linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1],
> --
> 2.45.2
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] perf doc: Add AMD IBS usage document
2024-06-19 9:22 [PATCH] perf doc: Add AMD IBS usage document Ravi Bangoria
2024-06-19 13:28 ` Arnaldo Carvalho de Melo
@ 2024-06-19 14:51 ` Namhyung Kim
2024-06-19 15:56 ` Ravi Bangoria
1 sibling, 1 reply; 5+ messages in thread
From: Namhyung Kim @ 2024-06-19 14:51 UTC (permalink / raw)
To: Ravi Bangoria
Cc: acme, irogers, peterz, mingo, mark.rutland, alexander.shishkin,
jolsa, adrian.hunter, kan.liang, yangjihong1, linux-kernel,
linux-perf-users, sandipan.das, ananth.narayan, santosh.shukla,
Stephane Eranian
Hello,
Adding Stephane to CC.
On Wed, Jun 19, 2024 at 2:23 AM Ravi Bangoria <ravi.bangoria@amd.com> wrote:
>
> Add a perf man page document that describes how to exploit AMD IBS with
> Linux perf. Brief intro about IBS and simple one-liner examples will help
> naive users to get started. This is not meant to be an exhaustive IBS
> guide. User should refer latest AMD64 Architecture Programmer's Manual
> for detailed description of IBS.
>
> Usage:
>
> $ man perf-amd-ibs
>
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Thanks a lot for adding this documentation! A nitpick below..
> ---
> tools/perf/Documentation/perf-amd-ibs.txt | 126 ++++++++++++++++++++++
> tools/perf/Documentation/perf.txt | 3 +-
> 2 files changed, 128 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/Documentation/perf-amd-ibs.txt
>
> diff --git a/tools/perf/Documentation/perf-amd-ibs.txt b/tools/perf/Documentation/perf-amd-ibs.txt
> new file mode 100644
> index 000000000000..d3dfa71e320c
> --- /dev/null
> +++ b/tools/perf/Documentation/perf-amd-ibs.txt
> @@ -0,0 +1,126 @@
> +perf-amd-ibs(1)
> +===============
> +
> +NAME
> +----
> +perf-amd-ibs - Support for AMD Instruction-Based Sampling with perf tool
> +
> +SYNOPSIS
> +--------
> +[verse]
> +'perf record' -e ibs_op//
> +'perf record' -e ibs_fetch//
> +
> +DESCRIPTION
> +-----------
> +
> +Instruction-Based Sampling (IBS) provides precise Instruction Pointer (IP)
> +profiling support on AMD platforms. IBS has two independent components: IBS
> +Op and IBS Fetch. IBS Op sampling provides information about instruction
> +execution (micro-op execution to be precise) with details like d-cache
> +hit/miss, d-TLB hit/miss, cache miss latency, load/store data source, branch
> +behavior etc. IBS Fetch sampling provides information about instruction fetch
> +with details like i-cache hit/miss, i-TLB hit/miss, fetch latency etc. IBS is
> +per-smt-thread i.e. each SMT hardware thread contains standalone IBS units.
> +
> +Both, IBS Op and IBS Fetch, are exposed as PMUs by Linux and can be exploited
> +using Linux perf utility. Following files will be created at boot time if IBS
> +is supported by the hardware and kernel.
> +
> + /sys/bus/event_source/devices/ibs_op/
> + /sys/bus/event_source/devices/ibs_fetch/
> +
> +IBS Op PMU supports two events: cycles and micro ops. IBS Fetch PMU supports
> +one event: fetch ops.
> +
> +IBS VS. REGULAR CORE PMU
> +------------------------
> +
> +IBS gives samples with precise IP, i.e. the IP recorded with IBS sample has
> +no skid. Whereas the IP recorded by regular core PMU will have some skid
> +(sample was generated at IP X but perf would record it at IP X+n). Hence,
> +regular core PMU might not help for profiling with instruction level
> +precision. Further, IBS provides additional information about the sample in
> +question. On the other hand, regular core PMU has it's own advantages like
> +plethora of events, counting mode (less interference), up to 6 parallel
> +counters, event grouping support, filtering capabilities etc.
> +
> +EXAMPLES
> +--------
> +
> +IBS Op PMU
> +~~~~~~~~~~
> +
> +System-wide profile, cycles event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op// -c 100000 -a
> +
> +Per-cpu profile (cpu10), cycles event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op// -c 100000 -C 10
> +
> +Per-cpu profile (cpu10), cycles event, sampling freq: 1000
> +
> + $ sudo perf record -e ibs_op// -F 1000 -C 10
> +
> +System-wide profile, uOps event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a
> +
> +Same command, but also capture IBS register raw dump along with perf sample:
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -a --raw-samples
> +
> +System-wide profile, uOps event, sampling period: 100000, L3MissOnly (Zen4 onward)
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1,l3missonly=1/ -c 100000 -a
> +
> +Per process(upstream v6.2 onward), uOps event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -p 1234
> +
> +Per process(upstream v6.2 onward), uOps event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_op/cnt_ctl=1/ -c 100000 -- ls
> +
> +To analyse recorded profile in aggregate mode
> +
> + $ sudo perf report
> + /* Select a line and press 'a' to drill down at instruction level. */
> +
> +To go over each sample
> +
> + $ sudo perf script
> +
> +Raw dump of IBS registers when profiled with --raw-samples
> +
> + $ sudo perf report -D
> + /* Look for PERF_RECORD_SAMPLE */
> +
> +IBS applied in a real world usecase
> +
> +~90% regression was observed in tbench with specific scheduler hint which
> +was counter intuitive. IBS profile of good and bad run captured using perf
> +helped in identifying exact cause of the problem:
> +
> + https://lore.kernel.org/r/20220921063638.2489-1-kprateek.nayak@amd.com
> +
> +IBS Fetch PMU
> +~~~~~~~~~~~~~
> +
> +Similar commands can be used with Fetch PMU as well.
> +
> +System-wide profile, fetch ops event, sampling period: 100000
> +
> + $ sudo perf record -e ibs_fetch// -c 100000 -a
> +
> +System-wide profile, fetch ops event, sampling period: 100000, Random enable
Can you please add a brief description of what 'random enable' means?
Thanks,
Namhyung
> +
> + $ sudo perf record -e ibs_fetch/rand_en=1/ -c 100000 -a
> +
> +etc.
> +
> +SEE ALSO
> +--------
> +
> +linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1]
> diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
> index 09f516f3fdfb..cbcc2e4d557e 100644
> --- a/tools/perf/Documentation/perf.txt
> +++ b/tools/perf/Documentation/perf.txt
> @@ -82,7 +82,8 @@ linkperf:perf-stat[1], linkperf:perf-top[1],
> linkperf:perf-record[1], linkperf:perf-report[1],
> linkperf:perf-list[1]
>
> -linkperf:perf-annotate[1],linkperf:perf-archive[1],linkperf:perf-arm-spe[1],
> +linkperf:perf-amd-ibs[1], linkperf:perf-annotate[1],
> +linkperf:perf-archive[1], linkperf:perf-arm-spe[1],
> linkperf:perf-bench[1], linkperf:perf-buildid-cache[1],
> linkperf:perf-buildid-list[1], linkperf:perf-c2c[1],
> linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1],
> --
> 2.45.2
>
^ permalink raw reply [flat|nested] 5+ messages in thread