From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ayush Jain <ayush.jain3@amd.com>, Ian Rogers <irogers@google.com>
Cc: Sandipan Das <sandipan.das@amd.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
peterz@infradead.org, Ingo Molnar <mingo@kernel.org>,
mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
kjain@linux.ibm.com, atrajeev@linux.vnet.ibm.com,
barnali@linux.ibm.com, ananth.narayan@amd.com,
ravi.bangoria@amd.com, santosh.shukla@amd.com
Subject: Re: [PATCH] perf test: Retry without grouping for all metrics test
Date: Wed, 6 Dec 2023 10:08:31 -0300 [thread overview]
Message-ID: <ZXByT1K6enTh2EHT@kernel.org> (raw)
In-Reply-To: <1320e6e3-c029-2a8c-e8b7-2cfbb781518a@amd.com>
Em Wed, Jun 14, 2023 at 05:08:21PM +0530, Ayush Jain escreveu:
> On 6/14/2023 2:37 PM, Sandipan Das wrote:
> > There are cases where a metric uses more events than the number of
> > counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric
> > counters but the "nps1_die_to_dram" metric has eight events. By default,
> > the constituent events are placed in a group. Since the events cannot be
> > scheduled at the same time, the metric is not computed. The all metrics
> > test also fails because of this.
Humm, I'm not being able to reproduce here the problem, before applying
this patch:
[root@five ~]# grep -m1 "model name" /proc/cpuinfo
model name : AMD Ryzen 9 5950X 16-Core Processor
[root@five ~]# perf test -vvv "perf all metrics test"
104: perf all metrics test :
--- start ---
test child forked, pid 1379713
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with 0
---- end ----
perf all metrics test: Ok
[root@five ~]#
[root@five ~]# perf stat -M nps1_die_to_dram -a sleep 2
Performance counter stats for 'system wide':
0 dram_channel_data_controller_4 # 10885.3 MiB nps1_die_to_dram (49.96%)
31,334,338 dram_channel_data_controller_1 (50.01%)
0 dram_channel_data_controller_6 (50.04%)
54,679,601 dram_channel_data_controller_3 (50.04%)
38,420,402 dram_channel_data_controller_0 (50.04%)
0 dram_channel_data_controller_5 (49.99%)
54,012,661 dram_channel_data_controller_2 (49.96%)
0 dram_channel_data_controller_7 (49.96%)
2.001465439 seconds time elapsed
[root@five ~]#
[root@five ~]# perf stat -v -M nps1_die_to_dram -a sleep 2
Using CPUID AuthenticAMD-25-21-0
metric expr dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7 for nps1_die_to_dram
found event dram_channel_data_controller_4
found event dram_channel_data_controller_1
found event dram_channel_data_controller_6
found event dram_channel_data_controller_3
found event dram_channel_data_controller_0
found event dram_channel_data_controller_5
found event dram_channel_data_controller_2
found event dram_channel_data_controller_7
Parsing metric events 'dram_channel_data_controller_4/metric-id=dram_channel_data_controller_4/,dram_channel_data_controller_1/metric-id=dram_channel_data_controller_1/,dram_channel_data_controller_6/metric-id=dram_channel_data_controller_6/,dram_channel_data_controller_3/metric-id=dram_channel_data_controller_3/,dram_channel_data_controller_0/metric-id=dram_channel_data_controller_0/,dram_channel_data_controller_5/metric-id=dram_channel_data_controller_5/,dram_channel_data_controller_2/metric-id=dram_channel_data_controller_2/,dram_channel_data_controller_7/metric-id=dram_channel_data_controller_7/'
dram_channel_data_controller_4 -> amd_df/metric-id=dram_channel_data_controller_4,dram_channel_data_controller_4/
dram_channel_data_controller_1 -> amd_df/metric-id=dram_channel_data_controller_1,dram_channel_data_controller_1/
Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_1'. Missing kernel support? (<no help>)
dram_channel_data_controller_6 -> amd_df/metric-id=dram_channel_data_controller_6,dram_channel_data_controller_6/
Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_6'. Missing kernel support? (<no help>)
dram_channel_data_controller_3 -> amd_df/metric-id=dram_channel_data_controller_3,dram_channel_data_controller_3/
Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_3'. Missing kernel support? (<no help>)
dram_channel_data_controller_0 -> amd_df/metric-id=dram_channel_data_controller_0,dram_channel_data_controller_0/
Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_0'. Missing kernel support? (<no help>)
dram_channel_data_controller_5 -> amd_df/metric-id=dram_channel_data_controller_5,dram_channel_data_controller_5/
Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_5'. Missing kernel support? (<no help>)
dram_channel_data_controller_2 -> amd_df/metric-id=dram_channel_data_controller_2,dram_channel_data_controller_2/
Multiple errors dropping message: Cannot find PMU `dram_channel_data_controller_2'. Missing kernel support? (<no help>)
dram_channel_data_controller_7 -> amd_df/metric-id=dram_channel_data_controller_7,dram_channel_data_controller_7/
Matched metric-id dram_channel_data_controller_4 to dram_channel_data_controller_4
Matched metric-id dram_channel_data_controller_1 to dram_channel_data_controller_1
Matched metric-id dram_channel_data_controller_6 to dram_channel_data_controller_6
Matched metric-id dram_channel_data_controller_3 to dram_channel_data_controller_3
Matched metric-id dram_channel_data_controller_0 to dram_channel_data_controller_0
Matched metric-id dram_channel_data_controller_5 to dram_channel_data_controller_5
Matched metric-id dram_channel_data_controller_2 to dram_channel_data_controller_2
Matched metric-id dram_channel_data_controller_7 to dram_channel_data_controller_7
Control descriptor is not initialized
dram_channel_data_controller_4: 0 2001175127 999996394
dram_channel_data_controller_1: 32346663 2001169897 1000709803
dram_channel_data_controller_6: 0 2001168377 1001193443
dram_channel_data_controller_3: 47551247 2001166947 1001198122
dram_channel_data_controller_0: 38975242 2001165217 1001182923
dram_channel_data_controller_5: 0 2001163067 1000464054
dram_channel_data_controller_2: 49934162 2001160907 999974934
dram_channel_data_controller_7: 0 2001150317 999968825
Performance counter stats for 'system wide':
0 dram_channel_data_controller_4 # 10297.2 MiB nps1_die_to_dram (49.97%)
32,346,663 dram_channel_data_controller_1 (50.01%)
0 dram_channel_data_controller_6 (50.03%)
47,551,247 dram_channel_data_controller_3 (50.03%)
38,975,242 dram_channel_data_controller_0 (50.03%)
0 dram_channel_data_controller_5 (49.99%)
49,934,162 dram_channel_data_controller_2 (49.97%)
0 dram_channel_data_controller_7 (49.97%)
2.001196512 seconds time elapsed
[root@five ~]#
What am I missing?
Ian, I also stumbled on this:
[root@five ~]# perf stat -M dram_channel_data_controller_4
Cannot find metric or group `dram_channel_data_controller_4'
^C
Performance counter stats for 'system wide':
284,908.91 msec cpu-clock # 32.002 CPUs utilized
6,485,456 context-switches # 22.763 K/sec
719 cpu-migrations # 2.524 /sec
32,800 page-faults # 115.125 /sec
189,779,273,552 cycles # 0.666 GHz (83.33%)
2,893,165,259 stalled-cycles-frontend # 1.52% frontend cycles idle (83.33%)
24,807,157,349 stalled-cycles-backend # 13.07% backend cycles idle (83.33%)
99,286,488,807 instructions # 0.52 insn per cycle
# 0.25 stalled cycles per insn (83.33%)
24,120,737,678 branches # 84.661 M/sec (83.33%)
1,907,540,278 branch-misses # 7.91% of all branches (83.34%)
8.902784776 seconds time elapsed
[root@five ~]#
[root@five ~]# perf stat -e dram_channel_data_controller_4
^C
Performance counter stats for 'system wide':
0 dram_channel_data_controller_4
1.189638741 seconds time elapsed
[root@five ~]#
I.e. -M should bail out at that point (Cannot find metric or group `dram_channel_data_controller_4'), no?
- Arnaldo
> > Before announcing failure, the test can try multiple options for each
> > available metric. After system-wide mode fails, retry once again with
> > the "--metric-no-group" option.
> >
> > E.g.
> >
> > $ sudo perf test -v 100
> >
> > Before:
> >
> > 100: perf all metrics test :
> > --- start ---
> > test child forked, pid 672731
> > Testing branch_misprediction_ratio
> > Testing all_remote_links_outbound
> > Testing nps1_die_to_dram
> > Metric 'nps1_die_to_dram' not printed in:
> > Error:
> > Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
> > Testing macro_ops_dispatched
> > Testing all_l2_cache_accesses
> > Testing all_l2_cache_hits
> > Testing all_l2_cache_misses
> > Testing ic_fetch_miss_ratio
> > Testing l2_cache_accesses_from_l2_hwpf
> > Testing l2_cache_misses_from_l2_hwpf
> > Testing op_cache_fetch_miss_ratio
> > Testing l3_read_miss_latency
> > Testing l1_itlb_misses
> > test child finished with -1
> > ---- end ----
> > perf all metrics test: FAILED!
> >
> > After:
> >
> > 100: perf all metrics test :
> > --- start ---
> > test child forked, pid 672887
> > Testing branch_misprediction_ratio
> > Testing all_remote_links_outbound
> > Testing nps1_die_to_dram
> > Testing macro_ops_dispatched
> > Testing all_l2_cache_accesses
> > Testing all_l2_cache_hits
> > Testing all_l2_cache_misses
> > Testing ic_fetch_miss_ratio
> > Testing l2_cache_accesses_from_l2_hwpf
> > Testing l2_cache_misses_from_l2_hwpf
> > Testing op_cache_fetch_miss_ratio
> > Testing l3_read_miss_latency
> > Testing l1_itlb_misses
> > test child finished with 0
> > ---- end ----
> > perf all metrics test: Ok
> >
>
> Issue gets resolved after applying this patch
>
> $ ./perf test 102 -vvv
> $102: perf all metrics test :
> $--- start ---
> $test child forked, pid 244991
> $Testing branch_misprediction_ratio
> $Testing all_remote_links_outbound
> $Testing nps1_die_to_dram
> $Testing all_l2_cache_accesses
> $Testing all_l2_cache_hits
> $Testing all_l2_cache_misses
> $Testing ic_fetch_miss_ratio
> $Testing l2_cache_accesses_from_l2_hwpf
> $Testing l2_cache_misses_from_l2_hwpf
> $Testing l3_read_miss_latency
> $Testing l1_itlb_misses
> $test child finished with 0
> $---- end ----
> $perf all metrics test: Ok
>
> > Reported-by: Ayush Jain <ayush.jain3@amd.com>
> > Signed-off-by: Sandipan Das <sandipan.das@amd.com>
>
> Tested-by: Ayush Jain <ayush.jain3@amd.com>
>
> > ---
> > tools/perf/tests/shell/stat_all_metrics.sh | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/tools/perf/tests/shell/stat_all_metrics.sh b/tools/perf/tests/shell/stat_all_metrics.sh
> > index 54774525e18a..1e88ea8c5677 100755
> > --- a/tools/perf/tests/shell/stat_all_metrics.sh
> > +++ b/tools/perf/tests/shell/stat_all_metrics.sh
> > @@ -16,6 +16,13 @@ for m in $(perf list --raw-dump metrics); do
> > then
> > continue
> > fi
> > + # Failed again, possibly there are not enough counters so retry system wide
> > + # mode but without event grouping.
> > + result=$(perf stat -M "$m" --metric-no-group -a sleep 0.01 2>&1)
> > + if [[ "$result" =~ ${m:0:50} ]]
> > + then
> > + continue
> > + fi
> > # Failed again, possibly the workload was too small so retry with something
> > # longer.
> > result=$(perf stat -M "$m" perf bench internals synthesize 2>&1)
>
> Thanks & Regards,
> Ayush Jain
--
- Arnaldo
next prev parent reply other threads:[~2023-12-06 13:08 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-14 9:07 [PATCH] perf test: Retry without grouping for all metrics test Sandipan Das
2023-06-14 11:38 ` Ayush Jain
2023-12-06 13:08 ` Arnaldo Carvalho de Melo [this message]
2023-12-06 16:35 ` Ian Rogers
2023-12-06 17:54 ` Arnaldo Carvalho de Melo
2023-12-06 18:50 ` Ian Rogers
2023-06-14 16:40 ` Ian Rogers
2023-06-19 11:46 ` Sandipan Das
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZXByT1K6enTh2EHT@kernel.org \
--to=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=ananth.narayan@amd.com \
--cc=atrajeev@linux.vnet.ibm.com \
--cc=ayush.jain3@amd.com \
--cc=barnali@linux.ibm.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=kjain@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=sandipan.das@amd.com \
--cc=santosh.shukla@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.