* [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
@ 2026-03-15 10:57 Athira Rajeev
2026-03-23 10:39 ` Venkat
0 siblings, 1 reply; 11+ messages in thread
From: Athira Rajeev @ 2026-03-15 10:57 UTC (permalink / raw)
To: acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy,
irogers, namhyung
Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, Tejas.Manhas1,
Tanushree.Shah, Shivani.Nittor
Currently in "perf all PMU test", for "perf stat -e <event> true",
below checks are done:
- if return code is zero, look for "not supported" to decide pass
scenario
- check for "not supported" to ignore the event
- looks for "No permission to enable" to skip the event.
- If output has "Bad event name", fail the test.
- Use "Access to performance monitoring and observability operations is
limited." to ignore fail due to access limitations
If we failed to see event and it is supported, retries with longer
workload "perf bench internals synthesize".
- Here if output has <event>, the test is a pass.
Snippet of code check:
```
output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
if echo "$output" | grep -q "$p"
```
- if output doesn't have event printed in logs, considers it fail.
But this results in false pass for events in some cases.
Example, if perf stat fails as below:
# ./perf stat -e pmu/event/ true
event syntax error: 'pmu/event/'
\___ Bad event or PMU
Unable to find PMU or event on a PMU of 'pmu'
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
# echo $?
129
Since this has non-zero return code and doesn't have the
fail strings being checked in the test, it will enter check using
longer workload. and since the output fail log has event, it
declares test as "supported".
Since all the fail strings can't be added in the check, update
the testcase to check return code before proceeding to longer
workload run.
Another missing scenario is when system wide monitoring is supported
example:
# ./perf stat -e pmu/event/ true
Error:
No supported events found.
Unsupported event (pmu/event/H) in per-thread mode, enable system wide with '-a'.
Update testcase to check with "perf stat -a -e $p" as well
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/tools/perf/tests/shell/stat_all_pmu.sh b/tools/perf/tests/shell/stat_all_pmu.sh
index 9c466c0efa85..6c4d59cbfa5f 100755
--- a/tools/perf/tests/shell/stat_all_pmu.sh
+++ b/tools/perf/tests/shell/stat_all_pmu.sh
@@ -53,6 +53,26 @@ do
continue
fi
+ # check with system wide if it is supported.
+ output=$(perf stat -a -e "$p" true 2>&1)
+ stat_result=$?
+ if echo "$output" | grep -q "not supported"
+ then
+ # Event not supported, so ignore.
+ echo "not supported"
+ continue
+ fi
+
+ # checked through possible access limitations and permissions.
+ # At this step, non-zero return code from "perf stat" needs to
+ # reported as fail for the user to investigate
+ if [ $stat_result -ne 0 ]
+ then
+ echo "perf stat failed with non-zero return code"
+ err=1
+ continue
+ fi
+
# We failed to see the event and it is supported. Possibly the workload was
# too small so retry with something longer.
output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
--
2.47.3
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-03-15 10:57 [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test Athira Rajeev @ 2026-03-23 10:39 ` Venkat 2026-04-01 20:40 ` Ian Rogers 0 siblings, 1 reply; 11+ messages in thread From: Venkat @ 2026-03-23 10:39 UTC (permalink / raw) To: Athira Rajeev Cc: acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy, irogers, namhyung, linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor > On 15 Mar 2026, at 4:27 PM, Athira Rajeev <atrajeev@linux.ibm.com> wrote: > > Currently in "perf all PMU test", for "perf stat -e <event> true", > below checks are done: > - if return code is zero, look for "not supported" to decide pass > scenario > - check for "not supported" to ignore the event > - looks for "No permission to enable" to skip the event. > - If output has "Bad event name", fail the test. > - Use "Access to performance monitoring and observability operations is > limited." to ignore fail due to access limitations > > If we failed to see event and it is supported, retries with longer > workload "perf bench internals synthesize". > - Here if output has <event>, the test is a pass. > > Snippet of code check: > ``` > output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) > if echo "$output" | grep -q "$p" > ``` > - if output doesn't have event printed in logs, considers it fail. > > But this results in false pass for events in some cases. > Example, if perf stat fails as below: > > # ./perf stat -e pmu/event/ true > event syntax error: 'pmu/event/' > \___ Bad event or PMU > > Unable to find PMU or event on a PMU of 'pmu' > Run 'perf list' for a list of valid events > > Usage: perf stat [<options>] [<command>] > > -e, --event <event> event selector. use 'perf list' to list available events > # echo $? > 129 > > Since this has non-zero return code and doesn't have the > fail strings being checked in the test, it will enter check using > longer workload. and since the output fail log has event, it > declares test as "supported". > > Since all the fail strings can't be added in the check, update > the testcase to check return code before proceeding to longer > workload run. > > Another missing scenario is when system wide monitoring is supported > example: > # ./perf stat -e pmu/event/ true > Error: > No supported events found. > Unsupported event (pmu/event/H) in per-thread mode, enable system wide with '-a'. > > Update testcase to check with "perf stat -a -e $p" as well > > Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> > --- Tested this patch. With this patch: Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero return code Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero return code Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Regards, Venkat. > tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/tools/perf/tests/shell/stat_all_pmu.sh b/tools/perf/tests/shell/stat_all_pmu.sh > index 9c466c0efa85..6c4d59cbfa5f 100755 > --- a/tools/perf/tests/shell/stat_all_pmu.sh > +++ b/tools/perf/tests/shell/stat_all_pmu.sh > @@ -53,6 +53,26 @@ do > continue > fi > > + # check with system wide if it is supported. > + output=$(perf stat -a -e "$p" true 2>&1) > + stat_result=$? > + if echo "$output" | grep -q "not supported" > + then > + # Event not supported, so ignore. > + echo "not supported" > + continue > + fi > + > + # checked through possible access limitations and permissions. > + # At this step, non-zero return code from "perf stat" needs to > + # reported as fail for the user to investigate > + if [ $stat_result -ne 0 ] > + then > + echo "perf stat failed with non-zero return code" > + err=1 > + continue > + fi > + > # We failed to see the event and it is supported. Possibly the workload was > # too small so retry with something longer. > output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) > -- > 2.47.3 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-03-23 10:39 ` Venkat @ 2026-04-01 20:40 ` Ian Rogers 2026-04-01 23:48 ` Namhyung Kim 2026-04-02 17:32 ` Falcon, Thomas 0 siblings, 2 replies; 11+ messages in thread From: Ian Rogers @ 2026-04-01 20:40 UTC (permalink / raw) To: Venkat, Athira Rajeev, Thomas Falcon Cc: acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy, namhyung, linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com> wrote: > > > > > On 15 Mar 2026, at 4:27 PM, Athira Rajeev <atrajeev@linux.ibm.com> wrote: > > > > Currently in "perf all PMU test", for "perf stat -e <event> true", > > below checks are done: > > - if return code is zero, look for "not supported" to decide pass > > scenario > > - check for "not supported" to ignore the event > > - looks for "No permission to enable" to skip the event. > > - If output has "Bad event name", fail the test. > > - Use "Access to performance monitoring and observability operations is > > limited." to ignore fail due to access limitations > > > > If we failed to see event and it is supported, retries with longer > > workload "perf bench internals synthesize". > > - Here if output has <event>, the test is a pass. > > > > Snippet of code check: > > ``` > > output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) > > if echo "$output" | grep -q "$p" > > ``` > > - if output doesn't have event printed in logs, considers it fail. > > > > But this results in false pass for events in some cases. > > Example, if perf stat fails as below: > > > > # ./perf stat -e pmu/event/ true > > event syntax error: 'pmu/event/' > > \___ Bad event or PMU > > > > Unable to find PMU or event on a PMU of 'pmu' > > Run 'perf list' for a list of valid events > > > > Usage: perf stat [<options>] [<command>] > > > > -e, --event <event> event selector. use 'perf list' to list available events > > # echo $? > > 129 > > > > Since this has non-zero return code and doesn't have the > > fail strings being checked in the test, it will enter check using > > longer workload. and since the output fail log has event, it > > declares test as "supported". > > > > Since all the fail strings can't be added in the check, update > > the testcase to check return code before proceeding to longer > > workload run. > > > > Another missing scenario is when system wide monitoring is supported > > example: > > # ./perf stat -e pmu/event/ true > > Error: > > No supported events found. > > Unsupported event (pmu/event/H) in per-thread mode, enable system wide with '-a'. > > > > Update testcase to check with "perf stat -a -e $p" as well > > > > Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> > > --- > > Tested this patch. > > > With this patch: > > Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero return code > Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero return code > > > > Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Testing on an Intel Alderlake the test is now failing: ``` ... Testing offcore_requests_outstanding.l3_miss_demand_data_rd -- supported Testing ocr.full_streaming_wr.any_response -- perf stat failed with non-zero return code Testing ocr.partial_streaming_wr.any_response -- perf stat failed with non-zero return code Testing ocr.streaming_wr.any_response -- supported ... ``` Running `perf stat` manually reveals an issue with the event: ``` $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep 1 Using CPUID GenuineIntel-6-B7-1 Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/ ..after resolving event: cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x800000010000/ ocr.full_streaming_wr.any_response -> cpu_atom/ocr.full_streaming_wr.any_response/ Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 10 (cpu_atom) size 144 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/) disabled 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/) disabled 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 config 0x1b7 (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 { bp_addr, config1 } 0x800000010000 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off deferred callchain support Warning: ocr.full_streaming_wr.any_response event is not supported by the kernel. The sys_perf_event_open() syscall failed for event (ocr.full_streaming_wr.any_response): Invalid argument "dmesg | grep -i perf" may provide additional information. Error: No supported events found. The sys_perf_event_open() syscall failed for event (ocr.full_streaming_wr.any_response): Invalid argument "dmesg | grep -i perf" may provide additional information. ``` This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? Thanks, Ian > Regards, > Venkat. > > > > > tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > > diff --git a/tools/perf/tests/shell/stat_all_pmu.sh b/tools/perf/tests/shell/stat_all_pmu.sh > > index 9c466c0efa85..6c4d59cbfa5f 100755 > > --- a/tools/perf/tests/shell/stat_all_pmu.sh > > +++ b/tools/perf/tests/shell/stat_all_pmu.sh > > @@ -53,6 +53,26 @@ do > > continue > > fi > > > > + # check with system wide if it is supported. > > + output=$(perf stat -a -e "$p" true 2>&1) > > + stat_result=$? > > + if echo "$output" | grep -q "not supported" > > + then > > + # Event not supported, so ignore. > > + echo "not supported" > > + continue > > + fi > > + > > + # checked through possible access limitations and permissions. > > + # At this step, non-zero return code from "perf stat" needs to > > + # reported as fail for the user to investigate > > + if [ $stat_result -ne 0 ] > > + then > > + echo "perf stat failed with non-zero return code" > > + err=1 > > + continue > > + fi > > + > > # We failed to see the event and it is supported. Possibly the workload was > > # too small so retry with something longer. > > output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) > > -- > > 2.47.3 > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-04-01 20:40 ` Ian Rogers @ 2026-04-01 23:48 ` Namhyung Kim 2026-04-01 23:57 ` Ian Rogers 2026-04-02 17:32 ` Falcon, Thomas 1 sibling, 1 reply; 11+ messages in thread From: Namhyung Kim @ 2026-04-01 23:48 UTC (permalink / raw) To: Ian Rogers Cc: Venkat, Athira Rajeev, Thomas Falcon, acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy, linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor On Wed, Apr 01, 2026 at 01:40:47PM -0700, Ian Rogers wrote: > This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? Are you ok with the change itself then? I'm not sure what's the expectation when the test runs with a regular user. I assume the intention of this change is running as root.. Thanks, Namhyung ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-04-01 23:48 ` Namhyung Kim @ 2026-04-01 23:57 ` Ian Rogers 2026-04-02 15:45 ` Athira Rajeev 0 siblings, 1 reply; 11+ messages in thread From: Ian Rogers @ 2026-04-01 23:57 UTC (permalink / raw) To: Namhyung Kim Cc: Venkat, Athira Rajeev, Thomas Falcon, acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy, linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor On Wed, Apr 1, 2026 at 4:48 PM Namhyung Kim <namhyung@kernel.org> wrote: > > On Wed, Apr 01, 2026 at 01:40:47PM -0700, Ian Rogers wrote: > > This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? > > Are you ok with the change itself then? > > I'm not sure what's the expectation when the test runs with a regular > user. I assume the intention of this change is running as root.. So the test is failing as root on Intel, and the log output is extensive. I'd prefer not to have the log output, so we may need to work past some known broken items. Otherwise, the change looks okay, but there are inconsistencies: `grep -q "<not supported>"` is used in the existing code, while `grep -q "not supported"` is used here. Thanks, Ian > Thanks, > Namhyung > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-04-01 23:57 ` Ian Rogers @ 2026-04-02 15:45 ` Athira Rajeev 2026-04-03 1:27 ` Namhyung Kim 0 siblings, 1 reply; 11+ messages in thread From: Athira Rajeev @ 2026-04-02 15:45 UTC (permalink / raw) To: Ian Rogers, Namhyung Kim Cc: Venkat, Thomas Falcon, acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy, linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor > On 2 Apr 2026, at 5:27 AM, Ian Rogers <irogers@google.com> wrote: > > On Wed, Apr 1, 2026 at 4:48 PM Namhyung Kim <namhyung@kernel.org> wrote: >> >> On Wed, Apr 01, 2026 at 01:40:47PM -0700, Ian Rogers wrote: >>> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? >> >> Are you ok with the change itself then? >> >> I'm not sure what's the expectation when the test runs with a regular >> user. I assume the intention of this change is running as root.. > > So the test is failing as root on Intel, and the log output is > extensive. I'd prefer not to have the log output, so we may need to > work past some known broken items. Otherwise, the change looks okay, > but there are inconsistencies: `grep -q "<not supported>"` is used in > the existing code, while `grep -q "not supported"` is used here. > > Thanks, > Ian > >> Thanks, >> Namhyung >> Hi Namhyung, Ian Thanks for trying the change and responding. Sorry, I missed the comment from Sashiko. Next time I will check explicitly for Sashiko comments too. The intention of this patch has two things: 1) It was resulting in “false pass” earlier if the test fallback to last check where we run with longer workload. Because error message contains the event name “$p” and hence matches the “grep” check. Example: # ./perf stat -e hv_24x7/CPM_ADJUNCT_PCYC/ ./perf bench internals synthesize event syntax error: 'hv_24x7/CPM_ADJUNCT_PCYC/' \___ Bad event or PMU Unable to find PMU or event on a PMU of ‘hv_24x7' Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events # echo $? 129 Here the test checks for : <<>> output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) if echo "$output" | grep -q "$p” <<>> Since the error message contains the event name, it matches grep check and declares test as pass. To catch this, patch adds a check for “return code” + # checked through possible access limitations and permissions. + # At this step, non-zero return code from "perf stat" needs to + # reported as fail for the user to investigate + if [ $stat_result -ne 0 ] + then + echo "perf stat failed with non-zero return code" + err=1 + continue + fi 2) If events are supported in system wide monitoring. Namhyung is right here that for regular user it will result in fail . I will take care of having check around that. The reason for using “not supported” text here is: Example: There is an event in powerpc "vpa_dtl/dtl_all/“ which when run on per thread monitoring: # ./perf stat -e vpa_dtl/dtl_all/ true Error: No supported events found. Unsupported event (vpa_dtl/dtl_all/H) in per-thread mode, enable system wide with '-a’. Next running with system wide monitoring: # ./perf stat -a -e vpa_dtl/dtl_all/ true Error: No supported events found. The sys_perf_event_open() syscall failed for event (vpa_dtl/dtl_all/H): Operation not supported "dmesg | grep -i perf" may provide additional information. This is because this event is supported for only “sampling” system wide and not counting. So patch attempts to use “-a”, if still it fails, we look for “not supported” in logs Namhyung, Ian, If the addition of “-a” can cause regression ( like intel one if its not suppose to be run system wide ), how about adding a case like this: - Look for "enable system wide with '-a’ “ in the error logs - If logs matches this message and if user is root, attempt with -a next. - With “-a”, If the logs has "Operation not supported” , test can continue to next event. Thanks Athira ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-04-02 15:45 ` Athira Rajeev @ 2026-04-03 1:27 ` Namhyung Kim 0 siblings, 0 replies; 11+ messages in thread From: Namhyung Kim @ 2026-04-03 1:27 UTC (permalink / raw) To: Athira Rajeev Cc: Ian Rogers, Venkat, Thomas Falcon, acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy, linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor On Thu, Apr 02, 2026 at 09:15:54PM +0530, Athira Rajeev wrote: > If the addition of “-a” can cause regression ( like intel one if its not suppose to be run system wide ), > how about adding a case like this: > - Look for "enable system wide with '-a’ “ in the error logs > - If logs matches this message and if user is root, attempt with -a next. > - With “-a”, If the logs has "Operation not supported” , test can continue to next event. Sounds good, you can check perf_event_paranoid as well as root. Then you don't need to check the result of "-a" for -EOPNOTSUP. Thanks, Namhyung ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-04-01 20:40 ` Ian Rogers 2026-04-01 23:48 ` Namhyung Kim @ 2026-04-02 17:32 ` Falcon, Thomas 2026-04-03 7:36 ` Mi, Dapeng 1 sibling, 1 reply; 11+ messages in thread From: Falcon, Thomas @ 2026-04-02 17:32 UTC (permalink / raw) To: atrajeev@linux.ibm.com, venkat88@linux.ibm.com, Rogers, Ian Cc: Kleen, Andi, Shivani.Nittor@ibm.com, tmricht@linux.ibm.com, hbathini@linux.vnet.ibm.com, mpetlan@redhat.com, Tanushree.Shah@ibm.com, Hunter, Adrian, linux-perf-users@vger.kernel.org, maddy@linux.ibm.com, Chen, Zide, vmolnaro@redhat.com, Tejas.Manhas1@ibm.com, linuxppc-dev@lists.ozlabs.org, acme@kernel.org, jolsa@kernel.org, Mi, Dapeng1, namhyung@kernel.org On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote: > On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com> > wrote: > > > > > > > > > On 15 Mar 2026, at 4:27 PM, Athira Rajeev > > > <atrajeev@linux.ibm.com> wrote: > > > > > > Currently in "perf all PMU test", for "perf stat -e <event> > > > true", > > > below checks are done: > > > - if return code is zero, look for "not supported" to decide pass > > > scenario > > > - check for "not supported" to ignore the event > > > - looks for "No permission to enable" to skip the event. > > > - If output has "Bad event name", fail the test. > > > - Use "Access to performance monitoring and observability > > > operations is > > > limited." to ignore fail due to access limitations > > > > > > If we failed to see event and it is supported, retries with > > > longer > > > workload "perf bench internals synthesize". > > > - Here if output has <event>, the test is a pass. > > > > > > Snippet of code check: > > > ``` > > > output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) > > > if echo "$output" | grep -q "$p" > > > ``` > > > - if output doesn't have event printed in logs, considers it > > > fail. > > > > > > But this results in false pass for events in some cases. > > > Example, if perf stat fails as below: > > > > > > # ./perf stat -e pmu/event/ true > > > event syntax error: 'pmu/event/' > > > \___ Bad event or PMU > > > > > > Unable to find PMU or event on a PMU of 'pmu' > > > Run 'perf list' for a list of valid events > > > > > > Usage: perf stat [<options>] [<command>] > > > > > > -e, --event <event> event selector. use 'perf list' to list > > > available events > > > # echo $? > > > 129 > > > > > > Since this has non-zero return code and doesn't have the > > > fail strings being checked in the test, it will enter check using > > > longer workload. and since the output fail log has event, it > > > declares test as "supported". > > > > > > Since all the fail strings can't be added in the check, update > > > the testcase to check return code before proceeding to longer > > > workload run. > > > > > > Another missing scenario is when system wide monitoring is > > > supported > > > example: > > > # ./perf stat -e pmu/event/ true > > > Error: > > > No supported events found. > > > Unsupported event (pmu/event/H) in per-thread mode, enable > > > system wide with '-a'. > > > > > > Update testcase to check with "perf stat -a -e $p" as well > > > > > > Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> > > > --- > > > > Tested this patch. > > > > > > With this patch: > > > > Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero > > return code > > Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero > > return code > > > > > > > > Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> > > Testing on an Intel Alderlake the test is now failing: > ``` > ... > Testing offcore_requests_outstanding.l3_miss_demand_data_rd -- > supported > Testing ocr.full_streaming_wr.any_response -- perf stat failed with > non-zero return code > Testing ocr.partial_streaming_wr.any_response -- perf stat failed > with > non-zero return code > Testing ocr.streaming_wr.any_response -- supported > ... > ``` > > Running `perf stat` manually reveals an issue with the event: > ``` > $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep > 1 > Using CPUID GenuineIntel-6-B7-1 > Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/ > ..after resolving event: > cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100 > 00/ > ocr.full_streaming_wr.any_response -> > cpu_atom/ocr.full_streaming_wr.any_response/ > Control descriptor is not initialized > ------------------------------------------------------------ > perf_event_attr: > type 10 (cpu_atom) > size 144 > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0xa00000000 > (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/) > disabled 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 > ------------------------------------------------------------ > perf_event_attr: > type 0 (PERF_TYPE_HARDWARE) > config 0x400000000 > (cpu_core/PERF_COUNT_HW_CPU_CYCLES/) > disabled 1 > ------------------------------------------------------------ > sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 > config 0x1b7 > (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd) > sample_type IDENTIFIER > read_format > TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > disabled 1 > inherit 1 > { bp_addr, config1 } 0x800000010000 > ------------------------------------------------------------ > sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 > sys_perf_event_open failed, error -22 > switching off deferred callchain support > Warning: > ocr.full_streaming_wr.any_response event is not supported by the > kernel. > The sys_perf_event_open() syscall failed for event > (ocr.full_streaming_wr.any_response): Invalid argument > "dmesg | grep -i perf" may provide additional information. > > Error: > No supported events found. > The sys_perf_event_open() syscall failed for event > (ocr.full_streaming_wr.any_response): Invalid argument > "dmesg | grep -i perf" may provide additional information. > ``` > > This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? +Dapeng, Zide, Andi Thanks, Tom > > Thanks, > Ian > > > Regards, > > Venkat. > > > > > > > > > tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++ > > > 1 file changed, 20 insertions(+) > > > > > > diff --git a/tools/perf/tests/shell/stat_all_pmu.sh > > > b/tools/perf/tests/shell/stat_all_pmu.sh > > > index 9c466c0efa85..6c4d59cbfa5f 100755 > > > --- a/tools/perf/tests/shell/stat_all_pmu.sh > > > +++ b/tools/perf/tests/shell/stat_all_pmu.sh > > > @@ -53,6 +53,26 @@ do > > > continue > > > fi > > > > > > + # check with system wide if it is supported. > > > + output=$(perf stat -a -e "$p" true 2>&1) > > > + stat_result=$? > > > + if echo "$output" | grep -q "not supported" > > > + then > > > + # Event not supported, so ignore. > > > + echo "not supported" > > > + continue > > > + fi > > > + > > > + # checked through possible access limitations and permissions. > > > + # At this step, non-zero return code from "perf stat" needs to > > > + # reported as fail for the user to investigate > > > + if [ $stat_result -ne 0 ] > > > + then > > > + echo "perf stat failed with non-zero return code" > > > + err=1 > > > + continue > > > + fi > > > + > > > # We failed to see the event and it is supported. Possibly the > > > workload was > > > # too small so retry with something longer. > > > output=$(perf stat -e "$p" perf bench internals synthesize > > > 2>&1) > > > -- > > > 2.47.3 > > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-04-02 17:32 ` Falcon, Thomas @ 2026-04-03 7:36 ` Mi, Dapeng 2026-04-03 15:39 ` Ian Rogers 0 siblings, 1 reply; 11+ messages in thread From: Mi, Dapeng @ 2026-04-03 7:36 UTC (permalink / raw) To: Falcon, Thomas, atrajeev@linux.ibm.com, venkat88@linux.ibm.com, Rogers, Ian Cc: Kleen, Andi, Shivani.Nittor@ibm.com, tmricht@linux.ibm.com, hbathini@linux.vnet.ibm.com, mpetlan@redhat.com, Tanushree.Shah@ibm.com, Hunter, Adrian, linux-perf-users@vger.kernel.org, maddy@linux.ibm.com, Chen, Zide, vmolnaro@redhat.com, Tejas.Manhas1@ibm.com, linuxppc-dev@lists.ozlabs.org, acme@kernel.org, jolsa@kernel.org, Mi, Dapeng1, namhyung@kernel.org On 4/3/2026 1:32 AM, Falcon, Thomas wrote: > On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote: >> On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com> >> wrote: >>> >>> >>>> On 15 Mar 2026, at 4:27 PM, Athira Rajeev >>>> <atrajeev@linux.ibm.com> wrote: >>>> >>>> Currently in "perf all PMU test", for "perf stat -e <event> >>>> true", >>>> below checks are done: >>>> - if return code is zero, look for "not supported" to decide pass >>>> scenario >>>> - check for "not supported" to ignore the event >>>> - looks for "No permission to enable" to skip the event. >>>> - If output has "Bad event name", fail the test. >>>> - Use "Access to performance monitoring and observability >>>> operations is >>>> limited." to ignore fail due to access limitations >>>> >>>> If we failed to see event and it is supported, retries with >>>> longer >>>> workload "perf bench internals synthesize". >>>> - Here if output has <event>, the test is a pass. >>>> >>>> Snippet of code check: >>>> ``` >>>> output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) >>>> if echo "$output" | grep -q "$p" >>>> ``` >>>> - if output doesn't have event printed in logs, considers it >>>> fail. >>>> >>>> But this results in false pass for events in some cases. >>>> Example, if perf stat fails as below: >>>> >>>> # ./perf stat -e pmu/event/ true >>>> event syntax error: 'pmu/event/' >>>> \___ Bad event or PMU >>>> >>>> Unable to find PMU or event on a PMU of 'pmu' >>>> Run 'perf list' for a list of valid events >>>> >>>> Usage: perf stat [<options>] [<command>] >>>> >>>> -e, --event <event> event selector. use 'perf list' to list >>>> available events >>>> # echo $? >>>> 129 >>>> >>>> Since this has non-zero return code and doesn't have the >>>> fail strings being checked in the test, it will enter check using >>>> longer workload. and since the output fail log has event, it >>>> declares test as "supported". >>>> >>>> Since all the fail strings can't be added in the check, update >>>> the testcase to check return code before proceeding to longer >>>> workload run. >>>> >>>> Another missing scenario is when system wide monitoring is >>>> supported >>>> example: >>>> # ./perf stat -e pmu/event/ true >>>> Error: >>>> No supported events found. >>>> Unsupported event (pmu/event/H) in per-thread mode, enable >>>> system wide with '-a'. >>>> >>>> Update testcase to check with "perf stat -a -e $p" as well >>>> >>>> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> >>>> --- >>> Tested this patch. >>> >>> >>> With this patch: >>> >>> Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero >>> return code >>> Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero >>> return code >>> >>> >>> >>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> >> Testing on an Intel Alderlake the test is now failing: >> ``` >> ... >> Testing offcore_requests_outstanding.l3_miss_demand_data_rd -- >> supported >> Testing ocr.full_streaming_wr.any_response -- perf stat failed with >> non-zero return code >> Testing ocr.partial_streaming_wr.any_response -- perf stat failed >> with >> non-zero return code >> Testing ocr.streaming_wr.any_response -- supported >> ... >> ``` >> >> Running `perf stat` manually reveals an issue with the event: >> ``` >> $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep >> 1 >> Using CPUID GenuineIntel-6-B7-1 >> Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/ >> ..after resolving event: >> cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100 >> 00/ >> ocr.full_streaming_wr.any_response -> >> cpu_atom/ocr.full_streaming_wr.any_response/ >> Control descriptor is not initialized >> ------------------------------------------------------------ >> perf_event_attr: >> type 10 (cpu_atom) >> size 144 >> ------------------------------------------------------------ >> perf_event_attr: >> type 0 (PERF_TYPE_HARDWARE) >> config 0xa00000000 >> (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/) >> disabled 1 >> ------------------------------------------------------------ >> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 >> ------------------------------------------------------------ >> perf_event_attr: >> type 0 (PERF_TYPE_HARDWARE) >> config 0x400000000 >> (cpu_core/PERF_COUNT_HW_CPU_CYCLES/) >> disabled 1 >> ------------------------------------------------------------ >> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 >> config 0x1b7 >> (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd) >> sample_type IDENTIFIER >> read_format >> TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >> disabled 1 >> inherit 1 >> { bp_addr, config1 } 0x800000010000 >> ------------------------------------------------------------ >> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 >> sys_perf_event_open failed, error -22 >> switching off deferred callchain support >> Warning: >> ocr.full_streaming_wr.any_response event is not supported by the >> kernel. >> The sys_perf_event_open() syscall failed for event >> (ocr.full_streaming_wr.any_response): Invalid argument >> "dmesg | grep -i perf" may provide additional information. >> >> Error: >> No supported events found. >> The sys_perf_event_open() syscall failed for event >> (ocr.full_streaming_wr.any_response): Invalid argument >> "dmesg | grep -i perf" may provide additional information. >> ``` >> >> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? Hmm, it looks the error is caused by the invalid bitmask of OFFCORE_RSP_x MSRs. Currently the valid bitmask of OFFCORE_RSP_x MSR is set to 0x3fffffffff in intel_grt_extra_regs[], while the msr value is set 0x800000010000 for the ocr.full_streaming_wr.any_response event. The bit 47 is recognized an invalid bit and then abort the event creation. Base on the description "Table 21-56. MSR_OFFCORE_RSPx Request Type Definition" in SDM, bit 47 should be a valid bit now. Suppose bit 47 should not be a valid bit when adding the ADL PMU support, but it's updated and becomes valid later. Along with the constant updates of perf event lists (https://github.com/intel/perfmon), we have noticed there are mismatches more or less between the driver hardcoded events and perfmon event list. Currently we are summarizing the mismatches. Once these mismatches are finalized. we would submit a patchset to fix these mismatches. Thanks. > +Dapeng, Zide, Andi > > Thanks, > Tom > >> Thanks, >> Ian >> >>> Regards, >>> Venkat. >>> >>> >>> >>>> tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++ >>>> 1 file changed, 20 insertions(+) >>>> >>>> diff --git a/tools/perf/tests/shell/stat_all_pmu.sh >>>> b/tools/perf/tests/shell/stat_all_pmu.sh >>>> index 9c466c0efa85..6c4d59cbfa5f 100755 >>>> --- a/tools/perf/tests/shell/stat_all_pmu.sh >>>> +++ b/tools/perf/tests/shell/stat_all_pmu.sh >>>> @@ -53,6 +53,26 @@ do >>>> continue >>>> fi >>>> >>>> + # check with system wide if it is supported. >>>> + output=$(perf stat -a -e "$p" true 2>&1) >>>> + stat_result=$? >>>> + if echo "$output" | grep -q "not supported" >>>> + then >>>> + # Event not supported, so ignore. >>>> + echo "not supported" >>>> + continue >>>> + fi >>>> + >>>> + # checked through possible access limitations and permissions. >>>> + # At this step, non-zero return code from "perf stat" needs to >>>> + # reported as fail for the user to investigate >>>> + if [ $stat_result -ne 0 ] >>>> + then >>>> + echo "perf stat failed with non-zero return code" >>>> + err=1 >>>> + continue >>>> + fi >>>> + >>>> # We failed to see the event and it is supported. Possibly the >>>> workload was >>>> # too small so retry with something longer. >>>> output=$(perf stat -e "$p" perf bench internals synthesize >>>> 2>&1) >>>> -- >>>> 2.47.3 >>>> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-04-03 7:36 ` Mi, Dapeng @ 2026-04-03 15:39 ` Ian Rogers 2026-04-07 0:48 ` Mi, Dapeng 0 siblings, 1 reply; 11+ messages in thread From: Ian Rogers @ 2026-04-03 15:39 UTC (permalink / raw) To: Mi, Dapeng, Falcon, Thomas, Kleen, Andi Cc: atrajeev@linux.ibm.com, venkat88@linux.ibm.com, Shivani.Nittor@ibm.com, tmricht@linux.ibm.com, hbathini@linux.vnet.ibm.com, mpetlan@redhat.com, Tanushree.Shah@ibm.com, Hunter, Adrian, linux-perf-users@vger.kernel.org, maddy@linux.ibm.com, Chen, Zide, vmolnaro@redhat.com, Tejas.Manhas1@ibm.com, linuxppc-dev@lists.ozlabs.org, acme@kernel.org, jolsa@kernel.org, Mi, Dapeng1, namhyung@kernel.org On Fri, Apr 3, 2026 at 12:36 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote: > > > On 4/3/2026 1:32 AM, Falcon, Thomas wrote: > > On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote: > >> On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com> > >> wrote: > >>> > >>> > >>>> On 15 Mar 2026, at 4:27 PM, Athira Rajeev > >>>> <atrajeev@linux.ibm.com> wrote: > >>>> > >>>> Currently in "perf all PMU test", for "perf stat -e <event> > >>>> true", > >>>> below checks are done: > >>>> - if return code is zero, look for "not supported" to decide pass > >>>> scenario > >>>> - check for "not supported" to ignore the event > >>>> - looks for "No permission to enable" to skip the event. > >>>> - If output has "Bad event name", fail the test. > >>>> - Use "Access to performance monitoring and observability > >>>> operations is > >>>> limited." to ignore fail due to access limitations > >>>> > >>>> If we failed to see event and it is supported, retries with > >>>> longer > >>>> workload "perf bench internals synthesize". > >>>> - Here if output has <event>, the test is a pass. > >>>> > >>>> Snippet of code check: > >>>> ``` > >>>> output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) > >>>> if echo "$output" | grep -q "$p" > >>>> ``` > >>>> - if output doesn't have event printed in logs, considers it > >>>> fail. > >>>> > >>>> But this results in false pass for events in some cases. > >>>> Example, if perf stat fails as below: > >>>> > >>>> # ./perf stat -e pmu/event/ true > >>>> event syntax error: 'pmu/event/' > >>>> \___ Bad event or PMU > >>>> > >>>> Unable to find PMU or event on a PMU of 'pmu' > >>>> Run 'perf list' for a list of valid events > >>>> > >>>> Usage: perf stat [<options>] [<command>] > >>>> > >>>> -e, --event <event> event selector. use 'perf list' to list > >>>> available events > >>>> # echo $? > >>>> 129 > >>>> > >>>> Since this has non-zero return code and doesn't have the > >>>> fail strings being checked in the test, it will enter check using > >>>> longer workload. and since the output fail log has event, it > >>>> declares test as "supported". > >>>> > >>>> Since all the fail strings can't be added in the check, update > >>>> the testcase to check return code before proceeding to longer > >>>> workload run. > >>>> > >>>> Another missing scenario is when system wide monitoring is > >>>> supported > >>>> example: > >>>> # ./perf stat -e pmu/event/ true > >>>> Error: > >>>> No supported events found. > >>>> Unsupported event (pmu/event/H) in per-thread mode, enable > >>>> system wide with '-a'. > >>>> > >>>> Update testcase to check with "perf stat -a -e $p" as well > >>>> > >>>> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> > >>>> --- > >>> Tested this patch. > >>> > >>> > >>> With this patch: > >>> > >>> Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero > >>> return code > >>> Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero > >>> return code > >>> > >>> > >>> > >>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> > >> Testing on an Intel Alderlake the test is now failing: > >> ``` > >> ... > >> Testing offcore_requests_outstanding.l3_miss_demand_data_rd -- > >> supported > >> Testing ocr.full_streaming_wr.any_response -- perf stat failed with > >> non-zero return code > >> Testing ocr.partial_streaming_wr.any_response -- perf stat failed > >> with > >> non-zero return code > >> Testing ocr.streaming_wr.any_response -- supported > >> ... > >> ``` > >> > >> Running `perf stat` manually reveals an issue with the event: > >> ``` > >> $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep > >> 1 > >> Using CPUID GenuineIntel-6-B7-1 > >> Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/ > >> ..after resolving event: > >> cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100 > >> 00/ > >> ocr.full_streaming_wr.any_response -> > >> cpu_atom/ocr.full_streaming_wr.any_response/ > >> Control descriptor is not initialized > >> ------------------------------------------------------------ > >> perf_event_attr: > >> type 10 (cpu_atom) > >> size 144 > >> ------------------------------------------------------------ > >> perf_event_attr: > >> type 0 (PERF_TYPE_HARDWARE) > >> config 0xa00000000 > >> (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/) > >> disabled 1 > >> ------------------------------------------------------------ > >> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 > >> ------------------------------------------------------------ > >> perf_event_attr: > >> type 0 (PERF_TYPE_HARDWARE) > >> config 0x400000000 > >> (cpu_core/PERF_COUNT_HW_CPU_CYCLES/) > >> disabled 1 > >> ------------------------------------------------------------ > >> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 > >> config 0x1b7 > >> (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd) > >> sample_type IDENTIFIER > >> read_format > >> TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > >> disabled 1 > >> inherit 1 > >> { bp_addr, config1 } 0x800000010000 > >> ------------------------------------------------------------ > >> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 > >> sys_perf_event_open failed, error -22 > >> switching off deferred callchain support > >> Warning: > >> ocr.full_streaming_wr.any_response event is not supported by the > >> kernel. > >> The sys_perf_event_open() syscall failed for event > >> (ocr.full_streaming_wr.any_response): Invalid argument > >> "dmesg | grep -i perf" may provide additional information. > >> > >> Error: > >> No supported events found. > >> The sys_perf_event_open() syscall failed for event > >> (ocr.full_streaming_wr.any_response): Invalid argument > >> "dmesg | grep -i perf" may provide additional information. > >> ``` > >> > >> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? > > Hmm, it looks the error is caused by the invalid bitmask of OFFCORE_RSP_x > MSRs. Currently the valid bitmask of OFFCORE_RSP_x MSR is set to > 0x3fffffffff in intel_grt_extra_regs[], while the msr value is set > 0x800000010000 for the ocr.full_streaming_wr.any_response event. The bit 47 > is recognized an invalid bit and then abort the event creation. > > Base on the description "Table 21-56. MSR_OFFCORE_RSPx Request Type > Definition" in SDM, bit 47 should be a valid bit now. Suppose bit 47 should > not be a valid bit when adding the ADL PMU support, but it's updated and > becomes valid later. > > Along with the constant updates of perf event lists > (https://github.com/intel/perfmon), we have noticed there are mismatches > more or less between the driver hardcoded events and perfmon event list. > Currently we are summarizing the mismatches. Once these mismatches are > finalized. we would submit a patchset to fix these mismatches. That's great, if it takes too long perhaps we could just remove the events for now. Thanks, Ian > Thanks. > > > +Dapeng, Zide, Andi > > > > Thanks, > > Tom > > > >> Thanks, > >> Ian > >> > >>> Regards, > >>> Venkat. > >>> > >>> > >>> > >>>> tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++ > >>>> 1 file changed, 20 insertions(+) > >>>> > >>>> diff --git a/tools/perf/tests/shell/stat_all_pmu.sh > >>>> b/tools/perf/tests/shell/stat_all_pmu.sh > >>>> index 9c466c0efa85..6c4d59cbfa5f 100755 > >>>> --- a/tools/perf/tests/shell/stat_all_pmu.sh > >>>> +++ b/tools/perf/tests/shell/stat_all_pmu.sh > >>>> @@ -53,6 +53,26 @@ do > >>>> continue > >>>> fi > >>>> > >>>> + # check with system wide if it is supported. > >>>> + output=$(perf stat -a -e "$p" true 2>&1) > >>>> + stat_result=$? > >>>> + if echo "$output" | grep -q "not supported" > >>>> + then > >>>> + # Event not supported, so ignore. > >>>> + echo "not supported" > >>>> + continue > >>>> + fi > >>>> + > >>>> + # checked through possible access limitations and permissions. > >>>> + # At this step, non-zero return code from "perf stat" needs to > >>>> + # reported as fail for the user to investigate > >>>> + if [ $stat_result -ne 0 ] > >>>> + then > >>>> + echo "perf stat failed with non-zero return code" > >>>> + err=1 > >>>> + continue > >>>> + fi > >>>> + > >>>> # We failed to see the event and it is supported. Possibly the > >>>> workload was > >>>> # too small so retry with something longer. > >>>> output=$(perf stat -e "$p" perf bench internals synthesize > >>>> 2>&1) > >>>> -- > >>>> 2.47.3 > >>>> ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test 2026-04-03 15:39 ` Ian Rogers @ 2026-04-07 0:48 ` Mi, Dapeng 0 siblings, 0 replies; 11+ messages in thread From: Mi, Dapeng @ 2026-04-07 0:48 UTC (permalink / raw) To: Ian Rogers, Falcon, Thomas, Kleen, Andi Cc: atrajeev@linux.ibm.com, venkat88@linux.ibm.com, Shivani.Nittor@ibm.com, tmricht@linux.ibm.com, hbathini@linux.vnet.ibm.com, mpetlan@redhat.com, Tanushree.Shah@ibm.com, Hunter, Adrian, linux-perf-users@vger.kernel.org, maddy@linux.ibm.com, Chen, Zide, vmolnaro@redhat.com, Tejas.Manhas1@ibm.com, linuxppc-dev@lists.ozlabs.org, acme@kernel.org, jolsa@kernel.org, Mi, Dapeng1, namhyung@kernel.org On 4/3/2026 11:39 PM, Ian Rogers wrote: > On Fri, Apr 3, 2026 at 12:36 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote: >> >> On 4/3/2026 1:32 AM, Falcon, Thomas wrote: >>> On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote: >>>> On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com> >>>> wrote: >>>>> >>>>>> On 15 Mar 2026, at 4:27 PM, Athira Rajeev >>>>>> <atrajeev@linux.ibm.com> wrote: >>>>>> >>>>>> Currently in "perf all PMU test", for "perf stat -e <event> >>>>>> true", >>>>>> below checks are done: >>>>>> - if return code is zero, look for "not supported" to decide pass >>>>>> scenario >>>>>> - check for "not supported" to ignore the event >>>>>> - looks for "No permission to enable" to skip the event. >>>>>> - If output has "Bad event name", fail the test. >>>>>> - Use "Access to performance monitoring and observability >>>>>> operations is >>>>>> limited." to ignore fail due to access limitations >>>>>> >>>>>> If we failed to see event and it is supported, retries with >>>>>> longer >>>>>> workload "perf bench internals synthesize". >>>>>> - Here if output has <event>, the test is a pass. >>>>>> >>>>>> Snippet of code check: >>>>>> ``` >>>>>> output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) >>>>>> if echo "$output" | grep -q "$p" >>>>>> ``` >>>>>> - if output doesn't have event printed in logs, considers it >>>>>> fail. >>>>>> >>>>>> But this results in false pass for events in some cases. >>>>>> Example, if perf stat fails as below: >>>>>> >>>>>> # ./perf stat -e pmu/event/ true >>>>>> event syntax error: 'pmu/event/' >>>>>> \___ Bad event or PMU >>>>>> >>>>>> Unable to find PMU or event on a PMU of 'pmu' >>>>>> Run 'perf list' for a list of valid events >>>>>> >>>>>> Usage: perf stat [<options>] [<command>] >>>>>> >>>>>> -e, --event <event> event selector. use 'perf list' to list >>>>>> available events >>>>>> # echo $? >>>>>> 129 >>>>>> >>>>>> Since this has non-zero return code and doesn't have the >>>>>> fail strings being checked in the test, it will enter check using >>>>>> longer workload. and since the output fail log has event, it >>>>>> declares test as "supported". >>>>>> >>>>>> Since all the fail strings can't be added in the check, update >>>>>> the testcase to check return code before proceeding to longer >>>>>> workload run. >>>>>> >>>>>> Another missing scenario is when system wide monitoring is >>>>>> supported >>>>>> example: >>>>>> # ./perf stat -e pmu/event/ true >>>>>> Error: >>>>>> No supported events found. >>>>>> Unsupported event (pmu/event/H) in per-thread mode, enable >>>>>> system wide with '-a'. >>>>>> >>>>>> Update testcase to check with "perf stat -a -e $p" as well >>>>>> >>>>>> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> >>>>>> --- >>>>> Tested this patch. >>>>> >>>>> >>>>> With this patch: >>>>> >>>>> Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero >>>>> return code >>>>> Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero >>>>> return code >>>>> >>>>> >>>>> >>>>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> >>>> Testing on an Intel Alderlake the test is now failing: >>>> ``` >>>> ... >>>> Testing offcore_requests_outstanding.l3_miss_demand_data_rd -- >>>> supported >>>> Testing ocr.full_streaming_wr.any_response -- perf stat failed with >>>> non-zero return code >>>> Testing ocr.partial_streaming_wr.any_response -- perf stat failed >>>> with >>>> non-zero return code >>>> Testing ocr.streaming_wr.any_response -- supported >>>> ... >>>> ``` >>>> >>>> Running `perf stat` manually reveals an issue with the event: >>>> ``` >>>> $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep >>>> 1 >>>> Using CPUID GenuineIntel-6-B7-1 >>>> Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/ >>>> ..after resolving event: >>>> cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100 >>>> 00/ >>>> ocr.full_streaming_wr.any_response -> >>>> cpu_atom/ocr.full_streaming_wr.any_response/ >>>> Control descriptor is not initialized >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 10 (cpu_atom) >>>> size 144 >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> config 0xa00000000 >>>> (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/) >>>> disabled 1 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 0 (PERF_TYPE_HARDWARE) >>>> config 0x400000000 >>>> (cpu_core/PERF_COUNT_HW_CPU_CYCLES/) >>>> disabled 1 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 >>>> config 0x1b7 >>>> (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd) >>>> sample_type IDENTIFIER >>>> read_format >>>> TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING >>>> disabled 1 >>>> inherit 1 >>>> { bp_addr, config1 } 0x800000010000 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 >>>> sys_perf_event_open failed, error -22 >>>> switching off deferred callchain support >>>> Warning: >>>> ocr.full_streaming_wr.any_response event is not supported by the >>>> kernel. >>>> The sys_perf_event_open() syscall failed for event >>>> (ocr.full_streaming_wr.any_response): Invalid argument >>>> "dmesg | grep -i perf" may provide additional information. >>>> >>>> Error: >>>> No supported events found. >>>> The sys_perf_event_open() syscall failed for event >>>> (ocr.full_streaming_wr.any_response): Invalid argument >>>> "dmesg | grep -i perf" may provide additional information. >>>> ``` >>>> >>>> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? >> Hmm, it looks the error is caused by the invalid bitmask of OFFCORE_RSP_x >> MSRs. Currently the valid bitmask of OFFCORE_RSP_x MSR is set to >> 0x3fffffffff in intel_grt_extra_regs[], while the msr value is set >> 0x800000010000 for the ocr.full_streaming_wr.any_response event. The bit 47 >> is recognized an invalid bit and then abort the event creation. >> >> Base on the description "Table 21-56. MSR_OFFCORE_RSPx Request Type >> Definition" in SDM, bit 47 should be a valid bit now. Suppose bit 47 should >> not be a valid bit when adding the ADL PMU support, but it's updated and >> becomes valid later. >> >> Along with the constant updates of perf event lists >> (https://github.com/intel/perfmon), we have noticed there are mismatches >> more or less between the driver hardcoded events and perfmon event list. >> Currently we are summarizing the mismatches. Once these mismatches are >> finalized. we would submit a patchset to fix these mismatches. > That's great, if it takes too long perhaps we could just remove the > events for now. Suppose it won't be too long. I plan to post the patchset in next release cycle. The code changes are simple but need much time to verify on all kinds of platforms. Thanks. > > Thanks, > Ian > >> Thanks. >> >>> +Dapeng, Zide, Andi >>> >>> Thanks, >>> Tom >>> >>>> Thanks, >>>> Ian >>>> >>>>> Regards, >>>>> Venkat. >>>>> >>>>> >>>>> >>>>>> tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++ >>>>>> 1 file changed, 20 insertions(+) >>>>>> >>>>>> diff --git a/tools/perf/tests/shell/stat_all_pmu.sh >>>>>> b/tools/perf/tests/shell/stat_all_pmu.sh >>>>>> index 9c466c0efa85..6c4d59cbfa5f 100755 >>>>>> --- a/tools/perf/tests/shell/stat_all_pmu.sh >>>>>> +++ b/tools/perf/tests/shell/stat_all_pmu.sh >>>>>> @@ -53,6 +53,26 @@ do >>>>>> continue >>>>>> fi >>>>>> >>>>>> + # check with system wide if it is supported. >>>>>> + output=$(perf stat -a -e "$p" true 2>&1) >>>>>> + stat_result=$? >>>>>> + if echo "$output" | grep -q "not supported" >>>>>> + then >>>>>> + # Event not supported, so ignore. >>>>>> + echo "not supported" >>>>>> + continue >>>>>> + fi >>>>>> + >>>>>> + # checked through possible access limitations and permissions. >>>>>> + # At this step, non-zero return code from "perf stat" needs to >>>>>> + # reported as fail for the user to investigate >>>>>> + if [ $stat_result -ne 0 ] >>>>>> + then >>>>>> + echo "perf stat failed with non-zero return code" >>>>>> + err=1 >>>>>> + continue >>>>>> + fi >>>>>> + >>>>>> # We failed to see the event and it is supported. Possibly the >>>>>> workload was >>>>>> # too small so retry with something longer. >>>>>> output=$(perf stat -e "$p" perf bench internals synthesize >>>>>> 2>&1) >>>>>> -- >>>>>> 2.47.3 >>>>>> ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-04-07 0:48 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-15 10:57 [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test Athira Rajeev 2026-03-23 10:39 ` Venkat 2026-04-01 20:40 ` Ian Rogers 2026-04-01 23:48 ` Namhyung Kim 2026-04-01 23:57 ` Ian Rogers 2026-04-02 15:45 ` Athira Rajeev 2026-04-03 1:27 ` Namhyung Kim 2026-04-02 17:32 ` Falcon, Thomas 2026-04-03 7:36 ` Mi, Dapeng 2026-04-03 15:39 ` Ian Rogers 2026-04-07 0:48 ` Mi, Dapeng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox