public inbox for linux-perf-users@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
@ 2026-03-15 10:57 Athira Rajeev
  2026-03-23 10:39 ` Venkat
  0 siblings, 1 reply; 11+ messages in thread
From: Athira Rajeev @ 2026-03-15 10:57 UTC (permalink / raw)
  To: acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy,
	irogers, namhyung
  Cc: linux-perf-users, linuxppc-dev, atrajeev, hbathini, Tejas.Manhas1,
	Tanushree.Shah, Shivani.Nittor

Currently in "perf all PMU test", for "perf stat -e <event> true",
below checks are done:
- if return code is zero, look for "not supported" to decide pass
  scenario
- check for "not supported" to ignore the event
- looks for "No permission to enable" to skip the event.
- If output has "Bad event name", fail the test.
- Use "Access to performance monitoring and observability operations is
  limited." to ignore fail due to access limitations

If we failed to see event and it is supported, retries with longer
workload "perf bench internals synthesize".
- Here if output has <event>, the test is a pass.

Snippet of code check:
  ```
  output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
  if echo "$output" | grep -q "$p"
  ```
- if output doesn't have event printed in logs, considers it fail.

But this results in false pass for events in some cases.
Example, if perf stat fails as below:

 # ./perf stat -e pmu/event/  true
 event syntax error: 'pmu/event/'
                     \___ Bad event or PMU

 Unable to find PMU or event on a PMU of 'pmu'
 Run 'perf list' for a list of valid events

  Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events
 # echo $?
 129

Since this has non-zero return code and doesn't have the
fail strings being checked in the test, it will enter check using
longer workload. and since the output fail log has event, it
declares test as "supported".

Since all the fail strings can't be added in the check, update
the testcase to check return code before proceeding to longer
workload run.

Another missing scenario is when system wide monitoring is supported
example:
 # ./perf stat -e pmu/event/ true
 Error:
 No supported events found.
  Unsupported event (pmu/event/H) in per-thread mode, enable system wide with '-a'.

Update testcase to check with "perf stat -a -e $p" as well

Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
---
 tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/tools/perf/tests/shell/stat_all_pmu.sh b/tools/perf/tests/shell/stat_all_pmu.sh
index 9c466c0efa85..6c4d59cbfa5f 100755
--- a/tools/perf/tests/shell/stat_all_pmu.sh
+++ b/tools/perf/tests/shell/stat_all_pmu.sh
@@ -53,6 +53,26 @@ do
     continue
   fi
 
+  # check with system wide if it is supported.
+  output=$(perf stat -a -e "$p" true 2>&1)
+  stat_result=$?
+  if echo "$output" | grep -q "not supported"
+  then
+    # Event not supported, so ignore.
+    echo "not supported"
+    continue
+  fi
+
+  # checked through possible access limitations and permissions.
+  # At this step, non-zero return code from "perf stat" needs to
+  # reported as fail for the user to investigate
+  if [ $stat_result -ne 0 ]
+  then
+    echo "perf stat failed with non-zero return code"
+    err=1
+    continue
+  fi
+
   # We failed to see the event and it is supported. Possibly the workload was
   # too small so retry with something longer.
   output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-03-15 10:57 [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test Athira Rajeev
@ 2026-03-23 10:39 ` Venkat
  2026-04-01 20:40   ` Ian Rogers
  0 siblings, 1 reply; 11+ messages in thread
From: Venkat @ 2026-03-23 10:39 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy,
	irogers, namhyung, linux-perf-users, linuxppc-dev, hbathini,
	Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor



> On 15 Mar 2026, at 4:27 PM, Athira Rajeev <atrajeev@linux.ibm.com> wrote:
> 
> Currently in "perf all PMU test", for "perf stat -e <event> true",
> below checks are done:
> - if return code is zero, look for "not supported" to decide pass
>  scenario
> - check for "not supported" to ignore the event
> - looks for "No permission to enable" to skip the event.
> - If output has "Bad event name", fail the test.
> - Use "Access to performance monitoring and observability operations is
>  limited." to ignore fail due to access limitations
> 
> If we failed to see event and it is supported, retries with longer
> workload "perf bench internals synthesize".
> - Here if output has <event>, the test is a pass.
> 
> Snippet of code check:
>  ```
>  output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
>  if echo "$output" | grep -q "$p"
>  ```
> - if output doesn't have event printed in logs, considers it fail.
> 
> But this results in false pass for events in some cases.
> Example, if perf stat fails as below:
> 
> # ./perf stat -e pmu/event/  true
> event syntax error: 'pmu/event/'
>                     \___ Bad event or PMU
> 
> Unable to find PMU or event on a PMU of 'pmu'
> Run 'perf list' for a list of valid events
> 
>  Usage: perf stat [<options>] [<command>]
> 
>    -e, --event <event>   event selector. use 'perf list' to list available events
> # echo $?
> 129
> 
> Since this has non-zero return code and doesn't have the
> fail strings being checked in the test, it will enter check using
> longer workload. and since the output fail log has event, it
> declares test as "supported".
> 
> Since all the fail strings can't be added in the check, update
> the testcase to check return code before proceeding to longer
> workload run.
> 
> Another missing scenario is when system wide monitoring is supported
> example:
> # ./perf stat -e pmu/event/ true
> Error:
> No supported events found.
>  Unsupported event (pmu/event/H) in per-thread mode, enable system wide with '-a'.
> 
> Update testcase to check with "perf stat -a -e $p" as well
> 
> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
> ---

Tested this patch.


With this patch:

Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero return code
Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero return code



Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>

Regards,
Venkat.



> tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
> 
> diff --git a/tools/perf/tests/shell/stat_all_pmu.sh b/tools/perf/tests/shell/stat_all_pmu.sh
> index 9c466c0efa85..6c4d59cbfa5f 100755
> --- a/tools/perf/tests/shell/stat_all_pmu.sh
> +++ b/tools/perf/tests/shell/stat_all_pmu.sh
> @@ -53,6 +53,26 @@ do
>     continue
>   fi
> 
> +  # check with system wide if it is supported.
> +  output=$(perf stat -a -e "$p" true 2>&1)
> +  stat_result=$?
> +  if echo "$output" | grep -q "not supported"
> +  then
> +    # Event not supported, so ignore.
> +    echo "not supported"
> +    continue
> +  fi
> +
> +  # checked through possible access limitations and permissions.
> +  # At this step, non-zero return code from "perf stat" needs to
> +  # reported as fail for the user to investigate
> +  if [ $stat_result -ne 0 ]
> +  then
> +    echo "perf stat failed with non-zero return code"
> +    err=1
> +    continue
> +  fi
> +
>   # We failed to see the event and it is supported. Possibly the workload was
>   # too small so retry with something longer.
>   output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
> -- 
> 2.47.3
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-03-23 10:39 ` Venkat
@ 2026-04-01 20:40   ` Ian Rogers
  2026-04-01 23:48     ` Namhyung Kim
  2026-04-02 17:32     ` Falcon, Thomas
  0 siblings, 2 replies; 11+ messages in thread
From: Ian Rogers @ 2026-04-01 20:40 UTC (permalink / raw)
  To: Venkat, Athira Rajeev, Thomas Falcon
  Cc: acme, jolsa, adrian.hunter, vmolnaro, mpetlan, tmricht, maddy,
	namhyung, linux-perf-users, linuxppc-dev, hbathini, Tejas.Manhas1,
	Tanushree.Shah, Shivani.Nittor

On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com> wrote:
>
>
>
> > On 15 Mar 2026, at 4:27 PM, Athira Rajeev <atrajeev@linux.ibm.com> wrote:
> >
> > Currently in "perf all PMU test", for "perf stat -e <event> true",
> > below checks are done:
> > - if return code is zero, look for "not supported" to decide pass
> >  scenario
> > - check for "not supported" to ignore the event
> > - looks for "No permission to enable" to skip the event.
> > - If output has "Bad event name", fail the test.
> > - Use "Access to performance monitoring and observability operations is
> >  limited." to ignore fail due to access limitations
> >
> > If we failed to see event and it is supported, retries with longer
> > workload "perf bench internals synthesize".
> > - Here if output has <event>, the test is a pass.
> >
> > Snippet of code check:
> >  ```
> >  output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
> >  if echo "$output" | grep -q "$p"
> >  ```
> > - if output doesn't have event printed in logs, considers it fail.
> >
> > But this results in false pass for events in some cases.
> > Example, if perf stat fails as below:
> >
> > # ./perf stat -e pmu/event/  true
> > event syntax error: 'pmu/event/'
> >                     \___ Bad event or PMU
> >
> > Unable to find PMU or event on a PMU of 'pmu'
> > Run 'perf list' for a list of valid events
> >
> >  Usage: perf stat [<options>] [<command>]
> >
> >    -e, --event <event>   event selector. use 'perf list' to list available events
> > # echo $?
> > 129
> >
> > Since this has non-zero return code and doesn't have the
> > fail strings being checked in the test, it will enter check using
> > longer workload. and since the output fail log has event, it
> > declares test as "supported".
> >
> > Since all the fail strings can't be added in the check, update
> > the testcase to check return code before proceeding to longer
> > workload run.
> >
> > Another missing scenario is when system wide monitoring is supported
> > example:
> > # ./perf stat -e pmu/event/ true
> > Error:
> > No supported events found.
> >  Unsupported event (pmu/event/H) in per-thread mode, enable system wide with '-a'.
> >
> > Update testcase to check with "perf stat -a -e $p" as well
> >
> > Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
> > ---
>
> Tested this patch.
>
>
> With this patch:
>
> Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero return code
> Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero return code
>
>
>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>

Testing on an Intel Alderlake the test is now failing:
```
...
Testing offcore_requests_outstanding.l3_miss_demand_data_rd -- supported
Testing ocr.full_streaming_wr.any_response -- perf stat failed with
non-zero return code
Testing ocr.partial_streaming_wr.any_response -- perf stat failed with
non-zero return code
Testing ocr.streaming_wr.any_response -- supported
...
```

Running `perf stat` manually reveals an issue with the event:
```
$ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep
1
Using CPUID GenuineIntel-6-B7-1
Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/
..after resolving event:
cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x800000010000/
ocr.full_streaming_wr.any_response ->
cpu_atom/ocr.full_streaming_wr.any_response/
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
 type                             10 (cpu_atom)
 size                             144
------------------------------------------------------------
perf_event_attr:
 type                             0 (PERF_TYPE_HARDWARE)
 config                           0xa00000000
(cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
 disabled                         1
------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
 type                             0 (PERF_TYPE_HARDWARE)
 config                           0x400000000
(cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
 disabled                         1
------------------------------------------------------------
sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
 config                           0x1b7
(ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd)
 sample_type                      IDENTIFIER
 read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
 disabled                         1
 inherit                          1
 { bp_addr, config1 }             0x800000010000
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8
sys_perf_event_open failed, error -22
switching off deferred callchain support
Warning:
ocr.full_streaming_wr.any_response event is not supported by the kernel.
The sys_perf_event_open() syscall failed for event
(ocr.full_streaming_wr.any_response): Invalid argument
"dmesg | grep -i perf" may provide additional information.

Error:
No supported events found.
The sys_perf_event_open() syscall failed for event
(ocr.full_streaming_wr.any_response): Invalid argument
"dmesg | grep -i perf" may provide additional information.
```

This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt?

Thanks,
Ian

> Regards,
> Venkat.
>
>
>
> > tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++
> > 1 file changed, 20 insertions(+)
> >
> > diff --git a/tools/perf/tests/shell/stat_all_pmu.sh b/tools/perf/tests/shell/stat_all_pmu.sh
> > index 9c466c0efa85..6c4d59cbfa5f 100755
> > --- a/tools/perf/tests/shell/stat_all_pmu.sh
> > +++ b/tools/perf/tests/shell/stat_all_pmu.sh
> > @@ -53,6 +53,26 @@ do
> >     continue
> >   fi
> >
> > +  # check with system wide if it is supported.
> > +  output=$(perf stat -a -e "$p" true 2>&1)
> > +  stat_result=$?
> > +  if echo "$output" | grep -q "not supported"
> > +  then
> > +    # Event not supported, so ignore.
> > +    echo "not supported"
> > +    continue
> > +  fi
> > +
> > +  # checked through possible access limitations and permissions.
> > +  # At this step, non-zero return code from "perf stat" needs to
> > +  # reported as fail for the user to investigate
> > +  if [ $stat_result -ne 0 ]
> > +  then
> > +    echo "perf stat failed with non-zero return code"
> > +    err=1
> > +    continue
> > +  fi
> > +
> >   # We failed to see the event and it is supported. Possibly the workload was
> >   # too small so retry with something longer.
> >   output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
> > --
> > 2.47.3
> >
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-04-01 20:40   ` Ian Rogers
@ 2026-04-01 23:48     ` Namhyung Kim
  2026-04-01 23:57       ` Ian Rogers
  2026-04-02 17:32     ` Falcon, Thomas
  1 sibling, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2026-04-01 23:48 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Venkat, Athira Rajeev, Thomas Falcon, acme, jolsa, adrian.hunter,
	vmolnaro, mpetlan, tmricht, maddy, linux-perf-users, linuxppc-dev,
	hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor

On Wed, Apr 01, 2026 at 01:40:47PM -0700, Ian Rogers wrote:
> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt?

Are you ok with the change itself then?

I'm not sure what's the expectation when the test runs with a regular
user.  I assume the intention of this change is running as root..

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-04-01 23:48     ` Namhyung Kim
@ 2026-04-01 23:57       ` Ian Rogers
  2026-04-02 15:45         ` Athira Rajeev
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2026-04-01 23:57 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Venkat, Athira Rajeev, Thomas Falcon, acme, jolsa, adrian.hunter,
	vmolnaro, mpetlan, tmricht, maddy, linux-perf-users, linuxppc-dev,
	hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor

On Wed, Apr 1, 2026 at 4:48 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> On Wed, Apr 01, 2026 at 01:40:47PM -0700, Ian Rogers wrote:
> > This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt?
>
> Are you ok with the change itself then?
>
> I'm not sure what's the expectation when the test runs with a regular
> user.  I assume the intention of this change is running as root..

So the test is failing as root on Intel, and the log output is
extensive. I'd prefer not to have the log output, so we may need to
work past some known broken items. Otherwise, the change looks okay,
but there are inconsistencies: `grep -q "<not supported>"` is used in
the existing code, while `grep -q "not supported"` is used here.

Thanks,
Ian

> Thanks,
> Namhyung
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-04-01 23:57       ` Ian Rogers
@ 2026-04-02 15:45         ` Athira Rajeev
  2026-04-03  1:27           ` Namhyung Kim
  0 siblings, 1 reply; 11+ messages in thread
From: Athira Rajeev @ 2026-04-02 15:45 UTC (permalink / raw)
  To: Ian Rogers, Namhyung Kim
  Cc: Venkat, Thomas Falcon, acme, jolsa, adrian.hunter, vmolnaro,
	mpetlan, tmricht, maddy, linux-perf-users, linuxppc-dev, hbathini,
	Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor



> On 2 Apr 2026, at 5:27 AM, Ian Rogers <irogers@google.com> wrote:
> 
> On Wed, Apr 1, 2026 at 4:48 PM Namhyung Kim <namhyung@kernel.org> wrote:
>> 
>> On Wed, Apr 01, 2026 at 01:40:47PM -0700, Ian Rogers wrote:
>>> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt?
>> 
>> Are you ok with the change itself then?
>> 
>> I'm not sure what's the expectation when the test runs with a regular
>> user.  I assume the intention of this change is running as root..
> 
> So the test is failing as root on Intel, and the log output is
> extensive. I'd prefer not to have the log output, so we may need to
> work past some known broken items. Otherwise, the change looks okay,
> but there are inconsistencies: `grep -q "<not supported>"` is used in
> the existing code, while `grep -q "not supported"` is used here.
> 
> Thanks,
> Ian
> 
>> Thanks,
>> Namhyung
>> 

Hi Namhyung, Ian

Thanks for trying the change and responding.

Sorry, I missed the comment from Sashiko. Next time I will check explicitly for Sashiko comments too.

The intention of this patch has two things:

1) It was resulting in “false pass” earlier if the test fallback to last check where we run with longer workload. Because error
  message contains the event name “$p” and hence matches the “grep” check. 

Example:

   # ./perf stat -e hv_24x7/CPM_ADJUNCT_PCYC/ ./perf bench internals synthesize
    event syntax error: 'hv_24x7/CPM_ADJUNCT_PCYC/'
                     \___ Bad event or PMU

    Unable to find PMU or event on a PMU of ‘hv_24x7'
    Run 'perf list' for a list of valid events

    Usage: perf stat [<options>] [<command>]

        -e, --event <event>   event selector. use 'perf list' to list available events
  # echo $?
     129


Here the test checks for :
  <<>>
   output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
   if echo "$output" | grep -q "$p”
  <<>>

Since the error message contains the event name, it matches grep check and declares test as pass.

 To catch this, patch adds a check for “return code” 

+  # checked through possible access limitations and permissions.
+  # At this step, non-zero return code from "perf stat" needs to
+  # reported as fail for the user to investigate
+  if [ $stat_result -ne 0 ]
+  then
+    echo "perf stat failed with non-zero return code"
+    err=1
+    continue
+  fi

2) If events are supported in system wide monitoring. 

Namhyung is right here that for regular user it will result in fail . I will take care of having check around that.
The reason for using “not supported” text here is:

Example: There is an event in powerpc "vpa_dtl/dtl_all/“ which when run on per thread monitoring:

 # ./perf stat  -e vpa_dtl/dtl_all/ true
    Error:  
    No supported events found.
    Unsupported event (vpa_dtl/dtl_all/H) in per-thread mode, enable system wide with '-a’.

Next running with system wide monitoring:

  # ./perf stat -a -e vpa_dtl/dtl_all/ true
     Error:
     No supported events found.
     The sys_perf_event_open() syscall failed for event (vpa_dtl/dtl_all/H): Operation not supported
     "dmesg | grep -i perf" may provide additional information.

This is because this event is supported for only “sampling” system wide and not counting. 
So patch attempts to use “-a”, if still it fails, we look for “not supported” in logs

Namhyung, Ian,

If the addition of “-a” can cause regression ( like intel one if its not suppose to be run system wide ), 
how about adding a case like this:
- Look for "enable system wide with '-a’ “ in the error logs
	- If logs matches this message and if user is root, attempt with -a next.
       - With “-a”, If the logs has "Operation not supported” , test can continue to next event.

Thanks
Athira



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-04-01 20:40   ` Ian Rogers
  2026-04-01 23:48     ` Namhyung Kim
@ 2026-04-02 17:32     ` Falcon, Thomas
  2026-04-03  7:36       ` Mi, Dapeng
  1 sibling, 1 reply; 11+ messages in thread
From: Falcon, Thomas @ 2026-04-02 17:32 UTC (permalink / raw)
  To: atrajeev@linux.ibm.com, venkat88@linux.ibm.com, Rogers, Ian
  Cc: Kleen, Andi, Shivani.Nittor@ibm.com, tmricht@linux.ibm.com,
	hbathini@linux.vnet.ibm.com, mpetlan@redhat.com,
	Tanushree.Shah@ibm.com, Hunter, Adrian,
	linux-perf-users@vger.kernel.org, maddy@linux.ibm.com, Chen, Zide,
	vmolnaro@redhat.com, Tejas.Manhas1@ibm.com,
	linuxppc-dev@lists.ozlabs.org, acme@kernel.org, jolsa@kernel.org,
	Mi, Dapeng1, namhyung@kernel.org

On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote:
> On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com>
> wrote:
> > 
> > 
> > 
> > > On 15 Mar 2026, at 4:27 PM, Athira Rajeev
> > > <atrajeev@linux.ibm.com> wrote:
> > > 
> > > Currently in "perf all PMU test", for "perf stat -e <event>
> > > true",
> > > below checks are done:
> > > - if return code is zero, look for "not supported" to decide pass
> > >  scenario
> > > - check for "not supported" to ignore the event
> > > - looks for "No permission to enable" to skip the event.
> > > - If output has "Bad event name", fail the test.
> > > - Use "Access to performance monitoring and observability
> > > operations is
> > >  limited." to ignore fail due to access limitations
> > > 
> > > If we failed to see event and it is supported, retries with
> > > longer
> > > workload "perf bench internals synthesize".
> > > - Here if output has <event>, the test is a pass.
> > > 
> > > Snippet of code check:
> > >  ```
> > >  output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
> > >  if echo "$output" | grep -q "$p"
> > >  ```
> > > - if output doesn't have event printed in logs, considers it
> > > fail.
> > > 
> > > But this results in false pass for events in some cases.
> > > Example, if perf stat fails as below:
> > > 
> > > # ./perf stat -e pmu/event/  true
> > > event syntax error: 'pmu/event/'
> > >                     \___ Bad event or PMU
> > > 
> > > Unable to find PMU or event on a PMU of 'pmu'
> > > Run 'perf list' for a list of valid events
> > > 
> > >  Usage: perf stat [<options>] [<command>]
> > > 
> > >    -e, --event <event>   event selector. use 'perf list' to list
> > > available events
> > > # echo $?
> > > 129
> > > 
> > > Since this has non-zero return code and doesn't have the
> > > fail strings being checked in the test, it will enter check using
> > > longer workload. and since the output fail log has event, it
> > > declares test as "supported".
> > > 
> > > Since all the fail strings can't be added in the check, update
> > > the testcase to check return code before proceeding to longer
> > > workload run.
> > > 
> > > Another missing scenario is when system wide monitoring is
> > > supported
> > > example:
> > > # ./perf stat -e pmu/event/ true
> > > Error:
> > > No supported events found.
> > >  Unsupported event (pmu/event/H) in per-thread mode, enable
> > > system wide with '-a'.
> > > 
> > > Update testcase to check with "perf stat -a -e $p" as well
> > > 
> > > Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
> > > ---
> > 
> > Tested this patch.
> > 
> > 
> > With this patch:
> > 
> > Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero
> > return code
> > Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero
> > return code
> > 
> > 
> > 
> > Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> 
> Testing on an Intel Alderlake the test is now failing:
> ```
> ...
> Testing offcore_requests_outstanding.l3_miss_demand_data_rd --
> supported
> Testing ocr.full_streaming_wr.any_response -- perf stat failed with
> non-zero return code
> Testing ocr.partial_streaming_wr.any_response -- perf stat failed
> with
> non-zero return code
> Testing ocr.streaming_wr.any_response -- supported
> ...
> ```
> 
> Running `perf stat` manually reveals an issue with the event:
> ```
> $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep
> 1
> Using CPUID GenuineIntel-6-B7-1
> Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/
> ..after resolving event:
> cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100
> 00/
> ocr.full_streaming_wr.any_response ->
> cpu_atom/ocr.full_streaming_wr.any_response/
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
>  type                             10 (cpu_atom)
>  size                             144
> ------------------------------------------------------------
> perf_event_attr:
>  type                             0 (PERF_TYPE_HARDWARE)
>  config                           0xa00000000
> (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
>  disabled                         1
> ------------------------------------------------------------
> sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
> ------------------------------------------------------------
> perf_event_attr:
>  type                             0 (PERF_TYPE_HARDWARE)
>  config                           0x400000000
> (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
>  disabled                         1
> ------------------------------------------------------------
> sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
>  config                           0x1b7
> (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd)
>  sample_type                      IDENTIFIER
>  read_format                     
> TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>  disabled                         1
>  inherit                          1
>  { bp_addr, config1 }             0x800000010000
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8
> sys_perf_event_open failed, error -22
> switching off deferred callchain support
> Warning:
> ocr.full_streaming_wr.any_response event is not supported by the
> kernel.
> The sys_perf_event_open() syscall failed for event
> (ocr.full_streaming_wr.any_response): Invalid argument
> "dmesg | grep -i perf" may provide additional information.
> 
> Error:
> No supported events found.
> The sys_perf_event_open() syscall failed for event
> (ocr.full_streaming_wr.any_response): Invalid argument
> "dmesg | grep -i perf" may provide additional information.
> ```
> 
> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt?

+Dapeng, Zide, Andi

Thanks,
Tom

> 
> Thanks,
> Ian
> 
> > Regards,
> > Venkat.
> > 
> > 
> > 
> > > tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++
> > > 1 file changed, 20 insertions(+)
> > > 
> > > diff --git a/tools/perf/tests/shell/stat_all_pmu.sh
> > > b/tools/perf/tests/shell/stat_all_pmu.sh
> > > index 9c466c0efa85..6c4d59cbfa5f 100755
> > > --- a/tools/perf/tests/shell/stat_all_pmu.sh
> > > +++ b/tools/perf/tests/shell/stat_all_pmu.sh
> > > @@ -53,6 +53,26 @@ do
> > >     continue
> > >   fi
> > > 
> > > +  # check with system wide if it is supported.
> > > +  output=$(perf stat -a -e "$p" true 2>&1)
> > > +  stat_result=$?
> > > +  if echo "$output" | grep -q "not supported"
> > > +  then
> > > +    # Event not supported, so ignore.
> > > +    echo "not supported"
> > > +    continue
> > > +  fi
> > > +
> > > +  # checked through possible access limitations and permissions.
> > > +  # At this step, non-zero return code from "perf stat" needs to
> > > +  # reported as fail for the user to investigate
> > > +  if [ $stat_result -ne 0 ]
> > > +  then
> > > +    echo "perf stat failed with non-zero return code"
> > > +    err=1
> > > +    continue
> > > +  fi
> > > +
> > >   # We failed to see the event and it is supported. Possibly the
> > > workload was
> > >   # too small so retry with something longer.
> > >   output=$(perf stat -e "$p" perf bench internals synthesize
> > > 2>&1)
> > > --
> > > 2.47.3
> > > 
> > 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-04-02 15:45         ` Athira Rajeev
@ 2026-04-03  1:27           ` Namhyung Kim
  0 siblings, 0 replies; 11+ messages in thread
From: Namhyung Kim @ 2026-04-03  1:27 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Ian Rogers, Venkat, Thomas Falcon, acme, jolsa, adrian.hunter,
	vmolnaro, mpetlan, tmricht, maddy, linux-perf-users, linuxppc-dev,
	hbathini, Tejas.Manhas1, Tanushree.Shah, Shivani.Nittor

On Thu, Apr 02, 2026 at 09:15:54PM +0530, Athira Rajeev wrote:
> If the addition of “-a” can cause regression ( like intel one if its not suppose to be run system wide ), 
> how about adding a case like this:
> - Look for "enable system wide with '-a’ “ in the error logs
> 	- If logs matches this message and if user is root, attempt with -a next.
>        - With “-a”, If the logs has "Operation not supported” , test can continue to next event.

Sounds good, you can check perf_event_paranoid as well as root.  Then
you don't need to check the result of "-a" for -EOPNOTSUP.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-04-02 17:32     ` Falcon, Thomas
@ 2026-04-03  7:36       ` Mi, Dapeng
  2026-04-03 15:39         ` Ian Rogers
  0 siblings, 1 reply; 11+ messages in thread
From: Mi, Dapeng @ 2026-04-03  7:36 UTC (permalink / raw)
  To: Falcon, Thomas, atrajeev@linux.ibm.com, venkat88@linux.ibm.com,
	Rogers, Ian
  Cc: Kleen, Andi, Shivani.Nittor@ibm.com, tmricht@linux.ibm.com,
	hbathini@linux.vnet.ibm.com, mpetlan@redhat.com,
	Tanushree.Shah@ibm.com, Hunter, Adrian,
	linux-perf-users@vger.kernel.org, maddy@linux.ibm.com, Chen, Zide,
	vmolnaro@redhat.com, Tejas.Manhas1@ibm.com,
	linuxppc-dev@lists.ozlabs.org, acme@kernel.org, jolsa@kernel.org,
	Mi, Dapeng1, namhyung@kernel.org


On 4/3/2026 1:32 AM, Falcon, Thomas wrote:
> On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote:
>> On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com>
>> wrote:
>>>
>>>
>>>> On 15 Mar 2026, at 4:27 PM, Athira Rajeev
>>>> <atrajeev@linux.ibm.com> wrote:
>>>>
>>>> Currently in "perf all PMU test", for "perf stat -e <event>
>>>> true",
>>>> below checks are done:
>>>> - if return code is zero, look for "not supported" to decide pass
>>>>  scenario
>>>> - check for "not supported" to ignore the event
>>>> - looks for "No permission to enable" to skip the event.
>>>> - If output has "Bad event name", fail the test.
>>>> - Use "Access to performance monitoring and observability
>>>> operations is
>>>>  limited." to ignore fail due to access limitations
>>>>
>>>> If we failed to see event and it is supported, retries with
>>>> longer
>>>> workload "perf bench internals synthesize".
>>>> - Here if output has <event>, the test is a pass.
>>>>
>>>> Snippet of code check:
>>>>  ```
>>>>  output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
>>>>  if echo "$output" | grep -q "$p"
>>>>  ```
>>>> - if output doesn't have event printed in logs, considers it
>>>> fail.
>>>>
>>>> But this results in false pass for events in some cases.
>>>> Example, if perf stat fails as below:
>>>>
>>>> # ./perf stat -e pmu/event/  true
>>>> event syntax error: 'pmu/event/'
>>>>                     \___ Bad event or PMU
>>>>
>>>> Unable to find PMU or event on a PMU of 'pmu'
>>>> Run 'perf list' for a list of valid events
>>>>
>>>>  Usage: perf stat [<options>] [<command>]
>>>>
>>>>    -e, --event <event>   event selector. use 'perf list' to list
>>>> available events
>>>> # echo $?
>>>> 129
>>>>
>>>> Since this has non-zero return code and doesn't have the
>>>> fail strings being checked in the test, it will enter check using
>>>> longer workload. and since the output fail log has event, it
>>>> declares test as "supported".
>>>>
>>>> Since all the fail strings can't be added in the check, update
>>>> the testcase to check return code before proceeding to longer
>>>> workload run.
>>>>
>>>> Another missing scenario is when system wide monitoring is
>>>> supported
>>>> example:
>>>> # ./perf stat -e pmu/event/ true
>>>> Error:
>>>> No supported events found.
>>>>  Unsupported event (pmu/event/H) in per-thread mode, enable
>>>> system wide with '-a'.
>>>>
>>>> Update testcase to check with "perf stat -a -e $p" as well
>>>>
>>>> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
>>>> ---
>>> Tested this patch.
>>>
>>>
>>> With this patch:
>>>
>>> Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero
>>> return code
>>> Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero
>>> return code
>>>
>>>
>>>
>>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>> Testing on an Intel Alderlake the test is now failing:
>> ```
>> ...
>> Testing offcore_requests_outstanding.l3_miss_demand_data_rd --
>> supported
>> Testing ocr.full_streaming_wr.any_response -- perf stat failed with
>> non-zero return code
>> Testing ocr.partial_streaming_wr.any_response -- perf stat failed
>> with
>> non-zero return code
>> Testing ocr.streaming_wr.any_response -- supported
>> ...
>> ```
>>
>> Running `perf stat` manually reveals an issue with the event:
>> ```
>> $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep
>> 1
>> Using CPUID GenuineIntel-6-B7-1
>> Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/
>> ..after resolving event:
>> cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100
>> 00/
>> ocr.full_streaming_wr.any_response ->
>> cpu_atom/ocr.full_streaming_wr.any_response/
>> Control descriptor is not initialized
>> ------------------------------------------------------------
>> perf_event_attr:
>>  type                             10 (cpu_atom)
>>  size                             144
>> ------------------------------------------------------------
>> perf_event_attr:
>>  type                             0 (PERF_TYPE_HARDWARE)
>>  config                           0xa00000000
>> (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
>>  disabled                         1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
>> ------------------------------------------------------------
>> perf_event_attr:
>>  type                             0 (PERF_TYPE_HARDWARE)
>>  config                           0x400000000
>> (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
>>  disabled                         1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
>>  config                           0x1b7
>> (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd)
>>  sample_type                      IDENTIFIER
>>  read_format                     
>> TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>  disabled                         1
>>  inherit                          1
>>  { bp_addr, config1 }             0x800000010000
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8
>> sys_perf_event_open failed, error -22
>> switching off deferred callchain support
>> Warning:
>> ocr.full_streaming_wr.any_response event is not supported by the
>> kernel.
>> The sys_perf_event_open() syscall failed for event
>> (ocr.full_streaming_wr.any_response): Invalid argument
>> "dmesg | grep -i perf" may provide additional information.
>>
>> Error:
>> No supported events found.
>> The sys_perf_event_open() syscall failed for event
>> (ocr.full_streaming_wr.any_response): Invalid argument
>> "dmesg | grep -i perf" may provide additional information.
>> ```
>>
>> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt?

Hmm, it looks the error is caused by the invalid bitmask of OFFCORE_RSP_x
MSRs. Currently the valid bitmask of OFFCORE_RSP_x MSR is set to
0x3fffffffff in intel_grt_extra_regs[], while the msr value is set
0x800000010000 for the ocr.full_streaming_wr.any_response event. The bit 47
is recognized an invalid bit and then abort the event creation.

Base on the description "Table 21-56. MSR_OFFCORE_RSPx Request Type
Definition" in SDM, bit 47 should be a valid bit now. Suppose bit 47 should
not be a valid bit when adding the ADL PMU support, but it's updated and
becomes valid later.

Along with the constant updates of perf event lists
(https://github.com/intel/perfmon), we have noticed there are mismatches
more or less between the driver hardcoded events and perfmon event list.
Currently we are summarizing the mismatches. Once these mismatches are
finalized. we would submit a patchset to fix these mismatches.

Thanks.

> +Dapeng, Zide, Andi
>
> Thanks,
> Tom
>
>> Thanks,
>> Ian
>>
>>> Regards,
>>> Venkat.
>>>
>>>
>>>
>>>> tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++
>>>> 1 file changed, 20 insertions(+)
>>>>
>>>> diff --git a/tools/perf/tests/shell/stat_all_pmu.sh
>>>> b/tools/perf/tests/shell/stat_all_pmu.sh
>>>> index 9c466c0efa85..6c4d59cbfa5f 100755
>>>> --- a/tools/perf/tests/shell/stat_all_pmu.sh
>>>> +++ b/tools/perf/tests/shell/stat_all_pmu.sh
>>>> @@ -53,6 +53,26 @@ do
>>>>     continue
>>>>   fi
>>>>
>>>> +  # check with system wide if it is supported.
>>>> +  output=$(perf stat -a -e "$p" true 2>&1)
>>>> +  stat_result=$?
>>>> +  if echo "$output" | grep -q "not supported"
>>>> +  then
>>>> +    # Event not supported, so ignore.
>>>> +    echo "not supported"
>>>> +    continue
>>>> +  fi
>>>> +
>>>> +  # checked through possible access limitations and permissions.
>>>> +  # At this step, non-zero return code from "perf stat" needs to
>>>> +  # reported as fail for the user to investigate
>>>> +  if [ $stat_result -ne 0 ]
>>>> +  then
>>>> +    echo "perf stat failed with non-zero return code"
>>>> +    err=1
>>>> +    continue
>>>> +  fi
>>>> +
>>>>   # We failed to see the event and it is supported. Possibly the
>>>> workload was
>>>>   # too small so retry with something longer.
>>>>   output=$(perf stat -e "$p" perf bench internals synthesize
>>>> 2>&1)
>>>> --
>>>> 2.47.3
>>>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-04-03  7:36       ` Mi, Dapeng
@ 2026-04-03 15:39         ` Ian Rogers
  2026-04-07  0:48           ` Mi, Dapeng
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2026-04-03 15:39 UTC (permalink / raw)
  To: Mi, Dapeng, Falcon, Thomas, Kleen, Andi
  Cc: atrajeev@linux.ibm.com, venkat88@linux.ibm.com,
	Shivani.Nittor@ibm.com, tmricht@linux.ibm.com,
	hbathini@linux.vnet.ibm.com, mpetlan@redhat.com,
	Tanushree.Shah@ibm.com, Hunter, Adrian,
	linux-perf-users@vger.kernel.org, maddy@linux.ibm.com, Chen, Zide,
	vmolnaro@redhat.com, Tejas.Manhas1@ibm.com,
	linuxppc-dev@lists.ozlabs.org, acme@kernel.org, jolsa@kernel.org,
	Mi, Dapeng1, namhyung@kernel.org

On Fri, Apr 3, 2026 at 12:36 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>
>
> On 4/3/2026 1:32 AM, Falcon, Thomas wrote:
> > On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote:
> >> On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com>
> >> wrote:
> >>>
> >>>
> >>>> On 15 Mar 2026, at 4:27 PM, Athira Rajeev
> >>>> <atrajeev@linux.ibm.com> wrote:
> >>>>
> >>>> Currently in "perf all PMU test", for "perf stat -e <event>
> >>>> true",
> >>>> below checks are done:
> >>>> - if return code is zero, look for "not supported" to decide pass
> >>>>  scenario
> >>>> - check for "not supported" to ignore the event
> >>>> - looks for "No permission to enable" to skip the event.
> >>>> - If output has "Bad event name", fail the test.
> >>>> - Use "Access to performance monitoring and observability
> >>>> operations is
> >>>>  limited." to ignore fail due to access limitations
> >>>>
> >>>> If we failed to see event and it is supported, retries with
> >>>> longer
> >>>> workload "perf bench internals synthesize".
> >>>> - Here if output has <event>, the test is a pass.
> >>>>
> >>>> Snippet of code check:
> >>>>  ```
> >>>>  output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
> >>>>  if echo "$output" | grep -q "$p"
> >>>>  ```
> >>>> - if output doesn't have event printed in logs, considers it
> >>>> fail.
> >>>>
> >>>> But this results in false pass for events in some cases.
> >>>> Example, if perf stat fails as below:
> >>>>
> >>>> # ./perf stat -e pmu/event/  true
> >>>> event syntax error: 'pmu/event/'
> >>>>                     \___ Bad event or PMU
> >>>>
> >>>> Unable to find PMU or event on a PMU of 'pmu'
> >>>> Run 'perf list' for a list of valid events
> >>>>
> >>>>  Usage: perf stat [<options>] [<command>]
> >>>>
> >>>>    -e, --event <event>   event selector. use 'perf list' to list
> >>>> available events
> >>>> # echo $?
> >>>> 129
> >>>>
> >>>> Since this has non-zero return code and doesn't have the
> >>>> fail strings being checked in the test, it will enter check using
> >>>> longer workload. and since the output fail log has event, it
> >>>> declares test as "supported".
> >>>>
> >>>> Since all the fail strings can't be added in the check, update
> >>>> the testcase to check return code before proceeding to longer
> >>>> workload run.
> >>>>
> >>>> Another missing scenario is when system wide monitoring is
> >>>> supported
> >>>> example:
> >>>> # ./perf stat -e pmu/event/ true
> >>>> Error:
> >>>> No supported events found.
> >>>>  Unsupported event (pmu/event/H) in per-thread mode, enable
> >>>> system wide with '-a'.
> >>>>
> >>>> Update testcase to check with "perf stat -a -e $p" as well
> >>>>
> >>>> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
> >>>> ---
> >>> Tested this patch.
> >>>
> >>>
> >>> With this patch:
> >>>
> >>> Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero
> >>> return code
> >>> Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero
> >>> return code
> >>>
> >>>
> >>>
> >>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
> >> Testing on an Intel Alderlake the test is now failing:
> >> ```
> >> ...
> >> Testing offcore_requests_outstanding.l3_miss_demand_data_rd --
> >> supported
> >> Testing ocr.full_streaming_wr.any_response -- perf stat failed with
> >> non-zero return code
> >> Testing ocr.partial_streaming_wr.any_response -- perf stat failed
> >> with
> >> non-zero return code
> >> Testing ocr.streaming_wr.any_response -- supported
> >> ...
> >> ```
> >>
> >> Running `perf stat` manually reveals an issue with the event:
> >> ```
> >> $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep
> >> 1
> >> Using CPUID GenuineIntel-6-B7-1
> >> Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/
> >> ..after resolving event:
> >> cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100
> >> 00/
> >> ocr.full_streaming_wr.any_response ->
> >> cpu_atom/ocr.full_streaming_wr.any_response/
> >> Control descriptor is not initialized
> >> ------------------------------------------------------------
> >> perf_event_attr:
> >>  type                             10 (cpu_atom)
> >>  size                             144
> >> ------------------------------------------------------------
> >> perf_event_attr:
> >>  type                             0 (PERF_TYPE_HARDWARE)
> >>  config                           0xa00000000
> >> (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
> >>  disabled                         1
> >> ------------------------------------------------------------
> >> sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
> >> ------------------------------------------------------------
> >> perf_event_attr:
> >>  type                             0 (PERF_TYPE_HARDWARE)
> >>  config                           0x400000000
> >> (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
> >>  disabled                         1
> >> ------------------------------------------------------------
> >> sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
> >>  config                           0x1b7
> >> (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd)
> >>  sample_type                      IDENTIFIER
> >>  read_format
> >> TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> >>  disabled                         1
> >>  inherit                          1
> >>  { bp_addr, config1 }             0x800000010000
> >> ------------------------------------------------------------
> >> sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8
> >> sys_perf_event_open failed, error -22
> >> switching off deferred callchain support
> >> Warning:
> >> ocr.full_streaming_wr.any_response event is not supported by the
> >> kernel.
> >> The sys_perf_event_open() syscall failed for event
> >> (ocr.full_streaming_wr.any_response): Invalid argument
> >> "dmesg | grep -i perf" may provide additional information.
> >>
> >> Error:
> >> No supported events found.
> >> The sys_perf_event_open() syscall failed for event
> >> (ocr.full_streaming_wr.any_response): Invalid argument
> >> "dmesg | grep -i perf" may provide additional information.
> >> ```
> >>
> >> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt?
>
> Hmm, it looks the error is caused by the invalid bitmask of OFFCORE_RSP_x
> MSRs. Currently the valid bitmask of OFFCORE_RSP_x MSR is set to
> 0x3fffffffff in intel_grt_extra_regs[], while the msr value is set
> 0x800000010000 for the ocr.full_streaming_wr.any_response event. The bit 47
> is recognized an invalid bit and then abort the event creation.
>
> Base on the description "Table 21-56. MSR_OFFCORE_RSPx Request Type
> Definition" in SDM, bit 47 should be a valid bit now. Suppose bit 47 should
> not be a valid bit when adding the ADL PMU support, but it's updated and
> becomes valid later.
>
> Along with the constant updates of perf event lists
> (https://github.com/intel/perfmon), we have noticed there are mismatches
> more or less between the driver hardcoded events and perfmon event list.
> Currently we are summarizing the mismatches. Once these mismatches are
> finalized. we would submit a patchset to fix these mismatches.

That's great, if it takes too long perhaps we could just remove the
events for now.

Thanks,
Ian

> Thanks.
>
> > +Dapeng, Zide, Andi
> >
> > Thanks,
> > Tom
> >
> >> Thanks,
> >> Ian
> >>
> >>> Regards,
> >>> Venkat.
> >>>
> >>>
> >>>
> >>>> tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++
> >>>> 1 file changed, 20 insertions(+)
> >>>>
> >>>> diff --git a/tools/perf/tests/shell/stat_all_pmu.sh
> >>>> b/tools/perf/tests/shell/stat_all_pmu.sh
> >>>> index 9c466c0efa85..6c4d59cbfa5f 100755
> >>>> --- a/tools/perf/tests/shell/stat_all_pmu.sh
> >>>> +++ b/tools/perf/tests/shell/stat_all_pmu.sh
> >>>> @@ -53,6 +53,26 @@ do
> >>>>     continue
> >>>>   fi
> >>>>
> >>>> +  # check with system wide if it is supported.
> >>>> +  output=$(perf stat -a -e "$p" true 2>&1)
> >>>> +  stat_result=$?
> >>>> +  if echo "$output" | grep -q "not supported"
> >>>> +  then
> >>>> +    # Event not supported, so ignore.
> >>>> +    echo "not supported"
> >>>> +    continue
> >>>> +  fi
> >>>> +
> >>>> +  # checked through possible access limitations and permissions.
> >>>> +  # At this step, non-zero return code from "perf stat" needs to
> >>>> +  # reported as fail for the user to investigate
> >>>> +  if [ $stat_result -ne 0 ]
> >>>> +  then
> >>>> +    echo "perf stat failed with non-zero return code"
> >>>> +    err=1
> >>>> +    continue
> >>>> +  fi
> >>>> +
> >>>>   # We failed to see the event and it is supported. Possibly the
> >>>> workload was
> >>>>   # too small so retry with something longer.
> >>>>   output=$(perf stat -e "$p" perf bench internals synthesize
> >>>> 2>&1)
> >>>> --
> >>>> 2.47.3
> >>>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test
  2026-04-03 15:39         ` Ian Rogers
@ 2026-04-07  0:48           ` Mi, Dapeng
  0 siblings, 0 replies; 11+ messages in thread
From: Mi, Dapeng @ 2026-04-07  0:48 UTC (permalink / raw)
  To: Ian Rogers, Falcon, Thomas, Kleen, Andi
  Cc: atrajeev@linux.ibm.com, venkat88@linux.ibm.com,
	Shivani.Nittor@ibm.com, tmricht@linux.ibm.com,
	hbathini@linux.vnet.ibm.com, mpetlan@redhat.com,
	Tanushree.Shah@ibm.com, Hunter, Adrian,
	linux-perf-users@vger.kernel.org, maddy@linux.ibm.com, Chen, Zide,
	vmolnaro@redhat.com, Tejas.Manhas1@ibm.com,
	linuxppc-dev@lists.ozlabs.org, acme@kernel.org, jolsa@kernel.org,
	Mi, Dapeng1, namhyung@kernel.org


On 4/3/2026 11:39 PM, Ian Rogers wrote:
> On Fri, Apr 3, 2026 at 12:36 AM Mi, Dapeng <dapeng1.mi@linux.intel.com> wrote:
>>
>> On 4/3/2026 1:32 AM, Falcon, Thomas wrote:
>>> On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote:
>>>> On Mon, Mar 23, 2026 at 3:40 AM Venkat <venkat88@linux.ibm.com>
>>>> wrote:
>>>>>
>>>>>> On 15 Mar 2026, at 4:27 PM, Athira Rajeev
>>>>>> <atrajeev@linux.ibm.com> wrote:
>>>>>>
>>>>>> Currently in "perf all PMU test", for "perf stat -e <event>
>>>>>> true",
>>>>>> below checks are done:
>>>>>> - if return code is zero, look for "not supported" to decide pass
>>>>>>  scenario
>>>>>> - check for "not supported" to ignore the event
>>>>>> - looks for "No permission to enable" to skip the event.
>>>>>> - If output has "Bad event name", fail the test.
>>>>>> - Use "Access to performance monitoring and observability
>>>>>> operations is
>>>>>>  limited." to ignore fail due to access limitations
>>>>>>
>>>>>> If we failed to see event and it is supported, retries with
>>>>>> longer
>>>>>> workload "perf bench internals synthesize".
>>>>>> - Here if output has <event>, the test is a pass.
>>>>>>
>>>>>> Snippet of code check:
>>>>>>  ```
>>>>>>  output=$(perf stat -e "$p" perf bench internals synthesize 2>&1)
>>>>>>  if echo "$output" | grep -q "$p"
>>>>>>  ```
>>>>>> - if output doesn't have event printed in logs, considers it
>>>>>> fail.
>>>>>>
>>>>>> But this results in false pass for events in some cases.
>>>>>> Example, if perf stat fails as below:
>>>>>>
>>>>>> # ./perf stat -e pmu/event/  true
>>>>>> event syntax error: 'pmu/event/'
>>>>>>                     \___ Bad event or PMU
>>>>>>
>>>>>> Unable to find PMU or event on a PMU of 'pmu'
>>>>>> Run 'perf list' for a list of valid events
>>>>>>
>>>>>>  Usage: perf stat [<options>] [<command>]
>>>>>>
>>>>>>    -e, --event <event>   event selector. use 'perf list' to list
>>>>>> available events
>>>>>> # echo $?
>>>>>> 129
>>>>>>
>>>>>> Since this has non-zero return code and doesn't have the
>>>>>> fail strings being checked in the test, it will enter check using
>>>>>> longer workload. and since the output fail log has event, it
>>>>>> declares test as "supported".
>>>>>>
>>>>>> Since all the fail strings can't be added in the check, update
>>>>>> the testcase to check return code before proceeding to longer
>>>>>> workload run.
>>>>>>
>>>>>> Another missing scenario is when system wide monitoring is
>>>>>> supported
>>>>>> example:
>>>>>> # ./perf stat -e pmu/event/ true
>>>>>> Error:
>>>>>> No supported events found.
>>>>>>  Unsupported event (pmu/event/H) in per-thread mode, enable
>>>>>> system wide with '-a'.
>>>>>>
>>>>>> Update testcase to check with "perf stat -a -e $p" as well
>>>>>>
>>>>>> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
>>>>>> ---
>>>>> Tested this patch.
>>>>>
>>>>>
>>>>> With this patch:
>>>>>
>>>>> Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero
>>>>> return code
>>>>> Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero
>>>>> return code
>>>>>
>>>>>
>>>>>
>>>>> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
>>>> Testing on an Intel Alderlake the test is now failing:
>>>> ```
>>>> ...
>>>> Testing offcore_requests_outstanding.l3_miss_demand_data_rd --
>>>> supported
>>>> Testing ocr.full_streaming_wr.any_response -- perf stat failed with
>>>> non-zero return code
>>>> Testing ocr.partial_streaming_wr.any_response -- perf stat failed
>>>> with
>>>> non-zero return code
>>>> Testing ocr.streaming_wr.any_response -- supported
>>>> ...
>>>> ```
>>>>
>>>> Running `perf stat` manually reveals an issue with the event:
>>>> ```
>>>> $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep
>>>> 1
>>>> Using CPUID GenuineIntel-6-B7-1
>>>> Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/
>>>> ..after resolving event:
>>>> cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100
>>>> 00/
>>>> ocr.full_streaming_wr.any_response ->
>>>> cpu_atom/ocr.full_streaming_wr.any_response/
>>>> Control descriptor is not initialized
>>>> ------------------------------------------------------------
>>>> perf_event_attr:
>>>>  type                             10 (cpu_atom)
>>>>  size                             144
>>>> ------------------------------------------------------------
>>>> perf_event_attr:
>>>>  type                             0 (PERF_TYPE_HARDWARE)
>>>>  config                           0xa00000000
>>>> (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
>>>>  disabled                         1
>>>> ------------------------------------------------------------
>>>> sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
>>>> ------------------------------------------------------------
>>>> perf_event_attr:
>>>>  type                             0 (PERF_TYPE_HARDWARE)
>>>>  config                           0x400000000
>>>> (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
>>>>  disabled                         1
>>>> ------------------------------------------------------------
>>>> sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 3
>>>>  config                           0x1b7
>>>> (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd)
>>>>  sample_type                      IDENTIFIER
>>>>  read_format
>>>> TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>>>>  disabled                         1
>>>>  inherit                          1
>>>>  { bp_addr, config1 }             0x800000010000
>>>> ------------------------------------------------------------
>>>> sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8
>>>> sys_perf_event_open failed, error -22
>>>> switching off deferred callchain support
>>>> Warning:
>>>> ocr.full_streaming_wr.any_response event is not supported by the
>>>> kernel.
>>>> The sys_perf_event_open() syscall failed for event
>>>> (ocr.full_streaming_wr.any_response): Invalid argument
>>>> "dmesg | grep -i perf" may provide additional information.
>>>>
>>>> Error:
>>>> No supported events found.
>>>> The sys_perf_event_open() syscall failed for event
>>>> (ocr.full_streaming_wr.any_response): Invalid argument
>>>> "dmesg | grep -i perf" may provide additional information.
>>>> ```
>>>>
>>>> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt?
>> Hmm, it looks the error is caused by the invalid bitmask of OFFCORE_RSP_x
>> MSRs. Currently the valid bitmask of OFFCORE_RSP_x MSR is set to
>> 0x3fffffffff in intel_grt_extra_regs[], while the msr value is set
>> 0x800000010000 for the ocr.full_streaming_wr.any_response event. The bit 47
>> is recognized an invalid bit and then abort the event creation.
>>
>> Base on the description "Table 21-56. MSR_OFFCORE_RSPx Request Type
>> Definition" in SDM, bit 47 should be a valid bit now. Suppose bit 47 should
>> not be a valid bit when adding the ADL PMU support, but it's updated and
>> becomes valid later.
>>
>> Along with the constant updates of perf event lists
>> (https://github.com/intel/perfmon), we have noticed there are mismatches
>> more or less between the driver hardcoded events and perfmon event list.
>> Currently we are summarizing the mismatches. Once these mismatches are
>> finalized. we would submit a patchset to fix these mismatches.
> That's great, if it takes too long perhaps we could just remove the
> events for now.

Suppose it won't be too long. I plan to post the patchset in next release
cycle. The code changes are simple but need much time to verify on all
kinds of platforms. Thanks.


>
> Thanks,
> Ian
>
>> Thanks.
>>
>>> +Dapeng, Zide, Andi
>>>
>>> Thanks,
>>> Tom
>>>
>>>> Thanks,
>>>> Ian
>>>>
>>>>> Regards,
>>>>> Venkat.
>>>>>
>>>>>
>>>>>
>>>>>> tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++
>>>>>> 1 file changed, 20 insertions(+)
>>>>>>
>>>>>> diff --git a/tools/perf/tests/shell/stat_all_pmu.sh
>>>>>> b/tools/perf/tests/shell/stat_all_pmu.sh
>>>>>> index 9c466c0efa85..6c4d59cbfa5f 100755
>>>>>> --- a/tools/perf/tests/shell/stat_all_pmu.sh
>>>>>> +++ b/tools/perf/tests/shell/stat_all_pmu.sh
>>>>>> @@ -53,6 +53,26 @@ do
>>>>>>     continue
>>>>>>   fi
>>>>>>
>>>>>> +  # check with system wide if it is supported.
>>>>>> +  output=$(perf stat -a -e "$p" true 2>&1)
>>>>>> +  stat_result=$?
>>>>>> +  if echo "$output" | grep -q "not supported"
>>>>>> +  then
>>>>>> +    # Event not supported, so ignore.
>>>>>> +    echo "not supported"
>>>>>> +    continue
>>>>>> +  fi
>>>>>> +
>>>>>> +  # checked through possible access limitations and permissions.
>>>>>> +  # At this step, non-zero return code from "perf stat" needs to
>>>>>> +  # reported as fail for the user to investigate
>>>>>> +  if [ $stat_result -ne 0 ]
>>>>>> +  then
>>>>>> +    echo "perf stat failed with non-zero return code"
>>>>>> +    err=1
>>>>>> +    continue
>>>>>> +  fi
>>>>>> +
>>>>>>   # We failed to see the event and it is supported. Possibly the
>>>>>> workload was
>>>>>>   # too small so retry with something longer.
>>>>>>   output=$(perf stat -e "$p" perf bench internals synthesize
>>>>>> 2>&1)
>>>>>> --
>>>>>> 2.47.3
>>>>>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-04-07  0:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-15 10:57 [PATCH] tools/perf/test: Check for perf stat return code in perf all PMU test Athira Rajeev
2026-03-23 10:39 ` Venkat
2026-04-01 20:40   ` Ian Rogers
2026-04-01 23:48     ` Namhyung Kim
2026-04-01 23:57       ` Ian Rogers
2026-04-02 15:45         ` Athira Rajeev
2026-04-03  1:27           ` Namhyung Kim
2026-04-02 17:32     ` Falcon, Thomas
2026-04-03  7:36       ` Mi, Dapeng
2026-04-03 15:39         ` Ian Rogers
2026-04-07  0:48           ` Mi, Dapeng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox