[PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
@ 2025-10-01 19:50 Anubhav Shelat
  2025-10-01 20:43 ` Ian Rogers
  2025-10-02  6:55 ` Thomas Richter
  0 siblings, 2 replies; 24+ messages in thread
From: Anubhav Shelat @ 2025-10-01 19:50 UTC (permalink / raw)
  To: mpetlan, acme, namhyung, irogers, linux-perf-users
  Cc: peterz, mingo, mark.rutland, alexander.shishkin, jolsa,
	adrian.hunter, kan.liang, dapeng1.mi, james.clark, Anubhav Shelat

On aarch64 systems, when running the leader sampling test, the cycle
count on the leader is consistently ~30 cycles less than the cycle count
on the slave event causing the test to fail. This looks like the result
of some hardware property of aarch64 processors, so allow for a small
difference in cycles between the leader and slave events on aarch64
systems.

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
---
 tools/perf/tests/shell/record.sh | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/shell/record.sh b/tools/perf/tests/shell/record.sh
index b1ad24fb3b33..dff83d64e970 100755
--- a/tools/perf/tests/shell/record.sh
+++ b/tools/perf/tests/shell/record.sh
@@ -280,7 +280,12 @@ test_leader_sampling() {
   while IFS= read -r line
   do
     cycles=$(echo $line | awk '{for(i=1;i<=NF;i++) if($i=="cycles:") print $(i-1)}')
-    if [ $(($index%2)) -ne 0 ] && [ ${cycles}x != ${prev_cycles}x ]
+    # On aarch64 systems the leader event gets stopped ~30 cycles before the slave, so allow some
+    # difference
+    if [ "$(uname -m)" = "aarch64" ] && (( cycles - prev_cycles < 50 ))
+    then
+	    valid_counts=$(($valid_counts+1))
+    elif [ $(($index%2)) -ne 0 ] && [ ${cycles}x != ${prev_cycles}x ]
     then
       invalid_counts=$(($invalid_counts+1))
     else
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-01 19:50 [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64 Anubhav Shelat
@ 2025-10-01 20:43 ` Ian Rogers
  2025-10-02  6:55 ` Thomas Richter
  1 sibling, 0 replies; 24+ messages in thread
From: Ian Rogers @ 2025-10-01 20:43 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: mpetlan, acme, namhyung, linux-perf-users, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, adrian.hunter, kan.liang,
	dapeng1.mi, james.clark

On Wed, Oct 1, 2025 at 12:52 PM Anubhav Shelat <ashelat@redhat.com> wrote:
>
> On aarch64 systems, when running the leader sampling test, the cycle
> count on the leader is consistently ~30 cycles less than the cycle count
> on the slave event causing the test to fail. This looks like the result
> of some hardware property of aarch64 processors, so allow for a small
> difference in cycles between the leader and slave events on aarch64
> systems.
>
> Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
> ---
>  tools/perf/tests/shell/record.sh | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/tests/shell/record.sh b/tools/perf/tests/shell/record.sh
> index b1ad24fb3b33..dff83d64e970 100755
> --- a/tools/perf/tests/shell/record.sh
> +++ b/tools/perf/tests/shell/record.sh
> @@ -280,7 +280,12 @@ test_leader_sampling() {
>    while IFS= read -r line
>    do
>      cycles=$(echo $line | awk '{for(i=1;i<=NF;i++) if($i=="cycles:") print $(i-1)}')
> -    if [ $(($index%2)) -ne 0 ] && [ ${cycles}x != ${prev_cycles}x ]
> +    # On aarch64 systems the leader event gets stopped ~30 cycles before the slave, so allow some
> +    # difference
> +    if [ "$(uname -m)" = "aarch64" ] && (( cycles - prev_cycles < 50 ))

If cycles is 0 then this will always pass, should this be checking a range?

Thanks,
Ian

> +    then
> +           valid_counts=$(($valid_counts+1))
> +    elif [ $(($index%2)) -ne 0 ] && [ ${cycles}x != ${prev_cycles}x ]
>      then
>        invalid_counts=$(($invalid_counts+1))
>      else
> --
> 2.47.3
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-01 19:50 [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64 Anubhav Shelat
  2025-10-01 20:43 ` Ian Rogers
@ 2025-10-02  6:55 ` Thomas Richter
       [not found]   ` <CA+G8DhL49FWD47bkbcXYeb9T=AbxNhC-ypqjkNxRnW0JqmYnPw@mail.gmail.com>
  1 sibling, 1 reply; 24+ messages in thread
From: Thomas Richter @ 2025-10-02  6:55 UTC (permalink / raw)
  To: Anubhav Shelat, mpetlan, acme, namhyung, irogers,
	linux-perf-users
  Cc: peterz, mingo, mark.rutland, alexander.shishkin, jolsa,
	adrian.hunter, kan.liang, dapeng1.mi, james.clark

On 10/1/25 21:50, Anubhav Shelat wrote:
> On aarch64 systems, when running the leader sampling test, the cycle
> count on the leader is consistently ~30 cycles less than the cycle count
> on the slave event causing the test to fail. This looks like the result
> of some hardware property of aarch64 processors, so allow for a small
> difference in cycles between the leader and slave events on aarch64
> systems.

I have observed the same behavior on s390 too and I guess other
platforms run into similar issues as well.

Can we use a larger range to allow the test to pass?

I suggest we 'ignore'  a small percentage of hits which
violate the range, lets say 10 %.
So if 90% of the cycles are in the allowed range, the test is good.

Just my 2 cents from debugging this on about 5 different s390
machine generations with and without virtualization and different
work-loads.

[...]

-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH

Vorsitzender des Aufsichtsrats: Wolfgang Wendt

Geschäftsführung: David Faller

Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
       [not found]   ` <CA+G8DhL49FWD47bkbcXYeb9T=AbxNhC-ypqjkNxRnW0JqmYnPw@mail.gmail.com>
@ 2025-10-02 17:44     ` Anubhav Shelat
  2025-10-07  5:47     ` Thomas Richter
  1 sibling, 0 replies; 24+ messages in thread
From: Anubhav Shelat @ 2025-10-02 17:44 UTC (permalink / raw)
  To: Thomas Richter
  Cc: mpetlan, acme, namhyung, irogers, linux-perf-users, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, adrian.hunter, kan.liang,
	dapeng1.mi, james.clark

On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
> If cycles is 0 then this will always pass, should this be checking a range?

Yes you're right this will be better.

On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
> Can we use a larger range to allow the test to pass?

What range do you get on s390? When I do group measurements using
"perf record -e "{cycles,cycles}:Su" perf test -w brstack" like in the
test I always get somewhere between 20 and 50 cycles difference. I
haven't tested on s390x, but I see no cycle count difference when
testing the same command on x86. I have observed much larger, more
varied differences when using software events.

Anubhav


On Thu, Oct 2, 2025 at 2:39 PM Anubhav Shelat <ashelat@redhat.com> wrote:
>
> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
> > If cycles is 0 then this will always pass, should this be checking a range?
>
> Yes you're right this will be better.
>
> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
> > Can we use a larger range to allow the test to pass?
>
> What range do you get on s390? When I do group measurements using "perf record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I always get somewhere between 20 and 50 cycles difference. I haven't tested on s390x, but I see no cycle count difference when testing the same command on x86. I have observed much larger, more varied differences when using software events.
>
> Anubhav


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
       [not found]   ` <CA+G8DhL49FWD47bkbcXYeb9T=AbxNhC-ypqjkNxRnW0JqmYnPw@mail.gmail.com>
  2025-10-02 17:44     ` Anubhav Shelat
@ 2025-10-07  5:47     ` Thomas Richter
  2025-10-07 12:34       ` James Clark
  1 sibling, 1 reply; 24+ messages in thread
From: Thomas Richter @ 2025-10-07  5:47 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: mpetlan, acme, namhyung, irogers, linux-perf-users, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, adrian.hunter, kan.liang,
	dapeng1.mi, james.clark

On 10/2/25 15:39, Anubhav Shelat wrote:
> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
>> If cycles is 0 then this will always pass, should this be checking a
> range?
> 
> Yes you're right this will be better.
> 
> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
>> Can we use a larger range to allow the test to pass?
> 
> What range do you get on s390? When I do group measurements using "perf
> record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I
> always get somewhere between 20 and 50 cycles difference. I haven't tested
> on s390x, but I see no cycle count difference when testing the same command
> on x86. I have observed much larger, more varied differences when using
> software events.
> 
> Anubhav
> 

Here is the output of the 

 # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack 
 # perf script | grep brstack

commands:

perf 1110782 426394.696874:    6885000 cycles:           116fc9e brstack_bench+0xae (/r>
perf 1110782 426394.696875:    1377000 cycles:           116fb98 brstack_foo+0x0 (/root>
perf 1110782 426394.696877:    1377000 cycles:           116fb48 brstack_bar+0x0 (/root>
perf 1110782 426394.696878:    1377000 cycles:           116fc94 brstack_bench+0xa4 (/r>
perf 1110782 426394.696880:    1377000 cycles:           116fc84 brstack_bench+0x94 (/r>
perf 1110782 426394.696881:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
perf 1110782 426394.696883:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
perf 1110782 426394.696884:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
perf 1110782 426394.696885:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
perf 1110782 426394.696887:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
perf 1110782 426394.696888:    1377000 cycles:           116fc98 brstack_bench+0xa8 (/r>
perf 1110782 426394.696890:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
perf 1110782 426394.696891:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
perf 1110782 426394.703542:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
perf 1110782 426394.703542:   30971975 cycles:           116fb7c brstack_bar+0x34 (/roo>
perf 1110782 426394.703543:    1377000 cycles:           116fc76 brstack_bench+0x86 (/r>
perf 1110782 426394.703545:    1377000 cycles:           116fc06 brstack_bench+0x16 (/r>
perf 1110782 426394.703546:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
perf 1110782 426394.703547:    1377000 cycles:           116fc20 brstack_bench+0x30 (/r>
perf 1110782 426394.703549:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
perf 1110782 426394.703550:    1377000 cycles:           116fcbc brstack_bench+0xcc

They are usual identical values beside one or two which are way off. Ignoring those would
be good.

-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH

Vorsitzender des Aufsichtsrats: Wolfgang Wendt

Geschäftsführung: David Faller

Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-07  5:47     ` Thomas Richter
@ 2025-10-07 12:34       ` James Clark
  2025-10-08  7:52         ` Namhyung Kim
  2025-10-08 10:48         ` Thomas Richter
  0 siblings, 2 replies; 24+ messages in thread
From: James Clark @ 2025-10-07 12:34 UTC (permalink / raw)
  To: Thomas Richter, Anubhav Shelat
  Cc: mpetlan, acme, namhyung, irogers, linux-perf-users, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, adrian.hunter, kan.liang,
	dapeng1.mi



On 07/10/2025 6:47 am, Thomas Richter wrote:
> On 10/2/25 15:39, Anubhav Shelat wrote:
>> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
>>> If cycles is 0 then this will always pass, should this be checking a
>> range?
>>
>> Yes you're right this will be better.
>>
>> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
>>> Can we use a larger range to allow the test to pass?
>>
>> What range do you get on s390? When I do group measurements using "perf
>> record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I
>> always get somewhere between 20 and 50 cycles difference. I haven't tested
>> on s390x, but I see no cycle count difference when testing the same command
>> on x86. I have observed much larger, more varied differences when using
>> software events.
>>
>> Anubhav
>>
> 
> Here is the output of the
> 
>   # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack
>   # perf script | grep brstack
> 
> commands:
> 
> perf 1110782 426394.696874:    6885000 cycles:           116fc9e brstack_bench+0xae (/r>
> perf 1110782 426394.696875:    1377000 cycles:           116fb98 brstack_foo+0x0 (/root>
> perf 1110782 426394.696877:    1377000 cycles:           116fb48 brstack_bar+0x0 (/root>
> perf 1110782 426394.696878:    1377000 cycles:           116fc94 brstack_bench+0xa4 (/r>
> perf 1110782 426394.696880:    1377000 cycles:           116fc84 brstack_bench+0x94 (/r>
> perf 1110782 426394.696881:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> perf 1110782 426394.696883:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> perf 1110782 426394.696884:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> perf 1110782 426394.696885:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> perf 1110782 426394.696887:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> perf 1110782 426394.696888:    1377000 cycles:           116fc98 brstack_bench+0xa8 (/r>
> perf 1110782 426394.696890:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> perf 1110782 426394.696891:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
> perf 1110782 426394.703542:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> perf 1110782 426394.703542:   30971975 cycles:           116fb7c brstack_bar+0x34 (/roo>
> perf 1110782 426394.703543:    1377000 cycles:           116fc76 brstack_bench+0x86 (/r>
> perf 1110782 426394.703545:    1377000 cycles:           116fc06 brstack_bench+0x16 (/r>
> perf 1110782 426394.703546:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
> perf 1110782 426394.703547:    1377000 cycles:           116fc20 brstack_bench+0x30 (/r>
> perf 1110782 426394.703549:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
> perf 1110782 426394.703550:    1377000 cycles:           116fcbc brstack_bench+0xcc
> 
> They are usual identical values beside one or two which are way off. Ignoring those would
> be good.
> 

FWIW I ran 100+ iterations my Arm Juno and N1SDP boards and the test 
passed every time.

Are we sure there isn't some kind of race condition or bug that the test 
has found? Rather than a bug in the test?

At least "This looks like the result of some hardware property of 
aarch64 processors" in the commit message can't be accurate as this 
isn't the case everywhere.

James


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-07 12:34       ` James Clark
@ 2025-10-08  7:52         ` Namhyung Kim
  2025-10-08 10:48         ` Thomas Richter
  1 sibling, 0 replies; 24+ messages in thread
From: Namhyung Kim @ 2025-10-08  7:52 UTC (permalink / raw)
  To: James Clark
  Cc: Thomas Richter, Anubhav Shelat, mpetlan, acme, irogers,
	linux-perf-users, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, adrian.hunter, kan.liang, dapeng1.mi

Hello,

On Tue, Oct 07, 2025 at 01:34:46PM +0100, James Clark wrote:
> 
> 
> On 07/10/2025 6:47 am, Thomas Richter wrote:
> > On 10/2/25 15:39, Anubhav Shelat wrote:
> > > On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
> > > > If cycles is 0 then this will always pass, should this be checking a
> > > range?
> > > 
> > > Yes you're right this will be better.
> > > 
> > > On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
> > > > Can we use a larger range to allow the test to pass?
> > > 
> > > What range do you get on s390? When I do group measurements using "perf
> > > record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I
> > > always get somewhere between 20 and 50 cycles difference. I haven't tested
> > > on s390x, but I see no cycle count difference when testing the same command
> > > on x86. I have observed much larger, more varied differences when using
> > > software events.
> > > 
> > > Anubhav
> > > 
> > 
> > Here is the output of the
> > 
> >   # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack
> >   # perf script | grep brstack
> > 
> > commands:
> > 
> > perf 1110782 426394.696874:    6885000 cycles:           116fc9e brstack_bench+0xae (/r>
> > perf 1110782 426394.696875:    1377000 cycles:           116fb98 brstack_foo+0x0 (/root>
> > perf 1110782 426394.696877:    1377000 cycles:           116fb48 brstack_bar+0x0 (/root>
> > perf 1110782 426394.696878:    1377000 cycles:           116fc94 brstack_bench+0xa4 (/r>
> > perf 1110782 426394.696880:    1377000 cycles:           116fc84 brstack_bench+0x94 (/r>
> > perf 1110782 426394.696881:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> > perf 1110782 426394.696883:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> > perf 1110782 426394.696884:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> > perf 1110782 426394.696885:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> > perf 1110782 426394.696887:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> > perf 1110782 426394.696888:    1377000 cycles:           116fc98 brstack_bench+0xa8 (/r>
> > perf 1110782 426394.696890:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> > perf 1110782 426394.696891:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
> > perf 1110782 426394.703542:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
> > perf 1110782 426394.703542:   30971975 cycles:           116fb7c brstack_bar+0x34 (/roo>
> > perf 1110782 426394.703543:    1377000 cycles:           116fc76 brstack_bench+0x86 (/r>
> > perf 1110782 426394.703545:    1377000 cycles:           116fc06 brstack_bench+0x16 (/r>
> > perf 1110782 426394.703546:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
> > perf 1110782 426394.703547:    1377000 cycles:           116fc20 brstack_bench+0x30 (/r>
> > perf 1110782 426394.703549:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
> > perf 1110782 426394.703550:    1377000 cycles:           116fcbc brstack_bench+0xcc
> > 
> > They are usual identical values beside one or two which are way off. Ignoring those would
> > be good.
> > 
> 
> FWIW I ran 100+ iterations my Arm Juno and N1SDP boards and the test passed
> every time.
> 
> Are we sure there isn't some kind of race condition or bug that the test has
> found? Rather than a bug in the test?

I suspect this too.

> 
> At least "This looks like the result of some hardware property of aarch64
> processors" in the commit message can't be accurate as this isn't the case
> everywhere.

I guess this depends on hardware capability to start/stop the PMU globally
rather than doing it for each counter.

Maybe we can have two set of checks depends on the hardware.  For
example, we keep the existing test on x86 (and selected ARM machines?)
and add a range test for others.  It'd be great if the kernel could
expose that information to userspace.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-07 12:34       ` James Clark
  2025-10-08  7:52         ` Namhyung Kim
@ 2025-10-08 10:48         ` Thomas Richter
  2025-10-08 11:24           ` James Clark
  1 sibling, 1 reply; 24+ messages in thread
From: Thomas Richter @ 2025-10-08 10:48 UTC (permalink / raw)
  To: James Clark, Anubhav Shelat
  Cc: mpetlan, acme, namhyung, irogers, linux-perf-users, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, adrian.hunter, kan.liang,
	dapeng1.mi

On 10/7/25 14:34, James Clark wrote:
> 
> 
> On 07/10/2025 6:47 am, Thomas Richter wrote:
>> On 10/2/25 15:39, Anubhav Shelat wrote:
>>> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
>>>> If cycles is 0 then this will always pass, should this be checking a
>>> range?
>>>
>>> Yes you're right this will be better.
>>>
>>> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
>>>> Can we use a larger range to allow the test to pass?
>>>
>>> What range do you get on s390? When I do group measurements using "perf
>>> record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I
>>> always get somewhere between 20 and 50 cycles difference. I haven't tested
>>> on s390x, but I see no cycle count difference when testing the same command
>>> on x86. I have observed much larger, more varied differences when using
>>> software events.
>>>
>>> Anubhav
>>>
>>
>> Here is the output of the
>>
>>   # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack
>>   # perf script | grep brstack
>>
>> commands:
>>
>> perf 1110782 426394.696874:    6885000 cycles:           116fc9e brstack_bench+0xae (/r>
>> perf 1110782 426394.696875:    1377000 cycles:           116fb98 brstack_foo+0x0 (/root>
>> perf 1110782 426394.696877:    1377000 cycles:           116fb48 brstack_bar+0x0 (/root>
>> perf 1110782 426394.696878:    1377000 cycles:           116fc94 brstack_bench+0xa4 (/r>
>> perf 1110782 426394.696880:    1377000 cycles:           116fc84 brstack_bench+0x94 (/r>
>> perf 1110782 426394.696881:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> perf 1110782 426394.696883:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> perf 1110782 426394.696884:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> perf 1110782 426394.696885:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> perf 1110782 426394.696887:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> perf 1110782 426394.696888:    1377000 cycles:           116fc98 brstack_bench+0xa8 (/r>
>> perf 1110782 426394.696890:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> perf 1110782 426394.696891:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>> perf 1110782 426394.703542:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> perf 1110782 426394.703542:   30971975 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> perf 1110782 426394.703543:    1377000 cycles:           116fc76 brstack_bench+0x86 (/r>
>> perf 1110782 426394.703545:    1377000 cycles:           116fc06 brstack_bench+0x16 (/r>
>> perf 1110782 426394.703546:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>> perf 1110782 426394.703547:    1377000 cycles:           116fc20 brstack_bench+0x30 (/r>
>> perf 1110782 426394.703549:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>> perf 1110782 426394.703550:    1377000 cycles:           116fcbc brstack_bench+0xcc
>>
>> They are usual identical values beside one or two which are way off. Ignoring those would
>> be good.
>>
> 
> FWIW I ran 100+ iterations my Arm Juno and N1SDP boards and the test passed every time.
> 
> Are we sure there isn't some kind of race condition or bug that the test has found? Rather than a bug in the test?
There is always a possibility of a bug, that can not be ruled out for certain.
However as LPARs on s390 run on top of a hypervisor, there is a chance for the 
linux guest being stopped while hardware keeps running.

I see these runoff values time and again, roughly every second run fails with
one runoff value

Hope this helps

-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH

Vorsitzender des Aufsichtsrats: Wolfgang Wendt

Geschäftsführung: David Faller

Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-08 10:48         ` Thomas Richter
@ 2025-10-08 11:24           ` James Clark
  2025-10-09 12:14             ` Thomas Richter
       [not found]             ` <CA+G8Dh+Odf40jdY4h1knjU+3sSjZokMx6OdzRT3o9v1=ndKORQ@mail.gmail.com>
  0 siblings, 2 replies; 24+ messages in thread
From: James Clark @ 2025-10-08 11:24 UTC (permalink / raw)
  To: Thomas Richter, Anubhav Shelat, Namhyung Kim
  Cc: mpetlan, acme, irogers, linux-perf-users, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, adrian.hunter, kan.liang,
	dapeng1.mi



On 08/10/2025 11:48 am, Thomas Richter wrote:
> On 10/7/25 14:34, James Clark wrote:
>>
>>
>> On 07/10/2025 6:47 am, Thomas Richter wrote:
>>> On 10/2/25 15:39, Anubhav Shelat wrote:
>>>> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
>>>>> If cycles is 0 then this will always pass, should this be checking a
>>>> range?
>>>>
>>>> Yes you're right this will be better.
>>>>
>>>> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
>>>>> Can we use a larger range to allow the test to pass?
>>>>
>>>> What range do you get on s390? When I do group measurements using "perf
>>>> record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I
>>>> always get somewhere between 20 and 50 cycles difference. I haven't tested
>>>> on s390x, but I see no cycle count difference when testing the same command
>>>> on x86. I have observed much larger, more varied differences when using
>>>> software events.
>>>>
>>>> Anubhav
>>>>
>>>
>>> Here is the output of the
>>>
>>>    # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack
>>>    # perf script | grep brstack
>>>
>>> commands:
>>>
>>> perf 1110782 426394.696874:    6885000 cycles:           116fc9e brstack_bench+0xae (/r>
>>> perf 1110782 426394.696875:    1377000 cycles:           116fb98 brstack_foo+0x0 (/root>
>>> perf 1110782 426394.696877:    1377000 cycles:           116fb48 brstack_bar+0x0 (/root>
>>> perf 1110782 426394.696878:    1377000 cycles:           116fc94 brstack_bench+0xa4 (/r>
>>> perf 1110782 426394.696880:    1377000 cycles:           116fc84 brstack_bench+0x94 (/r>
>>> perf 1110782 426394.696881:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>> perf 1110782 426394.696883:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>> perf 1110782 426394.696884:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>> perf 1110782 426394.696885:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>> perf 1110782 426394.696887:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>> perf 1110782 426394.696888:    1377000 cycles:           116fc98 brstack_bench+0xa8 (/r>
>>> perf 1110782 426394.696890:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>> perf 1110782 426394.696891:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>> perf 1110782 426394.703542:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>> perf 1110782 426394.703542:   30971975 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>> perf 1110782 426394.703543:    1377000 cycles:           116fc76 brstack_bench+0x86 (/r>
>>> perf 1110782 426394.703545:    1377000 cycles:           116fc06 brstack_bench+0x16 (/r>
>>> perf 1110782 426394.703546:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>> perf 1110782 426394.703547:    1377000 cycles:           116fc20 brstack_bench+0x30 (/r>
>>> perf 1110782 426394.703549:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>> perf 1110782 426394.703550:    1377000 cycles:           116fcbc brstack_bench+0xcc
>>>
>>> They are usual identical values beside one or two which are way off. Ignoring those would
>>> be good.
>>>
>>
>> FWIW I ran 100+ iterations my Arm Juno and N1SDP boards and the test passed every time.
>>
>> Are we sure there isn't some kind of race condition or bug that the test has found? Rather than a bug in the test?
> There is always a possibility of a bug, that can not be ruled out for certain.
> However as LPARs on s390 run on top of a hypervisor, there is a chance for the
> linux guest being stopped while hardware keeps running.
> 

I have no idea what's going on or how that works, so maybe this question 
is useless, but doesn't that mean that guests can determine/guess the 
counter values from other guests? If the hardware keeps the counter 
running when the guest isn't, that sounds like something is leaking from 
one guest to another? Should the hypervisor not be saving and restoring 
context?

> I see these runoff values time and again, roughly every second run fails with
> one runoff value
> 
> Hope this helps
> 

That may explain the issue for s390 then, but I'm assuming it doesn't 
explain the issues on Arm if the failures there aren't in a VM. But even 
if they were in a VM, the PMU is fully virtualised and the events would 
be stopped and resumed when the guest is switched out.

James


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-08 11:24           ` James Clark
@ 2025-10-09 12:14             ` Thomas Richter
       [not found]             ` <CA+G8Dh+Odf40jdY4h1knjU+3sSjZokMx6OdzRT3o9v1=ndKORQ@mail.gmail.com>
  1 sibling, 0 replies; 24+ messages in thread
From: Thomas Richter @ 2025-10-09 12:14 UTC (permalink / raw)
  To: James Clark, Anubhav Shelat, Namhyung Kim
  Cc: mpetlan, acme, irogers, linux-perf-users, peterz, mingo,
	mark.rutland, alexander.shishkin, jolsa, adrian.hunter, kan.liang,
	dapeng1.mi

On 10/8/25 13:24, James Clark wrote:
> 
> 
> On 08/10/2025 11:48 am, Thomas Richter wrote:
>> On 10/7/25 14:34, James Clark wrote:
>>>
>>>
>>> On 07/10/2025 6:47 am, Thomas Richter wrote:
>>>> On 10/2/25 15:39, Anubhav Shelat wrote:
>>>>> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
>>>>>> If cycles is 0 then this will always pass, should this be checking a
>>>>> range?
>>>>>
>>>>> Yes you're right this will be better.
>>>>>
>>>>> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
>>>>>> Can we use a larger range to allow the test to pass?
>>>>>
>>>>> What range do you get on s390? When I do group measurements using "perf
>>>>> record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I
>>>>> always get somewhere between 20 and 50 cycles difference. I haven't tested
>>>>> on s390x, but I see no cycle count difference when testing the same command
>>>>> on x86. I have observed much larger, more varied differences when using
>>>>> software events.
>>>>>
>>>>> Anubhav
>>>>>
>>>>
>>>> Here is the output of the
>>>>
>>>>    # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack
>>>>    # perf script | grep brstack
>>>>
>>>> commands:
>>>>
>>>> perf 1110782 426394.696874:    6885000 cycles:           116fc9e brstack_bench+0xae (/r>
>>>> perf 1110782 426394.696875:    1377000 cycles:           116fb98 brstack_foo+0x0 (/root>
>>>> perf 1110782 426394.696877:    1377000 cycles:           116fb48 brstack_bar+0x0 (/root>
>>>> perf 1110782 426394.696878:    1377000 cycles:           116fc94 brstack_bench+0xa4 (/r>
>>>> perf 1110782 426394.696880:    1377000 cycles:           116fc84 brstack_bench+0x94 (/r>
>>>> perf 1110782 426394.696881:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>> perf 1110782 426394.696883:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>> perf 1110782 426394.696884:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>> perf 1110782 426394.696885:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>> perf 1110782 426394.696887:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>> perf 1110782 426394.696888:    1377000 cycles:           116fc98 brstack_bench+0xa8 (/r>
>>>> perf 1110782 426394.696890:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>> perf 1110782 426394.696891:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>>> perf 1110782 426394.703542:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>> perf 1110782 426394.703542:   30971975 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>> perf 1110782 426394.703543:    1377000 cycles:           116fc76 brstack_bench+0x86 (/r>
>>>> perf 1110782 426394.703545:    1377000 cycles:           116fc06 brstack_bench+0x16 (/r>
>>>> perf 1110782 426394.703546:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>>> perf 1110782 426394.703547:    1377000 cycles:           116fc20 brstack_bench+0x30 (/r>
>>>> perf 1110782 426394.703549:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>>> perf 1110782 426394.703550:    1377000 cycles:           116fcbc brstack_bench+0xcc
>>>>
>>>> They are usual identical values beside one or two which are way off. Ignoring those would
>>>> be good.
>>>>
>>>
>>> FWIW I ran 100+ iterations my Arm Juno and N1SDP boards and the test passed every time.
>>>
>>> Are we sure there isn't some kind of race condition or bug that the test has found? Rather than a bug in the test?
>> There is always a possibility of a bug, that can not be ruled out for certain.
>> However as LPARs on s390 run on top of a hypervisor, there is a chance for the
>> linux guest being stopped while hardware keeps running.
>>
> 
> I have no idea what's going on or how that works, so maybe this question is useless, but doesn't that mean that guests can determine/guess the counter values from other guests? If the hardware keeps the counter running when the guest isn't, that sounds like something is leaking from one guest to another? Should the hypervisor not be saving and restoring context?

I have not enough knowledge myself to answer that. But I try to find out.
However I guess that the hypervisor saves and restores context in term of other guests.
Maybe the hypervisor does some work on behalf of the currently running guest?
I'll start digging...

> 
>> I see these runoff values time and again, roughly every second run fails with
>> one runoff value
>>
>> Hope this helps
>>
> 
> That may explain the issue for s390 then, but I'm assuming it doesn't explain the issues on Arm if the failures there aren't in a VM. But even if they were in a VM, the PMU is fully virtualised and the events would be stopped and resumed when the guest is switched out.
> 
> James
> 
-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH

Vorsitzender des Aufsichtsrats: Wolfgang Wendt

Geschäftsführung: David Faller

Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
       [not found]             ` <CA+G8Dh+Odf40jdY4h1knjU+3sSjZokMx6OdzRT3o9v1=ndKORQ@mail.gmail.com>
@ 2025-10-09 13:55               ` Anubhav Shelat
  2025-10-09 14:17                 ` James Clark
  2025-10-09 14:08               ` James Clark
  1 sibling, 1 reply; 24+ messages in thread
From: Anubhav Shelat @ 2025-10-09 13:55 UTC (permalink / raw)
  To: James Clark
  Cc: Thomas Richter, Namhyung Kim, mpetlan, acme, irogers,
	linux-perf-users, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, adrian.hunter, kan.liang, dapeng1.mi

I tested on a new arm machine and I'm getting a similar issue as
Thomas, but the test fails every 20 or so runs and I'm not getting the
issue that I previously mentioned.

Running test #15
 10bc60-10bcc4 g test_loop
perf does have symbol 'test_loop'
 10c354-10c418 l brstack
perf does have symbol 'brstack'
Basic leader sampling test
Basic leader sampling test [Success]
Invalid Counts: 1
Valid Counts: 27
Running test #16
 10bc60-10bcc4 g test_loop
perf does have symbol 'test_loop'
 10c354-10c418 l brstack
perf does have symbol 'brstack'
Basic leader sampling test
Basic leader sampling test [Success]
Invalid Counts: 1
Valid Counts: 27
Running test #17
 10bc60-10bcc4 g test_loop
perf does have symbol 'test_loop'
 10c354-10c418 l brstack
perf does have symbol 'brstack'
Basic leader sampling test
Leader sampling [Failed inconsistent cycles count]
Invalid Counts: 8
Valid Counts: 28

Initially I thought it was the throttling issue mentioned in the
comment in test_leadership_sampling, but there's another thread says
that it's fixed:
https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/


On Thu, Oct 9, 2025 at 2:43 PM Anubhav Shelat <ashelat@redhat.com> wrote:
>
> I tested on a new arm machine and I'm getting a similar issue as Thomas, but the test fails every 20 or so runs and I'm not getting the issue that I previously mentioned.
>
> Running test #15
>  10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>  10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Basic leader sampling test [Success]
> Invalid Counts: 1
> Valid Counts: 27
> Running test #16
>  10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>  10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Basic leader sampling test [Success]
> Invalid Counts: 1
> Valid Counts: 27
> Running test #17
>  10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>  10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Leader sampling [Failed inconsistent cycles count]
> Invalid Counts: 8
> Valid Counts: 28
>
> Initially I thought it was the throttling issue mentioned in the comment in test_leadership_sampling, but there's another thread says that it's fixed:
> https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/
>
>
> On Wed, Oct 8, 2025 at 12:24 PM James Clark <james.clark@linaro.org> wrote:
>>
>>
>>
>> On 08/10/2025 11:48 am, Thomas Richter wrote:
>> > On 10/7/25 14:34, James Clark wrote:
>> >>
>> >>
>> >> On 07/10/2025 6:47 am, Thomas Richter wrote:
>> >>> On 10/2/25 15:39, Anubhav Shelat wrote:
>> >>>> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
>> >>>>> If cycles is 0 then this will always pass, should this be checking a
>> >>>> range?
>> >>>>
>> >>>> Yes you're right this will be better.
>> >>>>
>> >>>> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
>> >>>>> Can we use a larger range to allow the test to pass?
>> >>>>
>> >>>> What range do you get on s390? When I do group measurements using "perf
>> >>>> record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I
>> >>>> always get somewhere between 20 and 50 cycles difference. I haven't tested
>> >>>> on s390x, but I see no cycle count difference when testing the same command
>> >>>> on x86. I have observed much larger, more varied differences when using
>> >>>> software events.
>> >>>>
>> >>>> Anubhav
>> >>>>
>> >>>
>> >>> Here is the output of the
>> >>>
>> >>>    # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack
>> >>>    # perf script | grep brstack
>> >>>
>> >>> commands:
>> >>>
>> >>> perf 1110782 426394.696874:    6885000 cycles:           116fc9e brstack_bench+0xae (/r>
>> >>> perf 1110782 426394.696875:    1377000 cycles:           116fb98 brstack_foo+0x0 (/root>
>> >>> perf 1110782 426394.696877:    1377000 cycles:           116fb48 brstack_bar+0x0 (/root>
>> >>> perf 1110782 426394.696878:    1377000 cycles:           116fc94 brstack_bench+0xa4 (/r>
>> >>> perf 1110782 426394.696880:    1377000 cycles:           116fc84 brstack_bench+0x94 (/r>
>> >>> perf 1110782 426394.696881:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> >>> perf 1110782 426394.696883:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> >>> perf 1110782 426394.696884:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> >>> perf 1110782 426394.696885:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> >>> perf 1110782 426394.696887:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> >>> perf 1110782 426394.696888:    1377000 cycles:           116fc98 brstack_bench+0xa8 (/r>
>> >>> perf 1110782 426394.696890:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> >>> perf 1110782 426394.696891:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>> >>> perf 1110782 426394.703542:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> >>> perf 1110782 426394.703542:   30971975 cycles:           116fb7c brstack_bar+0x34 (/roo>
>> >>> perf 1110782 426394.703543:    1377000 cycles:           116fc76 brstack_bench+0x86 (/r>
>> >>> perf 1110782 426394.703545:    1377000 cycles:           116fc06 brstack_bench+0x16 (/r>
>> >>> perf 1110782 426394.703546:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>> >>> perf 1110782 426394.703547:    1377000 cycles:           116fc20 brstack_bench+0x30 (/r>
>> >>> perf 1110782 426394.703549:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>> >>> perf 1110782 426394.703550:    1377000 cycles:           116fcbc brstack_bench+0xcc
>> >>>
>> >>> They are usual identical values beside one or two which are way off. Ignoring those would
>> >>> be good.
>> >>>
>> >>
>> >> FWIW I ran 100+ iterations my Arm Juno and N1SDP boards and the test passed every time.
>> >>
>> >> Are we sure there isn't some kind of race condition or bug that the test has found? Rather than a bug in the test?
>> > There is always a possibility of a bug, that can not be ruled out for certain.
>> > However as LPARs on s390 run on top of a hypervisor, there is a chance for the
>> > linux guest being stopped while hardware keeps running.
>> >
>>
>> I have no idea what's going on or how that works, so maybe this question
>> is useless, but doesn't that mean that guests can determine/guess the
>> counter values from other guests? If the hardware keeps the counter
>> running when the guest isn't, that sounds like something is leaking from
>> one guest to another? Should the hypervisor not be saving and restoring
>> context?
>>
>> > I see these runoff values time and again, roughly every second run fails with
>> > one runoff value
>> >
>> > Hope this helps
>> >
>>
>> That may explain the issue for s390 then, but I'm assuming it doesn't
>> explain the issues on Arm if the failures there aren't in a VM. But even
>> if they were in a VM, the PMU is fully virtualised and the events would
>> be stopped and resumed when the guest is switched out.
>>
>> James
>>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
       [not found]             ` <CA+G8Dh+Odf40jdY4h1knjU+3sSjZokMx6OdzRT3o9v1=ndKORQ@mail.gmail.com>
  2025-10-09 13:55               ` Anubhav Shelat
@ 2025-10-09 14:08               ` James Clark
  1 sibling, 0 replies; 24+ messages in thread
From: James Clark @ 2025-10-09 14:08 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: Thomas Richter, Namhyung Kim, mpetlan, acme, irogers,
	linux-perf-users, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, adrian.hunter, kan.liang, dapeng1.mi



On 09/10/2025 2:43 pm, Anubhav Shelat wrote:
> I tested on a new arm machine and I'm getting a similar issue as Thomas,

Which are your new and old Arm machines exactly? And which kernel 
versions did you run the test on?

> but the test fails every 20 or so runs and I'm not getting the issue that I
> previously mentioned.
> 

What do you mean here? Below I see the leader sampling test failure, 
which I thought was the same issue that was previously mentioned?

> Running test #15
>   10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>   10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Basic leader sampling test [Success]
> Invalid Counts: 1
> Valid Counts: 27
> Running test #16
>   10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>   10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Basic leader sampling test [Success]
> Invalid Counts: 1
> Valid Counts: 27
> Running test #17
>   10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>   10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Leader sampling [Failed inconsistent cycles count]
> Invalid Counts: 8
> Valid Counts: 28
> 
> Initially I thought it was the throttling issue mentioned in the comment in
> test_leadership_sampling, but there's another thread says that it's fixed:
> https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/
> 
> 
> 
> On Wed, Oct 8, 2025 at 12:24 PM James Clark <james.clark@linaro.org> wrote:
> 
>>
>>
>> On 08/10/2025 11:48 am, Thomas Richter wrote:
>>> On 10/7/25 14:34, James Clark wrote:
>>>>
>>>>
>>>> On 07/10/2025 6:47 am, Thomas Richter wrote:
>>>>> On 10/2/25 15:39, Anubhav Shelat wrote:
>>>>>> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
>>>>>>> If cycles is 0 then this will always pass, should this be checking a
>>>>>> range?
>>>>>>
>>>>>> Yes you're right this will be better.
>>>>>>
>>>>>> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
>>>>>>> Can we use a larger range to allow the test to pass?
>>>>>>
>>>>>> What range do you get on s390? When I do group measurements using
>> "perf
>>>>>> record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test
>> I
>>>>>> always get somewhere between 20 and 50 cycles difference. I haven't
>> tested
>>>>>> on s390x, but I see no cycle count difference when testing the same
>> command
>>>>>> on x86. I have observed much larger, more varied differences when
>> using
>>>>>> software events.
>>>>>>
>>>>>> Anubhav
>>>>>>
>>>>>
>>>>> Here is the output of the
>>>>>
>>>>>     # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack
>>>>>     # perf script | grep brstack
>>>>>
>>>>> commands:
>>>>>
>>>>> perf 1110782 426394.696874:    6885000 cycles:           116fc9e
>> brstack_bench+0xae (/r>
>>>>> perf 1110782 426394.696875:    1377000 cycles:           116fb98
>> brstack_foo+0x0 (/root>
>>>>> perf 1110782 426394.696877:    1377000 cycles:           116fb48
>> brstack_bar+0x0 (/root>
>>>>> perf 1110782 426394.696878:    1377000 cycles:           116fc94
>> brstack_bench+0xa4 (/r>
>>>>> perf 1110782 426394.696880:    1377000 cycles:           116fc84
>> brstack_bench+0x94 (/r>
>>>>> perf 1110782 426394.696881:    1377000 cycles:           116fb7c
>> brstack_bar+0x34 (/roo>
>>>>> perf 1110782 426394.696883:    1377000 cycles:           116fb7c
>> brstack_bar+0x34 (/roo>
>>>>> perf 1110782 426394.696884:    1377000 cycles:           116fb7c
>> brstack_bar+0x34 (/roo>
>>>>> perf 1110782 426394.696885:    1377000 cycles:           116fb7c
>> brstack_bar+0x34 (/roo>
>>>>> perf 1110782 426394.696887:    1377000 cycles:           116fb7c
>> brstack_bar+0x34 (/roo>
>>>>> perf 1110782 426394.696888:    1377000 cycles:           116fc98
>> brstack_bench+0xa8 (/r>
>>>>> perf 1110782 426394.696890:    1377000 cycles:           116fb7c
>> brstack_bar+0x34 (/roo>
>>>>> perf 1110782 426394.696891:    1377000 cycles:           116fc9e
>> brstack_bench+0xae (/r>
>>>>> perf 1110782 426394.703542:    1377000 cycles:           116fb7c
>> brstack_bar+0x34 (/roo>
>>>>> perf 1110782 426394.703542:   30971975 cycles:           116fb7c
>> brstack_bar+0x34 (/roo>
>>>>> perf 1110782 426394.703543:    1377000 cycles:           116fc76
>> brstack_bench+0x86 (/r>
>>>>> perf 1110782 426394.703545:    1377000 cycles:           116fc06
>> brstack_bench+0x16 (/r>
>>>>> perf 1110782 426394.703546:    1377000 cycles:           116fc9e
>> brstack_bench+0xae (/r>
>>>>> perf 1110782 426394.703547:    1377000 cycles:           116fc20
>> brstack_bench+0x30 (/r>
>>>>> perf 1110782 426394.703549:    1377000 cycles:           116fc9e
>> brstack_bench+0xae (/r>
>>>>> perf 1110782 426394.703550:    1377000 cycles:           116fcbc
>> brstack_bench+0xcc
>>>>>
>>>>> They are usual identical values beside one or two which are way off.
>> Ignoring those would
>>>>> be good.
>>>>>
>>>>
>>>> FWIW I ran 100+ iterations my Arm Juno and N1SDP boards and the test
>> passed every time.
>>>>
>>>> Are we sure there isn't some kind of race condition or bug that the
>> test has found? Rather than a bug in the test?
>>> There is always a possibility of a bug, that can not be ruled out for
>> certain.
>>> However as LPARs on s390 run on top of a hypervisor, there is a chance
>> for the
>>> linux guest being stopped while hardware keeps running.
>>>
>>
>> I have no idea what's going on or how that works, so maybe this question
>> is useless, but doesn't that mean that guests can determine/guess the
>> counter values from other guests? If the hardware keeps the counter
>> running when the guest isn't, that sounds like something is leaking from
>> one guest to another? Should the hypervisor not be saving and restoring
>> context?
>>
>>> I see these runoff values time and again, roughly every second run fails
>> with
>>> one runoff value
>>>
>>> Hope this helps
>>>
>>
>> That may explain the issue for s390 then, but I'm assuming it doesn't
>> explain the issues on Arm if the failures there aren't in a VM. But even
>> if they were in a VM, the PMU is fully virtualised and the events would
>> be stopped and resumed when the guest is switched out.
>>
>> James
>>
>>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-09 13:55               ` Anubhav Shelat
@ 2025-10-09 14:17                 ` James Clark
       [not found]                   ` <CA+G8DhKQkTKoNer5GfZedPUj4xMizWVJUWFocP2eQ_cmPJtBOQ@mail.gmail.com>
  0 siblings, 1 reply; 24+ messages in thread
From: James Clark @ 2025-10-09 14:17 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: Thomas Richter, Namhyung Kim, mpetlan, acme, irogers,
	linux-perf-users, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, adrian.hunter, kan.liang, dapeng1.mi



On 09/10/2025 2:55 pm, Anubhav Shelat wrote:
> I tested on a new arm machine and I'm getting a similar issue as
> Thomas, but the test fails every 20 or so runs and I'm not getting the
> issue that I previously mentioned.
> 
> Running test #15
>   10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>   10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Basic leader sampling test [Success]
> Invalid Counts: 1
> Valid Counts: 27
> Running test #16
>   10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>   10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Basic leader sampling test [Success]
> Invalid Counts: 1
> Valid Counts: 27
> Running test #17
>   10bc60-10bcc4 g test_loop
> perf does have symbol 'test_loop'
>   10c354-10c418 l brstack
> perf does have symbol 'brstack'
> Basic leader sampling test
> Leader sampling [Failed inconsistent cycles count]
> Invalid Counts: 8
> Valid Counts: 28
> 
> Initially I thought it was the throttling issue mentioned in the
> comment in test_leadership_sampling, but there's another thread says
> that it's fixed:
> https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/
> 
> 

After reading that patch it seems like we should actually be removing 
the 80% tolerance from the leader sampling test. Both instances of the 
cycles counts should be the same now.

(Excluding s390) I'm starting to think you were hitting this bug on an 
older kernel? Or something else is going wrong that we should  get to 
the bottom of. The test could have found something and we shouldn't 
ignore it yet.

> On Thu, Oct 9, 2025 at 2:43 PM Anubhav Shelat <ashelat@redhat.com> wrote:
>>
>> I tested on a new arm machine and I'm getting a similar issue as Thomas, but the test fails every 20 or so runs and I'm not getting the issue that I previously mentioned.
>>
>> Running test #15
>>   10bc60-10bcc4 g test_loop
>> perf does have symbol 'test_loop'
>>   10c354-10c418 l brstack
>> perf does have symbol 'brstack'
>> Basic leader sampling test
>> Basic leader sampling test [Success]
>> Invalid Counts: 1
>> Valid Counts: 27
>> Running test #16
>>   10bc60-10bcc4 g test_loop
>> perf does have symbol 'test_loop'
>>   10c354-10c418 l brstack
>> perf does have symbol 'brstack'
>> Basic leader sampling test
>> Basic leader sampling test [Success]
>> Invalid Counts: 1
>> Valid Counts: 27
>> Running test #17
>>   10bc60-10bcc4 g test_loop
>> perf does have symbol 'test_loop'
>>   10c354-10c418 l brstack
>> perf does have symbol 'brstack'
>> Basic leader sampling test
>> Leader sampling [Failed inconsistent cycles count]
>> Invalid Counts: 8
>> Valid Counts: 28
>>
>> Initially I thought it was the throttling issue mentioned in the comment in test_leadership_sampling, but there's another thread says that it's fixed:
>> https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/
>>
>>
>> On Wed, Oct 8, 2025 at 12:24 PM James Clark <james.clark@linaro.org> wrote:
>>>
>>>
>>>
>>> On 08/10/2025 11:48 am, Thomas Richter wrote:
>>>> On 10/7/25 14:34, James Clark wrote:
>>>>>
>>>>>
>>>>> On 07/10/2025 6:47 am, Thomas Richter wrote:
>>>>>> On 10/2/25 15:39, Anubhav Shelat wrote:
>>>>>>> On Oct 1, 2025 at 9:44 PM, Ian Rogers wrote:
>>>>>>>> If cycles is 0 then this will always pass, should this be checking a
>>>>>>> range?
>>>>>>>
>>>>>>> Yes you're right this will be better.
>>>>>>>
>>>>>>> On Oct 2, 2025 at 7:56 AM, Thomas Richter wrote:
>>>>>>>> Can we use a larger range to allow the test to pass?
>>>>>>>
>>>>>>> What range do you get on s390? When I do group measurements using "perf
>>>>>>> record -e "{cycles,cycles}:Su" perf test -w brstack" like in the test I
>>>>>>> always get somewhere between 20 and 50 cycles difference. I haven't tested
>>>>>>> on s390x, but I see no cycle count difference when testing the same command
>>>>>>> on x86. I have observed much larger, more varied differences when using
>>>>>>> software events.
>>>>>>>
>>>>>>> Anubhav
>>>>>>>
>>>>>>
>>>>>> Here is the output of the
>>>>>>
>>>>>>     # perf record  -e "{cycles,cycles}:Su" -- ./perf test -w brstack
>>>>>>     # perf script | grep brstack
>>>>>>
>>>>>> commands:
>>>>>>
>>>>>> perf 1110782 426394.696874:    6885000 cycles:           116fc9e brstack_bench+0xae (/r>
>>>>>> perf 1110782 426394.696875:    1377000 cycles:           116fb98 brstack_foo+0x0 (/root>
>>>>>> perf 1110782 426394.696877:    1377000 cycles:           116fb48 brstack_bar+0x0 (/root>
>>>>>> perf 1110782 426394.696878:    1377000 cycles:           116fc94 brstack_bench+0xa4 (/r>
>>>>>> perf 1110782 426394.696880:    1377000 cycles:           116fc84 brstack_bench+0x94 (/r>
>>>>>> perf 1110782 426394.696881:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>>>> perf 1110782 426394.696883:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>>>> perf 1110782 426394.696884:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>>>> perf 1110782 426394.696885:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>>>> perf 1110782 426394.696887:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>>>> perf 1110782 426394.696888:    1377000 cycles:           116fc98 brstack_bench+0xa8 (/r>
>>>>>> perf 1110782 426394.696890:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>>>> perf 1110782 426394.696891:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>>>>> perf 1110782 426394.703542:    1377000 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>>>> perf 1110782 426394.703542:   30971975 cycles:           116fb7c brstack_bar+0x34 (/roo>
>>>>>> perf 1110782 426394.703543:    1377000 cycles:           116fc76 brstack_bench+0x86 (/r>
>>>>>> perf 1110782 426394.703545:    1377000 cycles:           116fc06 brstack_bench+0x16 (/r>
>>>>>> perf 1110782 426394.703546:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>>>>> perf 1110782 426394.703547:    1377000 cycles:           116fc20 brstack_bench+0x30 (/r>
>>>>>> perf 1110782 426394.703549:    1377000 cycles:           116fc9e brstack_bench+0xae (/r>
>>>>>> perf 1110782 426394.703550:    1377000 cycles:           116fcbc brstack_bench+0xcc
>>>>>>
>>>>>> They are usual identical values beside one or two which are way off. Ignoring those would
>>>>>> be good.
>>>>>>
>>>>>
>>>>> FWIW I ran 100+ iterations my Arm Juno and N1SDP boards and the test passed every time.
>>>>>
>>>>> Are we sure there isn't some kind of race condition or bug that the test has found? Rather than a bug in the test?
>>>> There is always a possibility of a bug, that can not be ruled out for certain.
>>>> However as LPARs on s390 run on top of a hypervisor, there is a chance for the
>>>> linux guest being stopped while hardware keeps running.
>>>>
>>>
>>> I have no idea what's going on or how that works, so maybe this question
>>> is useless, but doesn't that mean that guests can determine/guess the
>>> counter values from other guests? If the hardware keeps the counter
>>> running when the guest isn't, that sounds like something is leaking from
>>> one guest to another? Should the hypervisor not be saving and restoring
>>> context?
>>>
>>>> I see these runoff values time and again, roughly every second run fails with
>>>> one runoff value
>>>>
>>>> Hope this helps
>>>>
>>>
>>> That may explain the issue for s390 then, but I'm assuming it doesn't
>>> explain the issues on Arm if the failures there aren't in a VM. But even
>>> if they were in a VM, the PMU is fully virtualised and the events would
>>> be stopped and resumed when the guest is switched out.
>>>
>>> James
>>>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
       [not found]                   ` <CA+G8DhKQkTKoNer5GfZedPUj4xMizWVJUWFocP2eQ_cmPJtBOQ@mail.gmail.com>
@ 2025-10-09 14:59                     ` James Clark
  2025-10-09 15:22                       ` Anubhav Shelat
  2025-10-13 15:36                       ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 24+ messages in thread
From: James Clark @ 2025-10-09 14:59 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: Thomas Richter, Namhyung Kim, mpetlan, acme, irogers,
	linux-perf-users, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, adrian.hunter, kan.liang, dapeng1.mi



On 09/10/2025 3:43 pm, Anubhav Shelat wrote:
> The first machine was running kernel 6.12.0-55.37.1.el10_0.aarch64 on a KVM
> virtual machine.
> The second machine was running kernel 6.12.0-119.el10.aarch64 also on a KVM.
> 

That's quite old. Make sure you test on the latest kernel before sending 
patches. The tests in mainline should be targeting the latest kernel, 
especially in this case because the throttling fix didn't have a fixes 
tag so won't be backported.

That change to fix throttling and group sampling is only from v6.16.

Also what hardware is the VM running on?

> On Thu, Oct 9, 2025 at 3:17 PM James Clark <james.clark@linaro.org> wrote:
>> After reading that patch it seems like we should actually be removing
>> the 80% tolerance from the leader sampling test. Both instances of the
>> cycles counts should be the same now.
> 
> If there's no tolerance then the leader sampling test would fail much more
> often. In most of the runs there's at least one case where the leader event
> has much fewer cycles.
> 

That's assuming that we've agreed that any difference in cycle counts is 
expected and valid. I don't agree that's the case yet and I think it's a 
bug. I only see identical counts, and the commit message in Kan's fix 
describes that the values should be the same for all architectures.

>> (Excluding s390) I'm starting to think you were hitting this bug on an
>> older kernel? Or something else is going wrong that we should  get to
>> the bottom of. The test could have found something and we shouldn't
>> ignore it yet.
> 
> I agree that the first bug I mentioned might be from an older kernel, but
> there's still the case here where the cycle counts don't match. I'll keep
> looking into it.
> 
> Anubhav
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-09 14:59                     ` James Clark
@ 2025-10-09 15:22                       ` Anubhav Shelat
  2025-10-13 15:36                       ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 24+ messages in thread
From: Anubhav Shelat @ 2025-10-09 15:22 UTC (permalink / raw)
  To: James Clark
  Cc: Thomas Richter, Namhyung Kim, mpetlan, acme, irogers,
	linux-perf-users, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, adrian.hunter, kan.liang, dapeng1.mi

On Thu, Oct 9, 2025 at 4:00 PM James Clark <james.clark@linaro.org> wrote:
> Also what hardware is the VM running on?
Both were running on an HPE Apollo CN99XX server.

> That's assuming that we've agreed that any difference in cycle counts is
> expected and valid. I don't agree that's the case yet and I think it's a
> bug. I only see identical counts, and the commit message in Kan's fix
> describes that the values should be the same for all architectures.
True


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-09 14:59                     ` James Clark
  2025-10-09 15:22                       ` Anubhav Shelat
@ 2025-10-13 15:36                       ` Arnaldo Carvalho de Melo
  2025-10-14  8:29                         ` James Clark
  1 sibling, 1 reply; 24+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-10-13 15:36 UTC (permalink / raw)
  To: James Clark
  Cc: Anubhav Shelat, Thomas Richter, Namhyung Kim, mpetlan, irogers,
	linux-perf-users, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, adrian.hunter, kan.liang, dapeng1.mi

On Thu, Oct 09, 2025 at 03:59:54PM +0100, James Clark wrote:
> 
> 
> On 09/10/2025 3:43 pm, Anubhav Shelat wrote:
> > The first machine was running kernel 6.12.0-55.37.1.el10_0.aarch64 on a KVM
> > virtual machine.
> > The second machine was running kernel 6.12.0-119.el10.aarch64 also on a KVM.
 
> That's quite old. Make sure you test on the latest kernel before sending

While I agree with you I think that 6.12 is not really that 6.12ish, as
its a disto kernel, an enterprise one at that, so tons of backports.

Having said that, yeah, the right thing is to build the latest upstream
kernel and see if it works, if it works, try to identify backports, and
only when its determined that it is something present on upstream,
report it publicly.

> patches. The tests in mainline should be targeting the latest kernel,
> especially in this case because the throttling fix didn't have a fixes tag
> so won't be backported.

Right.
 
> That change to fix throttling and group sampling is only from v6.16.
 
> Also what hardware is the VM running on?

Anubhav, please provide this info,

Thanks,

- Arnaldo
 
> > On Thu, Oct 9, 2025 at 3:17 PM James Clark <james.clark@linaro.org> wrote:
> > > After reading that patch it seems like we should actually be removing
> > > the 80% tolerance from the leader sampling test. Both instances of the
> > > cycles counts should be the same now.

> > If there's no tolerance then the leader sampling test would fail much more
> > often. In most of the runs there's at least one case where the leader event
> > has much fewer cycles.
 
> That's assuming that we've agreed that any difference in cycle counts is
> expected and valid. I don't agree that's the case yet and I think it's a
> bug. I only see identical counts, and the commit message in Kan's fix
> describes that the values should be the same for all architectures.
 
> > > (Excluding s390) I'm starting to think you were hitting this bug on an
> > > older kernel? Or something else is going wrong that we should  get to
> > > the bottom of. The test could have found something and we shouldn't
> > > ignore it yet.

> > I agree that the first bug I mentioned might be from an older kernel, but
> > there's still the case here where the cycle counts don't match. I'll keep
> > looking into it.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-13 15:36                       ` Arnaldo Carvalho de Melo
@ 2025-10-14  8:29                         ` James Clark
  2025-10-16 14:42                           ` Anubhav Shelat
  0 siblings, 1 reply; 24+ messages in thread
From: James Clark @ 2025-10-14  8:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Anubhav Shelat, Thomas Richter, Namhyung Kim, mpetlan, irogers,
	linux-perf-users, peterz, mingo, mark.rutland, alexander.shishkin,
	jolsa, adrian.hunter, kan.liang, dapeng1.mi



On 13/10/2025 4:36 pm, Arnaldo Carvalho de Melo wrote:
> On Thu, Oct 09, 2025 at 03:59:54PM +0100, James Clark wrote:
>>
>>
>> On 09/10/2025 3:43 pm, Anubhav Shelat wrote:
>>> The first machine was running kernel 6.12.0-55.37.1.el10_0.aarch64 on a KVM
>>> virtual machine.
>>> The second machine was running kernel 6.12.0-119.el10.aarch64 also on a KVM.
>   
>> That's quite old. Make sure you test on the latest kernel before sending
> 
> While I agree with you I think that 6.12 is not really that 6.12ish, as
> its a disto kernel, an enterprise one at that, so tons of backports.
> 
> Having said that, yeah, the right thing is to build the latest upstream
> kernel and see if it works, if it works, try to identify backports, and
> only when its determined that it is something present on upstream,
> report it publicly.
> 
>> patches. The tests in mainline should be targeting the latest kernel,
>> especially in this case because the throttling fix didn't have a fixes tag
>> so won't be backported.
> 
> Right.
>   
>> That change to fix throttling and group sampling is only from v6.16.
>   
>> Also what hardware is the VM running on?
> 
> Anubhav, please provide this info,

On the adjacent thread:

 > Both were running on an HPE Apollo CN99XX server.

But now that I've seen the older kernel version, I think the issue is 
that the throttling fix isn't present and the specific hardware isn't 
important. If Anubhav confirms a newer kernel fixes it we should look 
into possibly removing the tolerance from the test and start looking for 
exact matches.

James

> 
> Thanks,
> 
> - Arnaldo
>   
>>> On Thu, Oct 9, 2025 at 3:17 PM James Clark <james.clark@linaro.org> wrote:
>>>> After reading that patch it seems like we should actually be removing
>>>> the 80% tolerance from the leader sampling test. Both instances of the
>>>> cycles counts should be the same now.
> 
>>> If there's no tolerance then the leader sampling test would fail much more
>>> often. In most of the runs there's at least one case where the leader event
>>> has much fewer cycles.
>   
>> That's assuming that we've agreed that any difference in cycle counts is
>> expected and valid. I don't agree that's the case yet and I think it's a
>> bug. I only see identical counts, and the commit message in Kan's fix
>> describes that the values should be the same for all architectures.
>   
>>>> (Excluding s390) I'm starting to think you were hitting this bug on an
>>>> older kernel? Or something else is going wrong that we should  get to
>>>> the bottom of. The test could have found something and we shouldn't
>>>> ignore it yet.
> 
>>> I agree that the first bug I mentioned might be from an older kernel, but
>>> there's still the case here where the cycle counts don't match. I'll keep
>>> looking into it.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-14  8:29                         ` James Clark
@ 2025-10-16 14:42                           ` Anubhav Shelat
  2025-10-16 14:46                             ` James Clark
  0 siblings, 1 reply; 24+ messages in thread
From: Anubhav Shelat @ 2025-10-16 14:42 UTC (permalink / raw)
  To: James Clark
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, Namhyung Kim, mpetlan,
	irogers, linux-perf-users, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, kan.liang, dapeng1.mi

Sorry for the late reply.

On Tue, Oct 14, 2025 at 9:29 AM James Clark <james.clark@linaro.org> wrote:
>
> But now that I've seen the older kernel version, I think the issue is
> that the throttling fix isn't present and the specific hardware isn't
> important. If Anubhav confirms a newer kernel fixes it we should look
> into possibly removing the tolerance from the test and start looking for
> exact matches.

I tested again on 6.12.0-124.7.1.el10_1.aarch64 KVM with hardware
Ampere Mt Snow Altramax. I'm getting the same error that I was
initially getting (28 cycle difference between leader and slave
event), although now the difference is 18 cycles. So I don't think
this is just an issue with older kernels:
[root@ampere-mtsnow-altramax-02-vm-10 perf]# ./perf record -e
"{cycles,cycles}:Su" ./perf test -w brstack
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.021 MB perf.data (266 samples) ]
[root@ampere-mtsnow-altramax-02-vm-10 perf]# ./perf script -i
perf.data | grep brstack
            perf   96507 163361.511570:     434471 cycles:
53d4e8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
            perf   96507 163361.511570:     434489 cycles:
53d4e8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
            perf   96507 163361.511733:     422961 cycles:
53d424 brstack_bar+0x24 (/root/linux/tools/perf/perf)
            perf   96507 163361.511733:     422979 cycles:
53d424 brstack_bar+0x24 (/root/linux/tools/perf/perf)
            perf   96507 163361.511887:     402299 cycles:
53d4fc brstack_bench+0x58 (/root/linux/tools/perf/perf)
            perf   96507 163361.511887:     402317 cycles:
53d4fc brstack_bench+0x58 (/root/linux/tools/perf/perf)
            perf   96507 163361.512048:     429218 cycles:
53d4b0 brstack_bench+0xc (/root/linux/tools/perf/perf)
            perf   96507 163361.512048:     429236 cycles:
53d4b0 brstack_bench+0xc (/root/linux/tools/perf/perf)
            perf   96507 163361.512221:     462615 cycles:
53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf   96507 163361.512221:     462633 cycles:
53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf   96507 163361.512448:     494407 cycles:
53d520 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf   96507 163361.512448:     494425 cycles:
53d520 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf   96507 163361.512638:     512031 cycles:
53d540 brstack_bench+0x9c (/root/linux/tools/perf/perf)
            perf   96507 163361.512638:     512049 cycles:
53d540 brstack_bench+0x9c (/root/linux/tools/perf/perf)
            perf   96507 163361.512830:     518522 cycles:
53d4fc brstack_bench+0x58 (/root/linux/tools/perf/perf)
            perf   96507 163361.512830:     518540 cycles:
53d4fc brstack_bench+0x58 (/root/linux/tools/perf/perf)
            perf   96507 163361.513028:     538824 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.513028:     538842 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.513232:     559322 cycles:
53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf   96507 163361.513232:     559340 cycles:
53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf   96507 163361.513456:     577880 cycles:
53d4a8 brstack_bench+0x4 (/root/linux/tools/perf/perf)
            perf   96507 163361.513456:     577898 cycles:
53d4a8 brstack_bench+0x4 (/root/linux/tools/perf/perf)
            perf   96507 163361.513674:     593892 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.513674:     593910 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.513894:     602458 cycles:
53d498 brstack_foo+0x48 (/root/linux/tools/perf/perf)
            perf   96507 163361.513894:     602476 cycles:
53d498 brstack_foo+0x48 (/root/linux/tools/perf/perf)
            perf   96507 163361.514116:     613613 cycles:
53d444 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf   96507 163361.514116:     613631 cycles:
53d444 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf   96507 163361.514345:     624328 cycles:
53d44c brstack_bar+0x4c (/root/linux/tools/perf/perf)
            perf   96507 163361.514345:     624346 cycles:
53d44c brstack_bar+0x4c (/root/linux/tools/perf/perf)
            perf   96507 163361.514576:     633823 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.514576:     633841 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.514810:     641207 cycles:
53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf   96507 163361.514810:     641225 cycles:
53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf   96507 163361.515046:     647897 cycles:
53d45c brstack_foo+0xc (/root/linux/tools/perf/perf)
            perf   96507 163361.515046:     647915 cycles:
53d45c brstack_foo+0xc (/root/linux/tools/perf/perf)
            perf   96507 163361.515282:     653546 cycles:
53d438 brstack_bar+0x38 (/root/linux/tools/perf/perf)
            perf   96507 163361.515282:     653564 cycles:
53d438 brstack_bar+0x38 (/root/linux/tools/perf/perf)
            perf   96507 163361.515525:     658367 cycles:
53d434 brstack_bar+0x34 (/root/linux/tools/perf/perf)
            perf   96507 163361.515525:     658385 cycles:
53d434 brstack_bar+0x34 (/root/linux/tools/perf/perf)
            perf   96507 163361.515767:     663233 cycles:
53d46c brstack_foo+0x1c (/root/linux/tools/perf/perf)
            perf   96507 163361.515767:     663251 cycles:
53d46c brstack_foo+0x1c (/root/linux/tools/perf/perf)
            perf   96507 163361.516009:     665425 cycles:
53d444 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf   96507 163361.516009:     665443 cycles:
53d444 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf   96507 163361.516251:     668471 cycles:
53d410 brstack_bar+0x10 (/root/linux/tools/perf/perf)
            perf   96507 163361.516251:     668489 cycles:
53d410 brstack_bar+0x10 (/root/linux/tools/perf/perf)
            perf   96507 163361.516493:     671104 cycles:
53d404 brstack_bar+0x4 (/root/linux/tools/perf/perf)
            perf   96507 163361.516493:     671122 cycles:
53d404 brstack_bar+0x4 (/root/linux/tools/perf/perf)
            perf   96507 163361.516736:     673910 cycles:
53d4dc brstack_bench+0x38 (/root/linux/tools/perf/perf)
            perf   96507 163361.516736:     673928 cycles:
53d4dc brstack_bench+0x38 (/root/linux/tools/perf/perf)
            perf   96507 163361.516980:     676712 cycles:
53d540 brstack_bench+0x9c (/root/linux/tools/perf/perf)
            perf   96507 163361.516980:     676730 cycles:
53d540 brstack_bench+0x9c (/root/linux/tools/perf/perf)
            perf   96507 163361.517225:     678976 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.517225:     678994 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.517472:     681095 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.517472:     681113 cycles:
53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf   96507 163361.517719:     682992 cycles:
53d520 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf   96507 163361.517719:     683010 cycles:
53d520 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf   96507 163361.517966:     683967 cycles:
53d5bc brstack+0x60 (/root/linux/tools/perf/perf)
            perf   96507 163361.517966:     683985 cycles:
53d5bc brstack+0x60 (/root/linux/tools/perf/perf)
            perf   96507 163361.518213:     685021 cycles:
53d410 brstack_bar+0x10 (/root/linux/tools/perf/perf)
            perf   96507 163361.518213:     685039 cycles:
53d410 brstack_bar+0x10 (/root/linux/tools/perf/perf)

If another sibling event is added then the two sibling events will
have the same number of cycles.
[root@ampere-mtsnow-altramax-02-vm-10 perf]# ./perf record -e
"{cycles,cycles,cycles}:Su" ./perf test -w brstack
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.024 MB perf.data (390 samples) ]
[root@ampere-mtsnow-altramax-02-vm-10 perf]# ./perf script | grep brstack
            perf  104580 166706.554289:     448152 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.554289:     448170 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.554289:     448170 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.554461:     435718 cycles:
537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
            perf  104580 166706.554461:     435736 cycles:
537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
            perf  104580 166706.554461:     435736 cycles:
537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
            perf  104580 166706.554626:     427053 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.554626:     427071 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.554626:     427071 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.554798:     451335 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.554798:     451353 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.554798:     451353 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.554980:     480005 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.554980:     480023 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.554980:     480023 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.555171:     507300 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.555171:     507318 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.555171:     507318 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.555371:     531265 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.555371:     531283 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.555371:     531283 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.555577:     551570 cycles:
537fc8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.555577:     551588 cycles:
537fc8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.555577:     551588 cycles:
537fc8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.555788:     568657 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.555788:     568675 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.555788:     568675 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.556004:     584038 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.556004:     584056 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.556004:     584056 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.556223:     597582 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.556223:     597600 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.556223:     597600 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.556451:     609457 cycles:
538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf  104580 166706.556451:     609475 cycles:
538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf  104580 166706.556451:     609475 cycles:
538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf  104580 166706.556679:     619984 cycles:
537f04 brstack_bar+0x24 (/root/linux/tools/perf/perf)
            perf  104580 166706.556679:     620002 cycles:
537f04 brstack_bar+0x24 (/root/linux/tools/perf/perf)
            perf  104580 166706.556679:     620002 cycles:
537f04 brstack_bar+0x24 (/root/linux/tools/perf/perf)
            perf  104580 166706.556910:     627492 cycles:
537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
            perf  104580 166706.556910:     627510 cycles:
537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
            perf  104580 166706.556910:     627510 cycles:
537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
            perf  104580 166706.557143:     634910 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.557143:     634928 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.557143:     634928 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.557378:     641501 cycles:
538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf  104580 166706.557378:     641519 cycles:
538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf  104580 166706.557378:     641519 cycles:
538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
            perf  104580 166706.557620:     647392 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.557620:     647410 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.557620:     647410 cycles:
537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.557863:     652652 cycles:
5380a4 brstack+0x68 (/root/linux/tools/perf/perf)
            perf  104580 166706.557863:     652670 cycles:
5380a4 brstack+0x68 (/root/linux/tools/perf/perf)
            perf  104580 166706.557863:     652670 cycles:
5380a4 brstack+0x68 (/root/linux/tools/perf/perf)
            perf  104580 166706.558103:     655312 cycles:
537f14 brstack_bar+0x34 (/root/linux/tools/perf/perf)
            perf  104580 166706.558103:     655330 cycles:
537f14 brstack_bar+0x34 (/root/linux/tools/perf/perf)
            perf  104580 166706.558103:     655330 cycles:
537f14 brstack_bar+0x34 (/root/linux/tools/perf/perf)
            perf  104580 166706.558344:     657722 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.558344:     657740 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.558344:     657740 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.558585:     660962 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.558585:     660980 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.558585:     660980 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.558831:     664313 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.558831:     664331 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.558831:     664331 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.559075:     667265 cycles:
537f54 brstack_foo+0x24 (/root/linux/tools/perf/perf)
            perf  104580 166706.559075:     667283 cycles:
537f54 brstack_foo+0x24 (/root/linux/tools/perf/perf)
            perf  104580 166706.559075:     667283 cycles:
537f54 brstack_foo+0x24 (/root/linux/tools/perf/perf)
            perf  104580 166706.559321:     668769 cycles:
537f60 brstack_foo+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.559321:     668787 cycles:
537f60 brstack_foo+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.559321:     668787 cycles:
537f60 brstack_foo+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.559566:     670697 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.559566:     670715 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.559566:     670715 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.559811:     672257 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.559811:     672275 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.559811:     672275 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.560059:     673958 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.560059:     673976 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.560059:     673976 cycles:
537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
            perf  104580 166706.560306:     675608 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.560306:     675626 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.560306:     675626 cycles:
537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
            perf  104580 166706.560554:     676111 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.560554:     676129 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.560554:     676129 cycles:
537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
            perf  104580 166706.560812:     677092 cycles:
537f9c brstack_bench+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.560812:     677110 cycles:
537f9c brstack_bench+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.560812:     677110 cycles:
537f9c brstack_bench+0x18 (/root/linux/tools/perf/perf)
            perf  104580 166706.561108:     677768 cycles:
537f68 brstack_foo+0x38 (/root/linux/tools/perf/perf)
            perf  104580 166706.561108:     677786 cycles:
537f68 brstack_foo+0x38 (/root/linux/tools/perf/perf)
            perf  104580 166706.561108:     677786 cycles:
537f68 brstack_foo+0x38 (/root/linux/tools/perf/perf)

I'm thinking the problem might be in drivers/perf/arm_pmuv3.c with how
the pmus are enabled. x86 has a function that enables all pmus at once
(arch/x86/events/core.c: x86_pmu_enable_all) so maybe arm needs
something similar.

Anubhav


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-16 14:42                           ` Anubhav Shelat
@ 2025-10-16 14:46                             ` James Clark
  2025-10-16 14:59                               ` Anubhav Shelat
  0 siblings, 1 reply; 24+ messages in thread
From: James Clark @ 2025-10-16 14:46 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, Namhyung Kim, mpetlan,
	irogers, linux-perf-users, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, kan.liang, dapeng1.mi



On 16/10/2025 3:42 pm, Anubhav Shelat wrote:
> Sorry for the late reply.
> 
> On Tue, Oct 14, 2025 at 9:29 AM James Clark <james.clark@linaro.org> wrote:
>>
>> But now that I've seen the older kernel version, I think the issue is
>> that the throttling fix isn't present and the specific hardware isn't
>> important. If Anubhav confirms a newer kernel fixes it we should look
>> into possibly removing the tolerance from the test and start looking for
>> exact matches.
> 
> I tested again on 6.12.0-124.7.1.el10_1.aarch64 KVM with hardware
> Ampere Mt Snow Altramax. I'm getting the same error that I was
> initially getting (28 cycle difference between leader and slave

But the throttling fix was in 6.16, so I wouldn't expect it to work in 6.12.

> event), although now the difference is 18 cycles. So I don't think
> this is just an issue with older kernels:
> [root@ampere-mtsnow-altramax-02-vm-10 perf]# ./perf record -e
> "{cycles,cycles}:Su" ./perf test -w brstack
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.021 MB perf.data (266 samples) ]
> [root@ampere-mtsnow-altramax-02-vm-10 perf]# ./perf script -i
> perf.data | grep brstack
>              perf   96507 163361.511570:     434471 cycles:
> 53d4e8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
>              perf   96507 163361.511570:     434489 cycles:
> 53d4e8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
>              perf   96507 163361.511733:     422961 cycles:
> 53d424 brstack_bar+0x24 (/root/linux/tools/perf/perf)
>              perf   96507 163361.511733:     422979 cycles:
> 53d424 brstack_bar+0x24 (/root/linux/tools/perf/perf)
>              perf   96507 163361.511887:     402299 cycles:
> 53d4fc brstack_bench+0x58 (/root/linux/tools/perf/perf)
>              perf   96507 163361.511887:     402317 cycles:
> 53d4fc brstack_bench+0x58 (/root/linux/tools/perf/perf)
>              perf   96507 163361.512048:     429218 cycles:
> 53d4b0 brstack_bench+0xc (/root/linux/tools/perf/perf)
>              perf   96507 163361.512048:     429236 cycles:
> 53d4b0 brstack_bench+0xc (/root/linux/tools/perf/perf)
>              perf   96507 163361.512221:     462615 cycles:
> 53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf   96507 163361.512221:     462633 cycles:
> 53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf   96507 163361.512448:     494407 cycles:
> 53d520 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf   96507 163361.512448:     494425 cycles:
> 53d520 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf   96507 163361.512638:     512031 cycles:
> 53d540 brstack_bench+0x9c (/root/linux/tools/perf/perf)
>              perf   96507 163361.512638:     512049 cycles:
> 53d540 brstack_bench+0x9c (/root/linux/tools/perf/perf)
>              perf   96507 163361.512830:     518522 cycles:
> 53d4fc brstack_bench+0x58 (/root/linux/tools/perf/perf)
>              perf   96507 163361.512830:     518540 cycles:
> 53d4fc brstack_bench+0x58 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513028:     538824 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513028:     538842 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513232:     559322 cycles:
> 53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513232:     559340 cycles:
> 53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513456:     577880 cycles:
> 53d4a8 brstack_bench+0x4 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513456:     577898 cycles:
> 53d4a8 brstack_bench+0x4 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513674:     593892 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513674:     593910 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513894:     602458 cycles:
> 53d498 brstack_foo+0x48 (/root/linux/tools/perf/perf)
>              perf   96507 163361.513894:     602476 cycles:
> 53d498 brstack_foo+0x48 (/root/linux/tools/perf/perf)
>              perf   96507 163361.514116:     613613 cycles:
> 53d444 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf   96507 163361.514116:     613631 cycles:
> 53d444 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf   96507 163361.514345:     624328 cycles:
> 53d44c brstack_bar+0x4c (/root/linux/tools/perf/perf)
>              perf   96507 163361.514345:     624346 cycles:
> 53d44c brstack_bar+0x4c (/root/linux/tools/perf/perf)
>              perf   96507 163361.514576:     633823 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.514576:     633841 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.514810:     641207 cycles:
> 53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf   96507 163361.514810:     641225 cycles:
> 53d430 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf   96507 163361.515046:     647897 cycles:
> 53d45c brstack_foo+0xc (/root/linux/tools/perf/perf)
>              perf   96507 163361.515046:     647915 cycles:
> 53d45c brstack_foo+0xc (/root/linux/tools/perf/perf)
>              perf   96507 163361.515282:     653546 cycles:
> 53d438 brstack_bar+0x38 (/root/linux/tools/perf/perf)
>              perf   96507 163361.515282:     653564 cycles:
> 53d438 brstack_bar+0x38 (/root/linux/tools/perf/perf)
>              perf   96507 163361.515525:     658367 cycles:
> 53d434 brstack_bar+0x34 (/root/linux/tools/perf/perf)
>              perf   96507 163361.515525:     658385 cycles:
> 53d434 brstack_bar+0x34 (/root/linux/tools/perf/perf)
>              perf   96507 163361.515767:     663233 cycles:
> 53d46c brstack_foo+0x1c (/root/linux/tools/perf/perf)
>              perf   96507 163361.515767:     663251 cycles:
> 53d46c brstack_foo+0x1c (/root/linux/tools/perf/perf)
>              perf   96507 163361.516009:     665425 cycles:
> 53d444 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf   96507 163361.516009:     665443 cycles:
> 53d444 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf   96507 163361.516251:     668471 cycles:
> 53d410 brstack_bar+0x10 (/root/linux/tools/perf/perf)
>              perf   96507 163361.516251:     668489 cycles:
> 53d410 brstack_bar+0x10 (/root/linux/tools/perf/perf)
>              perf   96507 163361.516493:     671104 cycles:
> 53d404 brstack_bar+0x4 (/root/linux/tools/perf/perf)
>              perf   96507 163361.516493:     671122 cycles:
> 53d404 brstack_bar+0x4 (/root/linux/tools/perf/perf)
>              perf   96507 163361.516736:     673910 cycles:
> 53d4dc brstack_bench+0x38 (/root/linux/tools/perf/perf)
>              perf   96507 163361.516736:     673928 cycles:
> 53d4dc brstack_bench+0x38 (/root/linux/tools/perf/perf)
>              perf   96507 163361.516980:     676712 cycles:
> 53d540 brstack_bench+0x9c (/root/linux/tools/perf/perf)
>              perf   96507 163361.516980:     676730 cycles:
> 53d540 brstack_bench+0x9c (/root/linux/tools/perf/perf)
>              perf   96507 163361.517225:     678976 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.517225:     678994 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.517472:     681095 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.517472:     681113 cycles:
> 53d51c brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf   96507 163361.517719:     682992 cycles:
> 53d520 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf   96507 163361.517719:     683010 cycles:
> 53d520 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf   96507 163361.517966:     683967 cycles:
> 53d5bc brstack+0x60 (/root/linux/tools/perf/perf)
>              perf   96507 163361.517966:     683985 cycles:
> 53d5bc brstack+0x60 (/root/linux/tools/perf/perf)
>              perf   96507 163361.518213:     685021 cycles:
> 53d410 brstack_bar+0x10 (/root/linux/tools/perf/perf)
>              perf   96507 163361.518213:     685039 cycles:
> 53d410 brstack_bar+0x10 (/root/linux/tools/perf/perf)
> 
> If another sibling event is added then the two sibling events will
> have the same number of cycles.
> [root@ampere-mtsnow-altramax-02-vm-10 perf]# ./perf record -e
> "{cycles,cycles,cycles}:Su" ./perf test -w brstack
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.024 MB perf.data (390 samples) ]
> [root@ampere-mtsnow-altramax-02-vm-10 perf]# ./perf script | grep brstack
>              perf  104580 166706.554289:     448152 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554289:     448170 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554289:     448170 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554461:     435718 cycles:
> 537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
>              perf  104580 166706.554461:     435736 cycles:
> 537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
>              perf  104580 166706.554461:     435736 cycles:
> 537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
>              perf  104580 166706.554626:     427053 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554626:     427071 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554626:     427071 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554798:     451335 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554798:     451353 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554798:     451353 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554980:     480005 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554980:     480023 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.554980:     480023 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555171:     507300 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555171:     507318 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555171:     507318 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555371:     531265 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555371:     531283 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555371:     531283 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555577:     551570 cycles:
> 537fc8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555577:     551588 cycles:
> 537fc8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555577:     551588 cycles:
> 537fc8 brstack_bench+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555788:     568657 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555788:     568675 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.555788:     568675 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556004:     584038 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556004:     584056 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556004:     584056 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556223:     597582 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556223:     597600 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556223:     597600 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556451:     609457 cycles:
> 538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf  104580 166706.556451:     609475 cycles:
> 538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf  104580 166706.556451:     609475 cycles:
> 538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf  104580 166706.556679:     619984 cycles:
> 537f04 brstack_bar+0x24 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556679:     620002 cycles:
> 537f04 brstack_bar+0x24 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556679:     620002 cycles:
> 537f04 brstack_bar+0x24 (/root/linux/tools/perf/perf)
>              perf  104580 166706.556910:     627492 cycles:
> 537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
>              perf  104580 166706.556910:     627510 cycles:
> 537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
>              perf  104580 166706.556910:     627510 cycles:
> 537f3c brstack_foo+0xc (/root/linux/tools/perf/perf)
>              perf  104580 166706.557143:     634910 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.557143:     634928 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.557143:     634928 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.557378:     641501 cycles:
> 538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf  104580 166706.557378:     641519 cycles:
> 538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf  104580 166706.557378:     641519 cycles:
> 538000 brstack_bench+0x7c (/root/linux/tools/perf/perf)
>              perf  104580 166706.557620:     647392 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.557620:     647410 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.557620:     647410 cycles:
> 537ef8 brstack_bar+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.557863:     652652 cycles:
> 5380a4 brstack+0x68 (/root/linux/tools/perf/perf)
>              perf  104580 166706.557863:     652670 cycles:
> 5380a4 brstack+0x68 (/root/linux/tools/perf/perf)
>              perf  104580 166706.557863:     652670 cycles:
> 5380a4 brstack+0x68 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558103:     655312 cycles:
> 537f14 brstack_bar+0x34 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558103:     655330 cycles:
> 537f14 brstack_bar+0x34 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558103:     655330 cycles:
> 537f14 brstack_bar+0x34 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558344:     657722 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558344:     657740 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558344:     657740 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558585:     660962 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558585:     660980 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558585:     660980 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558831:     664313 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558831:     664331 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.558831:     664331 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559075:     667265 cycles:
> 537f54 brstack_foo+0x24 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559075:     667283 cycles:
> 537f54 brstack_foo+0x24 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559075:     667283 cycles:
> 537f54 brstack_foo+0x24 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559321:     668769 cycles:
> 537f60 brstack_foo+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559321:     668787 cycles:
> 537f60 brstack_foo+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559321:     668787 cycles:
> 537f60 brstack_foo+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559566:     670697 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559566:     670715 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559566:     670715 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559811:     672257 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559811:     672275 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.559811:     672275 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560059:     673958 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560059:     673976 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560059:     673976 cycles:
> 537f24 brstack_bar+0x44 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560306:     675608 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560306:     675626 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560306:     675626 cycles:
> 537f10 brstack_bar+0x30 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560554:     676111 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560554:     676129 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560554:     676129 cycles:
> 537ffc brstack_bench+0x78 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560812:     677092 cycles:
> 537f9c brstack_bench+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560812:     677110 cycles:
> 537f9c brstack_bench+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.560812:     677110 cycles:
> 537f9c brstack_bench+0x18 (/root/linux/tools/perf/perf)
>              perf  104580 166706.561108:     677768 cycles:
> 537f68 brstack_foo+0x38 (/root/linux/tools/perf/perf)
>              perf  104580 166706.561108:     677786 cycles:
> 537f68 brstack_foo+0x38 (/root/linux/tools/perf/perf)
>              perf  104580 166706.561108:     677786 cycles:
> 537f68 brstack_foo+0x38 (/root/linux/tools/perf/perf)
> 
> I'm thinking the problem might be in drivers/perf/arm_pmuv3.c with how
> the pmus are enabled. x86 has a function that enables all pmus at once
> (arch/x86/events/core.c: x86_pmu_enable_all) so maybe arm needs
> something similar.

But I don't see the issue on the latest kernel, so I'm not sure a change 
needs to be made.

> 
> Anubhav
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-16 14:46                             ` James Clark
@ 2025-10-16 14:59                               ` Anubhav Shelat
  2025-10-16 15:02                                 ` James Clark
  0 siblings, 1 reply; 24+ messages in thread
From: Anubhav Shelat @ 2025-10-16 14:59 UTC (permalink / raw)
  To: James Clark
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, Namhyung Kim, mpetlan,
	irogers, linux-perf-users, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, kan.liang, dapeng1.mi

On Thu, Oct 16, 2025 at 3:46 PM James Clark <james.clark@linaro.org> wrote:
>
> But the throttling fix was in 6.16, so I wouldn't expect it to work in 6.12.

This is using perf built from the upstream source so the patch is included.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-16 14:59                               ` Anubhav Shelat
@ 2025-10-16 15:02                                 ` James Clark
  2025-10-16 15:50                                   ` Anubhav Shelat
  0 siblings, 1 reply; 24+ messages in thread
From: James Clark @ 2025-10-16 15:02 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, Namhyung Kim, mpetlan,
	irogers, linux-perf-users, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, kan.liang, dapeng1.mi



On 16/10/2025 3:59 pm, Anubhav Shelat wrote:
> On Thu, Oct 16, 2025 at 3:46 PM James Clark <james.clark@linaro.org> wrote:
>>
>> But the throttling fix was in 6.16, so I wouldn't expect it to work in 6.12.
> 
> This is using perf built from the upstream source so the patch is included.
> 

The throttling fix that was linked previously in the thread [1] was in 
the kernel though. Unless you are talking about a different patch than 
that one? And that patch is in Perf?

[1]: 
https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-16 15:02                                 ` James Clark
@ 2025-10-16 15:50                                   ` Anubhav Shelat
  2025-10-16 17:50                                     ` James Clark
  0 siblings, 1 reply; 24+ messages in thread
From: Anubhav Shelat @ 2025-10-16 15:50 UTC (permalink / raw)
  To: James Clark
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, Namhyung Kim, mpetlan,
	irogers, linux-perf-users, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, kan.liang, dapeng1.mi

Yes. I had built perf manually from linux/tools/perf v6.17 so the
patch was definitely included.

On Thu, Oct 16, 2025 at 4:03 PM James Clark <james.clark@linaro.org> wrote:
>
>
>
> On 16/10/2025 3:59 pm, Anubhav Shelat wrote:
> > On Thu, Oct 16, 2025 at 3:46 PM James Clark <james.clark@linaro.org> wrote:
> >>
> >> But the throttling fix was in 6.16, so I wouldn't expect it to work in 6.12.
> >
> > This is using perf built from the upstream source so the patch is included.
> >
>
> The throttling fix that was linked previously in the thread [1] was in
> the kernel though. Unless you are talking about a different patch than
> that one? And that patch is in Perf?
>
> [1]:
> https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-16 15:50                                   ` Anubhav Shelat
@ 2025-10-16 17:50                                     ` James Clark
  2025-10-23 13:00                                       ` Anubhav Shelat
  0 siblings, 1 reply; 24+ messages in thread
From: James Clark @ 2025-10-16 17:50 UTC (permalink / raw)
  To: Anubhav Shelat
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, Namhyung Kim, mpetlan,
	irogers, linux-perf-users, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, kan.liang, dapeng1.mi



On 16/10/2025 4:50 pm, Anubhav Shelat wrote:
> Yes. I had built perf manually from linux/tools/perf v6.17 so the
> patch was definitely included.

But the patch is for the kernel, not for Perf. So building Perf isn't 
enough, you need to run the test on a 6.16+ kernel too.

> 
> On Thu, Oct 16, 2025 at 4:03 PM James Clark <james.clark@linaro.org> wrote:
>>
>>
>>
>> On 16/10/2025 3:59 pm, Anubhav Shelat wrote:
>>> On Thu, Oct 16, 2025 at 3:46 PM James Clark <james.clark@linaro.org> wrote:
>>>>
>>>> But the throttling fix was in 6.16, so I wouldn't expect it to work in 6.12.
>>>
>>> This is using perf built from the upstream source so the patch is included.
>>>
>>
>> The throttling fix that was linked previously in the thread [1] was in
>> the kernel though. Unless you are talking about a different patch than
>> that one? And that patch is in Perf?
>>
>> [1]:
>> https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/
>>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64
  2025-10-16 17:50                                     ` James Clark
@ 2025-10-23 13:00                                       ` Anubhav Shelat
  0 siblings, 0 replies; 24+ messages in thread
From: Anubhav Shelat @ 2025-10-23 13:00 UTC (permalink / raw)
  To: James Clark
  Cc: Arnaldo Carvalho de Melo, Thomas Richter, Namhyung Kim, mpetlan,
	irogers, linux-perf-users, peterz, mingo, mark.rutland,
	alexander.shishkin, jolsa, adrian.hunter, kan.liang, dapeng1.mi

Yeah you're right. I'll send a patch to remove the tolerance setting
then. It still fails intermittently but that's a different issue.

On Thu, Oct 16, 2025 at 6:50 PM James Clark <james.clark@linaro.org> wrote:
>
>
>
> On 16/10/2025 4:50 pm, Anubhav Shelat wrote:
> > Yes. I had built perf manually from linux/tools/perf v6.17 so the
> > patch was definitely included.
>
> But the patch is for the kernel, not for Perf. So building Perf isn't
> enough, you need to run the test on a 6.16+ kernel too.
>
> >
> > On Thu, Oct 16, 2025 at 4:03 PM James Clark <james.clark@linaro.org> wrote:
> >>
> >>
> >>
> >> On 16/10/2025 3:59 pm, Anubhav Shelat wrote:
> >>> On Thu, Oct 16, 2025 at 3:46 PM James Clark <james.clark@linaro.org> wrote:
> >>>>
> >>>> But the throttling fix was in 6.16, so I wouldn't expect it to work in 6.12.
> >>>
> >>> This is using perf built from the upstream source so the patch is included.
> >>>
> >>
> >> The throttling fix that was linked previously in the thread [1] was in
> >> the kernel though. Unless you are talking about a different patch than
> >> that one? And that patch is in Perf?
> >>
> >> [1]:
> >> https://lore.kernel.org/lkml/20250520181644.2673067-2-kan.liang@linux.intel.com/
> >>
> >
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-10-23 13:00 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-01 19:50 [PATCH] perf tests record: allow for some difference in cycle count in leader sampling test on aarch64 Anubhav Shelat
2025-10-01 20:43 ` Ian Rogers
2025-10-02  6:55 ` Thomas Richter
     [not found]   ` <CA+G8DhL49FWD47bkbcXYeb9T=AbxNhC-ypqjkNxRnW0JqmYnPw@mail.gmail.com>
2025-10-02 17:44     ` Anubhav Shelat
2025-10-07  5:47     ` Thomas Richter
2025-10-07 12:34       ` James Clark
2025-10-08  7:52         ` Namhyung Kim
2025-10-08 10:48         ` Thomas Richter
2025-10-08 11:24           ` James Clark
2025-10-09 12:14             ` Thomas Richter
     [not found]             ` <CA+G8Dh+Odf40jdY4h1knjU+3sSjZokMx6OdzRT3o9v1=ndKORQ@mail.gmail.com>
2025-10-09 13:55               ` Anubhav Shelat
2025-10-09 14:17                 ` James Clark
     [not found]                   ` <CA+G8DhKQkTKoNer5GfZedPUj4xMizWVJUWFocP2eQ_cmPJtBOQ@mail.gmail.com>
2025-10-09 14:59                     ` James Clark
2025-10-09 15:22                       ` Anubhav Shelat
2025-10-13 15:36                       ` Arnaldo Carvalho de Melo
2025-10-14  8:29                         ` James Clark
2025-10-16 14:42                           ` Anubhav Shelat
2025-10-16 14:46                             ` James Clark
2025-10-16 14:59                               ` Anubhav Shelat
2025-10-16 15:02                                 ` James Clark
2025-10-16 15:50                                   ` Anubhav Shelat
2025-10-16 17:50                                     ` James Clark
2025-10-23 13:00                                       ` Anubhav Shelat
2025-10-09 14:08               ` James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).