From: Thomas Richter <tmricht@linux.ibm.com>
To: James Clark <james.clark@linaro.org>,
Anubhav Shelat <ashelat@redhat.com>
Cc: mpetlan@redhat.com, acme@kernel.org, namhyung@kernel.org,
irogers@google.com, linux-perf-users@vger.kernel.org,
peterz@infradead.org, mingo@redhat.com, mark.rutland@arm.com,
alexander.shishkin@linux.intel.com, jolsa@kernel.org,
adrian.hunter@intel.com, kan.liang@linux.intel.com,
dapeng1.mi@linux.intel.com
Subject: Re: [PATCH] Revert "perf test: Allow tolerance for leader sampling test"
Date: Wed, 29 Oct 2025 08:37:07 +0100 [thread overview]
Message-ID: <a6218625-a08e-4466-837d-7bdd0c8631a5@linux.ibm.com> (raw)
In-Reply-To: <be8f4890-1c45-4116-9c9f-1e40e908747b@linaro.org>
[-- Attachment #1: Type: text/plain, Size: 6486 bytes --]
On 10/28/25 16:23, James Clark wrote:
>
>
> On 28/10/2025 12:55 pm, Thomas Richter wrote:
.....
>> When I skip the grep it actually gets worse, there re more run away values:
>> # perf record -e "{cycles,cycles}:Su" -- perf test -w brstack
>> [ perf record: Woken up 2 times to write data ]
>> [ perf record: Captured and wrote 0.012 MB perf.data (50 samples) ]
>> # perf script | head -20
>> perf 919810 6726.456179: 2754000 cycles: 3ff95608ec8 _dl_map_object_from_fd+0xb18 (/usr/lib/ld64.so.1)
>> perf 919810 6726.456179: 58638457 cycles: 3ff95608ec8 _dl_map_object_from_fd+0xb18 (/usr/lib/ld64.so.1)
>> perf 919810 6726.456182: 1377000 cycles: 3ff9560a696 check_match+0x76 (/usr/lib/ld64.so.1)
>> perf 919810 6726.456182: 1377000 cycles: 3ff9560fa6a _dl_relocate_object_no_relro+0x5fa (/usr/lib/ld64.so.1)
>
> Can you share the raw output for the second sample as well? Or even the whole file would be better.
Ok I will append a perf.data from today and hopefully it will be delivered to you:
See attachment perf.data.tmrs390 (binary file, big endian from s390)
>
> It's the addresses from this sample that are confusing. 0x3ff95608ec8 is the same for both counters on the first sample (correctly), but the second sample has 0x3ff9560a696 and 0x3ff9560fa6a even though the cycles counts are the same.
>
Command
./perf record -r 99 -e "{cycles,cycles}:Su" -- ./perf test -w brstack
is testing leadership group sampling in tests/shell/record.sh and
fails most of the time on s390.
The command opens event cycles (as group leader) for sampling and the s390
sampling facility is started with default frequency of 4000.
This can be seen in the debug output:
perf record opening and mmapping events
Opening: cycles
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0 (PERF_COUNT_HW_CPU_CYCLES)
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|READ|ID|PERIOD
read_format ID|GROUP|LOST
disabled 1
inherit 1
exclude_kernel 1
exclude_hv 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
sample_id_all 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
build_id 1
....
Next event cycles is opened in the s390 counting facility:
Opening: cycles
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
size 136
config 0 (PERF_COUNT_HW_CPU_CYCLES)
sample_type IP|TID|TIME|READ|ID|PERIOD
read_format ID|GROUP|LOST
inherit 1
exclude_kernel 1
exclude_hv 1
sample_id_all 1
So now there are 2 hardware events when are mapped on s390 to
1. event handled by CPU Measurement sampling facility, hardware writes 32 byte
large samples to buffers. The frequency of 4000 Hz translates
to a sample every 1300000 instructions. Interrupt driven.
2. event handled by CPU Measurement counting facilitly, hardware
runs in the background and increments counters accordingly.
All available counters (about 400) are running in the background
and read via assembler instruction until stopped. No interrupts.
If I understand this setup correctly, the first event is the group
leader and either both events run and are active or non of them.
That is the reason why both values should be identitical
Is this true?
Now given two independent CPU measurement units on s390, one running
in the background incrementing counters, the other interrupt driven
reading samples, there is always room the both counters to differ.
The question is how much and how often.
When I look at the debug output of the perf.data file; I get this:
55805554120788 0x22a8 [0x68]: PERF_RECORD_SAMPLE(IP, 0x2):
14135/14135: 0x3ff9ae90340 period: 1300000 addr: 0
... sample_read:
.... group nr 2
..... id 00000000000000b4, value 000000000115b5c0, lost 0
..... id 00000000000000bc, value 000000000195ac03, lost 0
... thread: perf:14135
...... dso: /usr/lib/ld64.so.1
The first value is the count from the sampling event, it gets
incremented with 4000 Hz frequency:
# perf report -D -i/tmp/perf.data.tmrs390|grep 00000000000000b4,|head -10
..... id 00000000000000b4, value 000000000101dfa0, lost 0
..... id 00000000000000b4, value 000000000115b5c0, lost 0
..... id 00000000000000b4, value 00000000013d6200, lost 0
..... id 00000000000000b4, value 0000000001513820, lost 0
..... id 00000000000000b4, value 0000000001650e40, lost 0
..... id 00000000000000b4, value 00000000018cba80, lost 0
..... id 00000000000000b4, value 0000000001a090a0, lost 0
..... id 00000000000000b4, value 0000000001b466c0, lost 0
..... id 00000000000000b4, value 0000000001c83ce0, lost 0
..... id 00000000000000b4, value 0000000001dc1300, lost 0
value 115b5c0 - 101dfa0 = 13d620 --> 1300000 period time.
So that value always increments by period time.
The other counter id is:
# perf report -D -i/tmp/perf.data.tmrs390|grep 00000000000000bc,| sort | uniq -d
..... id 00000000000000bc, value 000000000195ac03, lost 0
..... id 00000000000000bc, value 0000000002fd8b45, lost 0
..... id 00000000000000bc, value 0000000005f0b1ce, lost 0
#
It reads out the value of counter 0 (cycles) 85 times, but has only 3 different
values.
Anyway what does perf script print out? The value of the samples frequency?
Where does perf record read out the value of the counter event?
Any ideas where to start debugging?
Thanks a lot.
--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
[-- Attachment #2: perf.data.tmrs390 --]
[-- Type: application/octet-stream, Size: 25888 bytes --]
next prev parent reply other threads:[~2025-10-29 7:37 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-23 13:24 [PATCH] Revert "perf test: Allow tolerance for leader sampling test" Anubhav Shelat
2025-10-24 8:40 ` Thomas Richter
2025-10-24 17:21 ` Anubhav Shelat
2025-10-27 10:27 ` Thomas Richter
2025-10-28 11:30 ` James Clark
2025-10-28 12:55 ` Thomas Richter
2025-10-28 15:23 ` James Clark
2025-10-29 7:37 ` Thomas Richter [this message]
2025-10-29 9:25 ` James Clark
2025-11-11 11:22 ` Thomas Richter
2025-11-11 14:03 ` James Clark
2025-11-12 10:47 ` Thomas Richter
2025-11-13 11:58 ` James Clark
2025-10-30 13:52 ` Anubhav Shelat
2025-10-30 14:19 ` James Clark
2025-10-30 15:22 ` Anubhav Shelat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a6218625-a08e-4466-837d-7bdd0c8631a5@linux.ibm.com \
--to=tmricht@linux.ibm.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=ashelat@redhat.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=irogers@google.com \
--cc=james.clark@linaro.org \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=mpetlan@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).