linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: namhyung@kernel.org, irogers@google.com, jolsa@kernel.org,
	adrian.hunter@intel.com, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/3] perf mem: Fix missed p-core mem events on ADL and RPL
Date: Fri, 6 Sep 2024 12:08:52 -0400	[thread overview]
Message-ID: <8644996b-33d6-4eee-890c-f23a3c830b77@linux.intel.com> (raw)
In-Reply-To: <ZtsO-v3pUVezKBgE@x1>



On 2024-09-06 10:17 a.m., Arnaldo Carvalho de Melo wrote:
> On Thu, Sep 05, 2024 at 03:47:03PM -0400, Liang, Kan wrote:
>> On 2024-09-05 3:33 p.m., Arnaldo Carvalho de Melo wrote:
>>> On Thu, Sep 05, 2024 at 10:07:36AM -0700, kan.liang@linux.intel.com wrote:
>>>> From: Kan Liang <kan.liang@linux.intel.com>
>>>>
>>>> The p-core mem events are missed when launching perf mem record on ADL
>>>> and RPL.
>>>>
>>>> root@number:~# perf mem record sleep 1
>>>> Memory events are enabled on a subset of CPUs: 16-27
>>>> [ perf record: Woken up 1 times to write data ]
>>>> [ perf record: Captured and wrote 0.032 MB perf.data ]
>>>> root@number:~# perf evlist
>>>> cpu_atom/mem-loads,ldlat=30/P
>>>> cpu_atom/mem-stores/P
>>>> dummy:u
>>>>
>>>> A variable 'record' in the struct perf_mem_event is to indicate whether
>>>> a mem event in a mem_events[] should be recorded. The current code only
>>>> configure the variable for the first eligible PMU. It's good enough for
>>>> a non-hybrid machine or a hybrid machine which has the same
>>>> mem_events[]. However, if a different mem_events[] is used for different
>>>> PMUs on a hybrid machine, e.g., ADL or RPL, the 'record' for the second
>>>> PMU never get a chance to be set. The mem_events[] of the second PMU
>>>> are always ignored.
>>>>
>>>> Perf mem doesn't support the per-PMU configuration now. A
>>>> per-PMU mem_events[] 'record' variable doesn't make sense. Make it
>>>> global. That could also avoid searching for the per-PMU mem_events[]
>>>> via perf_pmu__mem_events_ptr every time.
>>>>
>>>> Fixes: abbdd79b786e ("perf mem: Clean up perf_mem_events__name()")
>>>> Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
>>>> Closes: https://lore.kernel.org/lkml/Zthu81fA3kLC2CS2@x1/
>>>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>>>
>>> Looks better:
>>>
>>> root@number:~# perf report --header-only | grep 'cmdline\|event'
>>> # cmdline : /home/acme/bin/perf mem record ls 
>>> # event : name = cpu_atom/mem-loads,ldlat=30/P, , id = { 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511 }, type = 10 (cpu_atom), size = 136, config = 0x5d0 (mem-loads), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, precise_ip = 3, sample_id_all = 1, { bp_addr, config1 } = 0x1f
>>> # event : name = cpu_atom/mem-stores/P, , id = { 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523 }, type = 10 (cpu_atom), size = 136, config = 0x6d0 (mem-stores), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, precise_ip = 3, sample_id_all = 1
>>> # event : name = cpu_core/mem-loads-aux/, , id = { 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539 }, type = 4 (cpu_core), size = 136, config = 0x8203 (mem-loads-aux), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1
>>> # event : name = cpu_core/mem-loads,ldlat=30/, , id = { 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556 }, type = 4 (cpu_core), size = 136, config = 0x1cd (mem-loads), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, inherit = 1, freq = 1, precise_ip = 2, sample_id_all = 1, exclude_guest = 1, { bp_addr, config1 } = 0x1f
>>> # event : name = cpu_core/mem-stores/P, , id = { 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572 }, type = 4 (cpu_core), size = 136, config = 0x2cd (mem-stores), { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, disabled = 1, inherit = 1, freq = 1, enable_on_exec = 1, precise_ip = 3, sample_id_all = 1
>>> # event : name = dummy:u, , id = { 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600 }, type = 1 (software), size = 136, config = 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|ADDR|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format = ID|LOST, inherit = 1, exclude_kernel = 1, exclude_hv = 1, mmap = 1, comm = 1, task = 1, mmap_data = 1, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
>>> # intel_pt pmu capabilities: topa_multiple_entries=1, psb_cyc=1, single_range_output=1, mtc_periods=249, ip_filtering=1, output_subsys=0, cr3_filtering=1, psb_periods=3f, event_trace=0, cycle_thresholds=3f, power_event_trace=0, mtc=1, payloads_lip=0, ptwrite=1, num_address_ranges=2, max_subleaf=1, topa_output=1, tnt_disable=0
>>> root@number:~# perf evlist
>>> cpu_atom/mem-loads,ldlat=30/P
>>> cpu_atom/mem-stores/P
>>> cpu_core/mem-loads-aux/
>>> cpu_core/mem-loads,ldlat=30/
>>> cpu_core/mem-stores/P
>>> dummy:u
>>> root@number:~#
>>>
>>> But can we reconstruct the events relationship (group, :S, etc) from
>>> what we have in the perf.data header?
>>>
>>
>> Do you mean show the group relation in the perf evlist?
>>
>> $perf mem record sleep 1
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.027 MB perf.data (10 samples) ]
>>
>> $perf evlist -g
>> cpu_atom/mem-loads,ldlat=30/P
>> cpu_atom/mem-stores/P
>> {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}
>> cpu_core/mem-stores/P
>> dummy:u
>>
>> The -g option already did it, although the group modifier looks lost.
> 
> Right, I can reproduce that, but I wonder if we shouldn't make this '-g'
> option the default?

I think the evlist means a list of events. Only outputting the events
makes sense to me.
With -g, the extra relationship information is provided.

> 
> -----
> 
> Committer testing:
> 
>   root@number:~# perf evlist -g
>   cpu_atom/mem-loads,ldlat=30/P
>   cpu_atom/mem-stores/P
>   {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}
>   cpu_core/mem-stores/P
>   dummy:u
>   root@number:~#
> 
> The :S for '{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}' is
> not being added by 'perf evlist -g', to be checked.
> 
> -----

It should be a generic issue, not just for perf evlist -g.

The same issue can be observed for perf report.
$perf report --header-only | grep 'cmdline\|group'
# cmdline : /home/kan/tmp/perf-tools-next/tools/perf/perf record -e
{cycles,instructions}:u sleep 1
# group: {cycles,instructions}

I think it's because the per-group modifiers is converted to per-event
modifiers and stored in the evsel when parsing the group. It's hard to
reconstruct the accurate group strings only relying on the evsel, unless
we record the group string somewhere, e.g., leader evsel, when parsing it.

Thanks,
Kan

  reply	other threads:[~2024-09-06 16:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-05 17:07 [PATCH 1/3] perf mem: Check mem_events for all eligible PMUs kan.liang
2024-09-05 17:07 ` [PATCH 2/3] perf mem: Fix missed p-core mem events on ADL and RPL kan.liang
2024-09-05 19:33   ` Arnaldo Carvalho de Melo
2024-09-05 19:47     ` Liang, Kan
2024-09-06 14:17       ` Arnaldo Carvalho de Melo
2024-09-06 16:08         ` Liang, Kan [this message]
2024-09-06 20:06           ` Arnaldo Carvalho de Melo
2024-09-08 20:30             ` Liang, Kan
2024-09-11 15:56               ` Arnaldo Carvalho de Melo
2024-09-05 17:07 ` [PATCH 3/3] perf mem: Fix the wrong reference in parse_record_events kan.liang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8644996b-33d6-4eee-890c-f23a3c830b77@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=namhyung@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).