From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>,
Namhyung Kim <namhyung@kernel.org>, Jiri Olsa <jolsa@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
linux-perf-users@vger.kernel.org,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: perf mem record not getting the mem_load_aux events by default
Date: Thu, 5 Sep 2024 13:23:27 -0400 [thread overview]
Message-ID: <caba86ad-bcba-4e1e-acc4-b18d769db87d@linux.intel.com> (raw)
In-Reply-To: <Zth-DBdaSXodeFqn@x1>
On 2024-09-04 11:34 a.m., Arnaldo Carvalho de Melo wrote:
> On Wed, Sep 04, 2024 at 11:20:57AM -0400, Liang, Kan wrote:
>>
>>
>> On 2024-09-04 10:30 a.m., Arnaldo Carvalho de Melo wrote:
>>> Hi Kan,
>>>
>>> Recently I presented about 'perf mem record' and found that I had use
>>> 'perf record' directly as 'perf mem record' on a Intel Hybrid system
>>> wasn't selecting the required aux event:
>>>
>>> http://vger.kernel.org/~acme/prez/lsfmm-bpf-2024/#/19
>>>
>>> The previous slides show the problem and the one above shows what worked
>>> for me.
>>>
>>> I saw this while trying to fix that:
>>>
>>> Author: Kan Liang <kan.liang@linux.intel.com>
>>> commit abbdd79b786e036e60f01b7907977943ebe7a74d
>>> Date: Tue Jan 23 10:50:32 2024 -0800
>>>
>>> perf mem: Clean up perf_mem_events__name()
>>>
>>> Introduce a generic perf_mem_events__name(). Remove the ARCH-specific
>>> one.
>>>
>>> The mem_load events may have a different format. Add ldlat and aux_event
>>> in the struct perf_mem_event to indicate the format and the extra aux
>>> event.
>>>
>>> Add perf_mem_events_intel_aux[] to support the extra mem_load_aux event.
>>>
>>> Rename perf_mem_events__name to perf_pmu__mem_events_name.
>>>
>>> --------------------------´
>>>
>>> So there are provisions for selecting the right events, but it doesn't
>>> seem to be working when I tried, can you take a look at what I describe
>>> on those slides and see what am I doing wrong?
>>>
>>
>> If I understand the example in the slides correctly, the issue is that
>> no mem events from big core are selected when running perf mem record,
>> rather than wrong mem events are selected.
>>
>> I don't see an obvious issue. That looks like a regression of the perf
>> mem record. I will find a Alder Lake or Raptor Lake to take a deep look.
>
> My expectation was for whatever is needed for having those events to be
> put in place, like I did manually, and indeed, limiting it to cpu_core:
>
> taskset -c 0 \
> perf record --weight --data \
> --event '{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/P}:S' \
> --event cpu_core/mem-stores/ find / > /dev/null
>
> I.e. lots of boilerplate for using 'perf mem record', we should at least
> have some sort of warning about the 'perf mem record' experience having
> to be restricted to workloads running on PMUs where it can take place,
> perhaps making 'perf mem record' to restrict the CPUs used for a session
> to be the ones with the needed resources... and we have that already:
>
> root@number:~# perf mem record sleep 1
> Memory events are enabled on a subset of CPUs: 16-27
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.032 MB perf.data ]
> root@number:~#
>
> But...
>
> root@number:~# perf evlist
> cpu_atom/mem-loads,ldlat=30/P
> cpu_atom/mem-stores/P
> dummy:u
> root@number:~# perf evlist -v
> cpu_atom/mem-loads,ldlat=30/P: type: 10 (cpu_atom), size: 136, config: 0x5d0 (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, precise_ip: 3, sample_id_all: 1, { bp_addr, config1 }: 0x1f
> cpu_atom/mem-stores/P: type: 10 (cpu_atom), size: 136, config: 0x6d0 (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, precise_ip: 3, sample_id_all: 1
> dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
> root@number:~#
>
> It is not setting up the required
>
> --event '{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/P}:S'
>
> part, right?
>
Right, it's bug on ADL and RPL. The p-core of ADL and RPL requires
mem-loads-aux to work around an HW defect. So there are different
mem_events for e-core and p-core, perf_mem_events_intel[] and
perf_mem_events_intel_aux[].
Ideally, perf should initialize and set the corresponding config bit for
both mem_events. However, the current code only does it for the first
PMU, which brings trouble. The second PMU (p-core) is always ignored.
Except ADL/RPL, it doesn't impact the other hybrid machine. Because the
workaround is not required. So both e-core and p-core share the same
perf_mem_events_intel[].
The patch set to fix it has been posted. Please take a look.
https://lore.kernel.org/lkml/20240905170737.4070743-1-kan.liang@linux.intel.com/
BTW: I found a regression with perf mem record -e when I did the test.
The fix patch can also be found in the above patch set.
Thanks,
Kan
> To make this more useful perhaps we should, in addition to warning that
> is running just on those CPUs, when we specify a workload (sleep 1) in
> the above case, limit that workload to that set of CPUs so that we can
> get those mem events on all of the workload runtime?
>
> We would just add a new warning for that behaviour, etc.
>
> - Arnaldo
>
prev parent reply other threads:[~2024-09-05 17:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-04 14:30 perf mem record not getting the mem_load_aux events by default Arnaldo Carvalho de Melo
2024-09-04 15:20 ` Liang, Kan
2024-09-04 15:34 ` Arnaldo Carvalho de Melo
2024-09-05 17:23 ` Liang, Kan [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=caba86ad-bcba-4e1e-acc4-b18d769db87d@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=namhyung@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).