From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
To: Eric Lin <eric.lin@sifive.com>, Ian Rogers <irogers@google.com>,
Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Kan Liang <kan.liang@linux.intel.com>,
James Clark <james.clark@arm.com>,
linux-perf-users <linux-perf-users@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
vincent.chen@sifive.com, greentime.hu@sifive.com,
Samuel Holland <samuel.holland@sifive.com>
Subject: Re: [PATCH] perf pmus: Fix duplicate events caused segfault
Date: Wed, 7 Aug 2024 00:12:26 +0530 [thread overview]
Message-ID: <2EA2700A-027A-4AD4-B93F-0426D6BF76D2@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAPqJEFr78B_74PCsxxHdDZtdrJVUL6j6u4vauCaoTaR7Rr=Rrw@mail.gmail.com>
> On 6 Aug 2024, at 3:26 PM, Eric Lin <eric.lin@sifive.com> wrote:
>
> This Message Is From an External Sender
> This message came from outside your organization.
> Report Suspicious
>
> Hi Ian,
>
> On Tue, Aug 6, 2024 at 12:00 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Mon, Aug 5, 2024 at 8:29 PM Eric Lin <eric.lin@sifive.com> wrote:
> > >
> > >
> > > Hi Arnaldo,
> > >
> > > On Mon, Aug 5, 2024 at 11:43 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > >
> > > > On Mon, Aug 05, 2024 at 07:54:33PM +0530, Athira Rajeev wrote:
> > > > >
> > > > >
> > > > > > On 4 Aug 2024, at 8:36 PM, Eric Lin <eric.lin@sifive.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Sun, Jul 21, 2024 at 11:44 PM Eric Lin <eric.lin@sifive.com> wrote:
> > > > > >>
> > > > > >> Hi Athira,
> > > > > >>
> > > > > >> On Sat, Jul 20, 2024 at 4:35 PM Athira Rajeev
> > > > > >> <atrajeev@linux.vnet.ibm.com> wrote:
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >>>> On 19 Jul 2024, at 1:46 PM, Eric Lin <eric.lin@sifive.com> wrote:
> > > > > >>>>
> > > > > >>>> Currently, if vendor JSON files have two duplicate event names,
> > > > > >>>> the "perf list" command will trigger a segfault.
> > > > > >>>>
> > > > > >>>> In commit e6ff1eed3584 ("perf pmu: Lazily add JSON events"),
> > > > > >>>> pmu_events_table__num_events() gets the number of JSON events
> > > > > >>>> from table_pmu->num_entries, which includes duplicate events
> > > > > >>>> if there are duplicate event names in the JSON files.
> > > > > >>>
> > > > > >>> Hi Eric,
> > > > > >>>
> > > > > >>> Let us consider there are duplicate event names in the JSON files, say :
> > > > > >>>
> > > > > >>> metric.json with: EventName as pmu_cache_miss, EventCode as 0x1
> > > > > >>> cache.json with: EventName as pmu_cache_miss, EventCode as 0x2
> > > > > >>>
> > > > > >>> If we fix the segfault and proceed, still “perf list” will list only one entry for pmu_cache_miss with may be 0x1/0x2 as event code ?
> > > > > >>> Can you check the result to confirm what “perf list” will list in this case ? If it’s going to have only one entry in perf list, does it mean there are two event codes for pmu_cache_miss and it can work with either of the event code ?
> > > > > >>>
> > > > > >>
> > > > > >> Sorry for the late reply.
> > > > > >> Yes, I've checked if there are duplicate pmu_cache_miss events in the
> > > > > >> JSON files, the perf list will have only one entry in perf list.
> > > > > >>
> > > > > >>> If it happens to be a mistake in json file to have duplicate entry with different event code (ex: with some broken commit), I am thinking if the better fix is to keep only the valid entry in json file ?
> > > > > >>>
> > > > > >>
> > > > > >> Yes, I agree we should fix the duplicate events in vendor JSON files.
> > > > > >>
> > > > > >> According to this code snippet [1], it seems the perf tool originally
> > > > > >> allowed duplicate events to exist and it will skip the duplicate
> > > > > >> events not shown on the perf list.
> > > > > >> However, after this commit e6ff1eed3584 ("perf pmu: Lazily add JSON
> > > > > >> events"), if there are two duplicate events, it causes a segfault.
> > > > > >>
> > > > > >> Can I ask, do you have any suggestions? Thanks.
> > > > > >>
> > > > > >> [1] https://github.com/torvalds/linux/blob/master/tools/perf/util/pmus.c#L491
> > > > > >>
> > > > > >
> > > > > > Kindly ping.
> > > > > >
> > > > > > Can I ask, are there any more comments about this patch? Thanks.
> > > > > >
> > > > > Hi Eric,
> > > > >
> > > > > The functions there says alias and to skip duplicate alias. I am not sure if that is for events
> > > > >
> > > > > Namhyung, Ian, Arnaldo
> > > > > Any comments here ?
> > > >
> > > > So I was trying to reproduce the problem here before looking at the
> > > > patch, tried a simple:
> > > >
> > > > ⬢[acme@toolbox perf-tools-next]$ git diff
> > > > diff --git a/tools/perf/pmu-events/arch/x86/rocketlake/cache.json b/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
> > > > index 2e93b7835b41442b..167a41b0309b7cfc 100644
> > > > --- a/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
> > > > +++ b/tools/perf/pmu-events/arch/x86/rocketlake/cache.json
> > > > @@ -1,4 +1,13 @@
> > > > [
> > > > + {
> > > > + "BriefDescription": "Counts the number of cache lines replaced in L1 data cache.",
> > > > + "Counter": "0,1,2,3",
> > > > + "EventCode": "0x51",
> > > > + "EventName": "L1D.REPLACEMENT",
> > > > + "PublicDescription": "Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.",
> > > > + "SampleAfterValue": "100003",
> > > > + "UMask": "0x1"
> > > > + },
> > > > {
> > > > "BriefDescription": "Counts the number of cache lines replaced in L1 data cache.",
> > > > "Counter": "0,1,2,3",
> > > > ⬢[acme@toolbox perf-tools-next]$ grep L1D.REPLACEMENT tools/perf/pmu-events/arch/x86/rocketlake/cache.json
> > > > "EventName": "L1D.REPLACEMENT",
> > > > "EventName": "L1D.REPLACEMENT",
> > > > ⬢[acme@toolbox perf-tools-next]$
> > > >
> > > > I.e. duplicated that whole event definition:
> > > >
> > > > Did a make clean and a rebuild and:
> > > >
> > > > root@x1:/home/acme/git/pahole# perf list l1d.replacement
> > > >
> > > > List of pre-defined events (to be used in -e or -M):
> > > >
> > > >
> > > > cache:
> > > > l1d.replacement
> > > > [Counts the number of cache lines replaced in L1 data cache. Unit: cpu_core]
> > > > root@x1:/home/acme/git/pahole# perf list > /dev/null
> > > > root@x1:/home/acme/git/pahole#
> > > >
> > > > No crash, can you provide instructions on how to reproduce the problem?
> > > >
> > > > I would like to use the experience to add a 'perf test' to show this
> > > > failing and then after the patch it passing that new test.
> > > >
> > >
> > > The segfault occurs when the vendor JSON files contain two events with duplicate names.
> > >
> > > I tried adding two duplicate events "L1D.REPLACEMENT" and "L1D_PEND_MISS.FB_FULL" and the issue can be reproduced on my laptop on the x86 platform.
> > >
> > > ericl@EricL-ThinkPadX1-TW 11:11 ~/linux_src/linux/tools/perf (master)
> > > git diff
> > > diff --git a/tools/perf/pmu-events/arch/x86/tigerlake/cache.json b/tools/perf/pmu-events/arch/x86/tigerlake/cache.json
> > > index f4144a1110be..71062a82699d 100644
> > > --- a/tools/perf/pmu-events/arch/x86/tigerlake/cache.json
> > > +++ b/tools/perf/pmu-events/arch/x86/tigerlake/cache.json
> > > @@ -8,6 +8,24 @@
> > > "SampleAfterValue": "100003",
> > > "UMask": "0x1"
> > > },
> > > + {
> > > + "BriefDescription": "Counts the number of cache lines replaced in L1 data cache.",
> > > + "Counter": "0,1,2,3",
> > > + "EventCode": "0x51",
> > > + "EventName": "L1D.REPLACEMENT",
> > > + "PublicDescription": "Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.",
> > > + "SampleAfterValue": "100003",
> > > + "UMask": "0x1"
> > > + },
> > > + {
> > > + "BriefDescription": "Number of cycles a demand request has waited due to L1D Fill Buffer (FB) unavailability.",
> > > + "Counter": "0,1,2,3",
> > > + "EventCode": "0x48",
> > > + "EventName": "L1D_PEND_MISS.FB_FULL",
> > > + "PublicDescription": "Counts number of cycles a demand request has waited due to L1D Fill Buffer (FB) unavailability. Demand requests include cacheable/uncacheable demand load, store, lock or SW prefetch accesses.",
> > > + "SampleAfterValue": "1000003",
> > > + "UMask": "0x2"
> > > + },
> > > {
> > > "BriefDescription": "Number of cycles a demand request has waited due to L1D Fill Buffer (FB) unavailability.",
> > > "Counter": "0,1,2,3",
> > >
> > > ericl@EricL-ThinkPadX1-TW 11:11 ~/linux_src/linux/tools/perf (master)
> > > $ ./perf list
> > > Segmentation fault (core dumped)
> >
> > Hi Eric,
> >
> > the series we were asking you to test was:
> > https://lore.kernel.org/linux-perf-users/20240805194424.597244-1-irogers@google.com/
> > It can be convenient to use the b4 tool to grab a set of patches. It
> > is also in the perf-tools-next tree:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/
> > in the tmp.perf-tools-next branch, so you could clone that.
> >
> > Having duplicate events doesn't make sense. When a sysfs event and
> > json event exist with the same name, the json events values override
> > those of the sysfs event:
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmu.c?h=tmp.perf-tools-next#n599
> > Two json events doesn't have a clear meaning and which would be found
> > would be dependent on a binary search. To avoid the problem the linked
> > to series forbids duplicate events in the json and adds a build test
> > building all architectures json. As this would break due to duplicate
> > events, as your patch shows, there are json fixes for RISC-V and ARM.
> >
Hi Arnaldo, Ian
Thanks for quick response and providing the fix.
I tested with tmp.perf-tools-next branch from git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git
The build reports "Duplicate event” error when having duplicate events in the JSON files
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Thanks
Athira
>
> I tested the series and added duplicate events on the JSON files
>
> diff --git a/tools/perf/pmu-events/arch/riscv/sifive/u74/memory.json b/tools/perf/pmu-events/arch/riscv/sifive/u74/memory.json
> index be1a46312ac3..48f5aec4875a 100644
> --- a/tools/perf/pmu-events/arch/riscv/sifive/u74/memory.json
> +++ b/tools/perf/pmu-events/arch/riscv/sifive/u74/memory.json
> @@ -1,4 +1,9 @@
> [
> + {
> + "EventName": "ICACHE_RETIRED",
> + "EventCode": "0x0000102",
> + "BriefDescription": "Instruction cache miss"
> + },
> {
> "EventName": "ICACHE_RETIRED",
> "EventCode": "0x0000102",
> @@ -29,4 +34,4 @@
> "EventCode": "0x0002002",
> "BriefDescription": "UTLB miss"
> }
>
> When building the perf tool, it will show the build error as follows:
>
>
> CC tests/pmu-events.o
> CC util/cacheline.o
> Traceback (most recent call last):
> File "pmu-events/jevents.py", line 1313, in <module>
> main()
> File "pmu-events/jevents.py", line 1304, in main
> ftw(arch_path, [], process_one_file)
> File "pmu-events/jevents.py", line 1245, in ftw
> ftw(item.path, parents + [item.name], action)
> File "pmu-events/jevents.py", line 1243, in ftw
> action(parents, item)
> File "pmu-events/jevents.py", line 646, in process_one_file
> print_pending_events()
> File "pmu-events/jevents.py", line 510, in print_pending_events
> assert event.name != last_name, f"Duplicate event: {last_pmu}/{last_name}/ in {_pending_events_tblname}"
> AssertionError: Duplicate event: default_core/icache_retired/ in pmu_events__sifive_u74
> LD arch/riscv/util/perf-util-in.o
> pmu-events/Build:35: recipe for target 'pmu-events/pmu-events.c' failed
> make[3]: *** [pmu-events/pmu-events.c] Error 1
> make[3]: *** Deleting file 'pmu-events/pmu-events.c'
> Makefile.perf:763: recipe for target 'pmu-events/pmu-events-in.o' failed
> make[2]: *** [pmu-events/pmu-events-in.o] Error 2
> make[2]: *** Waiting for unfinished jobs....
> CC bench/futex-lock-pi.o
>
> I think the patch series can detect if the vendor JSON file has duplicate events when building the perf tool. Thanks.
>
>
> Best regards,
> Eric Lin
>
> > Thanks,
> > Ian
next prev parent reply other threads:[~2024-08-06 18:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-19 8:16 [PATCH] perf pmus: Fix duplicate events caused segfault Eric Lin
2024-07-20 8:08 ` Athira Rajeev
2024-07-21 15:44 ` Eric Lin
2024-08-04 15:06 ` Eric Lin
2024-08-05 14:24 ` Athira Rajeev
2024-08-05 15:43 ` Arnaldo Carvalho de Melo
[not found] ` <CAPqJEFraOmS72dQQK2ou9EoxbCKZ8m_+DhQQfPmCy6wfxfQWzQ@mail.gmail.com>
2024-08-06 4:00 ` Ian Rogers
[not found] ` <CAPqJEFr78B_74PCsxxHdDZtdrJVUL6j6u4vauCaoTaR7Rr=Rrw@mail.gmail.com>
2024-08-06 18:42 ` Athira Rajeev [this message]
2024-08-07 2:53 ` Eric Lin
2024-08-05 17:02 ` Ian Rogers
2024-08-05 19:48 ` Ian Rogers
2024-08-05 20:18 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2EA2700A-027A-4AD4-B93F-0426D6BF76D2@linux.vnet.ibm.com \
--to=atrajeev@linux.vnet.ibm.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=eric.lin@sifive.com \
--cc=greentime.hu@sifive.com \
--cc=irogers@google.com \
--cc=james.clark@arm.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=samuel.holland@sifive.com \
--cc=vincent.chen@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).