Re: perf record dumps core all the time on s390 (6.6.0rc1) part2

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Thomas Richter <tmricht@linux.ibm.com>
To: ian Rogers <irogers@google.com>,
	Sumanth Korikkar <sumanthk@linux.ibm.com>,
	"linux-perf-use." <linux-perf-users@vger.kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>
Subject: Re: perf record dumps core all the time on s390 (6.6.0rc1) part2
Date: Wed, 13 Sep 2023 14:16:26 +0200	[thread overview]
Message-ID: <589cb8dc-fc2b-9461-be23-f2e57f1ee46f@linux.ibm.com> (raw)
In-Reply-To: <07625379-ca42-f899-40fc-58a8fc32cb5e@linux.ibm.com>

On 9/13/23 11:17, Thomas Richter wrote:
> Using Linux kernel version 6.6.0rc1, the command
> 
>   # perf record -e cycles
> 
> dumps core immediately on s390 running on a z/VM virtual machine.
> It works on an s390 LPAR (same kernel).
> 
> This issue showed up during my vacation on 25-Aug-2023 in linux-next
> and since Monday 11-Sep-2023 it shows up in linux 6.6.0rc1.
> 
> The event cycles is treated as a software event and it fails in:
> 
>  (gdb) n
>  1014           pmu->events_table = perf_pmu__find_events_table(pmu);
>  (gdb) p *pmu
>   $4 = {name = 0x152a150 "software", alias_name = 0x0, id = 0x0, type = 1,
>           selectable = false, is_core = false,
>   is_uncore = false, auxtrace = false, formats_checked = false,
>   config_masks_present = false,
>   config_masks_computed = false, max_precise = -1, default_config = 0x0, cpus = 0x0,
>   format = {next = 0x152a0c8, prev = 0x152a0c8},
>   aliases = {next = 0x152a0d8, prev = 0x152a0d8}, events_table = 0x0, sysfs_aliases = 0,
>   loaded_json_aliases = 0, sysfs_aliases_loaded = false,
>   cpu_aliases_added = false, caps_initialized = false,
>   nr_caps = 0, caps = {next = 0x152a100, prev = 0x152a100},
>   list = {next = 0x0, prev = 0x0}, config_masks = {0, 0, 0, 0},
>   missing_features = {exclude_guest = false}}
>  (gdb) n
> 
>  Program received signal SIGSEGV, Segmentation fault.
>  0x0000000001344558 in perf_pmu__find_events_table (pmu=0x152a090) at pmu-events/pmu-events.c:1970
>  1970           for (i = 0; i < table->num_pmus; i++) {
>  (gdb) p table
>  $5 = (const struct pmu_events_table *) 0x0
>  (gdb)
> 
> (gdb) where
>  #0  perf_pmu__find_events_table (pmu=pmu@entry=0x2aa003d48d0)
>         at pmu-events/pmu-events.c:1970
>  #1  0x000002aa00180cee in perf_pmu__lookup (pmus=0x2aa0039baf8 <other_pmus>,
>         dirfd=dirfd@entry=3, lookup_name=lookup_name@entry=0x2aa003dca83 "software")
>         at util/pmu.c:1014
>  #2  0x000002aa0018151e in perf_pmu__find2 (name=0x2aa003dca83 "software",
>         dirfd=<optimized out>) at util/pmus.c:150
>  #3  pmu_read_sysfs (core_only=false) at util/pmus.c:198
>  #4  0x000002aa00181e04 in pmu_read_sysfs (core_only=false) at util/pmus.c:179
>  #5  perf_pmus__find_by_type (type=0) at util/pmus.c:238
>  #6  perf_pmus__find_by_type (type=type@entry=0) at util/pmus.c:231
>  #7  0x000002aa0012ee4a in parse_events_add_numeric (parse_state=0x3ffffff7718,
>         parse_state@entry=0x100000000,
>         list=0x2aa003d47b0, list@entry=<error reading variable:
>                 value has been optimized out>, type=type@entry=0,
>     config=<optimized out>, head_config=<optimized out>, head_config@entry=0x0, wildcard=true)
>     at util/parse-events.c:1347
>  #8  0x000002aa0017b7c2 in parse_events_parse (_parse_state=0x100000000,
>         _parse_state@entry=0x3ffffff7718, scanner=<optimized out>) at util/parse-events.y:418
>  #9  0x000002aa0012ca6e in parse_events__scanner (parse_state=0x3ffffff7718, input=0x0,
>         str=0x3ffffffa58a "cycles") at util/parse-events.c:1822
>  #10 __parse_events (evlist=0x3ffffff7830, str=str@entry=0x3ffffffa58a "cycles",
>         pmu_filter=<optimized out>, err=err@entry=0x3ffffff7830, fake_pmu=fake_pmu@entry=0x0,
>         warn_if_reordered=true) at util/parse-events.c:2094
>  ....rest omitted.....
> 
> 
> Variable table is not assigned any value and remains a NULL pointer.
> This happens because on a z/VM vertual machines (on s390) there
> is not CPU measurement facility and the directories
>    /sys/devices/cpum_cf
>    /sys/devices/cpum_sf
> do not exist. Therefore the event cycles as treated as software event
> and this default does work with the latest pmu-events rework.
> 
> Note: This core-dump will also happen on an LPAR when the CPU Measurement
> facility is not configured. Then above PMU directory do not exist on
> an LPAR.
> 
> The other PMU for s390 are
>    /sys/devices/pai_crypto
>    /sys/devices/pai_ext
> do exist on z/VM and LPAR on s390 (when configured).
> 
> Do you have an idea how to fix this?
> Since the pmu-event subtree was completely redesigned, I would
> like some guidance on how to proceed.
> 
> Thanks a lot for your help.
> 


When I use software events like cpu-clock or cs, the commands

 # ./perf stat -e cs -- true
 Segmentation fault (core dumped)
 # ./perf stat -e cpu-clock-- true
 Segmentation fault (core dumped)
 #

also dump core. This should not happen as these events are defined
even when no hardware PMU is available.
Debugging this reveals this call chain:

  perf_pmus__find_by_type(type=1)
  +--> pmu_read_sysfs(core_only=false)
       +--> perf_pmu__find2(dirfd=3, name=0x152a113 "software")
            +--> perf_pmu__lookup(pmus=0x14f0568 <other_pmus>, dirfd=3,
                                  lookup_name=0x152a113 "software")
                 +--> perf_pmu__find_events_table (pmu=0x1532130)

Now the pmu is "software" and it tries to find a proper table
generated by the pmu-event generation process for s390:

 # cd pmu-events/
 # ./jevents.py  s390 all /root/linux/tools/perf/pmu-events/arch |\
        grep -E '^const struct pmu_table_entry'
 const struct pmu_table_entry pmu_events__cf_z10[] = {
 const struct pmu_table_entry pmu_events__cf_z13[] = {
 const struct pmu_table_entry pmu_metrics__cf_z13[] = {
 const struct pmu_table_entry pmu_events__cf_z14[] = {
 const struct pmu_table_entry pmu_metrics__cf_z14[] = {
 const struct pmu_table_entry pmu_events__cf_z15[] = {
 const struct pmu_table_entry pmu_metrics__cf_z15[] = {
 const struct pmu_table_entry pmu_events__cf_z16[] = {
 const struct pmu_table_entry pmu_metrics__cf_z16[] = {
 const struct pmu_table_entry pmu_events__cf_z196[] = {
 const struct pmu_table_entry pmu_events__cf_zec12[] = {
 const struct pmu_table_entry pmu_metrics__cf_zec12[] = {
 const struct pmu_table_entry pmu_events__test_soc_cpu[] = {
 const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = {
 const struct pmu_table_entry pmu_events__test_soc_sys[] = {
 #

However event "software" is not listed, as can be seen in the
generated const struct pmu_events_map pmu_events_map[].

So in function perf_pmu__find_events_table(), there variable
table is initialized to NULL, but never set to a proper
value. The function scans all generated &pmu_events_map[]
tables, but no table matches, because the tables are
s390 CPU Measurement unit specific:


        i = 0;
        for (;;) {
                const struct pmu_events_map *map = &pmu_events_map[i++];
                if (!map->arch)
                        break;

        --> the maps are there because the build generated them

                if (!strcmp_cpuid_str(map->cpuid, cpuid)) {
                        table = &map->event_table;
                        break;
                }
        --> Since no matching CPU string the table var remains 0x0
        }
        free(cpuid);
        if (!pmu)
                return table;

        --> The pmu is "software" so it exists and no return

        for (i = 0; i < table->num_pmus; i++) {
                const struct pmu_table_entry *table_pmu = &table->pmus[i];
        --> and here perf dies because table is 0x0

To me something is missing, either generate a pmu_events_map[] for software
events or return and handle these events somewhere else.

The events cpu-clock or cs are completely software driven and do not
need any CPU measurement hardware at all.
Something is very strange and I am confused.
Does this work on other platforms?

Thanks 

-- 
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Gregor Pillen
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

     prev parent reply	other threads:[~2023-09-13 12:16 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-13  9:17 perf record dumps core all the time on s390 (6.6.0rc1) Thomas Richter
2023-09-13 12:16 ` Thomas Richter [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=589cb8dc-fc2b-9461-be23-f2e57f1ee46f@linux.ibm.com \
    --to=tmricht@linux.ibm.com \
    --cc=acme@kernel.org \
    --cc=gor@linux.ibm.com \
    --cc=irogers@google.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=sumanthk@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).