* [PATCH 0/2] perf bench: Fix core dumps on s390 z/VM
@ 2025-03-18 9:51 Thomas Richter
2025-03-18 9:51 ` [PATCH 1/2] perf/bench: Fix perf bench internals pmu-scan core dump Thomas Richter
2025-03-18 9:51 ` [PATCH 2/2] perf/bench: Double free of dynamic allocated memory Thomas Richter
0 siblings, 2 replies; 5+ messages in thread
From: Thomas Richter @ 2025-03-18 9:51 UTC (permalink / raw)
To: linux-kernel, linux-s390, linux-perf-users, acme, namhyung,
irogers, acme
Cc: agordeev, gor, sumanthk, hca
On s390 z/VM command 'perf bench internals pmu-scan' dumps core,
due to a
- double free of the same memory chunck
- free of memory chunk located to read-only data segment
Thomas Richter (2):
perf/bench: perf bench internals pmu-scan dumps core
perf/bench: Double free of dynamic allocated memory
tools/perf/util/cpumap.c | 7 +------
tools/perf/util/pmu.c | 11 +++++++----
2 files changed, 8 insertions(+), 10 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] perf/bench: Fix perf bench internals pmu-scan core dump
2025-03-18 9:51 [PATCH 0/2] perf bench: Fix core dumps on s390 z/VM Thomas Richter
@ 2025-03-18 9:51 ` Thomas Richter
2025-03-18 16:28 ` Ian Rogers
2025-03-18 9:51 ` [PATCH 2/2] perf/bench: Double free of dynamic allocated memory Thomas Richter
1 sibling, 1 reply; 5+ messages in thread
From: Thomas Richter @ 2025-03-18 9:51 UTC (permalink / raw)
To: linux-kernel, linux-s390, linux-perf-users, acme, namhyung,
irogers, acme
Cc: agordeev, gor, sumanthk, hca, Thomas Richter
On s390 z/VM systems the command 'perf bench internals pmu-scan'
dumps core, as can be seen:
# ./perf bench internals pmu-scan
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
double free or corruption (out)
Aborted (core dumped)
# gdb ./perf core.xxxx
....
#9 0x00000000012fb57a in perf_pmu__delete (pmu=0x160e370 <tool>)
at util/pmu.c:2318
#10 0x00000000012fbfca in perf_pmus__destroy () at util/pmus.c:103
#11 0x0000000001186f72 in save_result () at bench/pmu-scan.c:71
#12 0x00000000011873c2 in run_pmu_scan () at bench/pmu-scan.c:140
#13 0x00000000011876a8 in bench_pmu_scan (argc=0, argv=0x3fff3a77338)
at bench/pmu-scan.c:183
#14 0x0000000001174556 in run_bench (coll_name=0x14709ba "internals",
bench_name=0x1470700 "pmu-scan", fn=0x1187620 <bench_pmu_scan>,
argc=1, argv=0x3fff3a77338) at builtin-bench.c:229
#15 0x0000000001174a1e in cmd_bench (argc=2, argv=0x3fff3a77330)
at builtin-bench.c:330
...
The root cause is in PMU buildup. The PMUs are constructed via
run_bench()
+--> bench_pmu_scan()
+--> run_pmu_scan()
+--> save_result()
+--> perf_pmus__scan()
+--> pmu_read_sysfs()
+--> perf_pmus__tool_pmu()
perf_pmus__tool_pmu() returns a pointer to a static defined variable:
static struct perf_pmu tool = {
.name = "tool",
.type = PERF_PMU_TYPE_TOOL,
.aliases = LIST_HEAD_INIT(tool.aliases),
.caps = LIST_HEAD_INIT(tool.caps),
.format = LIST_HEAD_INIT(tool.format),
};
and that PMU is added to the list of other_cpus in file
./util/pmus.c, function pmu_read_sysfs().
Later on after the list of PMUs is constructed,
that list is removed again via:
save_result()
+--> perf_pmus__destroy()
+--> perf_pmu__delete()
This works fine until the PMU named "tool" is deleted.
Its name is a constant pointer possibly located in read-only data
section and can not be freed using zfree().
Remedy this and check for dynamic memory allocation for the PMU.
Background: s390 z/VM system do not support PMUs for sampling and
counting. In this case dummy events are created by the perf tool
and the PMUs "tool" and "fake" are created and freed.
Fixes: efe98a7a3977 ("perf pmu: Use zfree() to reduce chances of use after free")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
---
tools/perf/util/pmu.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 6206c8fe2bf9..59cec4d2909e 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -2315,10 +2315,13 @@ void perf_pmu__delete(struct perf_pmu *pmu)
perf_cpu_map__put(pmu->cpus);
- zfree(&pmu->name);
- zfree(&pmu->alias_name);
- zfree(&pmu->id);
- free(pmu);
+ /* Static variables can not be free'ed */
+ if (pmu->type != PERF_PMU_TYPE_TOOL && pmu->type != PERF_PMU_TYPE_FAKE) {
+ zfree(&pmu->alias_name);
+ zfree(&pmu->id);
+ zfree(&pmu->name);
+ free(pmu);
+ }
}
const char *perf_pmu__name_from_config(struct perf_pmu *pmu, u64 config)
--
2.48.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] perf/bench: Double free of dynamic allocated memory
2025-03-18 9:51 [PATCH 0/2] perf bench: Fix core dumps on s390 z/VM Thomas Richter
2025-03-18 9:51 ` [PATCH 1/2] perf/bench: Fix perf bench internals pmu-scan core dump Thomas Richter
@ 2025-03-18 9:51 ` Thomas Richter
2025-03-18 16:37 ` Ian Rogers
1 sibling, 1 reply; 5+ messages in thread
From: Thomas Richter @ 2025-03-18 9:51 UTC (permalink / raw)
To: linux-kernel, linux-s390, linux-perf-users, acme, namhyung,
irogers, acme
Cc: agordeev, gor, sumanthk, hca, Thomas Richter
On s390 z/VM the command 'perf bench internals pmu-scan'
dumps core, as can be seen:
Output before:
# ./perf bench internals pmu-scan
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
perf: /root/linux/tools/include/linux/refcount.h:131:
refcount_sub_and_test: Assertion `!(new > val)' failed.
Aborted (core dumped)
#
The root cause is in
perf_pmus__scan()
+--> perf_pmu__create_placeholder_core_pmu()
+--> cpu_map__online()
cpu_map__online() has a static variable
static struct perf_cpu_map *online;
if (!online)
online = perf_cpu_map__new_online_cpus();
return online;
which is allocated once when entered for the first time.
However perf_pmu__create_placeholder_core_pmu() is actually called
two times.
First time:
run_pmu_scan()
+--> save_result()
+---> perf_pmus__scan_core()
+--> pmu_read_sysfs()
+--> perf_pmu__create_placeholder_core_pmu()
...
+--> perf_pmus__destroy()
Second time:
run_pmu_scan()
+--> perf_pmus__scan()
+--> pmu_read_sysfs()
+--> perf_pmu__create_placeholder_core_pmu()
...
+--> perf_pmus__destroy()
The second time the already allocated memory pointed to by variable
'online' is returned. However in between the first and second call
of perf_pmu__create_placeholder_core_pmu()
function save_result() also frees all PMUs:
save_result()
+--> perf_pmus__destroy()
+--> perf_pmu__delete()
+--> perf_cpu_map__put()
+--> cpu_map__delete()
cpu_map__delete() deletes the perf_cpu_map pointed to by variable
online, but this static variable is not set to NULL. In the second
invocation of perf_pmu__create_placeholder_core_pmu() the same
memory locattion stored in variable online is returned.
Later on run_pmu_scan() calls perf_pmus__destroy() again and then
cpu_map__delete() frees the PMU "cpu->cpus" a second time causing
the core dump.
Avoid core dump and always allocate the online CPUs.
Output after:
# ./perf bench internals pmu-scan
# Running 'internals/pmu-scan' benchmark:
Computing performance of sysfs PMU event scan for 100 times
Average core PMU scanning took: 7.970 usec (+- 0.147 usec)
Average PMU scanning took: 60.415 usec (+- 3.986 usec)
#
Background: s390 z/VM system do not support PMUs for sampling and
counting. In this case dummy events are created by the perf tool
and the PMUs "tool" and "fake" are created and freed.
Fixes: a0c41caebab2f ("perf pmu: Add CPU map for "cpu" PMUs")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
---
tools/perf/util/cpumap.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 5c329ad614e9..ab9e7a266af9 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -691,12 +691,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */
{
- static struct perf_cpu_map *online;
-
- if (!online)
- online = perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */
-
- return online;
+ return perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */
}
bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b)
--
2.48.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] perf/bench: Fix perf bench internals pmu-scan core dump
2025-03-18 9:51 ` [PATCH 1/2] perf/bench: Fix perf bench internals pmu-scan core dump Thomas Richter
@ 2025-03-18 16:28 ` Ian Rogers
0 siblings, 0 replies; 5+ messages in thread
From: Ian Rogers @ 2025-03-18 16:28 UTC (permalink / raw)
To: Thomas Richter
Cc: linux-kernel, linux-s390, linux-perf-users, acme, namhyung, acme,
agordeev, gor, sumanthk, hca, James Clark
On Tue, Mar 18, 2025 at 2:52 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>
> On s390 z/VM systems the command 'perf bench internals pmu-scan'
> dumps core, as can be seen:
>
> # ./perf bench internals pmu-scan
> # Running 'internals/pmu-scan' benchmark:
> Computing performance of sysfs PMU event scan for 100 times
> double free or corruption (out)
> Aborted (core dumped)
> # gdb ./perf core.xxxx
> ....
> #9 0x00000000012fb57a in perf_pmu__delete (pmu=0x160e370 <tool>)
> at util/pmu.c:2318
> #10 0x00000000012fbfca in perf_pmus__destroy () at util/pmus.c:103
> #11 0x0000000001186f72 in save_result () at bench/pmu-scan.c:71
> #12 0x00000000011873c2 in run_pmu_scan () at bench/pmu-scan.c:140
> #13 0x00000000011876a8 in bench_pmu_scan (argc=0, argv=0x3fff3a77338)
> at bench/pmu-scan.c:183
> #14 0x0000000001174556 in run_bench (coll_name=0x14709ba "internals",
> bench_name=0x1470700 "pmu-scan", fn=0x1187620 <bench_pmu_scan>,
> argc=1, argv=0x3fff3a77338) at builtin-bench.c:229
> #15 0x0000000001174a1e in cmd_bench (argc=2, argv=0x3fff3a77330)
> at builtin-bench.c:330
> ...
>
> The root cause is in PMU buildup. The PMUs are constructed via
>
> run_bench()
> +--> bench_pmu_scan()
> +--> run_pmu_scan()
> +--> save_result()
> +--> perf_pmus__scan()
> +--> pmu_read_sysfs()
> +--> perf_pmus__tool_pmu()
>
> perf_pmus__tool_pmu() returns a pointer to a static defined variable:
>
> static struct perf_pmu tool = {
> .name = "tool",
> .type = PERF_PMU_TYPE_TOOL,
> .aliases = LIST_HEAD_INIT(tool.aliases),
> .caps = LIST_HEAD_INIT(tool.caps),
> .format = LIST_HEAD_INIT(tool.format),
> };
>
> and that PMU is added to the list of other_cpus in file
> ./util/pmus.c, function pmu_read_sysfs().
>
> Later on after the list of PMUs is constructed,
> that list is removed again via:
>
> save_result()
> +--> perf_pmus__destroy()
> +--> perf_pmu__delete()
>
> This works fine until the PMU named "tool" is deleted.
> Its name is a constant pointer possibly located in read-only data
> section and can not be freed using zfree().
>
> Remedy this and check for dynamic memory allocation for the PMU.
>
> Background: s390 z/VM system do not support PMUs for sampling and
> counting. In this case dummy events are created by the perf tool
> and the PMUs "tool" and "fake" are created and freed.
>
> Fixes: efe98a7a3977 ("perf pmu: Use zfree() to reduce chances of use after free")
> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Ian Rogers <irogers@google.com>
Hi Thomas,
Was this already addressed by James' patch:
https://lore.kernel.org/linux-perf-users/20250226104111.564443-1-james.clark@linaro.org/
Thanks,
Ian
> ---
> tools/perf/util/pmu.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 6206c8fe2bf9..59cec4d2909e 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -2315,10 +2315,13 @@ void perf_pmu__delete(struct perf_pmu *pmu)
>
> perf_cpu_map__put(pmu->cpus);
>
> - zfree(&pmu->name);
> - zfree(&pmu->alias_name);
> - zfree(&pmu->id);
> - free(pmu);
> + /* Static variables can not be free'ed */
> + if (pmu->type != PERF_PMU_TYPE_TOOL && pmu->type != PERF_PMU_TYPE_FAKE) {
> + zfree(&pmu->alias_name);
> + zfree(&pmu->id);
> + zfree(&pmu->name);
> + free(pmu);
> + }
> }
>
> const char *perf_pmu__name_from_config(struct perf_pmu *pmu, u64 config)
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] perf/bench: Double free of dynamic allocated memory
2025-03-18 9:51 ` [PATCH 2/2] perf/bench: Double free of dynamic allocated memory Thomas Richter
@ 2025-03-18 16:37 ` Ian Rogers
0 siblings, 0 replies; 5+ messages in thread
From: Ian Rogers @ 2025-03-18 16:37 UTC (permalink / raw)
To: Thomas Richter
Cc: linux-kernel, linux-s390, linux-perf-users, acme, namhyung, acme,
agordeev, gor, sumanthk, hca
On Tue, Mar 18, 2025 at 2:52 AM Thomas Richter <tmricht@linux.ibm.com> wrote:
>
> On s390 z/VM the command 'perf bench internals pmu-scan'
> dumps core, as can be seen:
>
> Output before:
> # ./perf bench internals pmu-scan
> # Running 'internals/pmu-scan' benchmark:
> Computing performance of sysfs PMU event scan for 100 times
> perf: /root/linux/tools/include/linux/refcount.h:131:
> refcount_sub_and_test: Assertion `!(new > val)' failed.
> Aborted (core dumped)
> #
>
> The root cause is in
>
> perf_pmus__scan()
> +--> perf_pmu__create_placeholder_core_pmu()
> +--> cpu_map__online()
>
> cpu_map__online() has a static variable
>
> static struct perf_cpu_map *online;
>
> if (!online)
> online = perf_cpu_map__new_online_cpus();
>
> return online;
>
> which is allocated once when entered for the first time.
>
> However perf_pmu__create_placeholder_core_pmu() is actually called
> two times.
> First time:
> run_pmu_scan()
> +--> save_result()
> +---> perf_pmus__scan_core()
> +--> pmu_read_sysfs()
> +--> perf_pmu__create_placeholder_core_pmu()
> ...
> +--> perf_pmus__destroy()
>
> Second time:
> run_pmu_scan()
> +--> perf_pmus__scan()
> +--> pmu_read_sysfs()
> +--> perf_pmu__create_placeholder_core_pmu()
> ...
> +--> perf_pmus__destroy()
>
> The second time the already allocated memory pointed to by variable
> 'online' is returned. However in between the first and second call
> of perf_pmu__create_placeholder_core_pmu()
> function save_result() also frees all PMUs:
>
> save_result()
> +--> perf_pmus__destroy()
> +--> perf_pmu__delete()
> +--> perf_cpu_map__put()
> +--> cpu_map__delete()
>
> cpu_map__delete() deletes the perf_cpu_map pointed to by variable
> online, but this static variable is not set to NULL. In the second
> invocation of perf_pmu__create_placeholder_core_pmu() the same
> memory locattion stored in variable online is returned.
>
> Later on run_pmu_scan() calls perf_pmus__destroy() again and then
> cpu_map__delete() frees the PMU "cpu->cpus" a second time causing
> the core dump.
>
> Avoid core dump and always allocate the online CPUs.
>
> Output after:
> # ./perf bench internals pmu-scan
> # Running 'internals/pmu-scan' benchmark:
> Computing performance of sysfs PMU event scan for 100 times
> Average core PMU scanning took: 7.970 usec (+- 0.147 usec)
> Average PMU scanning took: 60.415 usec (+- 3.986 usec)
> #
>
> Background: s390 z/VM system do not support PMUs for sampling and
> counting. In this case dummy events are created by the perf tool
> and the PMUs "tool" and "fake" are created and freed.
>
> Fixes: a0c41caebab2f ("perf pmu: Add CPU map for "cpu" PMUs")
> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
> Cc: Ian Rogers <irogers@google.com>
Thanks Thomas, the proposed fix breaks address sanitizer as there are
missing puts on the freshly allocated cpu maps. I think a better fix
would be to "get" the cpu map to increment the reference count rather
than to keep re-reading the online CPUs. In either case we need the
puts on the user side.
Ian
> ---
> tools/perf/util/cpumap.c | 7 +------
> 1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
> index 5c329ad614e9..ab9e7a266af9 100644
> --- a/tools/perf/util/cpumap.c
> +++ b/tools/perf/util/cpumap.c
> @@ -691,12 +691,7 @@ size_t cpu_map__snprint_mask(struct perf_cpu_map *map, char *buf, size_t size)
>
> struct perf_cpu_map *cpu_map__online(void) /* thread unsafe */
> {
> - static struct perf_cpu_map *online;
> -
> - if (!online)
> - online = perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */
> -
> - return online;
> + return perf_cpu_map__new_online_cpus(); /* from /sys/devices/system/cpu/online */
> }
>
> bool aggr_cpu_id__equal(const struct aggr_cpu_id *a, const struct aggr_cpu_id *b)
> --
> 2.48.1
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-03-18 16:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-18 9:51 [PATCH 0/2] perf bench: Fix core dumps on s390 z/VM Thomas Richter
2025-03-18 9:51 ` [PATCH 1/2] perf/bench: Fix perf bench internals pmu-scan core dump Thomas Richter
2025-03-18 16:28 ` Ian Rogers
2025-03-18 9:51 ` [PATCH 2/2] perf/bench: Double free of dynamic allocated memory Thomas Richter
2025-03-18 16:37 ` Ian Rogers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).