* Re: [PATCH v2] perf sched stats: Fix segmentation faults in diff mode
2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
@ 2026-04-29 20:18 ` sashiko-bot
2026-04-30 13:29 ` James Clark
` (2 subsequent siblings)
3 siblings, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-04-29 20:18 UTC (permalink / raw)
To: Ian Rogers; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] Stale `ds2` pointer leads to invalid domain diff comparisons.
- [High] Missing NULL check for `summary_head->cpu_data` allocation leads to a segmentation fault.
- [High] Memory leaks on error paths in `get_all_cpu_stats()`.
- [High] Failure of `get_all_cpu_stats()` is silently ignored, leading to data corruption.
- [Medium] Commit message falsely claims fixing an uninitialized variable.
--
commit 9defc7ef4fdb95729f7675ddb4788cb55728c905
Author: Ian Rogers <irogers@google.com>
Subject: perf sched stats: Fix segmentation faults in diff mode
The patch addresses multiple segmentation fault vectors and out-of-bounds
reads in perf sched stats diff by adding bounds checks, NULL checks, list
termination checks, and zero-initialization.
> 1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
> unsafe list_first_entry() calls on empty lists, fixed uninitialized
> variable ret, and added list_is_last check when iterating domains to
This isn't a bug, but the commit message mentions fixing an uninitialized
variable ret. Looking at the original code, ret was explicitly initialized
to zero (int ret = 0;).
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..e67241415255 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
[ ... ]
> @@ -4170,40 +4170,40 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
> */
> static int get_all_cpu_stats(struct list_head *head)
> {
[ ... ]
> - if (cptr) {
> - summary_head = zalloc(sizeof(*summary_head));
> - if (!summary_head)
> - return -ENOMEM;
> + summary_head = zalloc(sizeof(*summary_head));
> + if (!summary_head)
> + return -ENOMEM;
>
> - summary_head->cpu_data = zalloc(sizeof(*cs));
> - memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> + summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> + memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
If zalloc() for summary_head->cpu_data fails and returns NULL, could this
memcpy() dereference the NULL pointer and cause a crash?
>
> - INIT_LIST_HEAD(&summary_head->domain_head);
> + INIT_LIST_HEAD(&summary_head->domain_head);
>
> - list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> - tdptr = zalloc(sizeof(*tdptr));
> - if (!tdptr)
> - return -ENOMEM;
> + list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> + struct schedstat_domain *tdptr = zalloc(sizeof(*tdptr));
>
> - tdptr->domain_data = zalloc(sizeof(*ds));
> - if (!tdptr->domain_data)
> - return -ENOMEM;
> + if (!tdptr)
> + return -ENOMEM;
> +
> + tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> + if (!tdptr->domain_data)
> + return -ENOMEM;
If either of these zalloc() calls fails and the function returns -ENOMEM,
what happens to the memory previously allocated for summary_head and
summary_head->cpu_data? It seems they might leak since they haven't been
attached to the list yet.
Additionally, does the caller of get_all_cpu_stats() properly handle an
-ENOMEM return value? In show_schedstat_data(), if this error is ignored,
the summary node won't be prepended, which might cause the subsequent loop
to treat the first CPU's actual data as the summary.
[ ... ]
> @@ -4302,10 +4339,30 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>
> ds1 = dptr1->domain_data;
[ ... ]
> if (dptr2) {
> ds2 = dptr2->domain_data;
[ ... ]
> @@ -4334,14 +4391,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> print_domain_stats(ds1, ds2, jiffies1, jiffies2);
> print_separator2(SEP_LEN, "", 0);
>
> - if (dptr2)
> - dptr2 = list_next_entry(dptr2, domain_list);
> + if (dptr2) {
> + if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> + dptr2 = NULL;
> + else
> + dptr2 = list_next_entry(dptr2, domain_list);
> + }
When dptr2 is explicitly set to NULL at the end of the second file's domain
list, the next iteration of the loop will skip the ds2 = dptr2->domain_data
assignment.
Does this leave ds2 holding the pointer from the previous iteration, causing
print_domain_stats() to perform a diff comparison against a stale, unrelated
domain?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260429173931.2700115-1-irogers@google.com?part=1
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v2] perf sched stats: Fix segmentation faults in diff mode
2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
2026-04-29 20:18 ` sashiko-bot
@ 2026-04-30 13:29 ` James Clark
2026-05-01 15:16 ` Athira Rajeev
2026-05-06 0:00 ` [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers " Ian Rogers
3 siblings, 0 replies; 14+ messages in thread
From: James Clark @ 2026-04-30 13:29 UTC (permalink / raw)
To: Ian Rogers
Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
venkat88, acme, atrajeev, namhyung
On 29/04/2026 6:39 pm, Ian Rogers wrote:
> Address several segmentation fault vectors in `perf sched stats diff`:
>
> 1. In `get_all_cpu_stats()`, added `assert(!list_empty(head))` to prevent
> unsafe `list_first_entry()` calls on empty lists, fixed uninitialized
> variable `ret`, and added `list_is_last` check when iterating domains to
> prevent out-of-bounds reads during parallel list traversal.
>
> 2. In `show_schedstat_data()`, added NULL checks for `cd_map1` and `cd_map2`
> to gracefully handle invalid or empty data files.
>
> 3. Added parallel iteration termination checks using `list_is_last()` in
> `show_schedstat_data()` for both domain and CPU lists to safely terminate
> at the end of each list when files contain a different number of CPUs
> or domains.
>
> 4. Added CPU bounds checks (`cs1->cpu >= nr1` and `cs2->cpu >= nr2`) in
> `show_schedstat_data()` to prevent out-of-bounds reads from `cd_map1` and
> `cd_map2` when comparing files from machines with different CPU counts.
>
> 5. Added NULL checks for `cd_info1` and `cd_info2` to prevent crashes when
> a CPU has data samples but no corresponding domain info in the header.
>
> 6. Added domain bounds checks (`ds1->domain >= cd_info1->nr_domains` and
> `ds2->domain >= cd_info2->nr_domains`) to prevent out-of-bounds array
> accesses in the domains array.
>
> 7. Added NULL checks for `dinfo1` and `dinfo2` in `show_schedstat_data()`
> to prevent crashes when a domain has no corresponding domain info.
>
> 8. Zero-initialized the `perf_data` array in `perf_sched__schedstat_diff()`
> to prevent stack garbage from causing `perf_data_file__fd()` to attempt
> to use a NULL `fptr` when `use_stdio` happened to be non-zero.
>
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> v2: Reduce indentation and variable scopes. Hopefully address feedback
> from James Clark.
>
> Previously this patch was part of a large perf script refactor:
> https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/
> ---
> tools/perf/builtin-sched.c | 144 +++++++++++++++++++++++++++----------
> 1 file changed, 106 insertions(+), 38 deletions(-)
>
Reviewed-by: James Clark <james.clark@linaro.org>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..e67241415255 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -4170,40 +4170,40 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
> */
> static int get_all_cpu_stats(struct list_head *head)
> {
> - struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> - struct schedstat_cpu *summary_head = NULL;
> - struct perf_record_schedstat_domain *ds;
> - struct perf_record_schedstat_cpu *cs;
> - struct schedstat_domain *dptr, *tdptr;
> + struct schedstat_cpu *cptr, *summary_head;
> + struct schedstat_domain *dptr;
> bool is_last = false;
> int cnt = 1;
> - int ret = 0;
>
> - if (cptr) {
> - summary_head = zalloc(sizeof(*summary_head));
> - if (!summary_head)
> - return -ENOMEM;
> + assert(!list_empty(head));
> + cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
>
> - summary_head->cpu_data = zalloc(sizeof(*cs));
> - memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> + summary_head = zalloc(sizeof(*summary_head));
> + if (!summary_head)
> + return -ENOMEM;
>
> - INIT_LIST_HEAD(&summary_head->domain_head);
> + summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> + memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
>
> - list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> - tdptr = zalloc(sizeof(*tdptr));
> - if (!tdptr)
> - return -ENOMEM;
> + INIT_LIST_HEAD(&summary_head->domain_head);
>
> - tdptr->domain_data = zalloc(sizeof(*ds));
> - if (!tdptr->domain_data)
> - return -ENOMEM;
> + list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> + struct schedstat_domain *tdptr = zalloc(sizeof(*tdptr));
>
> - memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
> - list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> - }
> + if (!tdptr)
> + return -ENOMEM;
> +
> + tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> + if (!tdptr->domain_data)
> + return -ENOMEM;
> +
> + memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
> + list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> }
>
> list_for_each_entry(cptr, head, cpu_list) {
> + struct schedstat_domain *tdptr;
> +
> if (list_is_first(&cptr->cpu_list, head))
> continue;
>
> @@ -4212,32 +4212,47 @@ static int get_all_cpu_stats(struct list_head *head)
>
> cnt++;
> summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> + if (list_empty(&summary_head->domain_head))
> + continue;
> +
> tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> domain_list);
>
> list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> + if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
> + tdptr = NULL;
> + break;
> + }
> tdptr = list_next_entry(tdptr, domain_list);
> }
> }
>
> list_add(&summary_head->cpu_list, head);
> - return ret;
> + return 0;
> }
>
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> - struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> + struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
> bool summary_only)
> {
> struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
> struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
> - struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
> struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
> struct schedstat_cpu *cptr2 = NULL;
> __u64 jiffies1 = 0, jiffies2 = 0;
> bool is_summary = true;
> int ret = 0;
>
> + if (!cd_map1) {
> + pr_err("Error: CPU domain map 1 is missing.\n");
> + return -1;
> + }
> + if (head2 && !cd_map2) {
> + pr_err("Error: CPU domain map 2 is missing.\n");
> + return -1;
> + }
> +
> printf("Description\n");
> print_separator2(SEP_LEN, "", 0);
> printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4267,14 +4282,36 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>
> list_for_each_entry(cptr1, head1, cpu_list) {
> struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> + struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
> + struct perf_record_schedstat_cpu *cs2 = NULL;
>
> - cs1 = cptr1->cpu_data;
> + dptr2 = NULL;
> + if (cs1->cpu >= (u32)nr1) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> + return -1;
> + }
> cd_info1 = cd_map1[cs1->cpu];
> + if (!cd_info1) {
> + pr_err("Error: CPU %d domain info is missing in map 1.\n",
> + cs1->cpu);
> + return -1;
> + }
> if (cptr2) {
> cs2 = cptr2->cpu_data;
> + if (cs2->cpu >= (u32)nr2) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> + return -1;
> + }
> cd_info2 = cd_map2[cs2->cpu];
> - dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> - domain_list);
> + if (!cd_info2) {
> + pr_err("Error: CPU %d domain info is missing in map 2.\n",
> + cs2->cpu);
> + return -1;
> + }
> + if (!list_empty(&cptr2->domain_head))
> + dptr2 = list_first_entry(&cptr2->domain_head,
> + struct schedstat_domain,
> + domain_list);
> }
>
> if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4339,30 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>
> ds1 = dptr1->domain_data;
> + if (ds1->domain >= cd_info1->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
> + ds1->domain, cd_info1->nr_domains, cs1->cpu);
> + return -1;
> + }
> dinfo1 = cd_info1->domains[ds1->domain];
> + if (!dinfo1) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
> + ds1->domain, cs1->cpu);
> + return -1;
> + }
> if (dptr2) {
> ds2 = dptr2->domain_data;
> + if (ds2->domain >= cd_info2->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
> + ds2->domain, cd_info2->nr_domains, cs2->cpu);
> + return -1;
> + }
> dinfo2 = cd_info2->domains[ds2->domain];
> + if (!dinfo2) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
> + ds2->domain, cs2->cpu);
> + return -1;
> + }
> }
>
> if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4391,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> print_domain_stats(ds1, ds2, jiffies1, jiffies2);
> print_separator2(SEP_LEN, "", 0);
>
> - if (dptr2)
> - dptr2 = list_next_entry(dptr2, domain_list);
> + if (dptr2) {
> + if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> + dptr2 = NULL;
> + else
> + dptr2 = list_next_entry(dptr2, domain_list);
> + }
> }
> if (summary_only)
> break;
>
> - if (cptr2)
> - cptr2 = list_next_entry(cptr2, cpu_list);
> + if (cptr2) {
> + if (list_is_last(&cptr2->cpu_list, head2))
> + cptr2 = NULL;
> + else
> + cptr2 = list_next_entry(cptr2, cpu_list);
> + }
>
> is_summary = false;
> }
> @@ -4523,7 +4588,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
> }
>
> cd_map = session->header.env.cpu_domain;
> - err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + err = show_schedstat_data(&cpu_head, cd_map,
> + session->header.env.nr_cpus_avail,
> + NULL, NULL, 0, false);
> }
>
> out:
> @@ -4538,7 +4605,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
> struct list_head cpu_head_ses0, cpu_head_ses1;
> struct perf_session *session[2];
> - struct perf_data data[2];
> + struct perf_data data[2] = {0};
> int ret = 0, err = 0;
> static const char *defaults[] = {
> "perf.data.old",
> @@ -4610,7 +4677,8 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> goto out_delete_ses0;
> }
>
> - show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> + show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> + &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
>
> out_delete_ses1:
> free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4788,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
> goto out;
> }
>
> - show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
> free_cpu_domain_info(cd_map, sv, nr);
> out:
> free_schedstat(&cpu_head);
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v2] perf sched stats: Fix segmentation faults in diff mode
2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
2026-04-29 20:18 ` sashiko-bot
2026-04-30 13:29 ` James Clark
@ 2026-05-01 15:16 ` Athira Rajeev
2026-05-06 0:00 ` [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers " Ian Rogers
3 siblings, 0 replies; 14+ messages in thread
From: Athira Rajeev @ 2026-05-01 15:16 UTC (permalink / raw)
To: Ian Rogers
Cc: acme, james.clark, namhyung, adrian.hunter, linux-kernel,
linux-perf-users, mingo, peterz, venkat88
> On 29 Apr 2026, at 11:09 PM, Ian Rogers <irogers@google.com> wrote:
>
> Address several segmentation fault vectors in `perf sched stats diff`:
>
> 1. In `get_all_cpu_stats()`, added `assert(!list_empty(head))` to prevent
> unsafe `list_first_entry()` calls on empty lists, fixed uninitialized
> variable `ret`, and added `list_is_last` check when iterating domains to
> prevent out-of-bounds reads during parallel list traversal.
>
> 2. In `show_schedstat_data()`, added NULL checks for `cd_map1` and `cd_map2`
> to gracefully handle invalid or empty data files.
>
> 3. Added parallel iteration termination checks using `list_is_last()` in
> `show_schedstat_data()` for both domain and CPU lists to safely terminate
> at the end of each list when files contain a different number of CPUs
> or domains.
>
> 4. Added CPU bounds checks (`cs1->cpu >= nr1` and `cs2->cpu >= nr2`) in
> `show_schedstat_data()` to prevent out-of-bounds reads from `cd_map1` and
> `cd_map2` when comparing files from machines with different CPU counts.
>
> 5. Added NULL checks for `cd_info1` and `cd_info2` to prevent crashes when
> a CPU has data samples but no corresponding domain info in the header.
>
> 6. Added domain bounds checks (`ds1->domain >= cd_info1->nr_domains` and
> `ds2->domain >= cd_info2->nr_domains`) to prevent out-of-bounds array
> accesses in the domains array.
>
> 7. Added NULL checks for `dinfo1` and `dinfo2` in `show_schedstat_data()`
> to prevent crashes when a domain has no corresponding domain info.
>
> 8. Zero-initialized the `perf_data` array in `perf_sched__schedstat_diff()`
> to prevent stack garbage from causing `perf_data_file__fd()` to attempt
> to use a NULL `fptr` when `use_stdio` happened to be non-zero.
>
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> v2: Reduce indentation and variable scopes. Hopefully address feedback
> from James Clark.
>
> Previously this patch was part of a large perf script refactor:
> https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/
Tested-by: Athira Rajeev <atrajeev@linux.ibm.com <mailto:atrajeev@linux.ibm.com>>
Thanks
Athira
> ---
> tools/perf/builtin-sched.c | 144 +++++++++++++++++++++++++++----------
> 1 file changed, 106 insertions(+), 38 deletions(-)
>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..e67241415255 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -4170,40 +4170,40 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
> */
> static int get_all_cpu_stats(struct list_head *head)
> {
> - struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> - struct schedstat_cpu *summary_head = NULL;
> - struct perf_record_schedstat_domain *ds;
> - struct perf_record_schedstat_cpu *cs;
> - struct schedstat_domain *dptr, *tdptr;
> + struct schedstat_cpu *cptr, *summary_head;
> + struct schedstat_domain *dptr;
> bool is_last = false;
> int cnt = 1;
> - int ret = 0;
>
> - if (cptr) {
> - summary_head = zalloc(sizeof(*summary_head));
> - if (!summary_head)
> - return -ENOMEM;
> + assert(!list_empty(head));
> + cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
>
> - summary_head->cpu_data = zalloc(sizeof(*cs));
> - memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> + summary_head = zalloc(sizeof(*summary_head));
> + if (!summary_head)
> + return -ENOMEM;
>
> - INIT_LIST_HEAD(&summary_head->domain_head);
> + summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> + memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
>
> - list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> - tdptr = zalloc(sizeof(*tdptr));
> - if (!tdptr)
> - return -ENOMEM;
> + INIT_LIST_HEAD(&summary_head->domain_head);
>
> - tdptr->domain_data = zalloc(sizeof(*ds));
> - if (!tdptr->domain_data)
> - return -ENOMEM;
> + list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> + struct schedstat_domain *tdptr = zalloc(sizeof(*tdptr));
>
> - memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
> - list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> - }
> + if (!tdptr)
> + return -ENOMEM;
> +
> + tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> + if (!tdptr->domain_data)
> + return -ENOMEM;
> +
> + memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
> + list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> }
>
> list_for_each_entry(cptr, head, cpu_list) {
> + struct schedstat_domain *tdptr;
> +
> if (list_is_first(&cptr->cpu_list, head))
> continue;
>
> @@ -4212,32 +4212,47 @@ static int get_all_cpu_stats(struct list_head *head)
>
> cnt++;
> summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> + if (list_empty(&summary_head->domain_head))
> + continue;
> +
> tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> domain_list);
>
> list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> + if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
> + tdptr = NULL;
> + break;
> + }
> tdptr = list_next_entry(tdptr, domain_list);
> }
> }
>
> list_add(&summary_head->cpu_list, head);
> - return ret;
> + return 0;
> }
>
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> - struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> + struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
> bool summary_only)
> {
> struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
> struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
> - struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
> struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
> struct schedstat_cpu *cptr2 = NULL;
> __u64 jiffies1 = 0, jiffies2 = 0;
> bool is_summary = true;
> int ret = 0;
>
> + if (!cd_map1) {
> + pr_err("Error: CPU domain map 1 is missing.\n");
> + return -1;
> + }
> + if (head2 && !cd_map2) {
> + pr_err("Error: CPU domain map 2 is missing.\n");
> + return -1;
> + }
> +
> printf("Description\n");
> print_separator2(SEP_LEN, "", 0);
> printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4267,14 +4282,36 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>
> list_for_each_entry(cptr1, head1, cpu_list) {
> struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> + struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
> + struct perf_record_schedstat_cpu *cs2 = NULL;
>
> - cs1 = cptr1->cpu_data;
> + dptr2 = NULL;
> + if (cs1->cpu >= (u32)nr1) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> + return -1;
> + }
> cd_info1 = cd_map1[cs1->cpu];
> + if (!cd_info1) {
> + pr_err("Error: CPU %d domain info is missing in map 1.\n",
> + cs1->cpu);
> + return -1;
> + }
> if (cptr2) {
> cs2 = cptr2->cpu_data;
> + if (cs2->cpu >= (u32)nr2) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> + return -1;
> + }
> cd_info2 = cd_map2[cs2->cpu];
> - dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> - domain_list);
> + if (!cd_info2) {
> + pr_err("Error: CPU %d domain info is missing in map 2.\n",
> + cs2->cpu);
> + return -1;
> + }
> + if (!list_empty(&cptr2->domain_head))
> + dptr2 = list_first_entry(&cptr2->domain_head,
> + struct schedstat_domain,
> + domain_list);
> }
>
> if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4339,30 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>
> ds1 = dptr1->domain_data;
> + if (ds1->domain >= cd_info1->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
> + ds1->domain, cd_info1->nr_domains, cs1->cpu);
> + return -1;
> + }
> dinfo1 = cd_info1->domains[ds1->domain];
> + if (!dinfo1) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
> + ds1->domain, cs1->cpu);
> + return -1;
> + }
> if (dptr2) {
> ds2 = dptr2->domain_data;
> + if (ds2->domain >= cd_info2->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
> + ds2->domain, cd_info2->nr_domains, cs2->cpu);
> + return -1;
> + }
> dinfo2 = cd_info2->domains[ds2->domain];
> + if (!dinfo2) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
> + ds2->domain, cs2->cpu);
> + return -1;
> + }
> }
>
> if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4391,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> print_domain_stats(ds1, ds2, jiffies1, jiffies2);
> print_separator2(SEP_LEN, "", 0);
>
> - if (dptr2)
> - dptr2 = list_next_entry(dptr2, domain_list);
> + if (dptr2) {
> + if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> + dptr2 = NULL;
> + else
> + dptr2 = list_next_entry(dptr2, domain_list);
> + }
> }
> if (summary_only)
> break;
>
> - if (cptr2)
> - cptr2 = list_next_entry(cptr2, cpu_list);
> + if (cptr2) {
> + if (list_is_last(&cptr2->cpu_list, head2))
> + cptr2 = NULL;
> + else
> + cptr2 = list_next_entry(cptr2, cpu_list);
> + }
>
> is_summary = false;
> }
> @@ -4523,7 +4588,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
> }
>
> cd_map = session->header.env.cpu_domain;
> - err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + err = show_schedstat_data(&cpu_head, cd_map,
> + session->header.env.nr_cpus_avail,
> + NULL, NULL, 0, false);
> }
>
> out:
> @@ -4538,7 +4605,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
> struct list_head cpu_head_ses0, cpu_head_ses1;
> struct perf_session *session[2];
> - struct perf_data data[2];
> + struct perf_data data[2] = {0};
> int ret = 0, err = 0;
> static const char *defaults[] = {
> "perf.data.old",
> @@ -4610,7 +4677,8 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> goto out_delete_ses0;
> }
>
> - show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> + show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> + &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
>
> out_delete_ses1:
> free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4788,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
> goto out;
> }
>
> - show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
> free_cpu_domain_info(cd_map, sv, nr);
> out:
> free_schedstat(&cpu_head);
> --
> 2.54.0.545.g6539524ca2-goog
>
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
` (2 preceding siblings ...)
2026-05-01 15:16 ` Athira Rajeev
@ 2026-05-06 0:00 ` Ian Rogers
2026-05-06 0:28 ` sashiko-bot
2026-05-06 4:10 ` [PATCH v4] " Ian Rogers
3 siblings, 2 replies; 14+ messages in thread
From: Ian Rogers @ 2026-05-06 0:00 UTC (permalink / raw)
To: irogers, acme, atrajeev, james.clark, namhyung
Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
venkat88
The patch addresses multiple segmentation fault vectors, out-of-bounds
reads, and memory leaks in perf sched stats by adding bounds checks,
NULL checks, proper error propagation, and robust memory cleanup.
1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
unsafe list_first_entry() calls on empty lists, added a missing NULL
check for summary_head->cpu_data allocation, and implemented a cleanup
ladder using a temporary list to prevent memory leaks on error paths.
2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
and cpu_data pointers are freed.
3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
its value from a previous iteration when dptr2 became NULL, and added
proper propagation of errors from get_all_cpu_stats().
4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
and perf_sched__schedstat_live() to prevent output corruption on failure.
5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
to gracefully handle invalid or empty data files.
6. Added parallel iteration termination checks using list_is_last() in
show_schedstat_data() for both domain and CPU lists to safely terminate
at the end of each list when files contain a different number of CPUs
or domains.
7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
cd_map2 when comparing files from machines with different CPU counts.
8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
a CPU has data samples but no corresponding domain info in the header.
9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
accesses in the domains array.
10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
to prevent crashes when a domain has no corresponding domain info.
11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
to prevent stack garbage from causing perf_data_file__fd() to attempt
to use a NULL fptr when use_stdio happened to be non-zero.
Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
v3:
Address sashiko feedback:
- Fix memory leaks in free_schedstat() by freeing internal data.
- Fix missing NULL check for summary_head->cpu_data in get_all_cpu_stats().
- Fix memory leaks in get_all_cpu_stats() on error paths using a temporary cleanup list.
- Fix stale ds2 pointer comparison in show_schedstat_data() by resetting it to NULL.
- Add error propagation from get_all_cpu_stats() and show_schedstat_data().
- Correct commit message to remove inaccurate claim about uninitialized variable 'ret'.
- Add forward declaration for free_schedstat() to fix compilation error.
v2: Had tested-by Athira Rajeev and reviewed-by James Clark, removed
from v3 given the number of changes.
https://lore.kernel.org/linux-perf-users/20260429173931.2700115-1-irogers@google.com/
---
tools/perf/builtin-sched.c | 165 +++++++++++++++++++++++++++++--------
1 file changed, 131 insertions(+), 34 deletions(-)
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..e39a30056dc5 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3947,6 +3947,8 @@ static struct schedstat_domain *domain_second_pass;
static bool after_workload_flag;
static bool verbose_field;
+static void free_schedstat(struct list_head *head);
+
static void store_schedstat_cpu_diff(struct schedstat_cpu *after_workload)
{
struct perf_record_schedstat_cpu *before = cpu_second_pass->cpu_data;
@@ -4170,37 +4172,50 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
*/
static int get_all_cpu_stats(struct list_head *head)
{
- struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
- struct schedstat_cpu *summary_head = NULL;
- struct perf_record_schedstat_domain *ds;
- struct perf_record_schedstat_cpu *cs;
+ struct schedstat_cpu *cptr, *summary_head = NULL;
struct schedstat_domain *dptr, *tdptr;
bool is_last = false;
int cnt = 1;
int ret = 0;
+ struct list_head tmp_cleanup_list;
- if (cptr) {
- summary_head = zalloc(sizeof(*summary_head));
- if (!summary_head)
- return -ENOMEM;
+ assert(!list_empty(head));
+ cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
- summary_head->cpu_data = zalloc(sizeof(*cs));
- memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
+ INIT_LIST_HEAD(&tmp_cleanup_list);
- INIT_LIST_HEAD(&summary_head->domain_head);
+ summary_head = zalloc(sizeof(*summary_head));
+ if (!summary_head)
+ return -ENOMEM;
- list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
- tdptr = zalloc(sizeof(*tdptr));
- if (!tdptr)
- return -ENOMEM;
+ INIT_LIST_HEAD(&summary_head->domain_head);
+ INIT_LIST_HEAD(&summary_head->cpu_list);
+ list_add(&summary_head->cpu_list, &tmp_cleanup_list);
- tdptr->domain_data = zalloc(sizeof(*ds));
- if (!tdptr->domain_data)
- return -ENOMEM;
+ summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
+ if (!summary_head->cpu_data) {
+ ret = -ENOMEM;
+ goto out_cleanup;
+ }
+ memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
- memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
- list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
+ list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
+ tdptr = zalloc(sizeof(*tdptr));
+ if (!tdptr) {
+ ret = -ENOMEM;
+ goto out_cleanup;
}
+ INIT_LIST_HEAD(&tdptr->domain_list);
+
+ tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
+ if (!tdptr->domain_data) {
+ free(tdptr);
+ ret = -ENOMEM;
+ goto out_cleanup;
+ }
+
+ memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
+ list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
}
list_for_each_entry(cptr, head, cpu_list) {
@@ -4212,32 +4227,52 @@ static int get_all_cpu_stats(struct list_head *head)
cnt++;
summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
+ if (list_empty(&summary_head->domain_head))
+ continue;
+
tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
domain_list);
list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
+ if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
+ tdptr = NULL;
+ break;
+ }
tdptr = list_next_entry(tdptr, domain_list);
}
}
+ list_del_init(&summary_head->cpu_list);
list_add(&summary_head->cpu_list, head);
+ return 0;
+
+out_cleanup:
+ free_schedstat(&tmp_cleanup_list);
return ret;
}
-static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
- struct list_head *head2, struct cpu_domain_map **cd_map2,
+static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
+ struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
bool summary_only)
{
struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
- struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
struct schedstat_cpu *cptr2 = NULL;
__u64 jiffies1 = 0, jiffies2 = 0;
bool is_summary = true;
int ret = 0;
+ if (!cd_map1) {
+ pr_err("Error: CPU domain map 1 is missing.\n");
+ return -1;
+ }
+ if (head2 && !cd_map2) {
+ pr_err("Error: CPU domain map 2 is missing.\n");
+ return -1;
+ }
+
printf("Description\n");
print_separator2(SEP_LEN, "", 0);
printf("%-30s-> %s\n", "DESC", "Description of the field");
@@ -4260,21 +4295,47 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
printf("\n");
ret = get_all_cpu_stats(head1);
+ if (ret)
+ return ret;
if (cptr2) {
ret = get_all_cpu_stats(head2);
+ if (ret)
+ return ret;
cptr2 = list_first_entry(head2, struct schedstat_cpu, cpu_list);
}
list_for_each_entry(cptr1, head1, cpu_list) {
struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
+ struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
+ struct perf_record_schedstat_cpu *cs2 = NULL;
- cs1 = cptr1->cpu_data;
+ dptr2 = NULL;
+ if (cs1->cpu >= (u32)nr1) {
+ pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
+ return -1;
+ }
cd_info1 = cd_map1[cs1->cpu];
+ if (!cd_info1) {
+ pr_err("Error: CPU %d domain info is missing in map 1.\n",
+ cs1->cpu);
+ return -1;
+ }
if (cptr2) {
cs2 = cptr2->cpu_data;
+ if (cs2->cpu >= (u32)nr2) {
+ pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
+ return -1;
+ }
cd_info2 = cd_map2[cs2->cpu];
- dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
- domain_list);
+ if (!cd_info2) {
+ pr_err("Error: CPU %d domain info is missing in map 2.\n",
+ cs2->cpu);
+ return -1;
+ }
+ if (!list_empty(&cptr2->domain_head))
+ dptr2 = list_first_entry(&cptr2->domain_head,
+ struct schedstat_domain,
+ domain_list);
}
if (cs2 && cs1->cpu != cs2->cpu) {
@@ -4302,10 +4363,31 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
ds1 = dptr1->domain_data;
+ ds2 = NULL;
+ if (ds1->domain >= cd_info1->nr_domains) {
+ pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
+ ds1->domain, cd_info1->nr_domains, cs1->cpu);
+ return -1;
+ }
dinfo1 = cd_info1->domains[ds1->domain];
+ if (!dinfo1) {
+ pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
+ ds1->domain, cs1->cpu);
+ return -1;
+ }
if (dptr2) {
ds2 = dptr2->domain_data;
+ if (ds2->domain >= cd_info2->nr_domains) {
+ pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
+ ds2->domain, cd_info2->nr_domains, cs2->cpu);
+ return -1;
+ }
dinfo2 = cd_info2->domains[ds2->domain];
+ if (!dinfo2) {
+ pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
+ ds2->domain, cs2->cpu);
+ return -1;
+ }
}
if (dinfo2 && dinfo1->domain != dinfo2->domain) {
@@ -4334,14 +4416,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
print_domain_stats(ds1, ds2, jiffies1, jiffies2);
print_separator2(SEP_LEN, "", 0);
- if (dptr2)
- dptr2 = list_next_entry(dptr2, domain_list);
+ if (dptr2) {
+ if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
+ dptr2 = NULL;
+ else
+ dptr2 = list_next_entry(dptr2, domain_list);
+ }
}
if (summary_only)
break;
- if (cptr2)
- cptr2 = list_next_entry(cptr2, cpu_list);
+ if (cptr2) {
+ if (list_is_last(&cptr2->cpu_list, head2))
+ cptr2 = NULL;
+ else
+ cptr2 = list_next_entry(cptr2, cpu_list);
+ }
is_summary = false;
}
@@ -4473,9 +4563,11 @@ static void free_schedstat(struct list_head *head)
list_for_each_entry_safe(cptr, n2, head, cpu_list) {
list_for_each_entry_safe(dptr, n1, &cptr->domain_head, domain_list) {
list_del_init(&dptr->domain_list);
+ free(dptr->domain_data);
free(dptr);
}
list_del_init(&cptr->cpu_list);
+ free(cptr->cpu_data);
free(cptr);
}
}
@@ -4523,7 +4615,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
}
cd_map = session->header.env.cpu_domain;
- err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+ err = show_schedstat_data(&cpu_head, cd_map,
+ session->header.env.nr_cpus_avail,
+ NULL, NULL, 0, false);
}
out:
@@ -4538,7 +4632,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
struct list_head cpu_head_ses0, cpu_head_ses1;
struct perf_session *session[2];
- struct perf_data data[2];
+ struct perf_data data[2] = {0};
int ret = 0, err = 0;
static const char *defaults[] = {
"perf.data.old",
@@ -4610,7 +4704,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
goto out_delete_ses0;
}
- show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
+ ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
+ &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
+ if (ret)
+ goto out_delete_ses1;
out_delete_ses1:
free_schedstat(&cpu_head_ses1);
@@ -4720,7 +4817,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
goto out;
}
- show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+ err = show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
free_cpu_domain_info(cd_map, sv, nr);
out:
free_schedstat(&cpu_head);
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
2026-05-06 0:00 ` [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers " Ian Rogers
@ 2026-05-06 0:28 ` sashiko-bot
2026-05-06 4:10 ` [PATCH v4] " Ian Rogers
1 sibling, 0 replies; 14+ messages in thread
From: sashiko-bot @ 2026-05-06 0:28 UTC (permalink / raw)
To: Ian Rogers; +Cc: linux-perf-users
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Medium] Resource and memory leak of the second perf session when the first session's data is empty.
- [Medium] Stranded elements in the global `cpu_head` list are leaked when `perf_session__process_events()` fails.
--
commit 2f5196e9db7b207b2b2f6bf8c143c68c7b431e3b
Author: Ian Rogers <irogers@google.com>
perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
This commit addresses multiple crash vectors and memory leaks in perf sched
stats diff mode by adding bounds checks, NULL checks, proper error
propagation, and robust memory cleanup.
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
[ ... ]
> @@ -4632,7 +4704,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> err = perf_session__process_events(session[0]);
> if (err)
> goto out_delete_ses0;
Is this code leaking the events added to the global cpu_head list?
The schedstat callbacks dynamically allocate and add parsed elements to the
global cpu_head list during perf_session__process_events(). However, the
transfer to cpu_head_ses0 via list_replace_init() hasn't happened yet at
this point. Since out_delete_ses0 only frees cpu_head_ses0, it looks like
any items added to cpu_head before the error occurred would be stranded.
The same pattern appears to exist for session[1] and cpu_head_ses1.
[ ... ]
> if (list_empty(&cpu_head_ses1)) {
> pr_err("Data is not available\n");
> ret = -1;
> goto out_delete_ses1;
> }
>
> if (list_empty(&cpu_head_ses0)) {
> pr_err("Data is not available\n");
> ret = -1;
> goto out_delete_ses0;
> }
Does jumping to out_delete_ses0 here bypass the cleanup for the second
session?
Since session[1] has already been successfully opened, parsed, and its
events transferred into cpu_head_ses1, skipping the out_delete_ses1 block
would mean free_schedstat(&cpu_head_ses1) and
perf_session__delete(session[1]) are never called.
> - show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> + ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> + &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
> + if (ret)
> + goto out_delete_ses1;
>
> out_delete_ses1:
> free_schedstat(&cpu_head_ses1);
> if (!IS_ERR(session[1]))
> perf_session__delete(session[1]);
>
> out_delete_ses0:
> free_schedstat(&cpu_head_ses0);
> if (!IS_ERR(session[0]))
> perf_session__delete(session[0]);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260506000018.3113599-1-irogers@google.com?part=1
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
2026-05-06 0:00 ` [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers " Ian Rogers
2026-05-06 0:28 ` sashiko-bot
@ 2026-05-06 4:10 ` Ian Rogers
2026-05-06 6:45 ` Ian Rogers
2026-05-06 9:07 ` James Clark
1 sibling, 2 replies; 14+ messages in thread
From: Ian Rogers @ 2026-05-06 4:10 UTC (permalink / raw)
To: irogers, acme, atrajeev, james.clark, namhyung
Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
venkat88
The patch addresses multiple segmentation fault vectors, out-of-bounds
reads, and memory leaks in perf sched stats by adding bounds checks,
NULL checks, proper error propagation, and robust memory cleanup.
1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
unsafe list_first_entry() calls on empty lists, added a missing NULL
check for summary_head->cpu_data allocation, and implemented a cleanup
ladder using a temporary list to prevent memory leaks on error paths.
2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
and cpu_data pointers are freed.
3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
its value from a previous iteration when dptr2 became NULL, and added
proper propagation of errors from get_all_cpu_stats().
4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
and perf_sched__schedstat_live() to prevent output corruption on failure.
5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
to gracefully handle invalid or empty data files.
6. Added parallel iteration termination checks using list_is_last() in
show_schedstat_data() for both domain and CPU lists to safely terminate
at the end of each list when files contain a different number of CPUs
or domains.
7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
cd_map2 when comparing files from machines with different CPU counts.
8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
a CPU has data samples but no corresponding domain info in the header.
9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
accesses in the domains array.
10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
to prevent crashes when a domain has no corresponding domain info.
11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
to prevent stack garbage from causing perf_data_file__fd() to attempt
to use a NULL fptr when use_stdio happened to be non-zero.
Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
v4:
Address further sashiko feedback:
- Fix memory leaks of global cpu_head list when perf_session__process_events() fails.
- Fix resource leak of second session by correcting cleanup order when first session's data is empty.
v3:
Address sashiko feedback:
- Fix memory leaks in free_schedstat() by freeing internal data.
- Fix missing NULL check for summary_head->cpu_data in get_all_cpu_stats().
- Fix memory leaks in get_all_cpu_stats() on error paths using a temporary cleanup list.
- Fix stale ds2 pointer comparison in show_schedstat_data() by resetting it to NULL.
- Add error propagation from get_all_cpu_stats() and show_schedstat_data().
- Correct commit message to remove inaccurate claim about uninitialized variable 'ret'.
- Add forward declaration for free_schedstat() to fix compilation error.
v2: Had tested-by Athira Rajeev and reviewed-by James Clark, removed
from v3 given the number of changes.
https://lore.kernel.org/linux-perf-users/20260429173931.2700115-1-irogers@google.com/
---
tools/perf/builtin-sched.c | 175 +++++++++++++++++++++++++++++--------
1 file changed, 138 insertions(+), 37 deletions(-)
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..fd679b106582 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3947,6 +3947,8 @@ static struct schedstat_domain *domain_second_pass;
static bool after_workload_flag;
static bool verbose_field;
+static void free_schedstat(struct list_head *head);
+
static void store_schedstat_cpu_diff(struct schedstat_cpu *after_workload)
{
struct perf_record_schedstat_cpu *before = cpu_second_pass->cpu_data;
@@ -4170,37 +4172,50 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
*/
static int get_all_cpu_stats(struct list_head *head)
{
- struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
- struct schedstat_cpu *summary_head = NULL;
- struct perf_record_schedstat_domain *ds;
- struct perf_record_schedstat_cpu *cs;
+ struct schedstat_cpu *cptr, *summary_head = NULL;
struct schedstat_domain *dptr, *tdptr;
bool is_last = false;
int cnt = 1;
int ret = 0;
+ struct list_head tmp_cleanup_list;
- if (cptr) {
- summary_head = zalloc(sizeof(*summary_head));
- if (!summary_head)
- return -ENOMEM;
+ assert(!list_empty(head));
+ cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
- summary_head->cpu_data = zalloc(sizeof(*cs));
- memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
+ INIT_LIST_HEAD(&tmp_cleanup_list);
- INIT_LIST_HEAD(&summary_head->domain_head);
+ summary_head = zalloc(sizeof(*summary_head));
+ if (!summary_head)
+ return -ENOMEM;
- list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
- tdptr = zalloc(sizeof(*tdptr));
- if (!tdptr)
- return -ENOMEM;
+ INIT_LIST_HEAD(&summary_head->domain_head);
+ INIT_LIST_HEAD(&summary_head->cpu_list);
+ list_add(&summary_head->cpu_list, &tmp_cleanup_list);
- tdptr->domain_data = zalloc(sizeof(*ds));
- if (!tdptr->domain_data)
- return -ENOMEM;
+ summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
+ if (!summary_head->cpu_data) {
+ ret = -ENOMEM;
+ goto out_cleanup;
+ }
+ memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
+
+ list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
+ tdptr = zalloc(sizeof(*tdptr));
+ if (!tdptr) {
+ ret = -ENOMEM;
+ goto out_cleanup;
+ }
+ INIT_LIST_HEAD(&tdptr->domain_list);
- memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
- list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
+ tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
+ if (!tdptr->domain_data) {
+ free(tdptr);
+ ret = -ENOMEM;
+ goto out_cleanup;
}
+
+ memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
+ list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
}
list_for_each_entry(cptr, head, cpu_list) {
@@ -4212,32 +4227,52 @@ static int get_all_cpu_stats(struct list_head *head)
cnt++;
summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
+ if (list_empty(&summary_head->domain_head))
+ continue;
+
tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
domain_list);
list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
+ if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
+ tdptr = NULL;
+ break;
+ }
tdptr = list_next_entry(tdptr, domain_list);
}
}
+ list_del_init(&summary_head->cpu_list);
list_add(&summary_head->cpu_list, head);
+ return 0;
+
+out_cleanup:
+ free_schedstat(&tmp_cleanup_list);
return ret;
}
-static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
- struct list_head *head2, struct cpu_domain_map **cd_map2,
+static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
+ struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
bool summary_only)
{
struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
- struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
struct schedstat_cpu *cptr2 = NULL;
__u64 jiffies1 = 0, jiffies2 = 0;
bool is_summary = true;
int ret = 0;
+ if (!cd_map1) {
+ pr_err("Error: CPU domain map 1 is missing.\n");
+ return -1;
+ }
+ if (head2 && !cd_map2) {
+ pr_err("Error: CPU domain map 2 is missing.\n");
+ return -1;
+ }
+
printf("Description\n");
print_separator2(SEP_LEN, "", 0);
printf("%-30s-> %s\n", "DESC", "Description of the field");
@@ -4260,21 +4295,47 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
printf("\n");
ret = get_all_cpu_stats(head1);
+ if (ret)
+ return ret;
if (cptr2) {
ret = get_all_cpu_stats(head2);
+ if (ret)
+ return ret;
cptr2 = list_first_entry(head2, struct schedstat_cpu, cpu_list);
}
list_for_each_entry(cptr1, head1, cpu_list) {
struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
+ struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
+ struct perf_record_schedstat_cpu *cs2 = NULL;
- cs1 = cptr1->cpu_data;
+ dptr2 = NULL;
+ if (cs1->cpu >= (u32)nr1) {
+ pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
+ return -1;
+ }
cd_info1 = cd_map1[cs1->cpu];
+ if (!cd_info1) {
+ pr_err("Error: CPU %d domain info is missing in map 1.\n",
+ cs1->cpu);
+ return -1;
+ }
if (cptr2) {
cs2 = cptr2->cpu_data;
+ if (cs2->cpu >= (u32)nr2) {
+ pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
+ return -1;
+ }
cd_info2 = cd_map2[cs2->cpu];
- dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
- domain_list);
+ if (!cd_info2) {
+ pr_err("Error: CPU %d domain info is missing in map 2.\n",
+ cs2->cpu);
+ return -1;
+ }
+ if (!list_empty(&cptr2->domain_head))
+ dptr2 = list_first_entry(&cptr2->domain_head,
+ struct schedstat_domain,
+ domain_list);
}
if (cs2 && cs1->cpu != cs2->cpu) {
@@ -4302,10 +4363,31 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
ds1 = dptr1->domain_data;
+ ds2 = NULL;
+ if (ds1->domain >= cd_info1->nr_domains) {
+ pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
+ ds1->domain, cd_info1->nr_domains, cs1->cpu);
+ return -1;
+ }
dinfo1 = cd_info1->domains[ds1->domain];
+ if (!dinfo1) {
+ pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
+ ds1->domain, cs1->cpu);
+ return -1;
+ }
if (dptr2) {
ds2 = dptr2->domain_data;
+ if (ds2->domain >= cd_info2->nr_domains) {
+ pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
+ ds2->domain, cd_info2->nr_domains, cs2->cpu);
+ return -1;
+ }
dinfo2 = cd_info2->domains[ds2->domain];
+ if (!dinfo2) {
+ pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
+ ds2->domain, cs2->cpu);
+ return -1;
+ }
}
if (dinfo2 && dinfo1->domain != dinfo2->domain) {
@@ -4334,14 +4416,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
print_domain_stats(ds1, ds2, jiffies1, jiffies2);
print_separator2(SEP_LEN, "", 0);
- if (dptr2)
- dptr2 = list_next_entry(dptr2, domain_list);
+ if (dptr2) {
+ if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
+ dptr2 = NULL;
+ else
+ dptr2 = list_next_entry(dptr2, domain_list);
+ }
}
if (summary_only)
break;
- if (cptr2)
- cptr2 = list_next_entry(cptr2, cpu_list);
+ if (cptr2) {
+ if (list_is_last(&cptr2->cpu_list, head2))
+ cptr2 = NULL;
+ else
+ cptr2 = list_next_entry(cptr2, cpu_list);
+ }
is_summary = false;
}
@@ -4473,9 +4563,11 @@ static void free_schedstat(struct list_head *head)
list_for_each_entry_safe(cptr, n2, head, cpu_list) {
list_for_each_entry_safe(dptr, n1, &cptr->domain_head, domain_list) {
list_del_init(&dptr->domain_list);
+ free(dptr->domain_data);
free(dptr);
}
list_del_init(&cptr->cpu_list);
+ free(cptr->cpu_data);
free(cptr);
}
}
@@ -4523,7 +4615,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
}
cd_map = session->header.env.cpu_domain;
- err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+ err = show_schedstat_data(&cpu_head, cd_map,
+ session->header.env.nr_cpus_avail,
+ NULL, NULL, 0, false);
}
out:
@@ -4538,7 +4632,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
struct list_head cpu_head_ses0, cpu_head_ses1;
struct perf_session *session[2];
- struct perf_data data[2];
+ struct perf_data data[2] = {0};
int ret = 0, err = 0;
static const char *defaults[] = {
"perf.data.old",
@@ -4573,8 +4667,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
}
err = perf_session__process_events(session[0]);
- if (err)
+ if (err) {
+ free_schedstat(&cpu_head);
goto out_delete_ses0;
+ }
cd_map0 = session[0]->header.env.cpu_domain;
list_replace_init(&cpu_head, &cpu_head_ses0);
@@ -4590,8 +4686,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
}
err = perf_session__process_events(session[1]);
- if (err)
+ if (err) {
+ free_schedstat(&cpu_head);
goto out_delete_ses1;
+ }
cd_map1 = session[1]->header.env.cpu_domain;
list_replace_init(&cpu_head, &cpu_head_ses1);
@@ -4607,10 +4705,13 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
if (list_empty(&cpu_head_ses0)) {
pr_err("Data is not available\n");
ret = -1;
- goto out_delete_ses0;
+ goto out_delete_ses1;
}
- show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
+ ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
+ &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
+ if (ret)
+ goto out_delete_ses1;
out_delete_ses1:
free_schedstat(&cpu_head_ses1);
@@ -4720,7 +4821,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
goto out;
}
- show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+ err = show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
free_cpu_domain_info(cd_map, sv, nr);
out:
free_schedstat(&cpu_head);
--
2.54.0.545.g6539524ca2-goog
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
2026-05-06 4:10 ` [PATCH v4] " Ian Rogers
@ 2026-05-06 6:45 ` Ian Rogers
2026-05-06 13:03 ` Arnaldo Carvalho de Melo
2026-05-06 9:07 ` James Clark
1 sibling, 1 reply; 14+ messages in thread
From: Ian Rogers @ 2026-05-06 6:45 UTC (permalink / raw)
To: irogers, acme, atrajeev, james.clark, namhyung
Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
venkat88
On Tue, May 5, 2026 at 9:10 PM Ian Rogers <irogers@google.com> wrote:
>
> The patch addresses multiple segmentation fault vectors, out-of-bounds
> reads, and memory leaks in perf sched stats by adding bounds checks,
> NULL checks, proper error propagation, and robust memory cleanup.
>
> 1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
> unsafe list_first_entry() calls on empty lists, added a missing NULL
> check for summary_head->cpu_data allocation, and implemented a cleanup
> ladder using a temporary list to prevent memory leaks on error paths.
> 2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
> and cpu_data pointers are freed.
> 3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
> its value from a previous iteration when dptr2 became NULL, and added
> proper propagation of errors from get_all_cpu_stats().
> 4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
> and perf_sched__schedstat_live() to prevent output corruption on failure.
> 5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
> to gracefully handle invalid or empty data files.
> 6. Added parallel iteration termination checks using list_is_last() in
> show_schedstat_data() for both domain and CPU lists to safely terminate
> at the end of each list when files contain a different number of CPUs
> or domains.
> 7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
> show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
> cd_map2 when comparing files from machines with different CPU counts.
> 8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
> a CPU has data samples but no corresponding domain info in the header.
> 9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
> ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
> accesses in the domains array.
> 10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
> to prevent crashes when a domain has no corresponding domain info.
> 11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
> to prevent stack garbage from causing perf_data_file__fd() to attempt
> to use a NULL fptr when use_stdio happened to be non-zero.
>
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
This version passes sashiko's scrutiny:
https://sashiko.dev/#/patchset/20260506041004.3196084-1-irogers%40google.com
Thanks,
Ian
> ---
> v4:
> Address further sashiko feedback:
> - Fix memory leaks of global cpu_head list when perf_session__process_events() fails.
> - Fix resource leak of second session by correcting cleanup order when first session's data is empty.
>
> v3:
> Address sashiko feedback:
> - Fix memory leaks in free_schedstat() by freeing internal data.
> - Fix missing NULL check for summary_head->cpu_data in get_all_cpu_stats().
> - Fix memory leaks in get_all_cpu_stats() on error paths using a temporary cleanup list.
> - Fix stale ds2 pointer comparison in show_schedstat_data() by resetting it to NULL.
> - Add error propagation from get_all_cpu_stats() and show_schedstat_data().
> - Correct commit message to remove inaccurate claim about uninitialized variable 'ret'.
> - Add forward declaration for free_schedstat() to fix compilation error.
>
> v2: Had tested-by Athira Rajeev and reviewed-by James Clark, removed
> from v3 given the number of changes.
> https://lore.kernel.org/linux-perf-users/20260429173931.2700115-1-irogers@google.com/
> ---
> tools/perf/builtin-sched.c | 175 +++++++++++++++++++++++++++++--------
> 1 file changed, 138 insertions(+), 37 deletions(-)
>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..fd679b106582 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -3947,6 +3947,8 @@ static struct schedstat_domain *domain_second_pass;
> static bool after_workload_flag;
> static bool verbose_field;
>
> +static void free_schedstat(struct list_head *head);
> +
> static void store_schedstat_cpu_diff(struct schedstat_cpu *after_workload)
> {
> struct perf_record_schedstat_cpu *before = cpu_second_pass->cpu_data;
> @@ -4170,37 +4172,50 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
> */
> static int get_all_cpu_stats(struct list_head *head)
> {
> - struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> - struct schedstat_cpu *summary_head = NULL;
> - struct perf_record_schedstat_domain *ds;
> - struct perf_record_schedstat_cpu *cs;
> + struct schedstat_cpu *cptr, *summary_head = NULL;
> struct schedstat_domain *dptr, *tdptr;
> bool is_last = false;
> int cnt = 1;
> int ret = 0;
> + struct list_head tmp_cleanup_list;
>
> - if (cptr) {
> - summary_head = zalloc(sizeof(*summary_head));
> - if (!summary_head)
> - return -ENOMEM;
> + assert(!list_empty(head));
> + cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
>
> - summary_head->cpu_data = zalloc(sizeof(*cs));
> - memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> + INIT_LIST_HEAD(&tmp_cleanup_list);
>
> - INIT_LIST_HEAD(&summary_head->domain_head);
> + summary_head = zalloc(sizeof(*summary_head));
> + if (!summary_head)
> + return -ENOMEM;
>
> - list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> - tdptr = zalloc(sizeof(*tdptr));
> - if (!tdptr)
> - return -ENOMEM;
> + INIT_LIST_HEAD(&summary_head->domain_head);
> + INIT_LIST_HEAD(&summary_head->cpu_list);
> + list_add(&summary_head->cpu_list, &tmp_cleanup_list);
>
> - tdptr->domain_data = zalloc(sizeof(*ds));
> - if (!tdptr->domain_data)
> - return -ENOMEM;
> + summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> + if (!summary_head->cpu_data) {
> + ret = -ENOMEM;
> + goto out_cleanup;
> + }
> + memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
> +
> + list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> + tdptr = zalloc(sizeof(*tdptr));
> + if (!tdptr) {
> + ret = -ENOMEM;
> + goto out_cleanup;
> + }
> + INIT_LIST_HEAD(&tdptr->domain_list);
>
> - memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
> - list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> + tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> + if (!tdptr->domain_data) {
> + free(tdptr);
> + ret = -ENOMEM;
> + goto out_cleanup;
> }
> +
> + memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
> + list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> }
>
> list_for_each_entry(cptr, head, cpu_list) {
> @@ -4212,32 +4227,52 @@ static int get_all_cpu_stats(struct list_head *head)
>
> cnt++;
> summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> + if (list_empty(&summary_head->domain_head))
> + continue;
> +
> tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> domain_list);
>
> list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> + if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
> + tdptr = NULL;
> + break;
> + }
> tdptr = list_next_entry(tdptr, domain_list);
> }
> }
>
> + list_del_init(&summary_head->cpu_list);
> list_add(&summary_head->cpu_list, head);
> + return 0;
> +
> +out_cleanup:
> + free_schedstat(&tmp_cleanup_list);
> return ret;
> }
>
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> - struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> + struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
> bool summary_only)
> {
> struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
> struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
> - struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
> struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
> struct schedstat_cpu *cptr2 = NULL;
> __u64 jiffies1 = 0, jiffies2 = 0;
> bool is_summary = true;
> int ret = 0;
>
> + if (!cd_map1) {
> + pr_err("Error: CPU domain map 1 is missing.\n");
> + return -1;
> + }
> + if (head2 && !cd_map2) {
> + pr_err("Error: CPU domain map 2 is missing.\n");
> + return -1;
> + }
> +
> printf("Description\n");
> print_separator2(SEP_LEN, "", 0);
> printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4260,21 +4295,47 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> printf("\n");
>
> ret = get_all_cpu_stats(head1);
> + if (ret)
> + return ret;
> if (cptr2) {
> ret = get_all_cpu_stats(head2);
> + if (ret)
> + return ret;
> cptr2 = list_first_entry(head2, struct schedstat_cpu, cpu_list);
> }
>
> list_for_each_entry(cptr1, head1, cpu_list) {
> struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> + struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
> + struct perf_record_schedstat_cpu *cs2 = NULL;
>
> - cs1 = cptr1->cpu_data;
> + dptr2 = NULL;
> + if (cs1->cpu >= (u32)nr1) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> + return -1;
> + }
> cd_info1 = cd_map1[cs1->cpu];
> + if (!cd_info1) {
> + pr_err("Error: CPU %d domain info is missing in map 1.\n",
> + cs1->cpu);
> + return -1;
> + }
> if (cptr2) {
> cs2 = cptr2->cpu_data;
> + if (cs2->cpu >= (u32)nr2) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> + return -1;
> + }
> cd_info2 = cd_map2[cs2->cpu];
> - dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> - domain_list);
> + if (!cd_info2) {
> + pr_err("Error: CPU %d domain info is missing in map 2.\n",
> + cs2->cpu);
> + return -1;
> + }
> + if (!list_empty(&cptr2->domain_head))
> + dptr2 = list_first_entry(&cptr2->domain_head,
> + struct schedstat_domain,
> + domain_list);
> }
>
> if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4363,31 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>
> ds1 = dptr1->domain_data;
> + ds2 = NULL;
> + if (ds1->domain >= cd_info1->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
> + ds1->domain, cd_info1->nr_domains, cs1->cpu);
> + return -1;
> + }
> dinfo1 = cd_info1->domains[ds1->domain];
> + if (!dinfo1) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
> + ds1->domain, cs1->cpu);
> + return -1;
> + }
> if (dptr2) {
> ds2 = dptr2->domain_data;
> + if (ds2->domain >= cd_info2->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
> + ds2->domain, cd_info2->nr_domains, cs2->cpu);
> + return -1;
> + }
> dinfo2 = cd_info2->domains[ds2->domain];
> + if (!dinfo2) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
> + ds2->domain, cs2->cpu);
> + return -1;
> + }
> }
>
> if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4416,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> print_domain_stats(ds1, ds2, jiffies1, jiffies2);
> print_separator2(SEP_LEN, "", 0);
>
> - if (dptr2)
> - dptr2 = list_next_entry(dptr2, domain_list);
> + if (dptr2) {
> + if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> + dptr2 = NULL;
> + else
> + dptr2 = list_next_entry(dptr2, domain_list);
> + }
> }
> if (summary_only)
> break;
>
> - if (cptr2)
> - cptr2 = list_next_entry(cptr2, cpu_list);
> + if (cptr2) {
> + if (list_is_last(&cptr2->cpu_list, head2))
> + cptr2 = NULL;
> + else
> + cptr2 = list_next_entry(cptr2, cpu_list);
> + }
>
> is_summary = false;
> }
> @@ -4473,9 +4563,11 @@ static void free_schedstat(struct list_head *head)
> list_for_each_entry_safe(cptr, n2, head, cpu_list) {
> list_for_each_entry_safe(dptr, n1, &cptr->domain_head, domain_list) {
> list_del_init(&dptr->domain_list);
> + free(dptr->domain_data);
> free(dptr);
> }
> list_del_init(&cptr->cpu_list);
> + free(cptr->cpu_data);
> free(cptr);
> }
> }
> @@ -4523,7 +4615,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
> }
>
> cd_map = session->header.env.cpu_domain;
> - err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + err = show_schedstat_data(&cpu_head, cd_map,
> + session->header.env.nr_cpus_avail,
> + NULL, NULL, 0, false);
> }
>
> out:
> @@ -4538,7 +4632,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
> struct list_head cpu_head_ses0, cpu_head_ses1;
> struct perf_session *session[2];
> - struct perf_data data[2];
> + struct perf_data data[2] = {0};
> int ret = 0, err = 0;
> static const char *defaults[] = {
> "perf.data.old",
> @@ -4573,8 +4667,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> }
>
> err = perf_session__process_events(session[0]);
> - if (err)
> + if (err) {
> + free_schedstat(&cpu_head);
> goto out_delete_ses0;
> + }
>
> cd_map0 = session[0]->header.env.cpu_domain;
> list_replace_init(&cpu_head, &cpu_head_ses0);
> @@ -4590,8 +4686,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> }
>
> err = perf_session__process_events(session[1]);
> - if (err)
> + if (err) {
> + free_schedstat(&cpu_head);
> goto out_delete_ses1;
> + }
>
> cd_map1 = session[1]->header.env.cpu_domain;
> list_replace_init(&cpu_head, &cpu_head_ses1);
> @@ -4607,10 +4705,13 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> if (list_empty(&cpu_head_ses0)) {
> pr_err("Data is not available\n");
> ret = -1;
> - goto out_delete_ses0;
> + goto out_delete_ses1;
> }
>
> - show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> + ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> + &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
> + if (ret)
> + goto out_delete_ses1;
>
> out_delete_ses1:
> free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4821,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
> goto out;
> }
>
> - show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + err = show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
> free_cpu_domain_info(cd_map, sv, nr);
> out:
> free_schedstat(&cpu_head);
> --
> 2.54.0.545.g6539524ca2-goog
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
2026-05-06 6:45 ` Ian Rogers
@ 2026-05-06 13:03 ` Arnaldo Carvalho de Melo
2026-05-11 16:20 ` Ian Rogers
0 siblings, 1 reply; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-06 13:03 UTC (permalink / raw)
To: Ian Rogers
Cc: atrajeev, james.clark, namhyung, adrian.hunter, linux-kernel,
linux-perf-users, mingo, peterz, venkat88
On Tue, May 05, 2026 at 11:45:56PM -0700, Ian Rogers wrote:
> On Tue, May 5, 2026 at 9:10 PM Ian Rogers <irogers@google.com> wrote:
> >
> > The patch addresses multiple segmentation fault vectors, out-of-bounds
> > reads, and memory leaks in perf sched stats by adding bounds checks,
> > NULL checks, proper error propagation, and robust memory cleanup.
> >
> > 1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
> > unsafe list_first_entry() calls on empty lists, added a missing NULL
> > check for summary_head->cpu_data allocation, and implemented a cleanup
> > ladder using a temporary list to prevent memory leaks on error paths.
> > 2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
> > and cpu_data pointers are freed.
> > 3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
> > its value from a previous iteration when dptr2 became NULL, and added
> > proper propagation of errors from get_all_cpu_stats().
> > 4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
> > and perf_sched__schedstat_live() to prevent output corruption on failure.
> > 5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
> > to gracefully handle invalid or empty data files.
> > 6. Added parallel iteration termination checks using list_is_last() in
> > show_schedstat_data() for both domain and CPU lists to safely terminate
> > at the end of each list when files contain a different number of CPUs
> > or domains.
> > 7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
> > show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
> > cd_map2 when comparing files from machines with different CPU counts.
> > 8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
> > a CPU has data samples but no corresponding domain info in the header.
> > 9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
> > ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
> > accesses in the domains array.
> > 10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
> > to prevent crashes when a domain has no corresponding domain info.
> > 11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
> > to prevent stack garbage from causing perf_data_file__fd() to attempt
> > to use a NULL fptr when use_stdio happened to be non-zero.
> >
> > Assisted-by: Gemini:gemini-3.1-pro-preview
> > Signed-off-by: Ian Rogers <irogers@google.com>
>
> This version passes sashiko's scrutiny:
> https://sashiko.dev/#/patchset/20260506041004.3196084-1-irogers%40google.com
Thanks, applied to perf-tools-next, for v7.2.
- Arnaldo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
2026-05-06 13:03 ` Arnaldo Carvalho de Melo
@ 2026-05-11 16:20 ` Ian Rogers
0 siblings, 0 replies; 14+ messages in thread
From: Ian Rogers @ 2026-05-11 16:20 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: atrajeev, james.clark, namhyung, adrian.hunter, linux-kernel,
linux-perf-users, mingo, peterz, venkat88
On Wed, May 6, 2026 at 6:03 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> On Tue, May 05, 2026 at 11:45:56PM -0700, Ian Rogers wrote:
> > On Tue, May 5, 2026 at 9:10 PM Ian Rogers <irogers@google.com> wrote:
> > >
> > > The patch addresses multiple segmentation fault vectors, out-of-bounds
> > > reads, and memory leaks in perf sched stats by adding bounds checks,
> > > NULL checks, proper error propagation, and robust memory cleanup.
> > >
> > > 1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
> > > unsafe list_first_entry() calls on empty lists, added a missing NULL
> > > check for summary_head->cpu_data allocation, and implemented a cleanup
> > > ladder using a temporary list to prevent memory leaks on error paths.
> > > 2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
> > > and cpu_data pointers are freed.
> > > 3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
> > > its value from a previous iteration when dptr2 became NULL, and added
> > > proper propagation of errors from get_all_cpu_stats().
> > > 4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
> > > and perf_sched__schedstat_live() to prevent output corruption on failure.
> > > 5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
> > > to gracefully handle invalid or empty data files.
> > > 6. Added parallel iteration termination checks using list_is_last() in
> > > show_schedstat_data() for both domain and CPU lists to safely terminate
> > > at the end of each list when files contain a different number of CPUs
> > > or domains.
> > > 7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
> > > show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
> > > cd_map2 when comparing files from machines with different CPU counts.
> > > 8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
> > > a CPU has data samples but no corresponding domain info in the header.
> > > 9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
> > > ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
> > > accesses in the domains array.
> > > 10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
> > > to prevent crashes when a domain has no corresponding domain info.
> > > 11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
> > > to prevent stack garbage from causing perf_data_file__fd() to attempt
> > > to use a NULL fptr when use_stdio happened to be non-zero.
> > >
> > > Assisted-by: Gemini:gemini-3.1-pro-preview
> > > Signed-off-by: Ian Rogers <irogers@google.com>
> >
> > This version passes sashiko's scrutiny:
> > https://sashiko.dev/#/patchset/20260506041004.3196084-1-irogers%40google.com
>
> Thanks, applied to perf-tools-next, for v7.2.
Thanks Arnaldo! I don't see this in:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/log/?h=perf-tools-next
just wanted to make sure it didn't accidentally get lost.
Thanks,
Ian
> - Arnaldo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
2026-05-06 4:10 ` [PATCH v4] " Ian Rogers
2026-05-06 6:45 ` Ian Rogers
@ 2026-05-06 9:07 ` James Clark
1 sibling, 0 replies; 14+ messages in thread
From: James Clark @ 2026-05-06 9:07 UTC (permalink / raw)
To: Ian Rogers
Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
venkat88, acme, atrajeev, namhyung
On 06/05/2026 5:10 am, Ian Rogers wrote:
> The patch addresses multiple segmentation fault vectors, out-of-bounds
> reads, and memory leaks in perf sched stats by adding bounds checks,
> NULL checks, proper error propagation, and robust memory cleanup.
>
> 1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
> unsafe list_first_entry() calls on empty lists, added a missing NULL
> check for summary_head->cpu_data allocation, and implemented a cleanup
> ladder using a temporary list to prevent memory leaks on error paths.
> 2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
> and cpu_data pointers are freed.
> 3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
> its value from a previous iteration when dptr2 became NULL, and added
> proper propagation of errors from get_all_cpu_stats().
> 4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
> and perf_sched__schedstat_live() to prevent output corruption on failure.
> 5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
> to gracefully handle invalid or empty data files.
> 6. Added parallel iteration termination checks using list_is_last() in
> show_schedstat_data() for both domain and CPU lists to safely terminate
> at the end of each list when files contain a different number of CPUs
> or domains.
> 7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
> show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
> cd_map2 when comparing files from machines with different CPU counts.
> 8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
> a CPU has data samples but no corresponding domain info in the header.
> 9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
> ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
> accesses in the domains array.
> 10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
> to prevent crashes when a domain has no corresponding domain info.
> 11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
> to prevent stack garbage from causing perf_data_file__fd() to attempt
> to use a NULL fptr when use_stdio happened to be non-zero.
>
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
>
> ---
> v4:
> Address further sashiko feedback:
> - Fix memory leaks of global cpu_head list when perf_session__process_events() fails.
> - Fix resource leak of second session by correcting cleanup order when first session's data is empty.
>
> v3:
> Address sashiko feedback:
> - Fix memory leaks in free_schedstat() by freeing internal data.
> - Fix missing NULL check for summary_head->cpu_data in get_all_cpu_stats().
> - Fix memory leaks in get_all_cpu_stats() on error paths using a temporary cleanup list.
> - Fix stale ds2 pointer comparison in show_schedstat_data() by resetting it to NULL.
> - Add error propagation from get_all_cpu_stats() and show_schedstat_data().
> - Correct commit message to remove inaccurate claim about uninitialized variable 'ret'.
> - Add forward declaration for free_schedstat() to fix compilation error.
>
> v2: Had tested-by Athira Rajeev and reviewed-by James Clark, removed
> from v3 given the number of changes.
> https://lore.kernel.org/linux-perf-users/20260429173931.2700115-1-irogers@google.com/
> ---
> tools/perf/builtin-sched.c | 175 +++++++++++++++++++++++++++++--------
> 1 file changed, 138 insertions(+), 37 deletions(-)
>
Reviewed-by: James Clark <james.clark@linaro.org>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..fd679b106582 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -3947,6 +3947,8 @@ static struct schedstat_domain *domain_second_pass;
> static bool after_workload_flag;
> static bool verbose_field;
>
> +static void free_schedstat(struct list_head *head);
> +
> static void store_schedstat_cpu_diff(struct schedstat_cpu *after_workload)
> {
> struct perf_record_schedstat_cpu *before = cpu_second_pass->cpu_data;
> @@ -4170,37 +4172,50 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
> */
> static int get_all_cpu_stats(struct list_head *head)
> {
> - struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> - struct schedstat_cpu *summary_head = NULL;
> - struct perf_record_schedstat_domain *ds;
> - struct perf_record_schedstat_cpu *cs;
> + struct schedstat_cpu *cptr, *summary_head = NULL;
> struct schedstat_domain *dptr, *tdptr;
> bool is_last = false;
> int cnt = 1;
> int ret = 0;
> + struct list_head tmp_cleanup_list;
>
> - if (cptr) {
> - summary_head = zalloc(sizeof(*summary_head));
> - if (!summary_head)
> - return -ENOMEM;
> + assert(!list_empty(head));
> + cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
>
> - summary_head->cpu_data = zalloc(sizeof(*cs));
> - memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> + INIT_LIST_HEAD(&tmp_cleanup_list);
>
> - INIT_LIST_HEAD(&summary_head->domain_head);
> + summary_head = zalloc(sizeof(*summary_head));
> + if (!summary_head)
> + return -ENOMEM;
>
> - list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> - tdptr = zalloc(sizeof(*tdptr));
> - if (!tdptr)
> - return -ENOMEM;
> + INIT_LIST_HEAD(&summary_head->domain_head);
> + INIT_LIST_HEAD(&summary_head->cpu_list);
> + list_add(&summary_head->cpu_list, &tmp_cleanup_list);
>
> - tdptr->domain_data = zalloc(sizeof(*ds));
> - if (!tdptr->domain_data)
> - return -ENOMEM;
> + summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> + if (!summary_head->cpu_data) {
> + ret = -ENOMEM;
> + goto out_cleanup;
> + }
> + memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
> +
> + list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> + tdptr = zalloc(sizeof(*tdptr));
> + if (!tdptr) {
> + ret = -ENOMEM;
> + goto out_cleanup;
> + }
> + INIT_LIST_HEAD(&tdptr->domain_list);
>
> - memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
> - list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> + tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> + if (!tdptr->domain_data) {
> + free(tdptr);
> + ret = -ENOMEM;
> + goto out_cleanup;
> }
> +
> + memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
> + list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> }
>
> list_for_each_entry(cptr, head, cpu_list) {
> @@ -4212,32 +4227,52 @@ static int get_all_cpu_stats(struct list_head *head)
>
> cnt++;
> summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> + if (list_empty(&summary_head->domain_head))
> + continue;
> +
> tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> domain_list);
>
> list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> + if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
> + tdptr = NULL;
> + break;
> + }
> tdptr = list_next_entry(tdptr, domain_list);
> }
> }
>
> + list_del_init(&summary_head->cpu_list);
> list_add(&summary_head->cpu_list, head);
> + return 0;
> +
> +out_cleanup:
> + free_schedstat(&tmp_cleanup_list);
> return ret;
> }
>
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> - struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> + struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
> bool summary_only)
> {
> struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
> struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
> - struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
> struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
> struct schedstat_cpu *cptr2 = NULL;
> __u64 jiffies1 = 0, jiffies2 = 0;
> bool is_summary = true;
> int ret = 0;
>
> + if (!cd_map1) {
> + pr_err("Error: CPU domain map 1 is missing.\n");
> + return -1;
> + }
> + if (head2 && !cd_map2) {
> + pr_err("Error: CPU domain map 2 is missing.\n");
> + return -1;
> + }
> +
> printf("Description\n");
> print_separator2(SEP_LEN, "", 0);
> printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4260,21 +4295,47 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> printf("\n");
>
> ret = get_all_cpu_stats(head1);
> + if (ret)
> + return ret;
> if (cptr2) {
> ret = get_all_cpu_stats(head2);
> + if (ret)
> + return ret;
> cptr2 = list_first_entry(head2, struct schedstat_cpu, cpu_list);
> }
>
> list_for_each_entry(cptr1, head1, cpu_list) {
> struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> + struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
> + struct perf_record_schedstat_cpu *cs2 = NULL;
>
> - cs1 = cptr1->cpu_data;
> + dptr2 = NULL;
> + if (cs1->cpu >= (u32)nr1) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> + return -1;
> + }
> cd_info1 = cd_map1[cs1->cpu];
> + if (!cd_info1) {
> + pr_err("Error: CPU %d domain info is missing in map 1.\n",
> + cs1->cpu);
> + return -1;
> + }
> if (cptr2) {
> cs2 = cptr2->cpu_data;
> + if (cs2->cpu >= (u32)nr2) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> + return -1;
> + }
> cd_info2 = cd_map2[cs2->cpu];
> - dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> - domain_list);
> + if (!cd_info2) {
> + pr_err("Error: CPU %d domain info is missing in map 2.\n",
> + cs2->cpu);
> + return -1;
> + }
> + if (!list_empty(&cptr2->domain_head))
> + dptr2 = list_first_entry(&cptr2->domain_head,
> + struct schedstat_domain,
> + domain_list);
> }
>
> if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4363,31 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>
> ds1 = dptr1->domain_data;
> + ds2 = NULL;
> + if (ds1->domain >= cd_info1->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
> + ds1->domain, cd_info1->nr_domains, cs1->cpu);
> + return -1;
> + }
> dinfo1 = cd_info1->domains[ds1->domain];
> + if (!dinfo1) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
> + ds1->domain, cs1->cpu);
> + return -1;
> + }
> if (dptr2) {
> ds2 = dptr2->domain_data;
> + if (ds2->domain >= cd_info2->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
> + ds2->domain, cd_info2->nr_domains, cs2->cpu);
> + return -1;
> + }
> dinfo2 = cd_info2->domains[ds2->domain];
> + if (!dinfo2) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
> + ds2->domain, cs2->cpu);
> + return -1;
> + }
> }
>
> if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4416,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> print_domain_stats(ds1, ds2, jiffies1, jiffies2);
> print_separator2(SEP_LEN, "", 0);
>
> - if (dptr2)
> - dptr2 = list_next_entry(dptr2, domain_list);
> + if (dptr2) {
> + if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> + dptr2 = NULL;
> + else
> + dptr2 = list_next_entry(dptr2, domain_list);
> + }
> }
> if (summary_only)
> break;
>
> - if (cptr2)
> - cptr2 = list_next_entry(cptr2, cpu_list);
> + if (cptr2) {
> + if (list_is_last(&cptr2->cpu_list, head2))
> + cptr2 = NULL;
> + else
> + cptr2 = list_next_entry(cptr2, cpu_list);
> + }
>
> is_summary = false;
> }
> @@ -4473,9 +4563,11 @@ static void free_schedstat(struct list_head *head)
> list_for_each_entry_safe(cptr, n2, head, cpu_list) {
> list_for_each_entry_safe(dptr, n1, &cptr->domain_head, domain_list) {
> list_del_init(&dptr->domain_list);
> + free(dptr->domain_data);
> free(dptr);
> }
> list_del_init(&cptr->cpu_list);
> + free(cptr->cpu_data);
> free(cptr);
> }
> }
> @@ -4523,7 +4615,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
> }
>
> cd_map = session->header.env.cpu_domain;
> - err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + err = show_schedstat_data(&cpu_head, cd_map,
> + session->header.env.nr_cpus_avail,
> + NULL, NULL, 0, false);
> }
>
> out:
> @@ -4538,7 +4632,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
> struct list_head cpu_head_ses0, cpu_head_ses1;
> struct perf_session *session[2];
> - struct perf_data data[2];
> + struct perf_data data[2] = {0};
> int ret = 0, err = 0;
> static const char *defaults[] = {
> "perf.data.old",
> @@ -4573,8 +4667,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> }
>
> err = perf_session__process_events(session[0]);
> - if (err)
> + if (err) {
> + free_schedstat(&cpu_head);
> goto out_delete_ses0;
> + }
>
> cd_map0 = session[0]->header.env.cpu_domain;
> list_replace_init(&cpu_head, &cpu_head_ses0);
> @@ -4590,8 +4686,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> }
>
> err = perf_session__process_events(session[1]);
> - if (err)
> + if (err) {
> + free_schedstat(&cpu_head);
> goto out_delete_ses1;
> + }
>
> cd_map1 = session[1]->header.env.cpu_domain;
> list_replace_init(&cpu_head, &cpu_head_ses1);
> @@ -4607,10 +4705,13 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> if (list_empty(&cpu_head_ses0)) {
> pr_err("Data is not available\n");
> ret = -1;
> - goto out_delete_ses0;
> + goto out_delete_ses1;
> }
>
> - show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> + ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> + &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
> + if (ret)
> + goto out_delete_ses1;
>
> out_delete_ses1:
> free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4821,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
> goto out;
> }
>
> - show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + err = show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
> free_cpu_domain_info(cd_map, sv, nr);
> out:
> free_schedstat(&cpu_head);
^ permalink raw reply [flat|nested] 14+ messages in thread