public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] perf sched stats: Fix segmentation faults in diff mode
@ 2026-04-28  7:08 Ian Rogers
  2026-04-29 12:30 ` James Clark
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Ian Rogers @ 2026-04-28  7:08 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, James Clark, linux-perf-users,
	linux-kernel, Athira Rajeev, Venkat
  Cc: Ian Rogers

Address several segmentation fault vectors in `perf sched stats diff`:

1. When processing invalid or empty data files, the CPU domain maps may
   be NULL. Added NULL checks for `cd_map1` and `cd_map2` in
   `show_schedstat_data()` to fail gracefully.

2. When files contain a different number of CPUs or domains, the parallel
   list iteration in `show_schedstat_data()` could wrap around the list
   heads and dereference invalid memory. Added `list_is_last` checks
   to safely terminate iteration at the end of each list.

3. When summarizing CPU statistics in `get_all_cpu_stats()`, parallel list
   iteration over domains could similarly wrap around if a CPU has more
   domains than the first CPU. Added `list_is_last` check to prevent this.

4. Added bounds checks for `cs1->cpu` and `cs2->cpu` against `nr1` and
   `nr2` (passed from `env->nr_cpus_avail`) to prevent out-of-bounds
   reads from `cd_map1` and `cd_map2` when processing data from machines
   with different CPU configurations.

5. Added NULL checks for `cd_info1` and `cd_info2` in `show_schedstat_data()`
   to prevent crashes when a CPU has samples in the data file but no
   corresponding domain info in the header (which leaves the map entry NULL).

6. Added NULL checks for `dinfo1` and `dinfo2` in `show_schedstat_data()`
   to prevent crashes when a domain is present in the list but has no
   corresponding info in the CPU domain map (which leaves the entry NULL).

7. Zero-initialized the `perf_data` array in `perf_sched__schedstat_diff()`
   to prevent stack garbage from causing `perf_data_file__fd()` to attempt
   to use a NULL `fptr` when `use_stdio` happened to be non-zero.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
Previously this patch was part of a large perf script refactor:
https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/
---
 tools/perf/builtin-sched.c | 95 +++++++++++++++++++++++++++++++-------
 1 file changed, 79 insertions(+), 16 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..9ecda631b3ab 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -4212,12 +4212,20 @@ static int get_all_cpu_stats(struct list_head *head)
 
 		cnt++;
 		summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
-		tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
-					 domain_list);
+		if (!list_empty(&summary_head->domain_head))
+			tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
+						 domain_list);
+		else
+			tdptr = NULL;
 
 		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
-			summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
-			tdptr = list_next_entry(tdptr, domain_list);
+			if (tdptr) {
+				summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
+				if (list_is_last(&tdptr->domain_list, &summary_head->domain_head))
+					tdptr = NULL;
+				else
+					tdptr = list_next_entry(tdptr, domain_list);
+			}
 		}
 	}
 
@@ -4225,8 +4233,8 @@ static int get_all_cpu_stats(struct list_head *head)
 	return ret;
 }
 
-static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
-			       struct list_head *head2, struct cpu_domain_map **cd_map2,
+static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
+			       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
 			       bool summary_only)
 {
 	struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
@@ -4238,6 +4246,15 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 	bool is_summary = true;
 	int ret = 0;
 
+	if (!cd_map1) {
+		pr_err("Error: CPU domain map 1 is missing.\n");
+		return -1;
+	}
+	if (head2 && !cd_map2) {
+		pr_err("Error: CPU domain map 2 is missing.\n");
+		return -1;
+	}
+
 	printf("Description\n");
 	print_separator2(SEP_LEN, "", 0);
 	printf("%-30s-> %s\n", "DESC", "Description of the field");
@@ -4269,12 +4286,33 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 		struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
 
 		cs1 = cptr1->cpu_data;
+		cs2 = NULL;
+		dptr2 = NULL;
+		if (cs1->cpu >= (u32)nr1) {
+			pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
+			return -1;
+		}
 		cd_info1 = cd_map1[cs1->cpu];
+		if (!cd_info1) {
+			pr_err("Error: CPU %d domain info is missing in map 1.\n", cs1->cpu);
+			return -1;
+		}
 		if (cptr2) {
 			cs2 = cptr2->cpu_data;
+			if (cs2->cpu >= (u32)nr2) {
+				pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
+				return -1;
+			}
 			cd_info2 = cd_map2[cs2->cpu];
-			dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
-						 domain_list);
+			if (!cd_info2) {
+				pr_err("Error: CPU %d domain info is missing in map 2.\n", cs2->cpu);
+				return -1;
+			}
+			if (!list_empty(&cptr2->domain_head))
+				dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
+							 domain_list);
+			else
+				dptr2 = NULL;
 		}
 
 		if (cs2 && cs1->cpu != cs2->cpu) {
@@ -4302,10 +4340,26 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 			struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
 
 			ds1 = dptr1->domain_data;
+			if (ds1->domain >= cd_info1->nr_domains) {
+				pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n", ds1->domain, cd_info1->nr_domains, cs1->cpu);
+				return -1;
+			}
 			dinfo1 = cd_info1->domains[ds1->domain];
+			if (!dinfo1) {
+				pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n", ds1->domain, cs1->cpu);
+				return -1;
+			}
 			if (dptr2) {
 				ds2 = dptr2->domain_data;
+				if (ds2->domain >= cd_info2->nr_domains) {
+					pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n", ds2->domain, cd_info2->nr_domains, cs2->cpu);
+					return -1;
+				}
 				dinfo2 = cd_info2->domains[ds2->domain];
+				if (!dinfo2) {
+					pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n", ds2->domain, cs2->cpu);
+					return -1;
+				}
 			}
 
 			if (dinfo2 && dinfo1->domain != dinfo2->domain) {
@@ -4334,14 +4388,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 			print_domain_stats(ds1, ds2, jiffies1, jiffies2);
 			print_separator2(SEP_LEN, "", 0);
 
-			if (dptr2)
-				dptr2 = list_next_entry(dptr2, domain_list);
+			if (dptr2) {
+				if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
+					dptr2 = NULL;
+				else
+					dptr2 = list_next_entry(dptr2, domain_list);
+			}
 		}
 		if (summary_only)
 			break;
 
-		if (cptr2)
-			cptr2 = list_next_entry(cptr2, cpu_list);
+		if (cptr2) {
+			if (list_is_last(&cptr2->cpu_list, head2))
+				cptr2 = NULL;
+			else
+				cptr2 = list_next_entry(cptr2, cpu_list);
+		}
 
 		is_summary = false;
 	}
@@ -4523,7 +4585,7 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
 		}
 
 		cd_map = session->header.env.cpu_domain;
-		err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+		err = show_schedstat_data(&cpu_head, cd_map, session->header.env.nr_cpus_avail, NULL, NULL, 0, false);
 	}
 
 out:
@@ -4538,7 +4600,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 	struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
 	struct list_head cpu_head_ses0, cpu_head_ses1;
 	struct perf_session *session[2];
-	struct perf_data data[2];
+	struct perf_data data[2] = {0};
 	int ret = 0, err = 0;
 	static const char *defaults[] = {
 		"perf.data.old",
@@ -4610,7 +4672,8 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 		goto out_delete_ses0;
 	}
 
-	show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
+	show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
+			    &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
 
 out_delete_ses1:
 	free_schedstat(&cpu_head_ses1);
@@ -4720,7 +4783,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
 		goto out;
 	}
 
-	show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+	show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
 	free_cpu_domain_info(cd_map, sv, nr);
 out:
 	free_schedstat(&cpu_head);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] perf sched stats: Fix segmentation faults in diff mode
  2026-04-28  7:08 [PATCH v1] perf sched stats: Fix segmentation faults in diff mode Ian Rogers
@ 2026-04-29 12:30 ` James Clark
  2026-04-29 14:01 ` Athira Rajeev
  2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
  2 siblings, 0 replies; 11+ messages in thread
From: James Clark @ 2026-04-29 12:30 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, linux-perf-users, linux-kernel,
	Athira Rajeev, Venkat



On 28/04/2026 8:08 am, Ian Rogers wrote:
> Address several segmentation fault vectors in `perf sched stats diff`:
> 
> 1. When processing invalid or empty data files, the CPU domain maps may
>     be NULL. Added NULL checks for `cd_map1` and `cd_map2` in
>     `show_schedstat_data()` to fail gracefully.
> 
> 2. When files contain a different number of CPUs or domains, the parallel
>     list iteration in `show_schedstat_data()` could wrap around the list
>     heads and dereference invalid memory. Added `list_is_last` checks
>     to safely terminate iteration at the end of each list.
> 
> 3. When summarizing CPU statistics in `get_all_cpu_stats()`, parallel list
>     iteration over domains could similarly wrap around if a CPU has more
>     domains than the first CPU. Added `list_is_last` check to prevent this.
> 
> 4. Added bounds checks for `cs1->cpu` and `cs2->cpu` against `nr1` and
>     `nr2` (passed from `env->nr_cpus_avail`) to prevent out-of-bounds
>     reads from `cd_map1` and `cd_map2` when processing data from machines
>     with different CPU configurations.
> 
> 5. Added NULL checks for `cd_info1` and `cd_info2` in `show_schedstat_data()`
>     to prevent crashes when a CPU has samples in the data file but no
>     corresponding domain info in the header (which leaves the map entry NULL).
> 
> 6. Added NULL checks for `dinfo1` and `dinfo2` in `show_schedstat_data()`
>     to prevent crashes when a domain is present in the list but has no
>     corresponding info in the CPU domain map (which leaves the entry NULL).
> 
> 7. Zero-initialized the `perf_data` array in `perf_sched__schedstat_diff()`
>     to prevent stack garbage from causing `perf_data_file__fd()` to attempt
>     to use a NULL `fptr` when `use_stdio` happened to be non-zero.
> 
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> Previously this patch was part of a large perf script refactor:
> https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/
> ---
>   tools/perf/builtin-sched.c | 95 +++++++++++++++++++++++++++++++-------
>   1 file changed, 79 insertions(+), 16 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..9ecda631b3ab 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -4212,12 +4212,20 @@ static int get_all_cpu_stats(struct list_head *head)
>   
>   		cnt++;
>   		summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> -		tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> -					 domain_list);
> +		if (!list_empty(&summary_head->domain_head))
> +			tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> +						 domain_list);
> +		else
> +			tdptr = NULL;
>   
>   		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> -			summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> -			tdptr = list_next_entry(tdptr, domain_list);
> +			if (tdptr) {

Minor nit, but would it be better to check "if (tdptr)" before 
list_for_each_entry() and not do the loop at all if tdptr is NULL? It 
implies that there is some useful side effect of list_for_each_entry() 
even if the body is a nop, but I can't see any.

> +				summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> +				if (list_is_last(&tdptr->domain_list, &summary_head->domain_head))
> +					tdptr = NULL;
> +				else
> +					tdptr = list_next_entry(tdptr, domain_list);
> +			}
>   		}
>   	}
>   
> @@ -4225,8 +4233,8 @@ static int get_all_cpu_stats(struct list_head *head)
>   	return ret;
>   }
>   
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> -			       struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> +			       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
>   			       bool summary_only)
>   {
>   	struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
> @@ -4238,6 +4246,15 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   	bool is_summary = true;
>   	int ret = 0;
>   
> +	if (!cd_map1) {
> +		pr_err("Error: CPU domain map 1 is missing.\n");
> +		return -1;
> +	}
> +	if (head2 && !cd_map2) {
> +		pr_err("Error: CPU domain map 2 is missing.\n");
> +		return -1;
> +	}
> +
>   	printf("Description\n");
>   	print_separator2(SEP_LEN, "", 0);
>   	printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4269,12 +4286,33 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   		struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
>   
>   		cs1 = cptr1->cpu_data;
> +		cs2 = NULL;

cs1 and cs2 are only used inside this loop, probably makes sense to 
define them and initialize them both here.

> +		dptr2 = NULL;
> +		if (cs1->cpu >= (u32)nr1) {
> +			pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> +			return -1;
> +		}
>   		cd_info1 = cd_map1[cs1->cpu];
> +		if (!cd_info1) {
> +			pr_err("Error: CPU %d domain info is missing in map 1.\n", cs1->cpu);
> +			return -1;
> +		}
>   		if (cptr2) {
>   			cs2 = cptr2->cpu_data;
> +			if (cs2->cpu >= (u32)nr2) {
> +				pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> +				return -1;
> +			}
>   			cd_info2 = cd_map2[cs2->cpu];
> -			dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> -						 domain_list);
> +			if (!cd_info2) {
> +				pr_err("Error: CPU %d domain info is missing in map 2.\n", cs2->cpu);
> +				return -1;
> +			}
> +			if (!list_empty(&cptr2->domain_head))
> +				dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> +							 domain_list);
> +			else
> +				dptr2 = NULL;

dptr2 is already NULL initialized, and not that far away from here so 
this is probably a bit too defensive.

>   		}
>   
>   		if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4340,26 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   			struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>   
>   			ds1 = dptr1->domain_data;
> +			if (ds1->domain >= cd_info1->nr_domains) {
> +				pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n", ds1->domain, cd_info1->nr_domains, cs1->cpu);
> +				return -1;
> +			}
>   			dinfo1 = cd_info1->domains[ds1->domain];
> +			if (!dinfo1) {
> +				pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n", ds1->domain, cs1->cpu);
> +				return -1;
> +			}
>   			if (dptr2) {
>   				ds2 = dptr2->domain_data;
> +				if (ds2->domain >= cd_info2->nr_domains) {
> +					pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n", ds2->domain, cd_info2->nr_domains, cs2->cpu);
> +					return -1;
> +				}
>   				dinfo2 = cd_info2->domains[ds2->domain];
> +				if (!dinfo2) {
> +					pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n", ds2->domain, cs2->cpu);
> +					return -1;
> +				}
>   			}
>   
>   			if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4388,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   			print_domain_stats(ds1, ds2, jiffies1, jiffies2);
>   			print_separator2(SEP_LEN, "", 0);
>   
> -			if (dptr2)
> -				dptr2 = list_next_entry(dptr2, domain_list);
> +			if (dptr2) {
> +				if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> +					dptr2 = NULL;
> +				else
> +					dptr2 = list_next_entry(dptr2, domain_list);
> +			}
>   		}
>   		if (summary_only)
>   			break;
>   
> -		if (cptr2)
> -			cptr2 = list_next_entry(cptr2, cpu_list);
> +		if (cptr2) {
> +			if (list_is_last(&cptr2->cpu_list, head2))
> +				cptr2 = NULL;
> +			else
> +				cptr2 = list_next_entry(cptr2, cpu_list);
> +		}
>   
>   		is_summary = false;
>   	}
> @@ -4523,7 +4585,7 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
>   		}
>   
>   		cd_map = session->header.env.cpu_domain;
> -		err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> +		err = show_schedstat_data(&cpu_head, cd_map, session->header.env.nr_cpus_avail, NULL, NULL, 0, false);
>   	}
>   
>   out:
> @@ -4538,7 +4600,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>   	struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
>   	struct list_head cpu_head_ses0, cpu_head_ses1;
>   	struct perf_session *session[2];
> -	struct perf_data data[2];
> +	struct perf_data data[2] = {0};
>   	int ret = 0, err = 0;
>   	static const char *defaults[] = {
>   		"perf.data.old",
> @@ -4610,7 +4672,8 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>   		goto out_delete_ses0;
>   	}
>   
> -	show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> +	show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> +			    &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
>   
>   out_delete_ses1:
>   	free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4783,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
>   		goto out;
>   	}
>   
> -	show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> +	show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
>   	free_cpu_domain_info(cd_map, sv, nr);
>   out:
>   	free_schedstat(&cpu_head);


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1] perf sched stats: Fix segmentation faults in diff mode
  2026-04-28  7:08 [PATCH v1] perf sched stats: Fix segmentation faults in diff mode Ian Rogers
  2026-04-29 12:30 ` James Clark
@ 2026-04-29 14:01 ` Athira Rajeev
  2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
  2 siblings, 0 replies; 11+ messages in thread
From: Athira Rajeev @ 2026-04-29 14:01 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Namhyung Kim, Adrian Hunter, James Clark, linux-perf-users,
	linux-kernel, Venkat



> On 28 Apr 2026, at 12:38 PM, Ian Rogers <irogers@google.com> wrote:
> 
> Address several segmentation fault vectors in `perf sched stats diff`:
> 
> 1. When processing invalid or empty data files, the CPU domain maps may
>   be NULL. Added NULL checks for `cd_map1` and `cd_map2` in
>   `show_schedstat_data()` to fail gracefully.
> 
> 2. When files contain a different number of CPUs or domains, the parallel
>   list iteration in `show_schedstat_data()` could wrap around the list
>   heads and dereference invalid memory. Added `list_is_last` checks
>   to safely terminate iteration at the end of each list.
> 
> 3. When summarizing CPU statistics in `get_all_cpu_stats()`, parallel list
>   iteration over domains could similarly wrap around if a CPU has more
>   domains than the first CPU. Added `list_is_last` check to prevent this.
> 
> 4. Added bounds checks for `cs1->cpu` and `cs2->cpu` against `nr1` and
>   `nr2` (passed from `env->nr_cpus_avail`) to prevent out-of-bounds
>   reads from `cd_map1` and `cd_map2` when processing data from machines
>   with different CPU configurations.
> 
> 5. Added NULL checks for `cd_info1` and `cd_info2` in `show_schedstat_data()`
>   to prevent crashes when a CPU has samples in the data file but no
>   corresponding domain info in the header (which leaves the map entry NULL).
> 
> 6. Added NULL checks for `dinfo1` and `dinfo2` in `show_schedstat_data()`
>   to prevent crashes when a domain is present in the list but has no
>   corresponding info in the CPU domain map (which leaves the entry NULL).
> 
> 7. Zero-initialized the `perf_data` array in `perf_sched__schedstat_diff()`
>   to prevent stack garbage from causing `perf_data_file__fd()` to attempt
>   to use a NULL `fptr` when `use_stdio` happened to be non-zero.
> 
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> Previously this patch was part of a large perf script refactor:
> https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/
> ---
> tools/perf/builtin-sched.c | 95 +++++++++++++++++++++++++++++++-------
> 1 file changed, 79 insertions(+), 16 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..9ecda631b3ab 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -4212,12 +4212,20 @@ static int get_all_cpu_stats(struct list_head *head)
> 
> cnt++;
> summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> - tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> - domain_list);
> + if (!list_empty(&summary_head->domain_head))
> + tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> + domain_list);
> + else
> + tdptr = NULL;
> 
> list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> - summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> - tdptr = list_next_entry(tdptr, domain_list);
> + if (tdptr) {
> + summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> + if (list_is_last(&tdptr->domain_list, &summary_head->domain_head))
> + tdptr = NULL;
> + else
> + tdptr = list_next_entry(tdptr, domain_list);
> + }
> }
> }
> 
> @@ -4225,8 +4233,8 @@ static int get_all_cpu_stats(struct list_head *head)
> return ret;
> }
> 
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> -       struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> +       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
>       bool summary_only)
> {
> struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
> @@ -4238,6 +4246,15 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> bool is_summary = true;
> int ret = 0;
> 
> + if (!cd_map1) {
> + pr_err("Error: CPU domain map 1 is missing.\n");
> + return -1;
> + }
> + if (head2 && !cd_map2) {
> + pr_err("Error: CPU domain map 2 is missing.\n");
> + return -1;
> + }
> +
> printf("Description\n");
> print_separator2(SEP_LEN, "", 0);
> printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4269,12 +4286,33 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> 
> cs1 = cptr1->cpu_data;
> + cs2 = NULL;
> + dptr2 = NULL;
> + if (cs1->cpu >= (u32)nr1) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> + return -1;
> + }
> cd_info1 = cd_map1[cs1->cpu];
> + if (!cd_info1) {
> + pr_err("Error: CPU %d domain info is missing in map 1.\n", cs1->cpu);
> + return -1;
> + }
> if (cptr2) {
> cs2 = cptr2->cpu_data;
> + if (cs2->cpu >= (u32)nr2) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> + return -1;
> + }
> cd_info2 = cd_map2[cs2->cpu];
> - dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> - domain_list);
> + if (!cd_info2) {
> + pr_err("Error: CPU %d domain info is missing in map 2.\n", cs2->cpu);
> + return -1;
> + }
> + if (!list_empty(&cptr2->domain_head))
> + dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> + domain_list);
> + else
> + dptr2 = NULL;
> }
> 
> if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4340,26 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
> 
> ds1 = dptr1->domain_data;
> + if (ds1->domain >= cd_info1->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n", ds1->domain, cd_info1->nr_domains, cs1->cpu);
> + return -1;
> + }
> dinfo1 = cd_info1->domains[ds1->domain];
> + if (!dinfo1) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n", ds1->domain, cs1->cpu);
> + return -1;
> + }
> if (dptr2) {
> ds2 = dptr2->domain_data;
> + if (ds2->domain >= cd_info2->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n", ds2->domain, cd_info2->nr_domains, cs2->cpu);
> + return -1;
> + }
> dinfo2 = cd_info2->domains[ds2->domain];
> + if (!dinfo2) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n", ds2->domain, cs2->cpu);
> + return -1;
> + }
> }
> 
> if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4388,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> print_domain_stats(ds1, ds2, jiffies1, jiffies2);
> print_separator2(SEP_LEN, "", 0);
> 
> - if (dptr2)
> - dptr2 = list_next_entry(dptr2, domain_list);
> + if (dptr2) {
> + if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> + dptr2 = NULL;
> + else
> + dptr2 = list_next_entry(dptr2, domain_list);
> + }
> }
> if (summary_only)
> break;
> 
> - if (cptr2)
> - cptr2 = list_next_entry(cptr2, cpu_list);
> + if (cptr2) {
> + if (list_is_last(&cptr2->cpu_list, head2))
> + cptr2 = NULL;
> + else
> + cptr2 = list_next_entry(cptr2, cpu_list);
> + }
> 
> is_summary = false;
> }
> @@ -4523,7 +4585,7 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
> }
> 
> cd_map = session->header.env.cpu_domain;
> - err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + err = show_schedstat_data(&cpu_head, cd_map, session->header.env.nr_cpus_avail, NULL, NULL, 0, false);
> }
> 
> out:
> @@ -4538,7 +4600,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
> struct list_head cpu_head_ses0, cpu_head_ses1;
> struct perf_session *session[2];
> - struct perf_data data[2];
> + struct perf_data data[2] = {0};

Hi Ian

Thanks for these fixes.

I had sent a fix for only above change ( perf_data initialization ) : https://lore.kernel.org/linux-perf-users/20260422173545.73144-1-atrajeev@linux.ibm.com/
Since that change is already covered in this patch. Tested with this patch and verified it covers the fix

Before the patch:

# for i in {0..20}; do ./perf test "perf sched stats tests"; done
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : FAILED!
 92: perf sched stats tests                                          : FAILED!
 92: perf sched stats tests                                          : FAILED!


After this fix

# for i in {0..20}; do ./perf test "perf sched stats tests"; done
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok
 92: perf sched stats tests                                          : Ok


Tested-by : Athira Rajeev <atrajeev@linux.ibm.com <mailto:atrajeev@linux.ibm.com>>

Thanks
Athira
> int ret = 0, err = 0;
> static const char *defaults[] = {
> "perf.data.old",
> @@ -4610,7 +4672,8 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> goto out_delete_ses0;
> }
> 
> - show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> + show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> +    &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
> 
> out_delete_ses1:
> free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4783,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
> goto out;
> }
> 
> - show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
> free_cpu_domain_info(cd_map, sv, nr);
> out:
> free_schedstat(&cpu_head);
> -- 
> 2.54.0.545.g6539524ca2-goog
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2] perf sched stats: Fix segmentation faults in diff mode
  2026-04-28  7:08 [PATCH v1] perf sched stats: Fix segmentation faults in diff mode Ian Rogers
  2026-04-29 12:30 ` James Clark
  2026-04-29 14:01 ` Athira Rajeev
@ 2026-04-29 17:39 ` Ian Rogers
  2026-04-30 13:29   ` James Clark
                     ` (2 more replies)
  2 siblings, 3 replies; 11+ messages in thread
From: Ian Rogers @ 2026-04-29 17:39 UTC (permalink / raw)
  To: acme, atrajeev, james.clark, namhyung
  Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
	venkat88, Ian Rogers

Address several segmentation fault vectors in `perf sched stats diff`:

1. In `get_all_cpu_stats()`, added `assert(!list_empty(head))` to prevent
   unsafe `list_first_entry()` calls on empty lists, fixed uninitialized
   variable `ret`, and added `list_is_last` check when iterating domains to
   prevent out-of-bounds reads during parallel list traversal.

2. In `show_schedstat_data()`, added NULL checks for `cd_map1` and `cd_map2`
   to gracefully handle invalid or empty data files.

3. Added parallel iteration termination checks using `list_is_last()` in
   `show_schedstat_data()` for both domain and CPU lists to safely terminate
   at the end of each list when files contain a different number of CPUs
   or domains.

4. Added CPU bounds checks (`cs1->cpu >= nr1` and `cs2->cpu >= nr2`) in
   `show_schedstat_data()` to prevent out-of-bounds reads from `cd_map1` and
   `cd_map2` when comparing files from machines with different CPU counts.

5. Added NULL checks for `cd_info1` and `cd_info2` to prevent crashes when
   a CPU has data samples but no corresponding domain info in the header.

6. Added domain bounds checks (`ds1->domain >= cd_info1->nr_domains` and
   `ds2->domain >= cd_info2->nr_domains`) to prevent out-of-bounds array
   accesses in the domains array.

7. Added NULL checks for `dinfo1` and `dinfo2` in `show_schedstat_data()`
   to prevent crashes when a domain has no corresponding domain info.

8. Zero-initialized the `perf_data` array in `perf_sched__schedstat_diff()`
   to prevent stack garbage from causing `perf_data_file__fd()` to attempt
   to use a NULL `fptr` when `use_stdio` happened to be non-zero.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>
---
v2: Reduce indentation and variable scopes. Hopefully address feedback
    from James Clark.

Previously this patch was part of a large perf script refactor:
https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/
---
 tools/perf/builtin-sched.c | 144 +++++++++++++++++++++++++++----------
 1 file changed, 106 insertions(+), 38 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..e67241415255 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -4170,40 +4170,40 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
  */
 static int get_all_cpu_stats(struct list_head *head)
 {
-	struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
-	struct schedstat_cpu *summary_head = NULL;
-	struct perf_record_schedstat_domain *ds;
-	struct perf_record_schedstat_cpu *cs;
-	struct schedstat_domain *dptr, *tdptr;
+	struct schedstat_cpu *cptr, *summary_head;
+	struct schedstat_domain *dptr;
 	bool is_last = false;
 	int cnt = 1;
-	int ret = 0;
 
-	if (cptr) {
-		summary_head = zalloc(sizeof(*summary_head));
-		if (!summary_head)
-			return -ENOMEM;
+	assert(!list_empty(head));
+	cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
 
-		summary_head->cpu_data = zalloc(sizeof(*cs));
-		memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
+	summary_head = zalloc(sizeof(*summary_head));
+	if (!summary_head)
+		return -ENOMEM;
 
-		INIT_LIST_HEAD(&summary_head->domain_head);
+	summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
+	memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
 
-		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
-			tdptr = zalloc(sizeof(*tdptr));
-			if (!tdptr)
-				return -ENOMEM;
+	INIT_LIST_HEAD(&summary_head->domain_head);
 
-			tdptr->domain_data = zalloc(sizeof(*ds));
-			if (!tdptr->domain_data)
-				return -ENOMEM;
+	list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
+		struct schedstat_domain *tdptr = zalloc(sizeof(*tdptr));
 
-			memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
-			list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
-		}
+		if (!tdptr)
+			return -ENOMEM;
+
+		tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
+		if (!tdptr->domain_data)
+			return -ENOMEM;
+
+		memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
+		list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
 	}
 
 	list_for_each_entry(cptr, head, cpu_list) {
+		struct schedstat_domain *tdptr;
+
 		if (list_is_first(&cptr->cpu_list, head))
 			continue;
 
@@ -4212,32 +4212,47 @@ static int get_all_cpu_stats(struct list_head *head)
 
 		cnt++;
 		summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
+		if (list_empty(&summary_head->domain_head))
+			continue;
+
 		tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
 					 domain_list);
 
 		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
 			summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
+			if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
+				tdptr = NULL;
+				break;
+			}
 			tdptr = list_next_entry(tdptr, domain_list);
 		}
 	}
 
 	list_add(&summary_head->cpu_list, head);
-	return ret;
+	return 0;
 }
 
-static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
-			       struct list_head *head2, struct cpu_domain_map **cd_map2,
+static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
+			       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
 			       bool summary_only)
 {
 	struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
 	struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
-	struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
 	struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
 	struct schedstat_cpu *cptr2 = NULL;
 	__u64 jiffies1 = 0, jiffies2 = 0;
 	bool is_summary = true;
 	int ret = 0;
 
+	if (!cd_map1) {
+		pr_err("Error: CPU domain map 1 is missing.\n");
+		return -1;
+	}
+	if (head2 && !cd_map2) {
+		pr_err("Error: CPU domain map 2 is missing.\n");
+		return -1;
+	}
+
 	printf("Description\n");
 	print_separator2(SEP_LEN, "", 0);
 	printf("%-30s-> %s\n", "DESC", "Description of the field");
@@ -4267,14 +4282,36 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 
 	list_for_each_entry(cptr1, head1, cpu_list) {
 		struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
+		struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
+		struct perf_record_schedstat_cpu *cs2 = NULL;
 
-		cs1 = cptr1->cpu_data;
+		dptr2 = NULL;
+		if (cs1->cpu >= (u32)nr1) {
+			pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
+			return -1;
+		}
 		cd_info1 = cd_map1[cs1->cpu];
+		if (!cd_info1) {
+			pr_err("Error: CPU %d domain info is missing in map 1.\n",
+			       cs1->cpu);
+			return -1;
+		}
 		if (cptr2) {
 			cs2 = cptr2->cpu_data;
+			if (cs2->cpu >= (u32)nr2) {
+				pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
+				return -1;
+			}
 			cd_info2 = cd_map2[cs2->cpu];
-			dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
-						 domain_list);
+			if (!cd_info2) {
+				pr_err("Error: CPU %d domain info is missing in map 2.\n",
+				       cs2->cpu);
+				return -1;
+			}
+			if (!list_empty(&cptr2->domain_head))
+				dptr2 = list_first_entry(&cptr2->domain_head,
+							 struct schedstat_domain,
+							 domain_list);
 		}
 
 		if (cs2 && cs1->cpu != cs2->cpu) {
@@ -4302,10 +4339,30 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 			struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
 
 			ds1 = dptr1->domain_data;
+			if (ds1->domain >= cd_info1->nr_domains) {
+				pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
+				       ds1->domain, cd_info1->nr_domains, cs1->cpu);
+				return -1;
+			}
 			dinfo1 = cd_info1->domains[ds1->domain];
+			if (!dinfo1) {
+				pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
+				       ds1->domain, cs1->cpu);
+				return -1;
+			}
 			if (dptr2) {
 				ds2 = dptr2->domain_data;
+				if (ds2->domain >= cd_info2->nr_domains) {
+					pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
+					       ds2->domain, cd_info2->nr_domains, cs2->cpu);
+					return -1;
+				}
 				dinfo2 = cd_info2->domains[ds2->domain];
+				if (!dinfo2) {
+					pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
+					       ds2->domain, cs2->cpu);
+					return -1;
+				}
 			}
 
 			if (dinfo2 && dinfo1->domain != dinfo2->domain) {
@@ -4334,14 +4391,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 			print_domain_stats(ds1, ds2, jiffies1, jiffies2);
 			print_separator2(SEP_LEN, "", 0);
 
-			if (dptr2)
-				dptr2 = list_next_entry(dptr2, domain_list);
+			if (dptr2) {
+				if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
+					dptr2 = NULL;
+				else
+					dptr2 = list_next_entry(dptr2, domain_list);
+			}
 		}
 		if (summary_only)
 			break;
 
-		if (cptr2)
-			cptr2 = list_next_entry(cptr2, cpu_list);
+		if (cptr2) {
+			if (list_is_last(&cptr2->cpu_list, head2))
+				cptr2 = NULL;
+			else
+				cptr2 = list_next_entry(cptr2, cpu_list);
+		}
 
 		is_summary = false;
 	}
@@ -4523,7 +4588,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
 		}
 
 		cd_map = session->header.env.cpu_domain;
-		err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+		err = show_schedstat_data(&cpu_head, cd_map,
+					  session->header.env.nr_cpus_avail,
+					  NULL, NULL, 0, false);
 	}
 
 out:
@@ -4538,7 +4605,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 	struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
 	struct list_head cpu_head_ses0, cpu_head_ses1;
 	struct perf_session *session[2];
-	struct perf_data data[2];
+	struct perf_data data[2] = {0};
 	int ret = 0, err = 0;
 	static const char *defaults[] = {
 		"perf.data.old",
@@ -4610,7 +4677,8 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 		goto out_delete_ses0;
 	}
 
-	show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
+	show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
+			    &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
 
 out_delete_ses1:
 	free_schedstat(&cpu_head_ses1);
@@ -4720,7 +4788,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
 		goto out;
 	}
 
-	show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+	show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
 	free_cpu_domain_info(cd_map, sv, nr);
 out:
 	free_schedstat(&cpu_head);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] perf sched stats: Fix segmentation faults in diff mode
  2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
@ 2026-04-30 13:29   ` James Clark
  2026-05-01 15:16   ` Athira Rajeev
  2026-05-06  0:00   ` [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers " Ian Rogers
  2 siblings, 0 replies; 11+ messages in thread
From: James Clark @ 2026-04-30 13:29 UTC (permalink / raw)
  To: Ian Rogers
  Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
	venkat88, acme, atrajeev, namhyung



On 29/04/2026 6:39 pm, Ian Rogers wrote:
> Address several segmentation fault vectors in `perf sched stats diff`:
> 
> 1. In `get_all_cpu_stats()`, added `assert(!list_empty(head))` to prevent
>     unsafe `list_first_entry()` calls on empty lists, fixed uninitialized
>     variable `ret`, and added `list_is_last` check when iterating domains to
>     prevent out-of-bounds reads during parallel list traversal.
> 
> 2. In `show_schedstat_data()`, added NULL checks for `cd_map1` and `cd_map2`
>     to gracefully handle invalid or empty data files.
> 
> 3. Added parallel iteration termination checks using `list_is_last()` in
>     `show_schedstat_data()` for both domain and CPU lists to safely terminate
>     at the end of each list when files contain a different number of CPUs
>     or domains.
> 
> 4. Added CPU bounds checks (`cs1->cpu >= nr1` and `cs2->cpu >= nr2`) in
>     `show_schedstat_data()` to prevent out-of-bounds reads from `cd_map1` and
>     `cd_map2` when comparing files from machines with different CPU counts.
> 
> 5. Added NULL checks for `cd_info1` and `cd_info2` to prevent crashes when
>     a CPU has data samples but no corresponding domain info in the header.
> 
> 6. Added domain bounds checks (`ds1->domain >= cd_info1->nr_domains` and
>     `ds2->domain >= cd_info2->nr_domains`) to prevent out-of-bounds array
>     accesses in the domains array.
> 
> 7. Added NULL checks for `dinfo1` and `dinfo2` in `show_schedstat_data()`
>     to prevent crashes when a domain has no corresponding domain info.
> 
> 8. Zero-initialized the `perf_data` array in `perf_sched__schedstat_diff()`
>     to prevent stack garbage from causing `perf_data_file__fd()` to attempt
>     to use a NULL `fptr` when `use_stdio` happened to be non-zero.
> 
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> v2: Reduce indentation and variable scopes. Hopefully address feedback
>      from James Clark.
> 
> Previously this patch was part of a large perf script refactor:
> https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/
> ---
>   tools/perf/builtin-sched.c | 144 +++++++++++++++++++++++++++----------
>   1 file changed, 106 insertions(+), 38 deletions(-)
> 

Reviewed-by: James Clark <james.clark@linaro.org>

> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..e67241415255 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -4170,40 +4170,40 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
>    */
>   static int get_all_cpu_stats(struct list_head *head)
>   {
> -	struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> -	struct schedstat_cpu *summary_head = NULL;
> -	struct perf_record_schedstat_domain *ds;
> -	struct perf_record_schedstat_cpu *cs;
> -	struct schedstat_domain *dptr, *tdptr;
> +	struct schedstat_cpu *cptr, *summary_head;
> +	struct schedstat_domain *dptr;
>   	bool is_last = false;
>   	int cnt = 1;
> -	int ret = 0;
>   
> -	if (cptr) {
> -		summary_head = zalloc(sizeof(*summary_head));
> -		if (!summary_head)
> -			return -ENOMEM;
> +	assert(!list_empty(head));
> +	cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
>   
> -		summary_head->cpu_data = zalloc(sizeof(*cs));
> -		memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> +	summary_head = zalloc(sizeof(*summary_head));
> +	if (!summary_head)
> +		return -ENOMEM;
>   
> -		INIT_LIST_HEAD(&summary_head->domain_head);
> +	summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> +	memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
>   
> -		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> -			tdptr = zalloc(sizeof(*tdptr));
> -			if (!tdptr)
> -				return -ENOMEM;
> +	INIT_LIST_HEAD(&summary_head->domain_head);
>   
> -			tdptr->domain_data = zalloc(sizeof(*ds));
> -			if (!tdptr->domain_data)
> -				return -ENOMEM;
> +	list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> +		struct schedstat_domain *tdptr = zalloc(sizeof(*tdptr));
>   
> -			memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
> -			list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> -		}
> +		if (!tdptr)
> +			return -ENOMEM;
> +
> +		tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> +		if (!tdptr->domain_data)
> +			return -ENOMEM;
> +
> +		memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
> +		list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
>   	}
>   
>   	list_for_each_entry(cptr, head, cpu_list) {
> +		struct schedstat_domain *tdptr;
> +
>   		if (list_is_first(&cptr->cpu_list, head))
>   			continue;
>   
> @@ -4212,32 +4212,47 @@ static int get_all_cpu_stats(struct list_head *head)
>   
>   		cnt++;
>   		summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> +		if (list_empty(&summary_head->domain_head))
> +			continue;
> +
>   		tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
>   					 domain_list);
>   
>   		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
>   			summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> +			if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
> +				tdptr = NULL;
> +				break;
> +			}
>   			tdptr = list_next_entry(tdptr, domain_list);
>   		}
>   	}
>   
>   	list_add(&summary_head->cpu_list, head);
> -	return ret;
> +	return 0;
>   }
>   
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> -			       struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> +			       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
>   			       bool summary_only)
>   {
>   	struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
>   	struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
> -	struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
>   	struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
>   	struct schedstat_cpu *cptr2 = NULL;
>   	__u64 jiffies1 = 0, jiffies2 = 0;
>   	bool is_summary = true;
>   	int ret = 0;
>   
> +	if (!cd_map1) {
> +		pr_err("Error: CPU domain map 1 is missing.\n");
> +		return -1;
> +	}
> +	if (head2 && !cd_map2) {
> +		pr_err("Error: CPU domain map 2 is missing.\n");
> +		return -1;
> +	}
> +
>   	printf("Description\n");
>   	print_separator2(SEP_LEN, "", 0);
>   	printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4267,14 +4282,36 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   
>   	list_for_each_entry(cptr1, head1, cpu_list) {
>   		struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> +		struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
> +		struct perf_record_schedstat_cpu *cs2 = NULL;
>   
> -		cs1 = cptr1->cpu_data;
> +		dptr2 = NULL;
> +		if (cs1->cpu >= (u32)nr1) {
> +			pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> +			return -1;
> +		}
>   		cd_info1 = cd_map1[cs1->cpu];
> +		if (!cd_info1) {
> +			pr_err("Error: CPU %d domain info is missing in map 1.\n",
> +			       cs1->cpu);
> +			return -1;
> +		}
>   		if (cptr2) {
>   			cs2 = cptr2->cpu_data;
> +			if (cs2->cpu >= (u32)nr2) {
> +				pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> +				return -1;
> +			}
>   			cd_info2 = cd_map2[cs2->cpu];
> -			dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> -						 domain_list);
> +			if (!cd_info2) {
> +				pr_err("Error: CPU %d domain info is missing in map 2.\n",
> +				       cs2->cpu);
> +				return -1;
> +			}
> +			if (!list_empty(&cptr2->domain_head))
> +				dptr2 = list_first_entry(&cptr2->domain_head,
> +							 struct schedstat_domain,
> +							 domain_list);
>   		}
>   
>   		if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4339,30 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   			struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>   
>   			ds1 = dptr1->domain_data;
> +			if (ds1->domain >= cd_info1->nr_domains) {
> +				pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
> +				       ds1->domain, cd_info1->nr_domains, cs1->cpu);
> +				return -1;
> +			}
>   			dinfo1 = cd_info1->domains[ds1->domain];
> +			if (!dinfo1) {
> +				pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
> +				       ds1->domain, cs1->cpu);
> +				return -1;
> +			}
>   			if (dptr2) {
>   				ds2 = dptr2->domain_data;
> +				if (ds2->domain >= cd_info2->nr_domains) {
> +					pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
> +					       ds2->domain, cd_info2->nr_domains, cs2->cpu);
> +					return -1;
> +				}
>   				dinfo2 = cd_info2->domains[ds2->domain];
> +				if (!dinfo2) {
> +					pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
> +					       ds2->domain, cs2->cpu);
> +					return -1;
> +				}
>   			}
>   
>   			if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4391,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   			print_domain_stats(ds1, ds2, jiffies1, jiffies2);
>   			print_separator2(SEP_LEN, "", 0);
>   
> -			if (dptr2)
> -				dptr2 = list_next_entry(dptr2, domain_list);
> +			if (dptr2) {
> +				if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> +					dptr2 = NULL;
> +				else
> +					dptr2 = list_next_entry(dptr2, domain_list);
> +			}
>   		}
>   		if (summary_only)
>   			break;
>   
> -		if (cptr2)
> -			cptr2 = list_next_entry(cptr2, cpu_list);
> +		if (cptr2) {
> +			if (list_is_last(&cptr2->cpu_list, head2))
> +				cptr2 = NULL;
> +			else
> +				cptr2 = list_next_entry(cptr2, cpu_list);
> +		}
>   
>   		is_summary = false;
>   	}
> @@ -4523,7 +4588,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
>   		}
>   
>   		cd_map = session->header.env.cpu_domain;
> -		err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> +		err = show_schedstat_data(&cpu_head, cd_map,
> +					  session->header.env.nr_cpus_avail,
> +					  NULL, NULL, 0, false);
>   	}
>   
>   out:
> @@ -4538,7 +4605,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>   	struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
>   	struct list_head cpu_head_ses0, cpu_head_ses1;
>   	struct perf_session *session[2];
> -	struct perf_data data[2];
> +	struct perf_data data[2] = {0};
>   	int ret = 0, err = 0;
>   	static const char *defaults[] = {
>   		"perf.data.old",
> @@ -4610,7 +4677,8 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>   		goto out_delete_ses0;
>   	}
>   
> -	show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> +	show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> +			    &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
>   
>   out_delete_ses1:
>   	free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4788,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
>   		goto out;
>   	}
>   
> -	show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> +	show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
>   	free_cpu_domain_info(cd_map, sv, nr);
>   out:
>   	free_schedstat(&cpu_head);


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] perf sched stats: Fix segmentation faults in diff mode
  2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
  2026-04-30 13:29   ` James Clark
@ 2026-05-01 15:16   ` Athira Rajeev
  2026-05-06  0:00   ` [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers " Ian Rogers
  2 siblings, 0 replies; 11+ messages in thread
From: Athira Rajeev @ 2026-05-01 15:16 UTC (permalink / raw)
  To: Ian Rogers
  Cc: acme, james.clark, namhyung, adrian.hunter, linux-kernel,
	linux-perf-users, mingo, peterz, venkat88



> On 29 Apr 2026, at 11:09 PM, Ian Rogers <irogers@google.com> wrote:
> 
> Address several segmentation fault vectors in `perf sched stats diff`:
> 
> 1. In `get_all_cpu_stats()`, added `assert(!list_empty(head))` to prevent
>   unsafe `list_first_entry()` calls on empty lists, fixed uninitialized
>   variable `ret`, and added `list_is_last` check when iterating domains to
>   prevent out-of-bounds reads during parallel list traversal.
> 
> 2. In `show_schedstat_data()`, added NULL checks for `cd_map1` and `cd_map2`
>   to gracefully handle invalid or empty data files.
> 
> 3. Added parallel iteration termination checks using `list_is_last()` in
>   `show_schedstat_data()` for both domain and CPU lists to safely terminate
>   at the end of each list when files contain a different number of CPUs
>   or domains.
> 
> 4. Added CPU bounds checks (`cs1->cpu >= nr1` and `cs2->cpu >= nr2`) in
>   `show_schedstat_data()` to prevent out-of-bounds reads from `cd_map1` and
>   `cd_map2` when comparing files from machines with different CPU counts.
> 
> 5. Added NULL checks for `cd_info1` and `cd_info2` to prevent crashes when
>   a CPU has data samples but no corresponding domain info in the header.
> 
> 6. Added domain bounds checks (`ds1->domain >= cd_info1->nr_domains` and
>   `ds2->domain >= cd_info2->nr_domains`) to prevent out-of-bounds array
>   accesses in the domains array.
> 
> 7. Added NULL checks for `dinfo1` and `dinfo2` in `show_schedstat_data()`
>   to prevent crashes when a domain has no corresponding domain info.
> 
> 8. Zero-initialized the `perf_data` array in `perf_sched__schedstat_diff()`
>   to prevent stack garbage from causing `perf_data_file__fd()` to attempt
>   to use a NULL `fptr` when `use_stdio` happened to be non-zero.
> 
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
> ---
> v2: Reduce indentation and variable scopes. Hopefully address feedback
>    from James Clark.
> 
> Previously this patch was part of a large perf script refactor:
> https://lore.kernel.org/lkml/20260425224951.174663-1-irogers@google.com/

Tested-by: Athira Rajeev <atrajeev@linux.ibm.com <mailto:atrajeev@linux.ibm.com>>

Thanks
Athira
> ---
> tools/perf/builtin-sched.c | 144 +++++++++++++++++++++++++++----------
> 1 file changed, 106 insertions(+), 38 deletions(-)
> 
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..e67241415255 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -4170,40 +4170,40 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
>  */
> static int get_all_cpu_stats(struct list_head *head)
> {
> - struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> - struct schedstat_cpu *summary_head = NULL;
> - struct perf_record_schedstat_domain *ds;
> - struct perf_record_schedstat_cpu *cs;
> - struct schedstat_domain *dptr, *tdptr;
> + struct schedstat_cpu *cptr, *summary_head;
> + struct schedstat_domain *dptr;
> bool is_last = false;
> int cnt = 1;
> - int ret = 0;
> 
> - if (cptr) {
> - summary_head = zalloc(sizeof(*summary_head));
> - if (!summary_head)
> - return -ENOMEM;
> + assert(!list_empty(head));
> + cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> 
> - summary_head->cpu_data = zalloc(sizeof(*cs));
> - memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> + summary_head = zalloc(sizeof(*summary_head));
> + if (!summary_head)
> + return -ENOMEM;
> 
> - INIT_LIST_HEAD(&summary_head->domain_head);
> + summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> + memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
> 
> - list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> - tdptr = zalloc(sizeof(*tdptr));
> - if (!tdptr)
> - return -ENOMEM;
> + INIT_LIST_HEAD(&summary_head->domain_head);
> 
> - tdptr->domain_data = zalloc(sizeof(*ds));
> - if (!tdptr->domain_data)
> - return -ENOMEM;
> + list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> + struct schedstat_domain *tdptr = zalloc(sizeof(*tdptr));
> 
> - memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
> - list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> - }
> + if (!tdptr)
> + return -ENOMEM;
> +
> + tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> + if (!tdptr->domain_data)
> + return -ENOMEM;
> +
> + memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
> + list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> }
> 
> list_for_each_entry(cptr, head, cpu_list) {
> + struct schedstat_domain *tdptr;
> +
> if (list_is_first(&cptr->cpu_list, head))
> continue;
> 
> @@ -4212,32 +4212,47 @@ static int get_all_cpu_stats(struct list_head *head)
> 
> cnt++;
> summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> + if (list_empty(&summary_head->domain_head))
> + continue;
> +
> tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
> domain_list);
> 
> list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> + if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
> + tdptr = NULL;
> + break;
> + }
> tdptr = list_next_entry(tdptr, domain_list);
> }
> }
> 
> list_add(&summary_head->cpu_list, head);
> - return ret;
> + return 0;
> }
> 
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> -       struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> +       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
>       bool summary_only)
> {
> struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
> struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
> - struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
> struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
> struct schedstat_cpu *cptr2 = NULL;
> __u64 jiffies1 = 0, jiffies2 = 0;
> bool is_summary = true;
> int ret = 0;
> 
> + if (!cd_map1) {
> + pr_err("Error: CPU domain map 1 is missing.\n");
> + return -1;
> + }
> + if (head2 && !cd_map2) {
> + pr_err("Error: CPU domain map 2 is missing.\n");
> + return -1;
> + }
> +
> printf("Description\n");
> print_separator2(SEP_LEN, "", 0);
> printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4267,14 +4282,36 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> 
> list_for_each_entry(cptr1, head1, cpu_list) {
> struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> + struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
> + struct perf_record_schedstat_cpu *cs2 = NULL;
> 
> - cs1 = cptr1->cpu_data;
> + dptr2 = NULL;
> + if (cs1->cpu >= (u32)nr1) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> + return -1;
> + }
> cd_info1 = cd_map1[cs1->cpu];
> + if (!cd_info1) {
> + pr_err("Error: CPU %d domain info is missing in map 1.\n",
> +       cs1->cpu);
> + return -1;
> + }
> if (cptr2) {
> cs2 = cptr2->cpu_data;
> + if (cs2->cpu >= (u32)nr2) {
> + pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> + return -1;
> + }
> cd_info2 = cd_map2[cs2->cpu];
> - dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> - domain_list);
> + if (!cd_info2) {
> + pr_err("Error: CPU %d domain info is missing in map 2.\n",
> +       cs2->cpu);
> + return -1;
> + }
> + if (!list_empty(&cptr2->domain_head))
> + dptr2 = list_first_entry(&cptr2->domain_head,
> + struct schedstat_domain,
> + domain_list);
> }
> 
> if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4339,30 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
> 
> ds1 = dptr1->domain_data;
> + if (ds1->domain >= cd_info1->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
> +       ds1->domain, cd_info1->nr_domains, cs1->cpu);
> + return -1;
> + }
> dinfo1 = cd_info1->domains[ds1->domain];
> + if (!dinfo1) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
> +       ds1->domain, cs1->cpu);
> + return -1;
> + }
> if (dptr2) {
> ds2 = dptr2->domain_data;
> + if (ds2->domain >= cd_info2->nr_domains) {
> + pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
> +       ds2->domain, cd_info2->nr_domains, cs2->cpu);
> + return -1;
> + }
> dinfo2 = cd_info2->domains[ds2->domain];
> + if (!dinfo2) {
> + pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
> +       ds2->domain, cs2->cpu);
> + return -1;
> + }
> }
> 
> if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4391,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
> print_domain_stats(ds1, ds2, jiffies1, jiffies2);
> print_separator2(SEP_LEN, "", 0);
> 
> - if (dptr2)
> - dptr2 = list_next_entry(dptr2, domain_list);
> + if (dptr2) {
> + if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> + dptr2 = NULL;
> + else
> + dptr2 = list_next_entry(dptr2, domain_list);
> + }
> }
> if (summary_only)
> break;
> 
> - if (cptr2)
> - cptr2 = list_next_entry(cptr2, cpu_list);
> + if (cptr2) {
> + if (list_is_last(&cptr2->cpu_list, head2))
> + cptr2 = NULL;
> + else
> + cptr2 = list_next_entry(cptr2, cpu_list);
> + }
> 
> is_summary = false;
> }
> @@ -4523,7 +4588,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
> }
> 
> cd_map = session->header.env.cpu_domain;
> - err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + err = show_schedstat_data(&cpu_head, cd_map,
> +  session->header.env.nr_cpus_avail,
> +  NULL, NULL, 0, false);
> }
> 
> out:
> @@ -4538,7 +4605,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
> struct list_head cpu_head_ses0, cpu_head_ses1;
> struct perf_session *session[2];
> - struct perf_data data[2];
> + struct perf_data data[2] = {0};
> int ret = 0, err = 0;
> static const char *defaults[] = {
> "perf.data.old",
> @@ -4610,7 +4677,8 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
> goto out_delete_ses0;
> }
> 
> - show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> + show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> +    &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
> 
> out_delete_ses1:
> free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4788,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
> goto out;
> }
> 
> - show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> + show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
> free_cpu_domain_info(cd_map, sv, nr);
> out:
> free_schedstat(&cpu_head);
> -- 
> 2.54.0.545.g6539524ca2-goog
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
  2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
  2026-04-30 13:29   ` James Clark
  2026-05-01 15:16   ` Athira Rajeev
@ 2026-05-06  0:00   ` Ian Rogers
  2026-05-06  4:10     ` [PATCH v4] " Ian Rogers
  2 siblings, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2026-05-06  0:00 UTC (permalink / raw)
  To: irogers, acme, atrajeev, james.clark, namhyung
  Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
	venkat88

The patch addresses multiple segmentation fault vectors, out-of-bounds
reads, and memory leaks in perf sched stats by adding bounds checks,
NULL checks, proper error propagation, and robust memory cleanup.

1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
   unsafe list_first_entry() calls on empty lists, added a missing NULL
   check for summary_head->cpu_data allocation, and implemented a cleanup
   ladder using a temporary list to prevent memory leaks on error paths.
2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
   and cpu_data pointers are freed.
3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
   its value from a previous iteration when dptr2 became NULL, and added
   proper propagation of errors from get_all_cpu_stats().
4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
   and perf_sched__schedstat_live() to prevent output corruption on failure.
5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
   to gracefully handle invalid or empty data files.
6. Added parallel iteration termination checks using list_is_last() in
   show_schedstat_data() for both domain and CPU lists to safely terminate
   at the end of each list when files contain a different number of CPUs
   or domains.
7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
   show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
   cd_map2 when comparing files from machines with different CPU counts.
8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
   a CPU has data samples but no corresponding domain info in the header.
9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
   ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
   accesses in the domains array.
10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
    to prevent crashes when a domain has no corresponding domain info.
11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
    to prevent stack garbage from causing perf_data_file__fd() to attempt
    to use a NULL fptr when use_stdio happened to be non-zero.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>

---
v3:
Address sashiko feedback:
 - Fix memory leaks in free_schedstat() by freeing internal data.
 - Fix missing NULL check for summary_head->cpu_data in get_all_cpu_stats().
 - Fix memory leaks in get_all_cpu_stats() on error paths using a temporary cleanup list.
 - Fix stale ds2 pointer comparison in show_schedstat_data() by resetting it to NULL.
 - Add error propagation from get_all_cpu_stats() and show_schedstat_data().
 - Correct commit message to remove inaccurate claim about uninitialized variable 'ret'.
 - Add forward declaration for free_schedstat() to fix compilation error.

v2: Had tested-by Athira Rajeev and reviewed-by James Clark, removed
    from v3 given the number of changes.
https://lore.kernel.org/linux-perf-users/20260429173931.2700115-1-irogers@google.com/
---
 tools/perf/builtin-sched.c | 165 +++++++++++++++++++++++++++++--------
 1 file changed, 131 insertions(+), 34 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..e39a30056dc5 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3947,6 +3947,8 @@ static struct schedstat_domain *domain_second_pass;
 static bool after_workload_flag;
 static bool verbose_field;
 
+static void free_schedstat(struct list_head *head);
+
 static void store_schedstat_cpu_diff(struct schedstat_cpu *after_workload)
 {
 	struct perf_record_schedstat_cpu *before = cpu_second_pass->cpu_data;
@@ -4170,37 +4172,50 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
  */
 static int get_all_cpu_stats(struct list_head *head)
 {
-	struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
-	struct schedstat_cpu *summary_head = NULL;
-	struct perf_record_schedstat_domain *ds;
-	struct perf_record_schedstat_cpu *cs;
+	struct schedstat_cpu *cptr, *summary_head = NULL;
 	struct schedstat_domain *dptr, *tdptr;
 	bool is_last = false;
 	int cnt = 1;
 	int ret = 0;
+	struct list_head tmp_cleanup_list;
 
-	if (cptr) {
-		summary_head = zalloc(sizeof(*summary_head));
-		if (!summary_head)
-			return -ENOMEM;
+	assert(!list_empty(head));
+	cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
 
-		summary_head->cpu_data = zalloc(sizeof(*cs));
-		memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
+	INIT_LIST_HEAD(&tmp_cleanup_list);
 
-		INIT_LIST_HEAD(&summary_head->domain_head);
+	summary_head = zalloc(sizeof(*summary_head));
+	if (!summary_head)
+		return -ENOMEM;
 
-		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
-			tdptr = zalloc(sizeof(*tdptr));
-			if (!tdptr)
-				return -ENOMEM;
+	INIT_LIST_HEAD(&summary_head->domain_head);
+	INIT_LIST_HEAD(&summary_head->cpu_list);
+	list_add(&summary_head->cpu_list, &tmp_cleanup_list);
 
-			tdptr->domain_data = zalloc(sizeof(*ds));
-			if (!tdptr->domain_data)
-				return -ENOMEM;
+	summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
+	if (!summary_head->cpu_data) {
+		ret = -ENOMEM;
+		goto out_cleanup;
+	}
+	memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
 
-			memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
-			list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
+	list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
+		tdptr = zalloc(sizeof(*tdptr));
+		if (!tdptr) {
+			ret = -ENOMEM;
+			goto out_cleanup;
 		}
+		INIT_LIST_HEAD(&tdptr->domain_list);
+
+		tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
+		if (!tdptr->domain_data) {
+			free(tdptr);
+			ret = -ENOMEM;
+			goto out_cleanup;
+		}
+
+		memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
+		list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
 	}
 
 	list_for_each_entry(cptr, head, cpu_list) {
@@ -4212,32 +4227,52 @@ static int get_all_cpu_stats(struct list_head *head)
 
 		cnt++;
 		summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
+		if (list_empty(&summary_head->domain_head))
+			continue;
+
 		tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
 					 domain_list);
 
 		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
 			summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
+			if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
+				tdptr = NULL;
+				break;
+			}
 			tdptr = list_next_entry(tdptr, domain_list);
 		}
 	}
 
+	list_del_init(&summary_head->cpu_list);
 	list_add(&summary_head->cpu_list, head);
+	return 0;
+
+out_cleanup:
+	free_schedstat(&tmp_cleanup_list);
 	return ret;
 }
 
-static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
-			       struct list_head *head2, struct cpu_domain_map **cd_map2,
+static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
+			       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
 			       bool summary_only)
 {
 	struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
 	struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
-	struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
 	struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
 	struct schedstat_cpu *cptr2 = NULL;
 	__u64 jiffies1 = 0, jiffies2 = 0;
 	bool is_summary = true;
 	int ret = 0;
 
+	if (!cd_map1) {
+		pr_err("Error: CPU domain map 1 is missing.\n");
+		return -1;
+	}
+	if (head2 && !cd_map2) {
+		pr_err("Error: CPU domain map 2 is missing.\n");
+		return -1;
+	}
+
 	printf("Description\n");
 	print_separator2(SEP_LEN, "", 0);
 	printf("%-30s-> %s\n", "DESC", "Description of the field");
@@ -4260,21 +4295,47 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 	printf("\n");
 
 	ret = get_all_cpu_stats(head1);
+	if (ret)
+		return ret;
 	if (cptr2) {
 		ret = get_all_cpu_stats(head2);
+		if (ret)
+			return ret;
 		cptr2 = list_first_entry(head2, struct schedstat_cpu, cpu_list);
 	}
 
 	list_for_each_entry(cptr1, head1, cpu_list) {
 		struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
+		struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
+		struct perf_record_schedstat_cpu *cs2 = NULL;
 
-		cs1 = cptr1->cpu_data;
+		dptr2 = NULL;
+		if (cs1->cpu >= (u32)nr1) {
+			pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
+			return -1;
+		}
 		cd_info1 = cd_map1[cs1->cpu];
+		if (!cd_info1) {
+			pr_err("Error: CPU %d domain info is missing in map 1.\n",
+			       cs1->cpu);
+			return -1;
+		}
 		if (cptr2) {
 			cs2 = cptr2->cpu_data;
+			if (cs2->cpu >= (u32)nr2) {
+				pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
+				return -1;
+			}
 			cd_info2 = cd_map2[cs2->cpu];
-			dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
-						 domain_list);
+			if (!cd_info2) {
+				pr_err("Error: CPU %d domain info is missing in map 2.\n",
+				       cs2->cpu);
+				return -1;
+			}
+			if (!list_empty(&cptr2->domain_head))
+				dptr2 = list_first_entry(&cptr2->domain_head,
+							 struct schedstat_domain,
+							 domain_list);
 		}
 
 		if (cs2 && cs1->cpu != cs2->cpu) {
@@ -4302,10 +4363,31 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 			struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
 
 			ds1 = dptr1->domain_data;
+			ds2 = NULL;
+			if (ds1->domain >= cd_info1->nr_domains) {
+				pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
+				       ds1->domain, cd_info1->nr_domains, cs1->cpu);
+				return -1;
+			}
 			dinfo1 = cd_info1->domains[ds1->domain];
+			if (!dinfo1) {
+				pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
+				       ds1->domain, cs1->cpu);
+				return -1;
+			}
 			if (dptr2) {
 				ds2 = dptr2->domain_data;
+				if (ds2->domain >= cd_info2->nr_domains) {
+					pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
+					       ds2->domain, cd_info2->nr_domains, cs2->cpu);
+					return -1;
+				}
 				dinfo2 = cd_info2->domains[ds2->domain];
+				if (!dinfo2) {
+					pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
+					       ds2->domain, cs2->cpu);
+					return -1;
+				}
 			}
 
 			if (dinfo2 && dinfo1->domain != dinfo2->domain) {
@@ -4334,14 +4416,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 			print_domain_stats(ds1, ds2, jiffies1, jiffies2);
 			print_separator2(SEP_LEN, "", 0);
 
-			if (dptr2)
-				dptr2 = list_next_entry(dptr2, domain_list);
+			if (dptr2) {
+				if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
+					dptr2 = NULL;
+				else
+					dptr2 = list_next_entry(dptr2, domain_list);
+			}
 		}
 		if (summary_only)
 			break;
 
-		if (cptr2)
-			cptr2 = list_next_entry(cptr2, cpu_list);
+		if (cptr2) {
+			if (list_is_last(&cptr2->cpu_list, head2))
+				cptr2 = NULL;
+			else
+				cptr2 = list_next_entry(cptr2, cpu_list);
+		}
 
 		is_summary = false;
 	}
@@ -4473,9 +4563,11 @@ static void free_schedstat(struct list_head *head)
 	list_for_each_entry_safe(cptr, n2, head, cpu_list) {
 		list_for_each_entry_safe(dptr, n1, &cptr->domain_head, domain_list) {
 			list_del_init(&dptr->domain_list);
+			free(dptr->domain_data);
 			free(dptr);
 		}
 		list_del_init(&cptr->cpu_list);
+		free(cptr->cpu_data);
 		free(cptr);
 	}
 }
@@ -4523,7 +4615,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
 		}
 
 		cd_map = session->header.env.cpu_domain;
-		err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+		err = show_schedstat_data(&cpu_head, cd_map,
+					  session->header.env.nr_cpus_avail,
+					  NULL, NULL, 0, false);
 	}
 
 out:
@@ -4538,7 +4632,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 	struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
 	struct list_head cpu_head_ses0, cpu_head_ses1;
 	struct perf_session *session[2];
-	struct perf_data data[2];
+	struct perf_data data[2] = {0};
 	int ret = 0, err = 0;
 	static const char *defaults[] = {
 		"perf.data.old",
@@ -4610,7 +4704,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 		goto out_delete_ses0;
 	}
 
-	show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
+	ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
+				  &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
+	if (ret)
+		goto out_delete_ses1;
 
 out_delete_ses1:
 	free_schedstat(&cpu_head_ses1);
@@ -4720,7 +4817,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
 		goto out;
 	}
 
-	show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+	err = show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
 	free_cpu_domain_info(cd_map, sv, nr);
 out:
 	free_schedstat(&cpu_head);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
  2026-05-06  0:00   ` [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers " Ian Rogers
@ 2026-05-06  4:10     ` Ian Rogers
  2026-05-06  6:45       ` Ian Rogers
  2026-05-06  9:07       ` James Clark
  0 siblings, 2 replies; 11+ messages in thread
From: Ian Rogers @ 2026-05-06  4:10 UTC (permalink / raw)
  To: irogers, acme, atrajeev, james.clark, namhyung
  Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
	venkat88

The patch addresses multiple segmentation fault vectors, out-of-bounds
reads, and memory leaks in perf sched stats by adding bounds checks,
NULL checks, proper error propagation, and robust memory cleanup.

1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
   unsafe list_first_entry() calls on empty lists, added a missing NULL
   check for summary_head->cpu_data allocation, and implemented a cleanup
   ladder using a temporary list to prevent memory leaks on error paths.
2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
   and cpu_data pointers are freed.
3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
   its value from a previous iteration when dptr2 became NULL, and added
   proper propagation of errors from get_all_cpu_stats().
4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
   and perf_sched__schedstat_live() to prevent output corruption on failure.
5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
   to gracefully handle invalid or empty data files.
6. Added parallel iteration termination checks using list_is_last() in
   show_schedstat_data() for both domain and CPU lists to safely terminate
   at the end of each list when files contain a different number of CPUs
   or domains.
7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
   show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
   cd_map2 when comparing files from machines with different CPU counts.
8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
   a CPU has data samples but no corresponding domain info in the header.
9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
   ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
   accesses in the domains array.
10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
    to prevent crashes when a domain has no corresponding domain info.
11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
    to prevent stack garbage from causing perf_data_file__fd() to attempt
    to use a NULL fptr when use_stdio happened to be non-zero.

Assisted-by: Gemini:gemini-3.1-pro-preview
Signed-off-by: Ian Rogers <irogers@google.com>

---
v4:
Address further sashiko feedback:
 - Fix memory leaks of global cpu_head list when perf_session__process_events() fails.
 - Fix resource leak of second session by correcting cleanup order when first session's data is empty.

v3:
Address sashiko feedback:
 - Fix memory leaks in free_schedstat() by freeing internal data.
 - Fix missing NULL check for summary_head->cpu_data in get_all_cpu_stats().
 - Fix memory leaks in get_all_cpu_stats() on error paths using a temporary cleanup list.
 - Fix stale ds2 pointer comparison in show_schedstat_data() by resetting it to NULL.
 - Add error propagation from get_all_cpu_stats() and show_schedstat_data().
 - Correct commit message to remove inaccurate claim about uninitialized variable 'ret'.
 - Add forward declaration for free_schedstat() to fix compilation error.

v2: Had tested-by Athira Rajeev and reviewed-by James Clark, removed
    from v3 given the number of changes.
https://lore.kernel.org/linux-perf-users/20260429173931.2700115-1-irogers@google.com/
---
 tools/perf/builtin-sched.c | 175 +++++++++++++++++++++++++++++--------
 1 file changed, 138 insertions(+), 37 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 555247568e7a..fd679b106582 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -3947,6 +3947,8 @@ static struct schedstat_domain *domain_second_pass;
 static bool after_workload_flag;
 static bool verbose_field;
 
+static void free_schedstat(struct list_head *head);
+
 static void store_schedstat_cpu_diff(struct schedstat_cpu *after_workload)
 {
 	struct perf_record_schedstat_cpu *before = cpu_second_pass->cpu_data;
@@ -4170,37 +4172,50 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
  */
 static int get_all_cpu_stats(struct list_head *head)
 {
-	struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
-	struct schedstat_cpu *summary_head = NULL;
-	struct perf_record_schedstat_domain *ds;
-	struct perf_record_schedstat_cpu *cs;
+	struct schedstat_cpu *cptr, *summary_head = NULL;
 	struct schedstat_domain *dptr, *tdptr;
 	bool is_last = false;
 	int cnt = 1;
 	int ret = 0;
+	struct list_head tmp_cleanup_list;
 
-	if (cptr) {
-		summary_head = zalloc(sizeof(*summary_head));
-		if (!summary_head)
-			return -ENOMEM;
+	assert(!list_empty(head));
+	cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
 
-		summary_head->cpu_data = zalloc(sizeof(*cs));
-		memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
+	INIT_LIST_HEAD(&tmp_cleanup_list);
 
-		INIT_LIST_HEAD(&summary_head->domain_head);
+	summary_head = zalloc(sizeof(*summary_head));
+	if (!summary_head)
+		return -ENOMEM;
 
-		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
-			tdptr = zalloc(sizeof(*tdptr));
-			if (!tdptr)
-				return -ENOMEM;
+	INIT_LIST_HEAD(&summary_head->domain_head);
+	INIT_LIST_HEAD(&summary_head->cpu_list);
+	list_add(&summary_head->cpu_list, &tmp_cleanup_list);
 
-			tdptr->domain_data = zalloc(sizeof(*ds));
-			if (!tdptr->domain_data)
-				return -ENOMEM;
+	summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
+	if (!summary_head->cpu_data) {
+		ret = -ENOMEM;
+		goto out_cleanup;
+	}
+	memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
+
+	list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
+		tdptr = zalloc(sizeof(*tdptr));
+		if (!tdptr) {
+			ret = -ENOMEM;
+			goto out_cleanup;
+		}
+		INIT_LIST_HEAD(&tdptr->domain_list);
 
-			memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
-			list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
+		tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
+		if (!tdptr->domain_data) {
+			free(tdptr);
+			ret = -ENOMEM;
+			goto out_cleanup;
 		}
+
+		memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
+		list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
 	}
 
 	list_for_each_entry(cptr, head, cpu_list) {
@@ -4212,32 +4227,52 @@ static int get_all_cpu_stats(struct list_head *head)
 
 		cnt++;
 		summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
+		if (list_empty(&summary_head->domain_head))
+			continue;
+
 		tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
 					 domain_list);
 
 		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
 			summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
+			if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
+				tdptr = NULL;
+				break;
+			}
 			tdptr = list_next_entry(tdptr, domain_list);
 		}
 	}
 
+	list_del_init(&summary_head->cpu_list);
 	list_add(&summary_head->cpu_list, head);
+	return 0;
+
+out_cleanup:
+	free_schedstat(&tmp_cleanup_list);
 	return ret;
 }
 
-static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
-			       struct list_head *head2, struct cpu_domain_map **cd_map2,
+static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
+			       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
 			       bool summary_only)
 {
 	struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
 	struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
-	struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
 	struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
 	struct schedstat_cpu *cptr2 = NULL;
 	__u64 jiffies1 = 0, jiffies2 = 0;
 	bool is_summary = true;
 	int ret = 0;
 
+	if (!cd_map1) {
+		pr_err("Error: CPU domain map 1 is missing.\n");
+		return -1;
+	}
+	if (head2 && !cd_map2) {
+		pr_err("Error: CPU domain map 2 is missing.\n");
+		return -1;
+	}
+
 	printf("Description\n");
 	print_separator2(SEP_LEN, "", 0);
 	printf("%-30s-> %s\n", "DESC", "Description of the field");
@@ -4260,21 +4295,47 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 	printf("\n");
 
 	ret = get_all_cpu_stats(head1);
+	if (ret)
+		return ret;
 	if (cptr2) {
 		ret = get_all_cpu_stats(head2);
+		if (ret)
+			return ret;
 		cptr2 = list_first_entry(head2, struct schedstat_cpu, cpu_list);
 	}
 
 	list_for_each_entry(cptr1, head1, cpu_list) {
 		struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
+		struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
+		struct perf_record_schedstat_cpu *cs2 = NULL;
 
-		cs1 = cptr1->cpu_data;
+		dptr2 = NULL;
+		if (cs1->cpu >= (u32)nr1) {
+			pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
+			return -1;
+		}
 		cd_info1 = cd_map1[cs1->cpu];
+		if (!cd_info1) {
+			pr_err("Error: CPU %d domain info is missing in map 1.\n",
+			       cs1->cpu);
+			return -1;
+		}
 		if (cptr2) {
 			cs2 = cptr2->cpu_data;
+			if (cs2->cpu >= (u32)nr2) {
+				pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
+				return -1;
+			}
 			cd_info2 = cd_map2[cs2->cpu];
-			dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
-						 domain_list);
+			if (!cd_info2) {
+				pr_err("Error: CPU %d domain info is missing in map 2.\n",
+				       cs2->cpu);
+				return -1;
+			}
+			if (!list_empty(&cptr2->domain_head))
+				dptr2 = list_first_entry(&cptr2->domain_head,
+							 struct schedstat_domain,
+							 domain_list);
 		}
 
 		if (cs2 && cs1->cpu != cs2->cpu) {
@@ -4302,10 +4363,31 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 			struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
 
 			ds1 = dptr1->domain_data;
+			ds2 = NULL;
+			if (ds1->domain >= cd_info1->nr_domains) {
+				pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
+				       ds1->domain, cd_info1->nr_domains, cs1->cpu);
+				return -1;
+			}
 			dinfo1 = cd_info1->domains[ds1->domain];
+			if (!dinfo1) {
+				pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
+				       ds1->domain, cs1->cpu);
+				return -1;
+			}
 			if (dptr2) {
 				ds2 = dptr2->domain_data;
+				if (ds2->domain >= cd_info2->nr_domains) {
+					pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
+					       ds2->domain, cd_info2->nr_domains, cs2->cpu);
+					return -1;
+				}
 				dinfo2 = cd_info2->domains[ds2->domain];
+				if (!dinfo2) {
+					pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
+					       ds2->domain, cs2->cpu);
+					return -1;
+				}
 			}
 
 			if (dinfo2 && dinfo1->domain != dinfo2->domain) {
@@ -4334,14 +4416,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
 			print_domain_stats(ds1, ds2, jiffies1, jiffies2);
 			print_separator2(SEP_LEN, "", 0);
 
-			if (dptr2)
-				dptr2 = list_next_entry(dptr2, domain_list);
+			if (dptr2) {
+				if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
+					dptr2 = NULL;
+				else
+					dptr2 = list_next_entry(dptr2, domain_list);
+			}
 		}
 		if (summary_only)
 			break;
 
-		if (cptr2)
-			cptr2 = list_next_entry(cptr2, cpu_list);
+		if (cptr2) {
+			if (list_is_last(&cptr2->cpu_list, head2))
+				cptr2 = NULL;
+			else
+				cptr2 = list_next_entry(cptr2, cpu_list);
+		}
 
 		is_summary = false;
 	}
@@ -4473,9 +4563,11 @@ static void free_schedstat(struct list_head *head)
 	list_for_each_entry_safe(cptr, n2, head, cpu_list) {
 		list_for_each_entry_safe(dptr, n1, &cptr->domain_head, domain_list) {
 			list_del_init(&dptr->domain_list);
+			free(dptr->domain_data);
 			free(dptr);
 		}
 		list_del_init(&cptr->cpu_list);
+		free(cptr->cpu_data);
 		free(cptr);
 	}
 }
@@ -4523,7 +4615,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
 		}
 
 		cd_map = session->header.env.cpu_domain;
-		err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+		err = show_schedstat_data(&cpu_head, cd_map,
+					  session->header.env.nr_cpus_avail,
+					  NULL, NULL, 0, false);
 	}
 
 out:
@@ -4538,7 +4632,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 	struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
 	struct list_head cpu_head_ses0, cpu_head_ses1;
 	struct perf_session *session[2];
-	struct perf_data data[2];
+	struct perf_data data[2] = {0};
 	int ret = 0, err = 0;
 	static const char *defaults[] = {
 		"perf.data.old",
@@ -4573,8 +4667,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 	}
 
 	err = perf_session__process_events(session[0]);
-	if (err)
+	if (err) {
+		free_schedstat(&cpu_head);
 		goto out_delete_ses0;
+	}
 
 	cd_map0 = session[0]->header.env.cpu_domain;
 	list_replace_init(&cpu_head, &cpu_head_ses0);
@@ -4590,8 +4686,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 	}
 
 	err = perf_session__process_events(session[1]);
-	if (err)
+	if (err) {
+		free_schedstat(&cpu_head);
 		goto out_delete_ses1;
+	}
 
 	cd_map1 = session[1]->header.env.cpu_domain;
 	list_replace_init(&cpu_head, &cpu_head_ses1);
@@ -4607,10 +4705,13 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
 	if (list_empty(&cpu_head_ses0)) {
 		pr_err("Data is not available\n");
 		ret = -1;
-		goto out_delete_ses0;
+		goto out_delete_ses1;
 	}
 
-	show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
+	ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
+				  &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
+	if (ret)
+		goto out_delete_ses1;
 
 out_delete_ses1:
 	free_schedstat(&cpu_head_ses1);
@@ -4720,7 +4821,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
 		goto out;
 	}
 
-	show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
+	err = show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
 	free_cpu_domain_info(cd_map, sv, nr);
 out:
 	free_schedstat(&cpu_head);
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
  2026-05-06  4:10     ` [PATCH v4] " Ian Rogers
@ 2026-05-06  6:45       ` Ian Rogers
  2026-05-06 13:03         ` Arnaldo Carvalho de Melo
  2026-05-06  9:07       ` James Clark
  1 sibling, 1 reply; 11+ messages in thread
From: Ian Rogers @ 2026-05-06  6:45 UTC (permalink / raw)
  To: irogers, acme, atrajeev, james.clark, namhyung
  Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
	venkat88

On Tue, May 5, 2026 at 9:10 PM Ian Rogers <irogers@google.com> wrote:
>
> The patch addresses multiple segmentation fault vectors, out-of-bounds
> reads, and memory leaks in perf sched stats by adding bounds checks,
> NULL checks, proper error propagation, and robust memory cleanup.
>
> 1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
>    unsafe list_first_entry() calls on empty lists, added a missing NULL
>    check for summary_head->cpu_data allocation, and implemented a cleanup
>    ladder using a temporary list to prevent memory leaks on error paths.
> 2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
>    and cpu_data pointers are freed.
> 3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
>    its value from a previous iteration when dptr2 became NULL, and added
>    proper propagation of errors from get_all_cpu_stats().
> 4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
>    and perf_sched__schedstat_live() to prevent output corruption on failure.
> 5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
>    to gracefully handle invalid or empty data files.
> 6. Added parallel iteration termination checks using list_is_last() in
>    show_schedstat_data() for both domain and CPU lists to safely terminate
>    at the end of each list when files contain a different number of CPUs
>    or domains.
> 7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
>    show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
>    cd_map2 when comparing files from machines with different CPU counts.
> 8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
>    a CPU has data samples but no corresponding domain info in the header.
> 9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
>    ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
>    accesses in the domains array.
> 10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
>     to prevent crashes when a domain has no corresponding domain info.
> 11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
>     to prevent stack garbage from causing perf_data_file__fd() to attempt
>     to use a NULL fptr when use_stdio happened to be non-zero.
>
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>

This version passes sashiko's scrutiny:
https://sashiko.dev/#/patchset/20260506041004.3196084-1-irogers%40google.com

Thanks,
Ian

> ---
> v4:
> Address further sashiko feedback:
>  - Fix memory leaks of global cpu_head list when perf_session__process_events() fails.
>  - Fix resource leak of second session by correcting cleanup order when first session's data is empty.
>
> v3:
> Address sashiko feedback:
>  - Fix memory leaks in free_schedstat() by freeing internal data.
>  - Fix missing NULL check for summary_head->cpu_data in get_all_cpu_stats().
>  - Fix memory leaks in get_all_cpu_stats() on error paths using a temporary cleanup list.
>  - Fix stale ds2 pointer comparison in show_schedstat_data() by resetting it to NULL.
>  - Add error propagation from get_all_cpu_stats() and show_schedstat_data().
>  - Correct commit message to remove inaccurate claim about uninitialized variable 'ret'.
>  - Add forward declaration for free_schedstat() to fix compilation error.
>
> v2: Had tested-by Athira Rajeev and reviewed-by James Clark, removed
>     from v3 given the number of changes.
> https://lore.kernel.org/linux-perf-users/20260429173931.2700115-1-irogers@google.com/
> ---
>  tools/perf/builtin-sched.c | 175 +++++++++++++++++++++++++++++--------
>  1 file changed, 138 insertions(+), 37 deletions(-)
>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..fd679b106582 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -3947,6 +3947,8 @@ static struct schedstat_domain *domain_second_pass;
>  static bool after_workload_flag;
>  static bool verbose_field;
>
> +static void free_schedstat(struct list_head *head);
> +
>  static void store_schedstat_cpu_diff(struct schedstat_cpu *after_workload)
>  {
>         struct perf_record_schedstat_cpu *before = cpu_second_pass->cpu_data;
> @@ -4170,37 +4172,50 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
>   */
>  static int get_all_cpu_stats(struct list_head *head)
>  {
> -       struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> -       struct schedstat_cpu *summary_head = NULL;
> -       struct perf_record_schedstat_domain *ds;
> -       struct perf_record_schedstat_cpu *cs;
> +       struct schedstat_cpu *cptr, *summary_head = NULL;
>         struct schedstat_domain *dptr, *tdptr;
>         bool is_last = false;
>         int cnt = 1;
>         int ret = 0;
> +       struct list_head tmp_cleanup_list;
>
> -       if (cptr) {
> -               summary_head = zalloc(sizeof(*summary_head));
> -               if (!summary_head)
> -                       return -ENOMEM;
> +       assert(!list_empty(head));
> +       cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
>
> -               summary_head->cpu_data = zalloc(sizeof(*cs));
> -               memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> +       INIT_LIST_HEAD(&tmp_cleanup_list);
>
> -               INIT_LIST_HEAD(&summary_head->domain_head);
> +       summary_head = zalloc(sizeof(*summary_head));
> +       if (!summary_head)
> +               return -ENOMEM;
>
> -               list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> -                       tdptr = zalloc(sizeof(*tdptr));
> -                       if (!tdptr)
> -                               return -ENOMEM;
> +       INIT_LIST_HEAD(&summary_head->domain_head);
> +       INIT_LIST_HEAD(&summary_head->cpu_list);
> +       list_add(&summary_head->cpu_list, &tmp_cleanup_list);
>
> -                       tdptr->domain_data = zalloc(sizeof(*ds));
> -                       if (!tdptr->domain_data)
> -                               return -ENOMEM;
> +       summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> +       if (!summary_head->cpu_data) {
> +               ret = -ENOMEM;
> +               goto out_cleanup;
> +       }
> +       memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
> +
> +       list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> +               tdptr = zalloc(sizeof(*tdptr));
> +               if (!tdptr) {
> +                       ret = -ENOMEM;
> +                       goto out_cleanup;
> +               }
> +               INIT_LIST_HEAD(&tdptr->domain_list);
>
> -                       memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
> -                       list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> +               tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> +               if (!tdptr->domain_data) {
> +                       free(tdptr);
> +                       ret = -ENOMEM;
> +                       goto out_cleanup;
>                 }
> +
> +               memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
> +               list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
>         }
>
>         list_for_each_entry(cptr, head, cpu_list) {
> @@ -4212,32 +4227,52 @@ static int get_all_cpu_stats(struct list_head *head)
>
>                 cnt++;
>                 summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> +               if (list_empty(&summary_head->domain_head))
> +                       continue;
> +
>                 tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
>                                          domain_list);
>
>                 list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
>                         summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> +                       if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
> +                               tdptr = NULL;
> +                               break;
> +                       }
>                         tdptr = list_next_entry(tdptr, domain_list);
>                 }
>         }
>
> +       list_del_init(&summary_head->cpu_list);
>         list_add(&summary_head->cpu_list, head);
> +       return 0;
> +
> +out_cleanup:
> +       free_schedstat(&tmp_cleanup_list);
>         return ret;
>  }
>
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> -                              struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> +                              struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
>                                bool summary_only)
>  {
>         struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
>         struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
> -       struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
>         struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
>         struct schedstat_cpu *cptr2 = NULL;
>         __u64 jiffies1 = 0, jiffies2 = 0;
>         bool is_summary = true;
>         int ret = 0;
>
> +       if (!cd_map1) {
> +               pr_err("Error: CPU domain map 1 is missing.\n");
> +               return -1;
> +       }
> +       if (head2 && !cd_map2) {
> +               pr_err("Error: CPU domain map 2 is missing.\n");
> +               return -1;
> +       }
> +
>         printf("Description\n");
>         print_separator2(SEP_LEN, "", 0);
>         printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4260,21 +4295,47 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>         printf("\n");
>
>         ret = get_all_cpu_stats(head1);
> +       if (ret)
> +               return ret;
>         if (cptr2) {
>                 ret = get_all_cpu_stats(head2);
> +               if (ret)
> +                       return ret;
>                 cptr2 = list_first_entry(head2, struct schedstat_cpu, cpu_list);
>         }
>
>         list_for_each_entry(cptr1, head1, cpu_list) {
>                 struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> +               struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
> +               struct perf_record_schedstat_cpu *cs2 = NULL;
>
> -               cs1 = cptr1->cpu_data;
> +               dptr2 = NULL;
> +               if (cs1->cpu >= (u32)nr1) {
> +                       pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> +                       return -1;
> +               }
>                 cd_info1 = cd_map1[cs1->cpu];
> +               if (!cd_info1) {
> +                       pr_err("Error: CPU %d domain info is missing in map 1.\n",
> +                              cs1->cpu);
> +                       return -1;
> +               }
>                 if (cptr2) {
>                         cs2 = cptr2->cpu_data;
> +                       if (cs2->cpu >= (u32)nr2) {
> +                               pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> +                               return -1;
> +                       }
>                         cd_info2 = cd_map2[cs2->cpu];
> -                       dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> -                                                domain_list);
> +                       if (!cd_info2) {
> +                               pr_err("Error: CPU %d domain info is missing in map 2.\n",
> +                                      cs2->cpu);
> +                               return -1;
> +                       }
> +                       if (!list_empty(&cptr2->domain_head))
> +                               dptr2 = list_first_entry(&cptr2->domain_head,
> +                                                        struct schedstat_domain,
> +                                                        domain_list);
>                 }
>
>                 if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4363,31 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>                         struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>
>                         ds1 = dptr1->domain_data;
> +                       ds2 = NULL;
> +                       if (ds1->domain >= cd_info1->nr_domains) {
> +                               pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
> +                                      ds1->domain, cd_info1->nr_domains, cs1->cpu);
> +                               return -1;
> +                       }
>                         dinfo1 = cd_info1->domains[ds1->domain];
> +                       if (!dinfo1) {
> +                               pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
> +                                      ds1->domain, cs1->cpu);
> +                               return -1;
> +                       }
>                         if (dptr2) {
>                                 ds2 = dptr2->domain_data;
> +                               if (ds2->domain >= cd_info2->nr_domains) {
> +                                       pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
> +                                              ds2->domain, cd_info2->nr_domains, cs2->cpu);
> +                                       return -1;
> +                               }
>                                 dinfo2 = cd_info2->domains[ds2->domain];
> +                               if (!dinfo2) {
> +                                       pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
> +                                              ds2->domain, cs2->cpu);
> +                                       return -1;
> +                               }
>                         }
>
>                         if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4416,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>                         print_domain_stats(ds1, ds2, jiffies1, jiffies2);
>                         print_separator2(SEP_LEN, "", 0);
>
> -                       if (dptr2)
> -                               dptr2 = list_next_entry(dptr2, domain_list);
> +                       if (dptr2) {
> +                               if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> +                                       dptr2 = NULL;
> +                               else
> +                                       dptr2 = list_next_entry(dptr2, domain_list);
> +                       }
>                 }
>                 if (summary_only)
>                         break;
>
> -               if (cptr2)
> -                       cptr2 = list_next_entry(cptr2, cpu_list);
> +               if (cptr2) {
> +                       if (list_is_last(&cptr2->cpu_list, head2))
> +                               cptr2 = NULL;
> +                       else
> +                               cptr2 = list_next_entry(cptr2, cpu_list);
> +               }
>
>                 is_summary = false;
>         }
> @@ -4473,9 +4563,11 @@ static void free_schedstat(struct list_head *head)
>         list_for_each_entry_safe(cptr, n2, head, cpu_list) {
>                 list_for_each_entry_safe(dptr, n1, &cptr->domain_head, domain_list) {
>                         list_del_init(&dptr->domain_list);
> +                       free(dptr->domain_data);
>                         free(dptr);
>                 }
>                 list_del_init(&cptr->cpu_list);
> +               free(cptr->cpu_data);
>                 free(cptr);
>         }
>  }
> @@ -4523,7 +4615,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
>                 }
>
>                 cd_map = session->header.env.cpu_domain;
> -               err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> +               err = show_schedstat_data(&cpu_head, cd_map,
> +                                         session->header.env.nr_cpus_avail,
> +                                         NULL, NULL, 0, false);
>         }
>
>  out:
> @@ -4538,7 +4632,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>         struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
>         struct list_head cpu_head_ses0, cpu_head_ses1;
>         struct perf_session *session[2];
> -       struct perf_data data[2];
> +       struct perf_data data[2] = {0};
>         int ret = 0, err = 0;
>         static const char *defaults[] = {
>                 "perf.data.old",
> @@ -4573,8 +4667,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>         }
>
>         err = perf_session__process_events(session[0]);
> -       if (err)
> +       if (err) {
> +               free_schedstat(&cpu_head);
>                 goto out_delete_ses0;
> +       }
>
>         cd_map0 = session[0]->header.env.cpu_domain;
>         list_replace_init(&cpu_head, &cpu_head_ses0);
> @@ -4590,8 +4686,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>         }
>
>         err = perf_session__process_events(session[1]);
> -       if (err)
> +       if (err) {
> +               free_schedstat(&cpu_head);
>                 goto out_delete_ses1;
> +       }
>
>         cd_map1 = session[1]->header.env.cpu_domain;
>         list_replace_init(&cpu_head, &cpu_head_ses1);
> @@ -4607,10 +4705,13 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>         if (list_empty(&cpu_head_ses0)) {
>                 pr_err("Data is not available\n");
>                 ret = -1;
> -               goto out_delete_ses0;
> +               goto out_delete_ses1;
>         }
>
> -       show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> +       ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> +                                 &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
> +       if (ret)
> +               goto out_delete_ses1;
>
>  out_delete_ses1:
>         free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4821,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
>                 goto out;
>         }
>
> -       show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> +       err = show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
>         free_cpu_domain_info(cd_map, sv, nr);
>  out:
>         free_schedstat(&cpu_head);
> --
> 2.54.0.545.g6539524ca2-goog
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
  2026-05-06  4:10     ` [PATCH v4] " Ian Rogers
  2026-05-06  6:45       ` Ian Rogers
@ 2026-05-06  9:07       ` James Clark
  1 sibling, 0 replies; 11+ messages in thread
From: James Clark @ 2026-05-06  9:07 UTC (permalink / raw)
  To: Ian Rogers
  Cc: adrian.hunter, linux-kernel, linux-perf-users, mingo, peterz,
	venkat88, acme, atrajeev, namhyung



On 06/05/2026 5:10 am, Ian Rogers wrote:
> The patch addresses multiple segmentation fault vectors, out-of-bounds
> reads, and memory leaks in perf sched stats by adding bounds checks,
> NULL checks, proper error propagation, and robust memory cleanup.
> 
> 1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
>     unsafe list_first_entry() calls on empty lists, added a missing NULL
>     check for summary_head->cpu_data allocation, and implemented a cleanup
>     ladder using a temporary list to prevent memory leaks on error paths.
> 2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
>     and cpu_data pointers are freed.
> 3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
>     its value from a previous iteration when dptr2 became NULL, and added
>     proper propagation of errors from get_all_cpu_stats().
> 4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
>     and perf_sched__schedstat_live() to prevent output corruption on failure.
> 5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
>     to gracefully handle invalid or empty data files.
> 6. Added parallel iteration termination checks using list_is_last() in
>     show_schedstat_data() for both domain and CPU lists to safely terminate
>     at the end of each list when files contain a different number of CPUs
>     or domains.
> 7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
>     show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
>     cd_map2 when comparing files from machines with different CPU counts.
> 8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
>     a CPU has data samples but no corresponding domain info in the header.
> 9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
>     ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
>     accesses in the domains array.
> 10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
>      to prevent crashes when a domain has no corresponding domain info.
> 11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
>      to prevent stack garbage from causing perf_data_file__fd() to attempt
>      to use a NULL fptr when use_stdio happened to be non-zero.
> 
> Assisted-by: Gemini:gemini-3.1-pro-preview
> Signed-off-by: Ian Rogers <irogers@google.com>
> 
> ---
> v4:
> Address further sashiko feedback:
>   - Fix memory leaks of global cpu_head list when perf_session__process_events() fails.
>   - Fix resource leak of second session by correcting cleanup order when first session's data is empty.
> 
> v3:
> Address sashiko feedback:
>   - Fix memory leaks in free_schedstat() by freeing internal data.
>   - Fix missing NULL check for summary_head->cpu_data in get_all_cpu_stats().
>   - Fix memory leaks in get_all_cpu_stats() on error paths using a temporary cleanup list.
>   - Fix stale ds2 pointer comparison in show_schedstat_data() by resetting it to NULL.
>   - Add error propagation from get_all_cpu_stats() and show_schedstat_data().
>   - Correct commit message to remove inaccurate claim about uninitialized variable 'ret'.
>   - Add forward declaration for free_schedstat() to fix compilation error.
> 
> v2: Had tested-by Athira Rajeev and reviewed-by James Clark, removed
>      from v3 given the number of changes.
> https://lore.kernel.org/linux-perf-users/20260429173931.2700115-1-irogers@google.com/
> ---
>   tools/perf/builtin-sched.c | 175 +++++++++++++++++++++++++++++--------
>   1 file changed, 138 insertions(+), 37 deletions(-)
> 

Reviewed-by: James Clark <james.clark@linaro.org>

> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 555247568e7a..fd679b106582 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -3947,6 +3947,8 @@ static struct schedstat_domain *domain_second_pass;
>   static bool after_workload_flag;
>   static bool verbose_field;
>   
> +static void free_schedstat(struct list_head *head);
> +
>   static void store_schedstat_cpu_diff(struct schedstat_cpu *after_workload)
>   {
>   	struct perf_record_schedstat_cpu *before = cpu_second_pass->cpu_data;
> @@ -4170,37 +4172,50 @@ static void summarize_schedstat_domain(struct schedstat_domain *summary_domain,
>    */
>   static int get_all_cpu_stats(struct list_head *head)
>   {
> -	struct schedstat_cpu *cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
> -	struct schedstat_cpu *summary_head = NULL;
> -	struct perf_record_schedstat_domain *ds;
> -	struct perf_record_schedstat_cpu *cs;
> +	struct schedstat_cpu *cptr, *summary_head = NULL;
>   	struct schedstat_domain *dptr, *tdptr;
>   	bool is_last = false;
>   	int cnt = 1;
>   	int ret = 0;
> +	struct list_head tmp_cleanup_list;
>   
> -	if (cptr) {
> -		summary_head = zalloc(sizeof(*summary_head));
> -		if (!summary_head)
> -			return -ENOMEM;
> +	assert(!list_empty(head));
> +	cptr = list_first_entry(head, struct schedstat_cpu, cpu_list);
>   
> -		summary_head->cpu_data = zalloc(sizeof(*cs));
> -		memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*cs));
> +	INIT_LIST_HEAD(&tmp_cleanup_list);
>   
> -		INIT_LIST_HEAD(&summary_head->domain_head);
> +	summary_head = zalloc(sizeof(*summary_head));
> +	if (!summary_head)
> +		return -ENOMEM;
>   
> -		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> -			tdptr = zalloc(sizeof(*tdptr));
> -			if (!tdptr)
> -				return -ENOMEM;
> +	INIT_LIST_HEAD(&summary_head->domain_head);
> +	INIT_LIST_HEAD(&summary_head->cpu_list);
> +	list_add(&summary_head->cpu_list, &tmp_cleanup_list);
>   
> -			tdptr->domain_data = zalloc(sizeof(*ds));
> -			if (!tdptr->domain_data)
> -				return -ENOMEM;
> +	summary_head->cpu_data = zalloc(sizeof(*summary_head->cpu_data));
> +	if (!summary_head->cpu_data) {
> +		ret = -ENOMEM;
> +		goto out_cleanup;
> +	}
> +	memcpy(summary_head->cpu_data, cptr->cpu_data, sizeof(*summary_head->cpu_data));
> +
> +	list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
> +		tdptr = zalloc(sizeof(*tdptr));
> +		if (!tdptr) {
> +			ret = -ENOMEM;
> +			goto out_cleanup;
> +		}
> +		INIT_LIST_HEAD(&tdptr->domain_list);
>   
> -			memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*ds));
> -			list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
> +		tdptr->domain_data = zalloc(sizeof(*tdptr->domain_data));
> +		if (!tdptr->domain_data) {
> +			free(tdptr);
> +			ret = -ENOMEM;
> +			goto out_cleanup;
>   		}
> +
> +		memcpy(tdptr->domain_data, dptr->domain_data, sizeof(*tdptr->domain_data));
> +		list_add_tail(&tdptr->domain_list, &summary_head->domain_head);
>   	}
>   
>   	list_for_each_entry(cptr, head, cpu_list) {
> @@ -4212,32 +4227,52 @@ static int get_all_cpu_stats(struct list_head *head)
>   
>   		cnt++;
>   		summarize_schedstat_cpu(summary_head, cptr, cnt, is_last);
> +		if (list_empty(&summary_head->domain_head))
> +			continue;
> +
>   		tdptr = list_first_entry(&summary_head->domain_head, struct schedstat_domain,
>   					 domain_list);
>   
>   		list_for_each_entry(dptr, &cptr->domain_head, domain_list) {
>   			summarize_schedstat_domain(tdptr, dptr, cnt, is_last);
> +			if (list_is_last(&tdptr->domain_list, &summary_head->domain_head)) {
> +				tdptr = NULL;
> +				break;
> +			}
>   			tdptr = list_next_entry(tdptr, domain_list);
>   		}
>   	}
>   
> +	list_del_init(&summary_head->cpu_list);
>   	list_add(&summary_head->cpu_list, head);
> +	return 0;
> +
> +out_cleanup:
> +	free_schedstat(&tmp_cleanup_list);
>   	return ret;
>   }
>   
> -static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1,
> -			       struct list_head *head2, struct cpu_domain_map **cd_map2,
> +static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **cd_map1, int nr1,
> +			       struct list_head *head2, struct cpu_domain_map **cd_map2, int nr2,
>   			       bool summary_only)
>   {
>   	struct schedstat_cpu *cptr1 = list_first_entry(head1, struct schedstat_cpu, cpu_list);
>   	struct perf_record_schedstat_domain *ds1 = NULL, *ds2 = NULL;
> -	struct perf_record_schedstat_cpu *cs1 = NULL, *cs2 = NULL;
>   	struct schedstat_domain *dptr1 = NULL, *dptr2 = NULL;
>   	struct schedstat_cpu *cptr2 = NULL;
>   	__u64 jiffies1 = 0, jiffies2 = 0;
>   	bool is_summary = true;
>   	int ret = 0;
>   
> +	if (!cd_map1) {
> +		pr_err("Error: CPU domain map 1 is missing.\n");
> +		return -1;
> +	}
> +	if (head2 && !cd_map2) {
> +		pr_err("Error: CPU domain map 2 is missing.\n");
> +		return -1;
> +	}
> +
>   	printf("Description\n");
>   	print_separator2(SEP_LEN, "", 0);
>   	printf("%-30s-> %s\n", "DESC", "Description of the field");
> @@ -4260,21 +4295,47 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   	printf("\n");
>   
>   	ret = get_all_cpu_stats(head1);
> +	if (ret)
> +		return ret;
>   	if (cptr2) {
>   		ret = get_all_cpu_stats(head2);
> +		if (ret)
> +			return ret;
>   		cptr2 = list_first_entry(head2, struct schedstat_cpu, cpu_list);
>   	}
>   
>   	list_for_each_entry(cptr1, head1, cpu_list) {
>   		struct cpu_domain_map *cd_info1 = NULL, *cd_info2 = NULL;
> +		struct perf_record_schedstat_cpu *cs1 = cptr1->cpu_data;
> +		struct perf_record_schedstat_cpu *cs2 = NULL;
>   
> -		cs1 = cptr1->cpu_data;
> +		dptr2 = NULL;
> +		if (cs1->cpu >= (u32)nr1) {
> +			pr_err("Error: CPU %d exceeds domain map size %d\n", cs1->cpu, nr1);
> +			return -1;
> +		}
>   		cd_info1 = cd_map1[cs1->cpu];
> +		if (!cd_info1) {
> +			pr_err("Error: CPU %d domain info is missing in map 1.\n",
> +			       cs1->cpu);
> +			return -1;
> +		}
>   		if (cptr2) {
>   			cs2 = cptr2->cpu_data;
> +			if (cs2->cpu >= (u32)nr2) {
> +				pr_err("Error: CPU %d exceeds domain map size %d\n", cs2->cpu, nr2);
> +				return -1;
> +			}
>   			cd_info2 = cd_map2[cs2->cpu];
> -			dptr2 = list_first_entry(&cptr2->domain_head, struct schedstat_domain,
> -						 domain_list);
> +			if (!cd_info2) {
> +				pr_err("Error: CPU %d domain info is missing in map 2.\n",
> +				       cs2->cpu);
> +				return -1;
> +			}
> +			if (!list_empty(&cptr2->domain_head))
> +				dptr2 = list_first_entry(&cptr2->domain_head,
> +							 struct schedstat_domain,
> +							 domain_list);
>   		}
>   
>   		if (cs2 && cs1->cpu != cs2->cpu) {
> @@ -4302,10 +4363,31 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   			struct domain_info *dinfo1 = NULL, *dinfo2 = NULL;
>   
>   			ds1 = dptr1->domain_data;
> +			ds2 = NULL;
> +			if (ds1->domain >= cd_info1->nr_domains) {
> +				pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 1.\n",
> +				       ds1->domain, cd_info1->nr_domains, cs1->cpu);
> +				return -1;
> +			}
>   			dinfo1 = cd_info1->domains[ds1->domain];
> +			if (!dinfo1) {
> +				pr_err("Error: Domain %d info is missing for CPU %d in map 1.\n",
> +				       ds1->domain, cs1->cpu);
> +				return -1;
> +			}
>   			if (dptr2) {
>   				ds2 = dptr2->domain_data;
> +				if (ds2->domain >= cd_info2->nr_domains) {
> +					pr_err("Error: Domain %d exceeds max domains %d for CPU %d in map 2.\n",
> +					       ds2->domain, cd_info2->nr_domains, cs2->cpu);
> +					return -1;
> +				}
>   				dinfo2 = cd_info2->domains[ds2->domain];
> +				if (!dinfo2) {
> +					pr_err("Error: Domain %d info is missing for CPU %d in map 2.\n",
> +					       ds2->domain, cs2->cpu);
> +					return -1;
> +				}
>   			}
>   
>   			if (dinfo2 && dinfo1->domain != dinfo2->domain) {
> @@ -4334,14 +4416,22 @@ static int show_schedstat_data(struct list_head *head1, struct cpu_domain_map **
>   			print_domain_stats(ds1, ds2, jiffies1, jiffies2);
>   			print_separator2(SEP_LEN, "", 0);
>   
> -			if (dptr2)
> -				dptr2 = list_next_entry(dptr2, domain_list);
> +			if (dptr2) {
> +				if (list_is_last(&dptr2->domain_list, &cptr2->domain_head))
> +					dptr2 = NULL;
> +				else
> +					dptr2 = list_next_entry(dptr2, domain_list);
> +			}
>   		}
>   		if (summary_only)
>   			break;
>   
> -		if (cptr2)
> -			cptr2 = list_next_entry(cptr2, cpu_list);
> +		if (cptr2) {
> +			if (list_is_last(&cptr2->cpu_list, head2))
> +				cptr2 = NULL;
> +			else
> +				cptr2 = list_next_entry(cptr2, cpu_list);
> +		}
>   
>   		is_summary = false;
>   	}
> @@ -4473,9 +4563,11 @@ static void free_schedstat(struct list_head *head)
>   	list_for_each_entry_safe(cptr, n2, head, cpu_list) {
>   		list_for_each_entry_safe(dptr, n1, &cptr->domain_head, domain_list) {
>   			list_del_init(&dptr->domain_list);
> +			free(dptr->domain_data);
>   			free(dptr);
>   		}
>   		list_del_init(&cptr->cpu_list);
> +		free(cptr->cpu_data);
>   		free(cptr);
>   	}
>   }
> @@ -4523,7 +4615,9 @@ static int perf_sched__schedstat_report(struct perf_sched *sched)
>   		}
>   
>   		cd_map = session->header.env.cpu_domain;
> -		err = show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> +		err = show_schedstat_data(&cpu_head, cd_map,
> +					  session->header.env.nr_cpus_avail,
> +					  NULL, NULL, 0, false);
>   	}
>   
>   out:
> @@ -4538,7 +4632,7 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>   	struct cpu_domain_map **cd_map0 = NULL, **cd_map1 = NULL;
>   	struct list_head cpu_head_ses0, cpu_head_ses1;
>   	struct perf_session *session[2];
> -	struct perf_data data[2];
> +	struct perf_data data[2] = {0};
>   	int ret = 0, err = 0;
>   	static const char *defaults[] = {
>   		"perf.data.old",
> @@ -4573,8 +4667,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>   	}
>   
>   	err = perf_session__process_events(session[0]);
> -	if (err)
> +	if (err) {
> +		free_schedstat(&cpu_head);
>   		goto out_delete_ses0;
> +	}
>   
>   	cd_map0 = session[0]->header.env.cpu_domain;
>   	list_replace_init(&cpu_head, &cpu_head_ses0);
> @@ -4590,8 +4686,10 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>   	}
>   
>   	err = perf_session__process_events(session[1]);
> -	if (err)
> +	if (err) {
> +		free_schedstat(&cpu_head);
>   		goto out_delete_ses1;
> +	}
>   
>   	cd_map1 = session[1]->header.env.cpu_domain;
>   	list_replace_init(&cpu_head, &cpu_head_ses1);
> @@ -4607,10 +4705,13 @@ static int perf_sched__schedstat_diff(struct perf_sched *sched,
>   	if (list_empty(&cpu_head_ses0)) {
>   		pr_err("Data is not available\n");
>   		ret = -1;
> -		goto out_delete_ses0;
> +		goto out_delete_ses1;
>   	}
>   
> -	show_schedstat_data(&cpu_head_ses0, cd_map0, &cpu_head_ses1, cd_map1, true);
> +	ret = show_schedstat_data(&cpu_head_ses0, cd_map0, session[0]->header.env.nr_cpus_avail,
> +				  &cpu_head_ses1, cd_map1, session[1]->header.env.nr_cpus_avail, true);
> +	if (ret)
> +		goto out_delete_ses1;
>   
>   out_delete_ses1:
>   	free_schedstat(&cpu_head_ses1);
> @@ -4720,7 +4821,7 @@ static int perf_sched__schedstat_live(struct perf_sched *sched,
>   		goto out;
>   	}
>   
> -	show_schedstat_data(&cpu_head, cd_map, NULL, NULL, false);
> +	err = show_schedstat_data(&cpu_head, cd_map, nr, NULL, NULL, 0, false);
>   	free_cpu_domain_info(cd_map, sv, nr);
>   out:
>   	free_schedstat(&cpu_head);


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v4] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers in diff mode
  2026-05-06  6:45       ` Ian Rogers
@ 2026-05-06 13:03         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2026-05-06 13:03 UTC (permalink / raw)
  To: Ian Rogers
  Cc: atrajeev, james.clark, namhyung, adrian.hunter, linux-kernel,
	linux-perf-users, mingo, peterz, venkat88

On Tue, May 05, 2026 at 11:45:56PM -0700, Ian Rogers wrote:
> On Tue, May 5, 2026 at 9:10 PM Ian Rogers <irogers@google.com> wrote:
> >
> > The patch addresses multiple segmentation fault vectors, out-of-bounds
> > reads, and memory leaks in perf sched stats by adding bounds checks,
> > NULL checks, proper error propagation, and robust memory cleanup.
> >
> > 1. In get_all_cpu_stats(), added assert(!list_empty(head)) to prevent
> >    unsafe list_first_entry() calls on empty lists, added a missing NULL
> >    check for summary_head->cpu_data allocation, and implemented a cleanup
> >    ladder using a temporary list to prevent memory leaks on error paths.
> > 2. In free_schedstat(), fixed memory leaks by ensuring internal domain_data
> >    and cpu_data pointers are freed.
> > 3. In show_schedstat_data(), fixed a stale pointer issue where ds2 retained
> >    its value from a previous iteration when dptr2 became NULL, and added
> >    proper propagation of errors from get_all_cpu_stats().
> > 4. Propagated show_schedstat_data() errors up to perf_sched__schedstat_diff()
> >    and perf_sched__schedstat_live() to prevent output corruption on failure.
> > 5. In show_schedstat_data(), added NULL checks for cd_map1 and cd_map2
> >    to gracefully handle invalid or empty data files.
> > 6. Added parallel iteration termination checks using list_is_last() in
> >    show_schedstat_data() for both domain and CPU lists to safely terminate
> >    at the end of each list when files contain a different number of CPUs
> >    or domains.
> > 7. Added CPU bounds checks (cs1->cpu >= nr1 and cs2->cpu >= nr2) in
> >    show_schedstat_data() to prevent out-of-bounds reads from cd_map1 and
> >    cd_map2 when comparing files from machines with different CPU counts.
> > 8. Added NULL checks for cd_info1 and cd_info2 to prevent crashes when
> >    a CPU has data samples but no corresponding domain info in the header.
> > 9. Added domain bounds checks (ds1->domain >= cd_info1->nr_domains and
> >    ds2->domain >= cd_info2->nr_domains) to prevent out-of-bounds array
> >    accesses in the domains array.
> > 10. Added NULL checks for dinfo1 and dinfo2 in show_schedstat_data()
> >     to prevent crashes when a domain has no corresponding domain info.
> > 11. Zero-initialized the perf_data array in perf_sched__schedstat_diff()
> >     to prevent stack garbage from causing perf_data_file__fd() to attempt
> >     to use a NULL fptr when use_stdio happened to be non-zero.
> >
> > Assisted-by: Gemini:gemini-3.1-pro-preview
> > Signed-off-by: Ian Rogers <irogers@google.com>
> 
> This version passes sashiko's scrutiny:
> https://sashiko.dev/#/patchset/20260506041004.3196084-1-irogers%40google.com

Thanks, applied to perf-tools-next, for v7.2.

- Arnaldo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-05-06 13:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28  7:08 [PATCH v1] perf sched stats: Fix segmentation faults in diff mode Ian Rogers
2026-04-29 12:30 ` James Clark
2026-04-29 14:01 ` Athira Rajeev
2026-04-29 17:39 ` [PATCH v2] " Ian Rogers
2026-04-30 13:29   ` James Clark
2026-05-01 15:16   ` Athira Rajeev
2026-05-06  0:00   ` [PATCH v3] perf sched stats: Fix segmentation faults, memory leaks, and stale pointers " Ian Rogers
2026-05-06  4:10     ` [PATCH v4] " Ian Rogers
2026-05-06  6:45       ` Ian Rogers
2026-05-06 13:03         ` Arnaldo Carvalho de Melo
2026-05-06  9:07       ` James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox