All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Nathan Chancellor <nathan@kernel.org>,
	Valentin Schneider <vschneid@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Chen Yu <yu.c.chen@intel.com>,
	linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	"Gautham R. Shenoy" <gautham.shenoy@amd.com>,
	x86@kernel.org
Subject: Re: [PATCH] sched/topology: Initialize sd_span after assignment to *sd
Date: Mon, 23 Mar 2026 14:38:53 +0530	[thread overview]
Message-ID: <2c5e1ed7-cd7c-49e3-8cb0-ebe7365ba315@linux.ibm.com> (raw)
In-Reply-To: <20260321163852.11102-1-kprateek.nayak@amd.com>



On 3/21/26 10:08 PM, K Prateek Nayak wrote:
> Nathan reported a kernel panic on his ARM builds after commit
> 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset
> partitions") which was root caused to the compiler zeroing out the first
> few bytes of sd->span.
> 
> During the debug [1], it was discovered that, on some configs,
> offsetof(struct sched_domain, span) at 292 was less than
> sizeof(struct sched_domain) at 296 resulting in:
> 
>    *sd = { ... }
> 
> assignment clearing out first 4 bytes of sd->span which was initialized
> before.
> 
> The official GCC specification for "Arrays of Length Zero" [2] says:
> 
>    Although the size of a zero-length array is zero, an array member of
>    this kind may increase the size of the enclosing type as a result of
>    tail padding.
> 
> which means the relative offset of the variable length array at the end
> of the sturct can indeed be less than sizeof() the struct as a result of
> tail padding thus overwriting that data of the flexible array that
> overlapped with the padding whenever the struct is initialized as whole.
> 
> Partially revert commit 8e8e23dea43e ("sched/topology: Compute sd_weight
> considering cpuset partitions") to initialize sd_span after the fixed
> memebers of sd.
> 
> Use
> 
>    cpumask_weight_and(cpu_map, tl->mask(tl, cpu))
> 
> to calculate span_weight before initializing the sd_span.
> cpumask_and_weight() is of same complexity as cpumask_and() and the
> additional overhead is negligible.
> 
> While at it, also initialize sd->span_weight in sd_init() since
> sd_weight now captures the cpu_map constraints. Fixup the
> sd->span_weight whenever sd_span is fixed up by the generic topology
> layer.
> 


This description is a bit confusing. Fixup happens naturally since
cpu_map now reflects the changes right?

Maybe mention about that removal in build_sched_domains?

> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Closes: https://lore.kernel.org/all/20260320235824.GA1176840@ax162/
> Fixes: 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset partitions")
> Link: https://lore.kernel.org/all/a8c125fd-960d-4b35-b640-95a33584eb08@amd.com/ [1]
> Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2]
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---
> Nathan, can you please check if this fixes the issue you are observing -
> it at least fixed one that I'm observing ;-)
> 
> Peter, if you would like to keep revert and enhancements separate, let
> me know and I'll spin a v2.
> ---
>   kernel/sched/topology.c | 16 ++++++++++------
>   1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 43150591914b..721ed9b883b8 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1669,17 +1669,13 @@ sd_init(struct sched_domain_topology_level *tl,
>   	struct cpumask *sd_span;
>   	u64 now = sched_clock();
>   
> -	sd_span = sched_domain_span(sd);
> -	cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
> -	sd_weight = cpumask_weight(sd_span);
> -	sd_id = cpumask_first(sd_span);
> +	sd_weight = cpumask_weight_and(cpu_map, tl->mask(tl, cpu));
>   
>   	if (tl->sd_flags)
>   		sd_flags = (*tl->sd_flags)();
>   	if (WARN_ONCE(sd_flags & ~TOPOLOGY_SD_FLAGS,
>   		      "wrong sd_flags in topology description\n"))
>   		sd_flags &= TOPOLOGY_SD_FLAGS;
> -	sd_flags |= asym_cpu_capacity_classify(sd_span, cpu_map);
>   
>   	*sd = (struct sched_domain){
>   		.min_interval		= sd_weight,
> @@ -1715,8 +1711,15 @@ sd_init(struct sched_domain_topology_level *tl,
>   		.last_decay_max_lb_cost	= jiffies,
>   		.child			= child,
>   		.name			= tl->name,
> +		.span_weight		= sd_weight,
>   	};
>   
> +	sd_span = sched_domain_span(sd);
> +	cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
> +	sd_id = cpumask_first(sd_span);
> +
> +	sd->flags |= asym_cpu_capacity_classify(sd_span, cpu_map);
> +
>   	WARN_ONCE((sd->flags & (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY)) ==
>   		  (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY),
>   		  "CPU capacity asymmetry not supported on SMT\n");
> @@ -2518,6 +2521,8 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
>   			cpumask_or(sched_domain_span(sd),
>   				   sched_domain_span(sd),
>   				   sched_domain_span(child));
> +
> +			sd->span_weight = cpumask_weight(sched_domain_span(sd));
>   		}
>   
>   	}
> @@ -2697,7 +2702,6 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>   	/* Build the groups for the domains */
>   	for_each_cpu(i, cpu_map) {
>   		for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) {
> -			sd->span_weight = cpumask_weight(sched_domain_span(sd));
>   			if (sd->flags & SD_NUMA) {
>   				if (build_overlap_sched_groups(sd, i))
>   					goto error;
> 
> base-commit: fe7171d0d5dfbe189e41db99580ebacafc3c09ce


Other than nits in changelog:
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>


PS: b4 am -Q was quite confused which patch to pick for 0001.
may since it was a reply to the thread. Not sure. So i pulled
each patch separate and applied.

  reply	other threads:[~2026-03-23  9:09 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-12  4:44 [PATCH v4 0/9] sched/topology: Optimize sd->shared allocation K Prateek Nayak
2026-03-12  4:44 ` [PATCH v4 1/9] sched/topology: Compute sd_weight considering cpuset partitions K Prateek Nayak
2026-03-12  9:34   ` Peter Zijlstra
2026-03-12  9:59     ` K Prateek Nayak
2026-03-12 10:01       ` Peter Zijlstra
2026-03-12 10:09         ` K Prateek Nayak
2026-03-18  8:08   ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-03-20 23:58     ` Nathan Chancellor
2026-03-21  3:36       ` K Prateek Nayak
2026-03-21  7:33         ` Chen, Yu C
2026-03-21  7:47           ` Chen, Yu C
2026-03-21  8:59             ` K Prateek Nayak
2026-03-21  9:45               ` K Prateek Nayak
2026-03-21 10:13                 ` K Prateek Nayak
2026-03-21 12:48                   ` Chen, Yu C
2026-03-24  2:54                     ` K Prateek Nayak
2026-03-21 14:13                   ` Shrikanth Hegde
2026-03-21 15:14                     ` K Prateek Nayak
2026-03-21 16:38       ` [PATCH] sched/topology: Initialize sd_span after assignment to *sd K Prateek Nayak
2026-03-23  9:08         ` Shrikanth Hegde [this message]
2026-03-23 17:34           ` K Prateek Nayak
2026-03-23  9:36         ` Peter Zijlstra
2026-03-23 13:24           ` Jon Hunter
2026-03-23 15:36           ` Chen, Yu C
2026-03-23 17:24           ` K Prateek Nayak
2026-03-23 22:41           ` Nathan Chancellor
2026-03-24  9:10           ` [tip: sched/core] sched/topology: Fix sched_domain_span() tip-bot2 for Peter Zijlstra
2026-03-12  4:44 ` [PATCH v4 2/9] sched/topology: Extract "imb_numa_nr" calculation into a separate helper K Prateek Nayak
2026-03-12 13:37   ` kernel test robot
2026-03-12 15:42     ` K Prateek Nayak
2026-03-12 16:02       ` Peter Zijlstra
2026-03-16  0:18   ` Dietmar Eggemann
2026-03-16  3:41     ` K Prateek Nayak
2026-03-16  8:24       ` Dietmar Eggemann
2026-03-16  8:50         ` K Prateek Nayak
2026-03-18  8:08   ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-03-12  4:44 ` [PATCH v4 3/9] sched/topology: Allocate per-CPU sched_domain_shared in s_data K Prateek Nayak
2026-03-18  8:08   ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-03-12  4:44 ` [PATCH v4 4/9] sched/topology: Switch to assigning "sd->shared" from s_data K Prateek Nayak
2026-03-18  8:08   ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-03-12  4:44 ` [PATCH v4 5/9] sched/topology: Remove sched_domain_shared allocation with sd_data K Prateek Nayak
2026-03-18  8:08   ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-03-12  4:44 ` [PATCH v4 6/9] sched/core: Check for rcu_read_lock_any_held() in idle_get_state() K Prateek Nayak
2026-03-12  9:46   ` Peter Zijlstra
2026-03-12 10:06     ` K Prateek Nayak
2026-03-18  8:08   ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-03-12  4:44 ` [PATCH v4 7/9] sched/fair: Remove superfluous rcu_read_lock() in the wakeup path K Prateek Nayak
2026-03-15 23:36   ` Dietmar Eggemann
2026-03-16  3:19     ` K Prateek Nayak
2026-03-18  8:08     ` [tip: sched/core] PM: EM: Switch to rcu_dereference_all() in " tip-bot2 for Dietmar Eggemann
2026-03-18  8:08   ` [tip: sched/core] sched/fair: Remove superfluous rcu_read_lock() in the " tip-bot2 for K Prateek Nayak
2026-03-12  4:44 ` [PATCH v4 8/9] sched/fair: Simplify the entry condition for update_idle_cpu_scan() K Prateek Nayak
2026-03-18  8:08   ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-03-12  4:44 ` [PATCH v4 9/9] sched/fair: Simplify SIS_UTIL handling in select_idle_cpu() K Prateek Nayak
2026-03-18  8:08   ` [tip: sched/core] " tip-bot2 for K Prateek Nayak
2026-03-16  0:22 ` [PATCH v4 0/9] sched/topology: Optimize sd->shared allocation Dietmar Eggemann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2c5e1ed7-cd7c-49e3-8cb0-ebe7365ba315@linux.ibm.com \
    --to=sshegde@linux.ibm.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=gautham.shenoy@amd.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=nathan@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.