From: Chen Yu <yu.c.chen@intel.com>
To: kprateek.nayak@amd.com
Cc: srikar@linux.ibm.com, venkat88@linux.ibm.com,
maddy@linux.ibm.com, sshegde@linux.ibm.com,
riteshh@linux.ibm.com, chleroy@kernel.org,
tim.c.chen@linux.intel.com, peterz@infradead.org,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-sched@vger.kernel.org, Chen Yu <yu.c.chen@intel.com>
Subject: Re: [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask
Date: Tue, 26 May 2026 22:08:56 +0800 [thread overview]
Message-ID: <20260526140856.139657-1-yu.c.chen@intel.com> (raw)
In-Reply-To: <058664ab-0982-4c13-9d4b-caa2f7616b0f@amd.com>
Hi Prateek,
On Tue, 26 May 2026 11:23:59 +0530, K Prateek Nayak <kprateek.nayak@amd.com> wrote:
> Hello Srikar,
>
> On 5/26/2026 10:28 AM, Srikar Dronamraju wrote:
> > L2 Cache reported here is for SMT8 Core aka CACHE domain.
>
> Apart for the scheduler, nothing in tree currently cares about
> cpu_coregroup_mask() except for drivers/base/arch_topology.c but
> Power doesn't select GENERIC_ARCH_TOPOLOGY.
>
> Why can't Power have an internal mask for MC domain (tl_mc_mask) and
> the scheduler can use cpu_coregroup_mask() for the actual LLc? (The L2
> mask in this case.)
>
> Power anyways adds its own topology via set_sched_topology() so the
> default_topology from kernel/sched/topology.c remains unused.
>
> ...
>
> > Shouldnt cache-aware scheduling be worried about cpuset partitions too.
> > If a cpuset has subset of LLC cores, then we should Scheduler assume it can
> > control complete LLC?
>
> Well, the scheduling takes care of partitions and the cache aware
> scheduling bits take care of looking at the full system perspective
> for stats aggregation and pointing to a particular LLc.
>
> We don't compare llc_id across cpusets so we keeping one unique llc_id
> per H/W LLC instance is feasible and it enables us to keep llc_id space
> limited for optimizing cache-aware scheduling.
>
> Now if we have threads of same process across partitions, we'll
> still aggregate the utilization numbers across the full LLC but
> the load balancers at individual partitions will make a call on
> the aggregation.
>
> --
> Thanks and Regards,
> Prateek
>
>
I suppose what you suggested looks like below:
powerpc/smp: make cpu_coregroup_mask() return the LLC
On pSeries shared LPARs(or coregroup_enabled is false on
Power9 and earlier) the hemisphere map is not allocated, so
build_sched_domains() dereferences a NULL cpumask and crashes.
The generic scheduler expects cpu_coregroup_mask() to span the LLC.
On powerpc the LLC is the L2. Return cpu_l2_cache_mask() instead of
the hemisphere map. Use a coregroup_map() helper for the in-file
hemisphere users, and a powerpc_tl_mc_mask() wrapper for the MC
sched-domain level.
Fixes: b5ea300a17e3 ("sched/cache: Make LLC id continuous")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
arch/powerpc/kernel/smp.c | 35 +++++++++++++++++++++++------------
1 file changed, 23 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1040,11 +1040,22 @@ static const struct cpumask *tl_smallcore_smt_mask(struct sched_domain_topology_
}
#endif
+static inline struct cpumask *coregroup_map(int cpu)
+{
+ return per_cpu(cpu_coregroup_map, cpu);
+}
+
struct cpumask *cpu_coregroup_mask(int cpu)
{
- return per_cpu(cpu_coregroup_map, cpu);
+ return cpu_l2_cache_mask(cpu);
+}
+
+static const struct cpumask *
+powerpc_tl_mc_mask(struct sched_domain_topology_level *tl, int cpu)
+{
+ return coregroup_map(cpu);
}
static bool has_coregroup_support(void)
{
if (is_shared_processor())
@@ -1155,7 +1166,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
cpumask_set_cpu(boot_cpuid, cpu_core_mask(boot_cpuid));
if (has_coregroup_support())
- cpumask_set_cpu(boot_cpuid, cpu_coregroup_mask(boot_cpuid));
+ cpumask_set_cpu(boot_cpuid, coregroup_map(boot_cpuid));
init_big_cores();
if (has_big_cores) {
@@ -1520,8 +1531,8 @@ static void remove_cpu_from_masks(int cpu)
set_cpus_unrelated(cpu, i, cpu_core_mask);
if (has_coregroup_support()) {
- for_each_cpu(i, cpu_coregroup_mask(cpu))
- set_cpus_unrelated(cpu, i, cpu_coregroup_mask);
+ for_each_cpu(i, coregroup_map(cpu))
+ set_cpus_unrelated(cpu, i, coregroup_map);
}
}
#endif
@@ -1553,7 +1564,7 @@ static void update_coregroup_mask(int cpu, cpumask_var_t *mask)
if (!*mask) {
/* Assume only siblings are part of this CPU's coregroup */
for_each_cpu(i, submask_fn(cpu))
- set_cpus_related(cpu, i, cpu_coregroup_mask);
+ set_cpus_related(cpu, i, coregroup_map);
return;
}
@@ -1561,18 +1572,18 @@ static void update_coregroup_mask(int cpu, cpumask_var_t *mask)
cpumask_and(*mask, cpu_online_mask, cpu_node_mask(cpu));
/* Update coregroup mask with all the CPUs that are part of submask */
- or_cpumasks_related(cpu, cpu, submask_fn, cpu_coregroup_mask);
+ or_cpumasks_related(cpu, cpu, submask_fn, coregroup_map);
/* Skip all CPUs already part of coregroup mask */
- cpumask_andnot(*mask, *mask, cpu_coregroup_mask(cpu));
+ cpumask_andnot(*mask, *mask, coregroup_map(cpu));
for_each_cpu(i, *mask) {
/* Skip all CPUs not part of this coregroup */
if (coregroup_id == cpu_to_coregroup_id(i)) {
- or_cpumasks_related(cpu, i, submask_fn, cpu_coregroup_mask);
+ or_cpumasks_related(cpu, i, submask_fn, coregroup_map);
cpumask_andnot(*mask, *mask, submask_fn(i));
} else {
- cpumask_andnot(*mask, *mask, cpu_coregroup_mask(i));
+ cpumask_andnot(*mask, *mask, coregroup_map(i));
}
}
}
@@ -1733,7 +1744,7 @@ static void __init build_sched_topology(void)
if (has_coregroup_support()) {
powerpc_topology[i++] =
- SDTL_INIT(tl_mc_mask, powerpc_shared_proc_flags, MC);
+ SDTL_INIT(powerpc_tl_mc_mask, powerpc_shared_proc_flags, MC);
}
powerpc_topology[i++] = SDTL_INIT(tl_pkg_mask, powerpc_shared_proc_flags, PKG);
--
2.43.0
Thanks,
Yu
next prev parent reply other threads:[~2026-05-26 14:18 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-25 14:07 [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask dereference in build_sched_domains on POWER9 Venkat Rao Bagalkote
2026-05-25 15:35 ` Chen, Yu C
2026-05-25 16:16 ` K Prateek Nayak
2026-05-26 3:14 ` Chen, Yu C
2026-05-26 3:14 ` Srikar Dronamraju
2026-05-26 4:08 ` Chen, Yu C
2026-05-26 4:58 ` Srikar Dronamraju
2026-05-26 5:53 ` K Prateek Nayak
2026-05-26 14:08 ` Chen Yu [this message]
2026-05-27 7:01 ` [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask Shrikanth Hegde
2026-05-27 16:05 ` Chen, Yu C
2026-05-27 18:07 ` Shrikanth Hegde
2026-05-28 4:58 ` Shrikanth Hegde
2026-05-28 9:12 ` Chen, Yu C
2026-05-28 10:26 ` Shrikanth Hegde
2026-05-28 15:54 ` Srikar Dronamraju
2026-05-28 15:58 ` Srikar Dronamraju
2026-05-27 16:30 ` K Prateek Nayak
2026-05-26 5:24 ` [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask dereference in build_sched_domains on POWER9 Venkat Rao Bagalkote
2026-05-27 7:05 ` Shrikanth Hegde
2026-05-28 16:01 ` Srikar Dronamraju
2026-05-28 6:54 ` Ritesh Harjani
2026-05-28 16:06 ` Srikar Dronamraju
2026-05-28 11:27 ` Shrikanth Hegde
2026-05-28 13:21 ` Chen, Yu C
2026-05-28 15:06 ` Ritesh Harjani
2026-05-28 15:56 ` Srikar Dronamraju
2026-05-28 16:31 ` Shrikanth Hegde
2026-05-28 16:44 ` Srikar Dronamraju
2026-05-29 3:58 ` Shrikanth Hegde
2026-05-29 6:59 ` Venkat Rao Bagalkote
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260526140856.139657-1-yu.c.chen@intel.com \
--to=yu.c.chen@intel.com \
--cc=chleroy@kernel.org \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sched@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=peterz@infradead.org \
--cc=riteshh@linux.ibm.com \
--cc=srikar@linux.ibm.com \
--cc=sshegde@linux.ibm.com \
--cc=tim.c.chen@linux.intel.com \
--cc=venkat88@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox