From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: "Chen, Yu C" <yu.c.chen@intel.com>, kprateek.nayak@amd.com
Cc: srikar@linux.ibm.com, venkat88@linux.ibm.com,
maddy@linux.ibm.com, riteshh@linux.ibm.com, chleroy@kernel.org,
tim.c.chen@linux.intel.com, peterz@infradead.org,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-sched@vger.kernel.org
Subject: Re: [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask
Date: Thu, 28 May 2026 10:28:03 +0530 [thread overview]
Message-ID: <508a604b-2d30-47f4-a63c-36778a1d47c7@linux.ibm.com> (raw)
In-Reply-To: <42a19577-fe56-44b4-b0ed-37cde6d03ff5@linux.ibm.com>
On 5/27/26 11:37 PM, Shrikanth Hegde wrote:
> Hi Chen, Prateek.
>
> On 5/27/26 9:35 PM, Chen, Yu C wrote:
>> Hi Shrikanth,
>>
>> On 5/27/2026 3:01 PM, Shrikanth Hegde wrote:
>>> Hi Chen, Prateek.
>>>
>>> I got back to work today, sorry for delay.
>>> I am trying to go through the mails.
>>> Apologies in case i have missed any bits.
>>>
>>
>> Thanks for taking a look at this!
>>
>>> On 5/26/26 7:38 PM, Chen Yu wrote:
>>>> Hi Prateek,
>>>>
>>>> On Tue, 26 May 2026 11:23:59 +0530, K Prateek Nayak
>>>> <kprateek.nayak@amd.com> wrote:
>>>>> Hello Srikar,
>>>>>
>>>>> On 5/26/2026 10:28 AM, Srikar Dronamraju wrote:
>>>>>> L2 Cache reported here is for SMT8 Core aka CACHE domain.
>>>>>
>>>>> Apart for the scheduler, nothing in tree currently cares about
>>>>> cpu_coregroup_mask() except for drivers/base/arch_topology.c but
>>>>> Power doesn't select GENERIC_ARCH_TOPOLOGY.
>>>>>
>>>>> Why can't Power have an internal mask for MC domain (tl_mc_mask) and
>>>>> the scheduler can use cpu_coregroup_mask() for the actual LLc? (The L2
>>>>> mask in this case.)
>>>
>>> This seems wrong. there is no notion that coregroup_mask
>>> (MC domain) has to point at LLC domain.
>>>
>>> For example, on Shared LPAR, there is no MC domain and LLC is at SMT
>>> core level.
>>> In that case coregroup_mask has point at SMT mask is wrong.
>>>
>>
>> On Shared LPAR, highest_flag_domain(SD_SHARE_LLC) selected the
>> SMT domain(L2 shared)prior to commit b5ea300a17e3.
>> Prateek suggested changing cpu_coregroup_mask() to use
>> cpu_l2_cache_mask(), which makes the LLC mask cover the same range.
>> sd_llc, size and grouping remain unchanged. Only sd_llc_id becomes
>> contiguous, which aligns with the intent of this commit.
>>
>> But yes, the naming is confusing. cpu_coregroup_mask suggests a
>> "group of cores", but after the change, it only covers threads
>> within a single SMT core.
>>
>
> Yes. Though it might achieve the same effect, keeping it explicit may
> help in maintaining it better.
>
> On PowerPC, there are these subtleties of Shared LPAR where MC domain
> per se doesn't exit etc, and on power9 and earlier has different
> topologies. Maybe one could figure it all out and simplifies them into
> a few masks. But yes, it is slightly different as of today.
>
>>> If we need a mask to point to the LLC mask which arch has to return,
>>> then we would
>>> need a new api say cpu_llc_mask ? that can point accordingly.
>>>
>>
>> Do you mean something like this?
>> https://lore.kernel.org/lkml/8d14c844-b4a8-4af6-
>> acab-2cfdd42225be@intel.com/
>
> Yes. This is something i prefer, but not to make it point to l2 mask
> always. See below diff which now make it boot on Shared Processor LPAR
> where i see panic without it.
>
> Need to add proper comments where appropriate.
>
> diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/
> asm/topology.h
> index 66ed5fe1b718..bd1db3b1dbb0 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -131,6 +131,9 @@ static inline int cpu_to_coregroup_id(int cpu)
> #ifdef CONFIG_SMP
> #include <asm/cputable.h>
>
> +const struct cpumask *arch_llc_mask(int cpu);
> +#define arch_llc_mask arch_llc_mask
> +
> struct cpumask *cpu_coregroup_mask(int cpu);
> const struct cpumask *cpu_die_mask(int cpu);
> int cpu_die_id(int cpu);
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index 3467f86fd78f..26c15c786c55 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -1101,6 +1101,13 @@ const struct cpumask *cpu_die_mask(int cpu)
> }
> EXPORT_SYMBOL_GPL(cpu_die_mask);
>
> +const struct cpumask *arch_llc_mask(int cpu)
> +{
> + if (has_coregroup_support())
> + return cpu_coregroup_mask(cpu);
> + return cpu_smallcore_mask(cpu);
This function body needs change, since LLC is not at MC.
and I didn't account for power9.
Rest of the structure is what i would prefer the direction to go.
This will help future architectures too to account for their specific
needs.
What do you think?
> +}
> +
> int cpu_die_id(int cpu)
> {
> if (has_coregroup_support())
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index df2ceb54c970..3b5155121276 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2063,7 +2063,11 @@ const struct cpumask *tl_mc_mask(struct
> sched_domain_topology_level *tl, int cpu
> return cpu_coregroup_mask(cpu);
> }
>
> +#ifndef arch_llc_mask
> #define llc_mask(cpu) cpu_coregroup_mask(cpu)
> +#else
> +#define llc_mask(cpu) arch_llc_mask(cpu)
> +#endif
>
> #else
> #define llc_mask(cpu) cpumask_of(cpu)
>
> (One more subtlety; crash would be seen only with NR_CPUS=8192 as
> CPUMASK_OFFSTACK=y, but that's a different concern altogether.)
>
>
>>
>>> I don't like mixing MC domain and LLC into one bit.
>>>
>>
>> [ ... ]
>>
>>>> struct cpumask *cpu_coregroup_mask(int cpu)
>>>> {
>>>> - return per_cpu(cpu_coregroup_map, cpu);
>>>> + return cpu_l2_cache_mask(cpu);
>>>> +}
>>>
>>> This looks wrong to me too. In different hardware topologies
>>> there maybe distinction between coregroup and l2 mask.
>>>
>>> Let me go through the code and see if there is better way.
>>>
>>
>> Sure, please go ahead - I'm on board with the direction
>> you settle on.
>>
>> thanks,
>> Chenyu
>>
>
next prev parent reply other threads:[~2026-05-28 4:58 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-25 14:07 [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask dereference in build_sched_domains on POWER9 Venkat Rao Bagalkote
2026-05-25 15:35 ` Chen, Yu C
2026-05-25 16:16 ` K Prateek Nayak
2026-05-26 3:14 ` Chen, Yu C
2026-05-26 3:14 ` Srikar Dronamraju
2026-05-26 4:08 ` Chen, Yu C
2026-05-26 4:58 ` Srikar Dronamraju
2026-05-26 5:53 ` K Prateek Nayak
2026-05-26 14:08 ` [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask Chen Yu
2026-05-27 7:01 ` Shrikanth Hegde
2026-05-27 16:05 ` Chen, Yu C
2026-05-27 18:07 ` Shrikanth Hegde
2026-05-28 4:58 ` Shrikanth Hegde [this message]
2026-05-28 9:12 ` Chen, Yu C
2026-05-28 10:26 ` Shrikanth Hegde
2026-05-28 15:54 ` Srikar Dronamraju
2026-05-28 15:58 ` Srikar Dronamraju
2026-05-27 16:30 ` K Prateek Nayak
2026-05-26 5:24 ` [BUG] sched/cache: "Make LLC id continuous" causes NULL cpumask dereference in build_sched_domains on POWER9 Venkat Rao Bagalkote
2026-05-27 7:05 ` Shrikanth Hegde
2026-05-28 16:01 ` Srikar Dronamraju
2026-05-28 6:54 ` Ritesh Harjani
2026-05-28 16:06 ` Srikar Dronamraju
2026-05-28 11:27 ` Shrikanth Hegde
2026-05-28 13:21 ` Chen, Yu C
2026-05-28 15:06 ` Ritesh Harjani
2026-05-28 15:56 ` Srikar Dronamraju
2026-05-28 16:31 ` Shrikanth Hegde
2026-05-28 16:44 ` Srikar Dronamraju
2026-05-29 3:58 ` Shrikanth Hegde
2026-05-29 6:59 ` Venkat Rao Bagalkote
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=508a604b-2d30-47f4-a63c-36778a1d47c7@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=chleroy@kernel.org \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sched@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=peterz@infradead.org \
--cc=riteshh@linux.ibm.com \
--cc=srikar@linux.ibm.com \
--cc=tim.c.chen@linux.intel.com \
--cc=venkat88@linux.ibm.com \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox