From: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Chen Yu <yu.c.chen@intel.com>, Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Hillf Danton <hdanton@sina.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
Jianyong Wu <jianyong.wu@outlook.com>,
Yangyu Chen <cyy@cyyself.name>,
Tingyin Duan <tingyin.duan@gmail.com>,
Vern Hao <vernhao@tencent.com>, Vern Hao <haoxing990@gmail.com>,
Len Brown <len.brown@intel.com>, Aubrey Li <aubrey.li@intel.com>,
Zhao Liu <zhao1.liu@intel.com>, Chen Yu <yu.chen.surf@gmail.com>,
Adam Li <adamli@os.amperecomputing.com>,
Aaron Lu <ziqianlu@bytedance.com>,
Tim Chen <tim.c.chen@intel.com>, Josh Don <joshdon@google.com>,
Gavin Guo <gavinguo@igalia.com>,
Qais Yousef <qyousef@layalina.io>,
Libo Chen <libchen@purestorage.com>,
linux-kernel@vger.kernel.org,
Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Subject: Re: [PATCH v3 04/21] sched/cache: Make LLC id continuous
Date: Sat, 14 Feb 2026 23:23:35 +0530 [thread overview]
Message-ID: <437fef08-cabe-461f-a2d2-4bc385e9d513@linux.ibm.com> (raw)
In-Reply-To: <60a05a3f50d14a7bf3b968f62cca87893c5c552c.1770760558.git.tim.c.chen@linux.intel.com>
On 11/02/26 03:48, Tim Chen wrote:
> From: Chen Yu <yu.c.chen@intel.com>
>
> Introduce an index mapping between CPUs and their LLCs. This provides
> a continuous per LLC index needed for cache-aware load balancing in
> later patches.
>
> The existing per_cpu llc_id usually points to the first CPU of the
> LLC domain, which is sparse and unsuitable as an array index. Using
> llc_id directly would waste memory.
>
> With the new mapping, CPUs in the same LLC share a continuous id:
>
> per_cpu(llc_id, CPU=0...15) = 0
> per_cpu(llc_id, CPU=16...31) = 1
> per_cpu(llc_id, CPU=32...47) = 2
> ...
>
> Once a CPU has been assigned an llc_id, this ID persists even when
> the CPU is taken offline and brought back online, which can facilitate
> the management of the ID.
tl_max_llcs is never reset across multiple invocations of build_sched_domains().
While this preserves LLC IDs across normal CPU hotplug events, I'm wondering about
scenarios where hardware topology changes, such as physically removing/replacing
CPU sockets.
Example scenario:
Boot with 3 LLCs: IDs {0,1,2}, tl_max_llcs=3
Physical hardware change removes LLC 1
New hardware added at a different position gets ID=3
After multiple such events: System has 4 LLCs but IDs {0,2,5,7}, tl_max_llcs=8
This creates gaps in the ID space. However, I understand this trade-off might be
intentional since physical topology changes are rare, and resetting tl_max_llcs and
all sd_llc_id values would rebuild IDs on every invocation of build_sched_domains().
Would like to know your thoughts on overhead of resetting tl_max_llcs and sd_llc_id
so that IDs are rebuilt on each invocation of build_sched_domains() to always maintain
a dense mapping.
Thanks,
Vineeth
>
> Co-developed-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> Co-developed-by: K Prateek Nayak <kprateek.nayak@amd.com>
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>
> Notes:
> v2->v3:
> Allocate the LLC id according to the topology level data directly, rather
> than calculating from the sched domain. This simplifies the code.
> (Peter Zijlstra, K Prateek Nayak)
>
> kernel/sched/topology.c | 47 ++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 44 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index cf643a5ddedd..ca46b5cf7f78 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -20,6 +20,7 @@ void sched_domains_mutex_unlock(void)
> /* Protected by sched_domains_mutex: */
> static cpumask_var_t sched_domains_tmpmask;
> static cpumask_var_t sched_domains_tmpmask2;
> +static int tl_max_llcs;
>
> static int __init sched_debug_setup(char *str)
> {
> @@ -658,7 +659,7 @@ static void destroy_sched_domains(struct sched_domain *sd)
> */
> DEFINE_PER_CPU(struct sched_domain __rcu *, sd_llc);
> DEFINE_PER_CPU(int, sd_llc_size);
> -DEFINE_PER_CPU(int, sd_llc_id);
> +DEFINE_PER_CPU(int, sd_llc_id) = -1;
> DEFINE_PER_CPU(int, sd_share_id);
> DEFINE_PER_CPU(struct sched_domain_shared __rcu *, sd_llc_shared);
> DEFINE_PER_CPU(struct sched_domain __rcu *, sd_numa);
> @@ -684,7 +685,6 @@ static void update_top_cache_domain(int cpu)
>
> rcu_assign_pointer(per_cpu(sd_llc, cpu), sd);
> per_cpu(sd_llc_size, cpu) = size;
> - per_cpu(sd_llc_id, cpu) = id;
> rcu_assign_pointer(per_cpu(sd_llc_shared, cpu), sds);
>
> sd = lowest_flag_domain(cpu, SD_CLUSTER);
> @@ -2567,10 +2567,18 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>
> /* Set up domains for CPUs specified by the cpu_map: */
> for_each_cpu(i, cpu_map) {
> - struct sched_domain_topology_level *tl;
> + struct sched_domain_topology_level *tl, *tl_llc = NULL;
> + int lid;
>
> sd = NULL;
> for_each_sd_topology(tl) {
> + int flags = 0;
> +
> + if (tl->sd_flags)
> + flags = (*tl->sd_flags)();
> +
> + if (flags & SD_SHARE_LLC)
> + tl_llc = tl;
>
> sd = build_sched_domain(tl, cpu_map, attr, sd, i);
>
> @@ -2581,6 +2589,39 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
> if (cpumask_equal(cpu_map, sched_domain_span(sd)))
> break;
> }
> +
> + lid = per_cpu(sd_llc_id, i);
> + if (lid == -1) {
> + int j;
> +
> + /*
> + * Assign the llc_id to the CPUs that do not
> + * have an LLC.
> + */
> + if (!tl_llc) {
> + per_cpu(sd_llc_id, i) = tl_max_llcs++;
> +
> + continue;
> + }
> +
> + /* try to reuse the llc_id of its siblings */
> + for_each_cpu(j, tl_llc->mask(tl_llc, i)) {
> + if (i == j)
> + continue;
> +
> + lid = per_cpu(sd_llc_id, j);
> +
> + if (lid != -1) {
> + per_cpu(sd_llc_id, i) = lid;
> +
> + break;
> + }
> + }
> +
> + /* a new LLC is detected */
> + if (lid == -1)
> + per_cpu(sd_llc_id, i) = tl_max_llcs++;
> + }
> }
>
> if (WARN_ON(!topology_span_sane(cpu_map)))
next prev parent reply other threads:[~2026-02-14 17:54 UTC|newest]
Thread overview: 117+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-10 22:18 [PATCH v3 00/21] Cache Aware Scheduling Tim Chen
2026-02-10 22:18 ` [PATCH v3 01/21] sched/cache: Introduce infrastructure for cache-aware load balancing Tim Chen
2026-02-14 12:26 ` Madadi Vineeth Reddy
2026-02-14 15:34 ` Chen, Yu C
2026-02-17 18:51 ` Tim Chen
2026-02-10 22:18 ` [PATCH v3 02/21] sched/cache: Record per LLC utilization to guide cache aware scheduling decisions Tim Chen
2026-02-10 22:18 ` [PATCH v3 03/21] sched/cache: Introduce helper functions to enforce LLC migration policy Tim Chen
2026-02-14 16:12 ` Madadi Vineeth Reddy
2026-02-15 12:14 ` Chen, Yu C
2026-02-19 11:29 ` Peter Zijlstra
2026-02-19 14:48 ` Chen, Yu C
2026-02-19 14:55 ` Peter Zijlstra
2026-02-10 22:18 ` [PATCH v3 04/21] sched/cache: Make LLC id continuous Tim Chen
2026-02-14 17:53 ` Madadi Vineeth Reddy [this message]
2026-02-15 14:25 ` Chen, Yu C
2026-02-17 10:05 ` Madadi Vineeth Reddy
2026-02-17 21:20 ` Tim Chen
2026-02-16 7:44 ` K Prateek Nayak
2026-02-17 6:07 ` Chen, Yu C
2026-02-17 8:09 ` K Prateek Nayak
2026-02-17 23:12 ` Tim Chen
2026-02-18 3:28 ` K Prateek Nayak
2026-02-18 15:22 ` Chen, Yu C
2026-02-18 17:46 ` K Prateek Nayak
2026-02-18 23:21 ` Tim Chen
2026-02-19 6:12 ` K Prateek Nayak
2026-02-19 15:51 ` Peter Zijlstra
2026-02-20 0:11 ` Tim Chen
2026-02-19 11:25 ` Chen, Yu C
2026-02-19 16:10 ` K Prateek Nayak
2026-02-18 18:45 ` Tim Chen
2026-02-18 21:33 ` Tim Chen
2026-02-18 15:11 ` Chen, Yu C
2026-02-19 15:48 ` Peter Zijlstra
2026-02-20 15:22 ` Chen, Yu C
2026-02-19 15:40 ` Peter Zijlstra
2026-02-20 15:53 ` Chen, Yu C
2026-02-20 16:03 ` Peter Zijlstra
2026-02-20 16:10 ` Chen, Yu C
2026-02-20 19:24 ` Tim Chen
2026-02-20 19:30 ` Peter Zijlstra
2026-02-20 19:35 ` Tim Chen
2026-02-19 11:35 ` Peter Zijlstra
2026-02-19 18:17 ` Tim Chen
2026-02-19 14:59 ` Peter Zijlstra
2026-02-19 15:20 ` Chen, Yu C
2026-02-19 19:20 ` Tim Chen
2026-02-19 21:04 ` Tim Chen
2026-02-20 17:17 ` Chen, Yu C
2026-02-10 22:18 ` [PATCH v3 05/21] sched/cache: Assign preferred LLC ID to processes Tim Chen
2026-02-14 18:36 ` Madadi Vineeth Reddy
2026-02-16 6:58 ` Chen, Yu C
2026-02-10 22:18 ` [PATCH v3 06/21] sched/cache: Track LLC-preferred tasks per runqueue Tim Chen
2026-02-10 22:18 ` [PATCH v3 07/21] sched/cache: Introduce per CPU's tasks LLC preference counter Tim Chen
2026-02-20 10:45 ` Peter Zijlstra
2026-02-20 16:57 ` Chen, Yu C
2026-02-20 18:38 ` Peter Zijlstra
2026-02-10 22:18 ` [PATCH v3 08/21] sched/cache: Calculate the percpu sd task LLC preference Tim Chen
2026-02-20 11:02 ` Peter Zijlstra
2026-02-20 14:02 ` Peter Zijlstra
2026-02-20 17:25 ` Chen, Yu C
2026-02-10 22:18 ` [PATCH v3 09/21] sched/cache: Count tasks prefering destination LLC in a sched group Tim Chen
2026-02-20 12:52 ` Peter Zijlstra
2026-02-20 13:43 ` Peter Zijlstra
2026-02-21 2:53 ` Chen, Yu C
2026-02-10 22:18 ` [PATCH v3 10/21] sched/cache: Check local_group only once in update_sg_lb_stats() Tim Chen
2026-02-10 22:18 ` [PATCH v3 11/21] sched/cache: Prioritize tasks preferring destination LLC during balancing Tim Chen
2026-02-17 18:33 ` Madadi Vineeth Reddy
2026-02-17 21:45 ` Tim Chen
2026-02-10 22:18 ` [PATCH v3 12/21] sched/cache: Add migrate_llc_task migration type for cache-aware balancing Tim Chen
2026-02-10 22:18 ` [PATCH v3 13/21] sched/cache: Handle moving single tasks to/from their preferred LLC Tim Chen
2026-02-17 19:00 ` Madadi Vineeth Reddy
2026-02-17 22:04 ` Tim Chen
2026-02-20 13:53 ` Peter Zijlstra
2026-02-20 18:22 ` Tim Chen
2026-02-10 22:18 ` [PATCH v3 14/21] sched/cache: Respect LLC preference in task migration and detach Tim Chen
2026-02-18 9:14 ` Madadi Vineeth Reddy
2026-02-18 15:34 ` Chen, Yu C
2026-02-10 22:18 ` [PATCH v3 15/21] sched/cache: Disable cache aware scheduling for processes with high thread counts Tim Chen
2026-02-18 17:54 ` Madadi Vineeth Reddy
2026-02-18 21:44 ` Tim Chen
2026-02-19 2:28 ` Madadi Vineeth Reddy
2026-02-19 14:38 ` Chen, Yu C
2026-02-19 21:12 ` Tim Chen
2026-02-19 16:52 ` Peter Zijlstra
2026-02-20 7:02 ` Madadi Vineeth Reddy
2026-02-19 16:55 ` Peter Zijlstra
2026-02-20 6:40 ` Madadi Vineeth Reddy
2026-02-20 9:53 ` Peter Zijlstra
2026-02-24 9:42 ` Madadi Vineeth Reddy
2026-02-19 16:50 ` Peter Zijlstra
2026-02-19 21:06 ` Tim Chen
2026-02-10 22:18 ` [PATCH v3 16/21] sched/cache: Avoid cache-aware scheduling for memory-heavy processes Tim Chen
2026-02-10 22:18 ` [PATCH v3 17/21] sched/cache: Enable cache aware scheduling for multi LLCs NUMA node Tim Chen
2026-02-10 22:18 ` [PATCH v3 18/21] sched/cache: Allow the user space to turn on and off cache aware scheduling Tim Chen
2026-02-10 22:18 ` [PATCH v3 19/21] sched/cache: Add user control to adjust the aggressiveness of cache-aware scheduling Tim Chen
2026-02-20 14:29 ` Peter Zijlstra
2026-02-20 18:18 ` Tim Chen
2026-02-10 22:19 ` [PATCH v3 20/21] -- DO NOT APPLY!!! -- sched/cache/debug: Display the per LLC occupancy for each process via proc fs Tim Chen
2026-02-10 22:19 ` [PATCH v3 21/21] -- DO NOT APPLY!!! -- sched/cache/debug: Add ftrace to track the load balance statistics Tim Chen
2026-02-19 14:08 ` [PATCH v3 00/21] Cache Aware Scheduling Qais Yousef
2026-02-19 14:41 ` Peter Zijlstra
2026-02-19 15:07 ` Chen, Yu C
2026-02-19 18:11 ` Tim Chen
2026-02-20 3:29 ` Qais Yousef
2026-02-20 9:43 ` Peter Zijlstra
2026-02-24 2:49 ` Qais Yousef
2026-02-20 18:14 ` Tim Chen
2026-02-24 3:02 ` Qais Yousef
2026-02-20 3:25 ` Qais Yousef
2026-02-21 2:48 ` Chen, Yu C
2026-02-24 3:11 ` Qais Yousef
2026-02-19 19:48 ` Qais Yousef
2026-02-19 21:47 ` Tim Chen
2026-02-20 3:41 ` Qais Yousef
2026-02-20 8:45 ` Peter Zijlstra
2026-02-24 3:31 ` Qais Yousef
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=437fef08-cabe-461f-a2d2-4bc385e9d513@linux.ibm.com \
--to=vineethr@linux.ibm.com \
--cc=adamli@os.amperecomputing.com \
--cc=aubrey.li@intel.com \
--cc=bsegall@google.com \
--cc=cyy@cyyself.name \
--cc=dietmar.eggemann@arm.com \
--cc=gautham.shenoy@amd.com \
--cc=gavinguo@igalia.com \
--cc=haoxing990@gmail.com \
--cc=hdanton@sina.com \
--cc=jianyong.wu@outlook.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=len.brown@intel.com \
--cc=libchen@purestorage.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=qyousef@layalina.io \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=tim.c.chen@intel.com \
--cc=tim.c.chen@linux.intel.com \
--cc=tingyin.duan@gmail.com \
--cc=vernhao@tencent.com \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=yu.c.chen@intel.com \
--cc=yu.chen.surf@gmail.com \
--cc=zhao1.liu@intel.com \
--cc=ziqianlu@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox