From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Tim Chen <tim.c.chen@linux.intel.com>, Chen Yu <yu.c.chen@intel.com>
Cc: Juri Lelli <juri.lelli@redhat.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Tim Chen <tim.c.chen@intel.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Libo Chen <libo.chen@oracle.com>,
Abel Wu <wuyun.abel@bytedance.com>,
Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
Hillf Danton <hdanton@sina.com>, Len Brown <len.brown@intel.com>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
"Gautham R . Shenoy" <gautham.shenoy@amd.com>
Subject: Re: [RFC patch v3 14/20] sched: Introduce update_llc_busiest() to deal with groups having preferred LLC tasks
Date: Fri, 4 Jul 2025 01:22:09 +0530 [thread overview]
Message-ID: <736d41f0-1eb4-4420-ab67-e88fc7e31bda@linux.ibm.com> (raw)
In-Reply-To: <e5b77a2e33a6a98de0468c999e8c94d226b8e6d3.1750268218.git.tim.c.chen@linux.intel.com>
On 6/18/25 23:58, Tim Chen wrote:
> The load balancer attempts to identify the busiest sched_group with
> the highest load and migrates some tasks to a less busy sched_group
> to distribute the load across different CPUs.
>
> When cache-aware scheduling is enabled, the busiest sched_group is
> defined as the one with the highest number of tasks preferring to run
> on the destination LLC. If the busiest group has llc_balance tag,
> the cache aware load balance will be launched.
>
> Introduce the helper function update_llc_busiest() to identify
> such sched group with most tasks preferring the destination LLC.
>
> Co-developed-by: Chen Yu <yu.c.chen@intel.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
> kernel/sched/fair.c | 36 +++++++++++++++++++++++++++++++++++-
> 1 file changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 48a090c6e885..ab3d1239d6e4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10848,12 +10848,36 @@ static inline bool llc_balance(struct lb_env *env, struct sg_lb_stats *sgs,
>
> return false;
> }
> +
> +static bool update_llc_busiest(struct lb_env *env,
> + struct sg_lb_stats *busiest,
> + struct sg_lb_stats *sgs)
> +{
> + int idx;
> +
> + /* Only the candidate with llc_balance need to be taken care of */
> + if (!sgs->group_llc_balance)
> + return false;
> +
> + /*
> + * There are more tasks that want to run on dst_cpu's LLC.
> + */
> + idx = llc_idx(env->dst_cpu);
> + return sgs->nr_pref_llc[idx] > busiest->nr_pref_llc[idx];
> +}
> #else
> static inline bool llc_balance(struct lb_env *env, struct sg_lb_stats *sgs,
> struct sched_group *group)
> {
> return false;
> }
> +
> +static bool update_llc_busiest(struct lb_env *env,
> + struct sg_lb_stats *busiest,
> + struct sg_lb_stats *sgs)
> +{
> + return false;
> +}
> #endif
>
> static inline long sibling_imbalance(struct lb_env *env,
> @@ -11085,6 +11109,14 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> sds->local_stat.group_type != group_has_spare))
> return false;
>
> + /* deal with prefer LLC load balance, if failed, fall into normal load balance */
> + if (update_llc_busiest(env, busiest, sgs))
> + return true;
> +
> + /* if there is already a busy group, skip the normal load balance */
> + if (busiest->group_llc_balance)
> + return false;
> +
If you had a group which was group_overloaded but it could have group_llc_balance right?
In this case the priorities based on group_type is not followed no?
> if (sgs->group_type > busiest->group_type)
> return true;
>
> @@ -11991,9 +12023,11 @@ static struct sched_group *sched_balance_find_src_group(struct lb_env *env)
> /*
> * Try to move all excess tasks to a sibling domain of the busiest
> * group's child domain.
> + * Also do so if we can move some tasks that prefer the local LLC.
> */
> if (sds.prefer_sibling && local->group_type == group_has_spare &&
> - sibling_imbalance(env, &sds, busiest, local) > 1)
> + (busiest->group_llc_balance ||
> + sibling_imbalance(env, &sds, busiest, local) > 1))
> goto force_balance;
>
> if (busiest->group_type != group_overloaded) {
Also, This load balancing happening due to llc could be very tricky to debug.
Any stats added to schedstat or sched/debug?
next prev parent reply other threads:[~2025-07-03 19:52 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-18 18:27 [RFC patch v3 00/20] Cache aware scheduling Tim Chen
2025-06-18 18:27 ` [RFC patch v3 01/20] sched: Cache aware load-balancing Tim Chen
2025-06-26 12:23 ` Jianyong Wu
2025-06-26 13:32 ` Chen, Yu C
2025-06-27 0:10 ` Tim Chen
2025-06-27 2:13 ` Jianyong Wu
2025-07-03 19:29 ` Shrikanth Hegde
2025-07-04 8:40 ` Chen, Yu C
2025-07-04 8:45 ` Peter Zijlstra
2025-07-04 8:54 ` Shrikanth Hegde
2025-07-07 19:57 ` Tim Chen
2025-06-18 18:27 ` [RFC patch v3 02/20] sched: Several fixes for cache aware scheduling Tim Chen
2025-07-03 19:33 ` Shrikanth Hegde
2025-07-07 21:02 ` Tim Chen
2025-07-08 1:15 ` Libo Chen
2025-07-08 7:54 ` Chen, Yu C
2025-07-08 15:47 ` Libo Chen
2025-06-18 18:27 ` [RFC patch v3 03/20] sched: Avoid task migration within its preferred LLC Tim Chen
2025-06-18 18:27 ` [RFC patch v3 04/20] sched: Avoid calculating the cpumask if the system is overloaded Tim Chen
2025-07-03 19:39 ` Shrikanth Hegde
2025-07-07 14:57 ` Tim Chen
2025-06-18 18:27 ` [RFC patch v3 05/20] sched: Add hysteresis to switch a task's preferred LLC Tim Chen
2025-07-02 6:47 ` Madadi Vineeth Reddy
2025-07-02 21:47 ` Tim Chen
2025-06-18 18:27 ` [RFC patch v3 06/20] sched: Save the per LLC utilization for better cache aware scheduling Tim Chen
2025-06-18 18:27 ` [RFC patch v3 07/20] sched: Add helper function to decide whether to allow " Tim Chen
2025-07-08 0:41 ` Libo Chen
2025-07-08 8:29 ` Chen, Yu C
2025-07-08 17:22 ` Libo Chen
2025-07-09 14:41 ` Chen, Yu C
2025-07-09 21:31 ` Libo Chen
2025-07-08 21:59 ` Tim Chen
2025-07-09 21:22 ` Libo Chen
2025-06-18 18:27 ` [RFC patch v3 08/20] sched: Set up LLC indexing Tim Chen
2025-07-03 19:44 ` Shrikanth Hegde
2025-07-04 9:36 ` Chen, Yu C
2025-06-18 18:27 ` [RFC patch v3 09/20] sched: Introduce task preferred LLC field Tim Chen
2025-06-18 18:27 ` [RFC patch v3 10/20] sched: Calculate the number of tasks that have LLC preference on a runqueue Tim Chen
2025-07-03 19:45 ` Shrikanth Hegde
2025-07-04 15:00 ` Chen, Yu C
2025-06-18 18:27 ` [RFC patch v3 11/20] sched: Introduce per runqueue task LLC preference counter Tim Chen
2025-06-18 18:28 ` [RFC patch v3 12/20] sched: Calculate the total number of preferred LLC tasks during load balance Tim Chen
2025-06-18 18:28 ` [RFC patch v3 13/20] sched: Tag the sched group as llc_balance if it has tasks prefer other LLC Tim Chen
2025-06-18 18:28 ` [RFC patch v3 14/20] sched: Introduce update_llc_busiest() to deal with groups having preferred LLC tasks Tim Chen
2025-07-03 19:52 ` Shrikanth Hegde [this message]
2025-07-05 2:26 ` Chen, Yu C
2025-06-18 18:28 ` [RFC patch v3 15/20] sched: Introduce a new migration_type to track the preferred LLC load balance Tim Chen
2025-06-18 18:28 ` [RFC patch v3 16/20] sched: Consider LLC locality for active balance Tim Chen
2025-06-18 18:28 ` [RFC patch v3 17/20] sched: Consider LLC preference when picking tasks from busiest queue Tim Chen
2025-06-18 18:28 ` [RFC patch v3 18/20] sched: Do not migrate task if it is moving out of its preferred LLC Tim Chen
2025-06-18 18:28 ` [RFC patch v3 19/20] sched: Introduce SCHED_CACHE_LB to control cache aware load balance Tim Chen
2025-06-18 18:28 ` [RFC patch v3 20/20] sched: Introduce SCHED_CACHE_WAKE to control LLC aggregation on wake up Tim Chen
2025-06-19 6:39 ` [RFC patch v3 00/20] Cache aware scheduling Yangyu Chen
2025-06-19 13:21 ` Chen, Yu C
2025-06-19 14:12 ` Yangyu Chen
2025-06-20 19:25 ` Madadi Vineeth Reddy
2025-06-22 0:39 ` Chen, Yu C
2025-06-24 17:47 ` Madadi Vineeth Reddy
2025-06-23 16:45 ` Tim Chen
2025-06-24 5:00 ` K Prateek Nayak
2025-06-24 12:16 ` Chen, Yu C
2025-06-25 4:19 ` K Prateek Nayak
2025-06-25 0:30 ` Tim Chen
2025-06-25 4:30 ` K Prateek Nayak
2025-07-03 20:00 ` Shrikanth Hegde
2025-07-04 10:09 ` Chen, Yu C
2025-07-09 19:39 ` Madadi Vineeth Reddy
2025-07-10 3:33 ` Chen, Yu C
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=736d41f0-1eb4-4420-ab67-e88fc7e31bda@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=gautham.shenoy@amd.com \
--cc=hdanton@sina.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=len.brown@intel.com \
--cc=libo.chen@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tim.c.chen@intel.com \
--cc=tim.c.chen@linux.intel.com \
--cc=vincent.guittot@linaro.org \
--cc=vineethr@linux.ibm.com \
--cc=vschneid@redhat.com \
--cc=wuyun.abel@bytedance.com \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.