public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vincent Guittot <vincent.guittot@linaro.org>
To: Qais Yousef <qyousef@layalina.io>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] sched/fair: Fix impossible migrate_util scenario in load balance
Date: Tue, 18 Jul 2023 14:48:20 +0200	[thread overview]
Message-ID: <ZLaKFFjY6NWaJdOq@vingu-book> (raw)
In-Reply-To: <20230716014125.139577-1-qyousef@layalina.io>

Le dimanche 16 juil. 2023 à 02:41:25 (+0100), Qais Yousef a écrit :
> We've seen cases while running geekbench that an idle little core never
> pulls a task from a bigger overloaded cluster for 100s of ms and
> sometimes over a second.
> 
> It turned out that the load balance identifies this as a migrate_util
> type since the local group (little cluster) has a spare capacity and
> will try to pull a task. But the little cluster capacity is very small
> nowadays (around 200 or less) and if two busy tasks are stuck on a mid
> core which has a capacity of over 700, this means the util of each tasks
> will be around 350+ range. Which is always bigger than the spare
> capacity of the little group with a single idle core.
> 
> When trying to detach_tasks() we bail out then because of the comparison
> of:
> 
> 	if (util > env->imbalance)
> 		goto next;
> 
> In calculate_imbalance() we convert a migrate_util into migrate_task
> type if the CPU trying to do the pull is idle. But we only do this if
> env->imbalance is 0; which I can't understand. AFAICT env->imbalance
> contains the local group's spare capacity. If it is 0, this means it's
> fully busy.
> 
> Removing this condition fixes the problem, but since I can't fully
> understand why it checks for 0, sending this as RFC. It could be a typo
> and meant to check for
> 
> 	env->imbalance != 0
> 
> instead?
> 
> Signed-off-by: Qais Yousef (Google) <qyousef@layalina.io>
> ---
>  kernel/sched/fair.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index a80a73909dc2..682d9d6a8691 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10288,7 +10288,7 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
>  			 * waiting task in this overloaded busiest group. Let's
>  			 * try to pull it.
>  			 */
> -			if (env->idle != CPU_NOT_IDLE && env->imbalance == 0) {
> +			if (env->idle != CPU_NOT_IDLE) {

With this change you completely skip migrate_util for idle and newly idle case
and this would be too aggressive.

We can do something similar to migrate_load in detach_tasks():

---
 kernel/sched/fair.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d3df5b1642a6..64111ac7e137 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8834,7 +8834,13 @@ static int detach_tasks(struct lb_env *env)
 		case migrate_util:
 			util = task_util_est(p);

-			if (util > env->imbalance)
+			/*
+			 * Make sure that we don't migrate too much utilization.
+			 * Nevertheless, let relax the constraint if
+			 * scheduler fails to find a good waiting task to
+			 * migrate.
+			 */
+			if (shr_bound(util, env->sd->nr_balance_failed) > env->imbalance)
 				goto next;

 			env->imbalance -= util;
--



>  				env->migration_type = migrate_task;
>  				env->imbalance = 1;
>  			}
> -- 
> 2.25.1
> 

  reply	other threads:[~2023-07-18 12:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-16  1:41 [RFC PATCH] sched/fair: Fix impossible migrate_util scenario in load balance Qais Yousef
2023-07-18 12:48 ` Vincent Guittot [this message]
2023-07-18 16:18   ` Qais Yousef
2023-07-18 16:31     ` Vincent Guittot
2023-07-18 17:25       ` Qais Yousef
2023-07-20 12:31         ` Vincent Guittot
2023-07-21 10:57           ` Qais Yousef
2023-07-21 13:52             ` Vincent Guittot
2023-07-21 22:04               ` Qais Yousef
2023-07-24 12:58                 ` Dietmar Eggemann
2023-07-24 16:10                   ` Qais Yousef
2023-07-24 17:54                     ` Dietmar Eggemann
2023-07-24 21:11                       ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZLaKFFjY6NWaJdOq@vingu-book \
    --to=vincent.guittot@linaro.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qyousef@layalina.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox