public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] sched/fair: reduce long-tail newly idle balance cost
@ 2021-02-24  8:15 Aubrey Li
  2021-03-16  4:27 ` Li, Aubrey
  2021-03-23 15:08 ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Aubrey Li
  0 siblings, 2 replies; 5+ messages in thread
From: Aubrey Li @ 2021-02-24  8:15 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel, Aubrey Li, Andi Kleen, Tim Chen,
	Srinivas Pandruvada, Rafael J . Wysocki, Aubrey Li

A long-tail load balance cost is observed on the newly idle path,
this is caused by a race window between the first nr_running check
of the busiest runqueue and its nr_running recheck in detach_tasks.

Before the busiest runqueue is locked, the tasks on the busiest
runqueue could be pulled by other CPUs and nr_running of the busiest
runqueu becomes 1 or even 0 if the running task becomes idle, this
causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
load_balance redo at the same sched_domain level.

In order to find the new busiest sched_group and CPU, load balance will
recompute and update the various load statistics, which eventually leads
to the long-tail load balance cost.

This patch clears LBF_ALL_PINNED flag for this race condition, and hence
reduces the long-tail cost of newly idle balance.

Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
---
 kernel/sched/fair.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce2..5c67804 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7675,6 +7675,15 @@ static int detach_tasks(struct lb_env *env)
 
 	lockdep_assert_held(&env->src_rq->lock);
 
+	/*
+	 * Source run queue has been emptied by another CPU, clear
+	 * LBF_ALL_PINNED flag as we will not test any task.
+	 */
+	if (env->src_rq->nr_running <= 1) {
+		env->flags &= ~LBF_ALL_PINNED;
+		return 0;
+	}
+
 	if (env->imbalance <= 0)
 		return 0;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched/fair: reduce long-tail newly idle balance cost
  2021-02-24  8:15 [PATCH v2] sched/fair: reduce long-tail newly idle balance cost Aubrey Li
@ 2021-03-16  4:27 ` Li, Aubrey
  2021-03-23 13:44   ` Vincent Guittot
  2021-03-23 15:08 ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Aubrey Li
  1 sibling, 1 reply; 5+ messages in thread
From: Li, Aubrey @ 2021-03-16  4:27 UTC (permalink / raw)
  To: Aubrey Li, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot
  Cc: linux-kernel, Andi Kleen, Tim Chen, Srinivas Pandruvada,
	Rafael J . Wysocki

On 2021/2/24 16:15, Aubrey Li wrote:
> A long-tail load balance cost is observed on the newly idle path,
> this is caused by a race window between the first nr_running check
> of the busiest runqueue and its nr_running recheck in detach_tasks.
> 
> Before the busiest runqueue is locked, the tasks on the busiest
> runqueue could be pulled by other CPUs and nr_running of the busiest
> runqueu becomes 1 or even 0 if the running task becomes idle, this
> causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
> load_balance redo at the same sched_domain level.
> 
> In order to find the new busiest sched_group and CPU, load balance will
> recompute and update the various load statistics, which eventually leads
> to the long-tail load balance cost.
> 
> This patch clears LBF_ALL_PINNED flag for this race condition, and hence
> reduces the long-tail cost of newly idle balance.

Ping...

> 
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
> ---
>  kernel/sched/fair.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 04a3ce2..5c67804 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7675,6 +7675,15 @@ static int detach_tasks(struct lb_env *env)
>  
>  	lockdep_assert_held(&env->src_rq->lock);
>  
> +	/*
> +	 * Source run queue has been emptied by another CPU, clear
> +	 * LBF_ALL_PINNED flag as we will not test any task.
> +	 */
> +	if (env->src_rq->nr_running <= 1) {
> +		env->flags &= ~LBF_ALL_PINNED;
> +		return 0;
> +	}
> +
>  	if (env->imbalance <= 0)
>  		return 0;
>  
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched/fair: reduce long-tail newly idle balance cost
  2021-03-16  4:27 ` Li, Aubrey
@ 2021-03-23 13:44   ` Vincent Guittot
  2021-03-23 14:49     ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Vincent Guittot @ 2021-03-23 13:44 UTC (permalink / raw)
  To: Li, Aubrey
  Cc: Aubrey Li, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Andi Kleen, Tim Chen,
	Srinivas Pandruvada, Rafael J . Wysocki

Hi Aurey,

On Tue, 16 Mar 2021 at 05:27, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
>
> On 2021/2/24 16:15, Aubrey Li wrote:
> > A long-tail load balance cost is observed on the newly idle path,
> > this is caused by a race window between the first nr_running check
> > of the busiest runqueue and its nr_running recheck in detach_tasks.
> >
> > Before the busiest runqueue is locked, the tasks on the busiest
> > runqueue could be pulled by other CPUs and nr_running of the busiest
> > runqueu becomes 1 or even 0 if the running task becomes idle, this
> > causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
> > load_balance redo at the same sched_domain level.
> >
> > In order to find the new busiest sched_group and CPU, load balance will
> > recompute and update the various load statistics, which eventually leads
> > to the long-tail load balance cost.
> >
> > This patch clears LBF_ALL_PINNED flag for this race condition, and hence
> > reduces the long-tail cost of newly idle balance.
>
> Ping...

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

>
> >
> > Cc: Vincent Guittot <vincent.guittot@linaro.org>
> > Cc: Mel Gorman <mgorman@techsingularity.net>
> > Cc: Andi Kleen <ak@linux.intel.com>
> > Cc: Tim Chen <tim.c.chen@linux.intel.com>
> > Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
> > ---
> >  kernel/sched/fair.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 04a3ce2..5c67804 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7675,6 +7675,15 @@ static int detach_tasks(struct lb_env *env)
> >
> >       lockdep_assert_held(&env->src_rq->lock);
> >
> > +     /*
> > +      * Source run queue has been emptied by another CPU, clear
> > +      * LBF_ALL_PINNED flag as we will not test any task.
> > +      */
> > +     if (env->src_rq->nr_running <= 1) {
> > +             env->flags &= ~LBF_ALL_PINNED;
> > +             return 0;
> > +     }
> > +
> >       if (env->imbalance <= 0)
> >               return 0;
> >
> >
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] sched/fair: reduce long-tail newly idle balance cost
  2021-03-23 13:44   ` Vincent Guittot
@ 2021-03-23 14:49     ` Peter Zijlstra
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2021-03-23 14:49 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Li, Aubrey, Aubrey Li, Ingo Molnar, Juri Lelli, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, linux-kernel, Andi Kleen, Tim Chen,
	Srinivas Pandruvada, Rafael J . Wysocki

On Tue, Mar 23, 2021 at 02:44:57PM +0100, Vincent Guittot wrote:
> Hi Aurey,
> 
> On Tue, 16 Mar 2021 at 05:27, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
> >
> > On 2021/2/24 16:15, Aubrey Li wrote:
> > > A long-tail load balance cost is observed on the newly idle path,
> > > this is caused by a race window between the first nr_running check
> > > of the busiest runqueue and its nr_running recheck in detach_tasks.
> > >
> > > Before the busiest runqueue is locked, the tasks on the busiest
> > > runqueue could be pulled by other CPUs and nr_running of the busiest
> > > runqueu becomes 1 or even 0 if the running task becomes idle, this
> > > causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
> > > load_balance redo at the same sched_domain level.
> > >
> > > In order to find the new busiest sched_group and CPU, load balance will
> > > recompute and update the various load statistics, which eventually leads
> > > to the long-tail load balance cost.
> > >
> > > This patch clears LBF_ALL_PINNED flag for this race condition, and hence
> > > reduces the long-tail cost of newly idle balance.
> >
> > Ping...
> 
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

Thanks!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: sched/core] sched/fair: Reduce long-tail newly idle balance cost
  2021-02-24  8:15 [PATCH v2] sched/fair: reduce long-tail newly idle balance cost Aubrey Li
  2021-03-16  4:27 ` Li, Aubrey
@ 2021-03-23 15:08 ` tip-bot2 for Aubrey Li
  1 sibling, 0 replies; 5+ messages in thread
From: tip-bot2 for Aubrey Li @ 2021-03-23 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Aubrey Li, Peter Zijlstra (Intel), Vincent Guittot, x86,
	linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     acb4decc1e900468d51b33c5f1ee445278e716a7
Gitweb:        https://git.kernel.org/tip/acb4decc1e900468d51b33c5f1ee445278e716a7
Author:        Aubrey Li <aubrey.li@intel.com>
AuthorDate:    Wed, 24 Feb 2021 16:15:49 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Mar 2021 16:01:59 +01:00

sched/fair: Reduce long-tail newly idle balance cost

A long-tail load balance cost is observed on the newly idle path,
this is caused by a race window between the first nr_running check
of the busiest runqueue and its nr_running recheck in detach_tasks.

Before the busiest runqueue is locked, the tasks on the busiest
runqueue could be pulled by other CPUs and nr_running of the busiest
runqueu becomes 1 or even 0 if the running task becomes idle, this
causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
load_balance redo at the same sched_domain level.

In order to find the new busiest sched_group and CPU, load balance will
recompute and update the various load statistics, which eventually leads
to the long-tail load balance cost.

This patch clears LBF_ALL_PINNED flag for this race condition, and hence
reduces the long-tail cost of newly idle balance.

Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/1614154549-116078-1-git-send-email-aubrey.li@intel.com
---
 kernel/sched/fair.c |  9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index aaa0dfa..6d73bdb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7687,6 +7687,15 @@ static int detach_tasks(struct lb_env *env)
 
 	lockdep_assert_held(&env->src_rq->lock);
 
+	/*
+	 * Source run queue has been emptied by another CPU, clear
+	 * LBF_ALL_PINNED flag as we will not test any task.
+	 */
+	if (env->src_rq->nr_running <= 1) {
+		env->flags &= ~LBF_ALL_PINNED;
+		return 0;
+	}
+
 	if (env->imbalance <= 0)
 		return 0;
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-23 15:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-24  8:15 [PATCH v2] sched/fair: reduce long-tail newly idle balance cost Aubrey Li
2021-03-16  4:27 ` Li, Aubrey
2021-03-23 13:44   ` Vincent Guittot
2021-03-23 14:49     ` Peter Zijlstra
2021-03-23 15:08 ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Aubrey Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox