* [PATCH v2] sched/fair: reduce long-tail newly idle balance cost
@ 2021-02-24 8:15 Aubrey Li
2021-03-16 4:27 ` Li, Aubrey
2021-03-23 15:08 ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Aubrey Li
0 siblings, 2 replies; 5+ messages in thread
From: Aubrey Li @ 2021-02-24 8:15 UTC (permalink / raw)
To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot
Cc: linux-kernel, Aubrey Li, Andi Kleen, Tim Chen,
Srinivas Pandruvada, Rafael J . Wysocki, Aubrey Li
A long-tail load balance cost is observed on the newly idle path,
this is caused by a race window between the first nr_running check
of the busiest runqueue and its nr_running recheck in detach_tasks.
Before the busiest runqueue is locked, the tasks on the busiest
runqueue could be pulled by other CPUs and nr_running of the busiest
runqueu becomes 1 or even 0 if the running task becomes idle, this
causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
load_balance redo at the same sched_domain level.
In order to find the new busiest sched_group and CPU, load balance will
recompute and update the various load statistics, which eventually leads
to the long-tail load balance cost.
This patch clears LBF_ALL_PINNED flag for this race condition, and hence
reduces the long-tail cost of newly idle balance.
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
---
kernel/sched/fair.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 04a3ce2..5c67804 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7675,6 +7675,15 @@ static int detach_tasks(struct lb_env *env)
lockdep_assert_held(&env->src_rq->lock);
+ /*
+ * Source run queue has been emptied by another CPU, clear
+ * LBF_ALL_PINNED flag as we will not test any task.
+ */
+ if (env->src_rq->nr_running <= 1) {
+ env->flags &= ~LBF_ALL_PINNED;
+ return 0;
+ }
+
if (env->imbalance <= 0)
return 0;
--
2.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/fair: reduce long-tail newly idle balance cost
2021-02-24 8:15 [PATCH v2] sched/fair: reduce long-tail newly idle balance cost Aubrey Li
@ 2021-03-16 4:27 ` Li, Aubrey
2021-03-23 13:44 ` Vincent Guittot
2021-03-23 15:08 ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Aubrey Li
1 sibling, 1 reply; 5+ messages in thread
From: Li, Aubrey @ 2021-03-16 4:27 UTC (permalink / raw)
To: Aubrey Li, mingo, peterz, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot
Cc: linux-kernel, Andi Kleen, Tim Chen, Srinivas Pandruvada,
Rafael J . Wysocki
On 2021/2/24 16:15, Aubrey Li wrote:
> A long-tail load balance cost is observed on the newly idle path,
> this is caused by a race window between the first nr_running check
> of the busiest runqueue and its nr_running recheck in detach_tasks.
>
> Before the busiest runqueue is locked, the tasks on the busiest
> runqueue could be pulled by other CPUs and nr_running of the busiest
> runqueu becomes 1 or even 0 if the running task becomes idle, this
> causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
> load_balance redo at the same sched_domain level.
>
> In order to find the new busiest sched_group and CPU, load balance will
> recompute and update the various load statistics, which eventually leads
> to the long-tail load balance cost.
>
> This patch clears LBF_ALL_PINNED flag for this race condition, and hence
> reduces the long-tail cost of newly idle balance.
Ping...
>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
> ---
> kernel/sched/fair.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 04a3ce2..5c67804 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7675,6 +7675,15 @@ static int detach_tasks(struct lb_env *env)
>
> lockdep_assert_held(&env->src_rq->lock);
>
> + /*
> + * Source run queue has been emptied by another CPU, clear
> + * LBF_ALL_PINNED flag as we will not test any task.
> + */
> + if (env->src_rq->nr_running <= 1) {
> + env->flags &= ~LBF_ALL_PINNED;
> + return 0;
> + }
> +
> if (env->imbalance <= 0)
> return 0;
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/fair: reduce long-tail newly idle balance cost
2021-03-16 4:27 ` Li, Aubrey
@ 2021-03-23 13:44 ` Vincent Guittot
2021-03-23 14:49 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Vincent Guittot @ 2021-03-23 13:44 UTC (permalink / raw)
To: Li, Aubrey
Cc: Aubrey Li, Ingo Molnar, Peter Zijlstra, Juri Lelli,
Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
Daniel Bristot de Oliveira, linux-kernel, Andi Kleen, Tim Chen,
Srinivas Pandruvada, Rafael J . Wysocki
Hi Aurey,
On Tue, 16 Mar 2021 at 05:27, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
>
> On 2021/2/24 16:15, Aubrey Li wrote:
> > A long-tail load balance cost is observed on the newly idle path,
> > this is caused by a race window between the first nr_running check
> > of the busiest runqueue and its nr_running recheck in detach_tasks.
> >
> > Before the busiest runqueue is locked, the tasks on the busiest
> > runqueue could be pulled by other CPUs and nr_running of the busiest
> > runqueu becomes 1 or even 0 if the running task becomes idle, this
> > causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
> > load_balance redo at the same sched_domain level.
> >
> > In order to find the new busiest sched_group and CPU, load balance will
> > recompute and update the various load statistics, which eventually leads
> > to the long-tail load balance cost.
> >
> > This patch clears LBF_ALL_PINNED flag for this race condition, and hence
> > reduces the long-tail cost of newly idle balance.
>
> Ping...
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
>
> >
> > Cc: Vincent Guittot <vincent.guittot@linaro.org>
> > Cc: Mel Gorman <mgorman@techsingularity.net>
> > Cc: Andi Kleen <ak@linux.intel.com>
> > Cc: Tim Chen <tim.c.chen@linux.intel.com>
> > Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
> > ---
> > kernel/sched/fair.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 04a3ce2..5c67804 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7675,6 +7675,15 @@ static int detach_tasks(struct lb_env *env)
> >
> > lockdep_assert_held(&env->src_rq->lock);
> >
> > + /*
> > + * Source run queue has been emptied by another CPU, clear
> > + * LBF_ALL_PINNED flag as we will not test any task.
> > + */
> > + if (env->src_rq->nr_running <= 1) {
> > + env->flags &= ~LBF_ALL_PINNED;
> > + return 0;
> > + }
> > +
> > if (env->imbalance <= 0)
> > return 0;
> >
> >
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] sched/fair: reduce long-tail newly idle balance cost
2021-03-23 13:44 ` Vincent Guittot
@ 2021-03-23 14:49 ` Peter Zijlstra
0 siblings, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2021-03-23 14:49 UTC (permalink / raw)
To: Vincent Guittot
Cc: Li, Aubrey, Aubrey Li, Ingo Molnar, Juri Lelli, Dietmar Eggemann,
Steven Rostedt, Ben Segall, Mel Gorman,
Daniel Bristot de Oliveira, linux-kernel, Andi Kleen, Tim Chen,
Srinivas Pandruvada, Rafael J . Wysocki
On Tue, Mar 23, 2021 at 02:44:57PM +0100, Vincent Guittot wrote:
> Hi Aurey,
>
> On Tue, 16 Mar 2021 at 05:27, Li, Aubrey <aubrey.li@linux.intel.com> wrote:
> >
> > On 2021/2/24 16:15, Aubrey Li wrote:
> > > A long-tail load balance cost is observed on the newly idle path,
> > > this is caused by a race window between the first nr_running check
> > > of the busiest runqueue and its nr_running recheck in detach_tasks.
> > >
> > > Before the busiest runqueue is locked, the tasks on the busiest
> > > runqueue could be pulled by other CPUs and nr_running of the busiest
> > > runqueu becomes 1 or even 0 if the running task becomes idle, this
> > > causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
> > > load_balance redo at the same sched_domain level.
> > >
> > > In order to find the new busiest sched_group and CPU, load balance will
> > > recompute and update the various load statistics, which eventually leads
> > > to the long-tail load balance cost.
> > >
> > > This patch clears LBF_ALL_PINNED flag for this race condition, and hence
> > > reduces the long-tail cost of newly idle balance.
> >
> > Ping...
>
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
* [tip: sched/core] sched/fair: Reduce long-tail newly idle balance cost
2021-02-24 8:15 [PATCH v2] sched/fair: reduce long-tail newly idle balance cost Aubrey Li
2021-03-16 4:27 ` Li, Aubrey
@ 2021-03-23 15:08 ` tip-bot2 for Aubrey Li
1 sibling, 0 replies; 5+ messages in thread
From: tip-bot2 for Aubrey Li @ 2021-03-23 15:08 UTC (permalink / raw)
To: linux-tip-commits
Cc: Aubrey Li, Peter Zijlstra (Intel), Vincent Guittot, x86,
linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: acb4decc1e900468d51b33c5f1ee445278e716a7
Gitweb: https://git.kernel.org/tip/acb4decc1e900468d51b33c5f1ee445278e716a7
Author: Aubrey Li <aubrey.li@intel.com>
AuthorDate: Wed, 24 Feb 2021 16:15:49 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Mar 2021 16:01:59 +01:00
sched/fair: Reduce long-tail newly idle balance cost
A long-tail load balance cost is observed on the newly idle path,
this is caused by a race window between the first nr_running check
of the busiest runqueue and its nr_running recheck in detach_tasks.
Before the busiest runqueue is locked, the tasks on the busiest
runqueue could be pulled by other CPUs and nr_running of the busiest
runqueu becomes 1 or even 0 if the running task becomes idle, this
causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers
load_balance redo at the same sched_domain level.
In order to find the new busiest sched_group and CPU, load balance will
recompute and update the various load statistics, which eventually leads
to the long-tail load balance cost.
This patch clears LBF_ALL_PINNED flag for this race condition, and hence
reduces the long-tail cost of newly idle balance.
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/1614154549-116078-1-git-send-email-aubrey.li@intel.com
---
kernel/sched/fair.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index aaa0dfa..6d73bdb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7687,6 +7687,15 @@ static int detach_tasks(struct lb_env *env)
lockdep_assert_held(&env->src_rq->lock);
+ /*
+ * Source run queue has been emptied by another CPU, clear
+ * LBF_ALL_PINNED flag as we will not test any task.
+ */
+ if (env->src_rq->nr_running <= 1) {
+ env->flags &= ~LBF_ALL_PINNED;
+ return 0;
+ }
+
if (env->imbalance <= 0)
return 0;
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-03-23 15:09 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-24 8:15 [PATCH v2] sched/fair: reduce long-tail newly idle balance cost Aubrey Li
2021-03-16 4:27 ` Li, Aubrey
2021-03-23 13:44 ` Vincent Guittot
2021-03-23 14:49 ` Peter Zijlstra
2021-03-23 15:08 ` [tip: sched/core] sched/fair: Reduce " tip-bot2 for Aubrey Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox