linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
@ 2025-03-03 10:52 Xuewen Yan
  2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-03 10:52 UTC (permalink / raw)
  To: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid
  Cc: linux-kernel, ke.wang, di.shen, xuewen.yan94

Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
lag has elapsed. As a result, it stays also visible in rq->nr_running.
However, sometimes when using nr-running, we should not consider
sched-delayed tasks.
This serie fixes those by adding a helper function which return the
number of sched-delayed tasks. And when we should get the real runnable
tasks, we sub the nr-delayed tasks.

Changes sinc v1:
- add cover-letter
- add helper function;
- add more fixes

Xuewen Yan (3):
  sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
  sched/fair: Do not consider the sched-delayed task when yield
  sched: Do not consider the delayed task when cpu is about to enter
    idle

 kernel/sched/core.c  |  2 +-
 kernel/sched/fair.c  | 10 +++++++---
 kernel/sched/sched.h |  5 +++++
 3 files changed, 13 insertions(+), 4 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
  2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
@ 2025-03-03 10:52 ` Xuewen Yan
  2025-03-19  9:05   ` Tianchen Ding
                     ` (2 more replies)
  2025-03-03 10:52 ` [RFC PATCH V2 2/3] sched/fair: Do not consider the sched-delayed task when yield Xuewen Yan
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-03 10:52 UTC (permalink / raw)
  To: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid
  Cc: linux-kernel, ke.wang, di.shen, xuewen.yan94

Delayed dequeued feature keeps a sleeping task enqueued until its
lag has elapsed. As a result, it stays also visible in rq->nr_running.
So when in wake_affine_idle(), we should use the real running-tasks
in rq to check whether we should place the wake-up task to
current cpu.
On the other hand, add a helper function to return the nr-delayed.

Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
V2:
- add helper function (Vincent)
---
 kernel/sched/fair.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1c0ef435a7aa..a354f29c4f6f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7181,6 +7181,11 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 	return true;
 }
 
+static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
+{
+	return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
+}
+
 #ifdef CONFIG_SMP
 
 /* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
@@ -7342,8 +7347,12 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
 	if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
 		return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
 
-	if (sync && cpu_rq(this_cpu)->nr_running == 1)
-		return this_cpu;
+	if (sync) {
+		struct rq *rq = cpu_rq(this_cpu);
+
+		if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
+			return this_cpu;
+	}
 
 	if (available_idle_cpu(prev_cpu))
 		return prev_cpu;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH V2 2/3] sched/fair: Do not consider the sched-delayed task when yield
  2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
  2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
@ 2025-03-03 10:52 ` Xuewen Yan
  2025-03-03 10:52 ` [RFC PATCH V2 3/3] sched: Do not consider the delayed task when cpu is about to enter idle Xuewen Yan
  2025-03-03 12:00 ` [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Peter Zijlstra
  3 siblings, 0 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-03 10:52 UTC (permalink / raw)
  To: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid
  Cc: linux-kernel, ke.wang, di.shen, xuewen.yan94

When task call sched_yield, if there is only one task in rq,
it is no need to yield, however now, the rq->nr_running include
the sched-delayed tasks which are indeed not runnable tasks.
So sub the sched-delayed tasks when check the nr-running.

Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a354f29c4f6f..8797f6872155 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8992,7 +8992,7 @@ static void yield_task_fair(struct rq *rq)
 	/*
 	 * Are we the only task in the tree?
 	 */
-	if (unlikely(rq->nr_running == 1))
+	if (unlikely((rq->nr_running - cfs_h_nr_delayed(rq)) == 1))
 		return;
 
 	clear_buddies(cfs_rq, se);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH V2 3/3] sched: Do not consider the delayed task when cpu is about to enter idle
  2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
  2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
  2025-03-03 10:52 ` [RFC PATCH V2 2/3] sched/fair: Do not consider the sched-delayed task when yield Xuewen Yan
@ 2025-03-03 10:52 ` Xuewen Yan
  2025-03-03 12:00 ` [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Peter Zijlstra
  3 siblings, 0 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-03 10:52 UTC (permalink / raw)
  To: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid
  Cc: linux-kernel, ke.wang, di.shen, xuewen.yan94

When there are one task with sched-delayed and one task
which is descheduling, Using nr-running to determine
CPU idle may be incorrect.
For example:
task-A is sched_delayed, task-B is descheduling:
1. before schedule():
   rq-nr-running=2, task-A->on_rq=1; task-B->on_rq=1;
2. after block_task(B):
   rq-nr-running=1, task-A->on_rq=1; task-B->on_rq=0;
3. after pick_next_task(), because the task-A would be dequeued:
   rq-nr-running=0, task-A->on_rq=0; task-B->on_rq=0;

In ttwu_queue_cond, it hope the nr-running to be 0 after
the step 2, however, now the nr-running is not 0.
So sub the nr-delayed-task when checking the rq-nr-running.

Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 kernel/sched/core.c  | 2 +-
 kernel/sched/fair.c  | 5 -----
 kernel/sched/sched.h | 5 +++++
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 67189907214d..6569f220c2fb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3963,7 +3963,7 @@ static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
 	 * p->on_cpu can be whatever, we've done the dequeue, so
 	 * the wakee has been accounted out of ->nr_running.
 	 */
-	if (!cpu_rq(cpu)->nr_running)
+	if (!(cpu_rq(cpu)->nr_running - cfs_h_nr_delayed(cpu_rq(cpu))))
 		return true;
 
 	return false;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8797f6872155..29ee1ce17036 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7181,11 +7181,6 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 	return true;
 }
 
-static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
-{
-	return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
-}
-
 #ifdef CONFIG_SMP
 
 /* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c8512a9fb022..3996b0c5c332 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3609,6 +3609,11 @@ static inline bool is_per_cpu_kthread(struct task_struct *p)
 }
 #endif
 
+static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
+{
+	return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
+}
+
 extern void swake_up_all_locked(struct swait_queue_head *q);
 extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
  2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
                   ` (2 preceding siblings ...)
  2025-03-03 10:52 ` [RFC PATCH V2 3/3] sched: Do not consider the delayed task when cpu is about to enter idle Xuewen Yan
@ 2025-03-03 12:00 ` Peter Zijlstra
  2025-03-04  1:56   ` Xuewen Yan
  3 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2025-03-03 12:00 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: vincent.guittot, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, ke.wang, di.shen,
	xuewen.yan94

On Mon, Mar 03, 2025 at 06:52:38PM +0800, Xuewen Yan wrote:
> Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> lag has elapsed. As a result, it stays also visible in rq->nr_running.
> However, sometimes when using nr-running, we should not consider
> sched-delayed tasks.
> This serie fixes those by adding a helper function which return the
> number of sched-delayed tasks. And when we should get the real runnable
> tasks, we sub the nr-delayed tasks.
> 

Is there an actual performance improvement? Because when a runqueue
looses competition, delayed tasks very quickly dissipate.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
  2025-03-03 12:00 ` [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Peter Zijlstra
@ 2025-03-04  1:56   ` Xuewen Yan
  2025-03-05  8:17     ` Vincent Guittot
  0 siblings, 1 reply; 12+ messages in thread
From: Xuewen Yan @ 2025-03-04  1:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Xuewen Yan, vincent.guittot, mingo, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, linux-kernel, ke.wang,
	di.shen

Hi Peter

On Mon, Mar 3, 2025 at 8:00 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Mar 03, 2025 at 06:52:38PM +0800, Xuewen Yan wrote:
> > Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > However, sometimes when using nr-running, we should not consider
> > sched-delayed tasks.
> > This serie fixes those by adding a helper function which return the
> > number of sched-delayed tasks. And when we should get the real runnable
> > tasks, we sub the nr-delayed tasks.
> >
>
> Is there an actual performance improvement? Because when a runqueue
> looses competition, delayed tasks very quickly dissipate.

At the moment, I don't have very detailed test data. I've been
studying delay-dequeue carefully recently, and these are the issues I
feel might need modification as I go through the code.

Thanks!

BR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
  2025-03-04  1:56   ` Xuewen Yan
@ 2025-03-05  8:17     ` Vincent Guittot
  2025-03-06  7:33       ` Xuewen Yan
  0 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2025-03-05  8:17 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: Peter Zijlstra, Xuewen Yan, mingo, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, linux-kernel, ke.wang,
	di.shen

On Tue, 4 Mar 2025 at 02:56, Xuewen Yan <xuewen.yan94@gmail.com> wrote:
>
> Hi Peter
>
> On Mon, Mar 3, 2025 at 8:00 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Mar 03, 2025 at 06:52:38PM +0800, Xuewen Yan wrote:
> > > Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> > > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > > However, sometimes when using nr-running, we should not consider
> > > sched-delayed tasks.
> > > This serie fixes those by adding a helper function which return the
> > > number of sched-delayed tasks. And when we should get the real runnable
> > > tasks, we sub the nr-delayed tasks.
> > >
> >
> > Is there an actual performance improvement? Because when a runqueue
> > looses competition, delayed tasks very quickly dissipate.
>
> At the moment, I don't have very detailed test data. I've been
> studying delay-dequeue carefully recently, and these are the issues I
> feel might need modification as I go through the code.

Patch 1 makes sense for me but I'm less convinced by patch 2 and 3. As
Peter also mentioned, the state where cpu_rq(cpu)->nr_running ==
cfs_h_nr_delayed(cpu_rq(cpu)) is really transient as they will be
picked as soon as the last runnable task will be dequeued

>
> Thanks!
>
> BR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
  2025-03-05  8:17     ` Vincent Guittot
@ 2025-03-06  7:33       ` Xuewen Yan
  0 siblings, 0 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-06  7:33 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Peter Zijlstra, Xuewen Yan, mingo, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, linux-kernel, ke.wang,
	di.shen

Hi Vincent,

On Wed, Mar 5, 2025 at 4:17 PM Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Tue, 4 Mar 2025 at 02:56, Xuewen Yan <xuewen.yan94@gmail.com> wrote:
> >
> > Hi Peter
> >
> > On Mon, Mar 3, 2025 at 8:00 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Mon, Mar 03, 2025 at 06:52:38PM +0800, Xuewen Yan wrote:
> > > > Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> > > > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > > > However, sometimes when using nr-running, we should not consider
> > > > sched-delayed tasks.
> > > > This serie fixes those by adding a helper function which return the
> > > > number of sched-delayed tasks. And when we should get the real runnable
> > > > tasks, we sub the nr-delayed tasks.
> > > >
> > >
> > > Is there an actual performance improvement? Because when a runqueue
> > > looses competition, delayed tasks very quickly dissipate.
> >
> > At the moment, I don't have very detailed test data. I've been
> > studying delay-dequeue carefully recently, and these are the issues I
> > feel might need modification as I go through the code.
>
> Patch 1 makes sense for me but I'm less convinced by patch 2 and 3. As
> Peter also mentioned, the state where cpu_rq(cpu)->nr_running ==
> cfs_h_nr_delayed(cpu_rq(cpu)) is really transient as they will be
> picked as soon as the last runnable task will be dequeued
>

Thanks for the comments, based on your and Peter's explanation, it
seems that patch2 and patch3 might not have any significant impact at
the moment.
I will also test patch2 and patch3 later.
Thank you again!

BR

> >
> > Thanks!
> >
> > BR

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
  2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
@ 2025-03-19  9:05   ` Tianchen Ding
  2025-03-19  9:34   ` Vincent Guittot
  2025-05-21 12:06   ` [tip: sched/core] sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE tip-bot2 for Xuewen Yan
  2 siblings, 0 replies; 12+ messages in thread
From: Tianchen Ding @ 2025-03-19  9:05 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, linux-kernel, ke.wang,
	di.shen, xuewen.yan94

Hi Xuewen,

On 3/3/25 6:52 PM, Xuewen Yan wrote:
> Delayed dequeued feature keeps a sleeping task enqueued until its
> lag has elapsed. As a result, it stays also visible in rq->nr_running.
> So when in wake_affine_idle(), we should use the real running-tasks
> in rq to check whether we should place the wake-up task to
> current cpu.
> On the other hand, add a helper function to return the nr-delayed.
> 
> Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>

We noticed that your patch can fix a regression introduced by DELAY_DEQUEUE 
in lmbench lat_ctx.

Here's the performance data running
`./lat_ctx -P $(nproc) 96`
on an intel SPR server with 192 CPUs (smaller is better):

DELAY_DEQUEUE                 9.71
NO_DELAY_DEQUEUE              4.02
DELAY_DEQUEUE + this_patch    3.86

Also on an aarch64 server with 128 CPUs:

DELAY_DEQUEUE                 14.82
NO_DELAY_DEQUEUE               5.62
DELAY_DEQUEUE + this_patch     4.66


We found the lmbench lat_ctx regression when enabling DELAY_DEQUEUE, with 
cpu-migrations increasing more than 100 times, higher nr_wakeups_migrate, 
nr_wakeups_remote, nr_wakeups_affine, nr_wakeups_affine_attempts and lower 
nr_wakeups_local.

We think this benchmark prefers waker and wakee staying on the same cpu, 
but WA_IDLE failed to reach this due to sched_delay noise. So your patch 
does fix it.

Feel free to add
Reviewed-and-tested-by: Tianchen Ding <dtcccc@linux.alibaba.com>

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
  2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
  2025-03-19  9:05   ` Tianchen Ding
@ 2025-03-19  9:34   ` Vincent Guittot
  2025-05-20  6:00     ` Xuewen Yan
  2025-05-21 12:06   ` [tip: sched/core] sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE tip-bot2 for Xuewen Yan
  2 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2025-03-19  9:34 UTC (permalink / raw)
  To: Xuewen Yan
  Cc: peterz, mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, linux-kernel, ke.wang, di.shen, xuewen.yan94

On Mon, 3 Mar 2025 at 11:56, Xuewen Yan <xuewen.yan@unisoc.com> wrote:
>
> Delayed dequeued feature keeps a sleeping task enqueued until its
> lag has elapsed. As a result, it stays also visible in rq->nr_running.
> So when in wake_affine_idle(), we should use the real running-tasks
> in rq to check whether we should place the wake-up task to
> current cpu.
> On the other hand, add a helper function to return the nr-delayed.
>
> Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
> V2:
> - add helper function (Vincent)
> ---
>  kernel/sched/fair.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1c0ef435a7aa..a354f29c4f6f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7181,6 +7181,11 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>         return true;
>  }
>
> +static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
> +{
> +       return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
> +}
> +
>  #ifdef CONFIG_SMP
>
>  /* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
> @@ -7342,8 +7347,12 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
>         if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
>                 return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
>
> -       if (sync && cpu_rq(this_cpu)->nr_running == 1)
> -               return this_cpu;
> +       if (sync) {
> +               struct rq *rq = cpu_rq(this_cpu);
> +
> +               if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
> +                       return this_cpu;
> +       }
>
>         if (available_idle_cpu(prev_cpu))
>                 return prev_cpu;
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
  2025-03-19  9:34   ` Vincent Guittot
@ 2025-05-20  6:00     ` Xuewen Yan
  0 siblings, 0 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-05-20  6:00 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: Xuewen Yan, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, ke.wang, di.shen

Hi Vincent,

Sorry to ask, but may I know if this patch can be merged into the mainline?

Thanks!

On Wed, Mar 19, 2025 at 5:35 PM Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Mon, 3 Mar 2025 at 11:56, Xuewen Yan <xuewen.yan@unisoc.com> wrote:
> >
> > Delayed dequeued feature keeps a sleeping task enqueued until its
> > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > So when in wake_affine_idle(), we should use the real running-tasks
> > in rq to check whether we should place the wake-up task to
> > current cpu.
> > On the other hand, add a helper function to return the nr-delayed.
> >
> > Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
>
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
>
> > ---
> > V2:
> > - add helper function (Vincent)
> > ---
> >  kernel/sched/fair.c | 13 +++++++++++--
> >  1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 1c0ef435a7aa..a354f29c4f6f 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7181,6 +7181,11 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >         return true;
> >  }
> >
> > +static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
> > +{
> > +       return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
> > +}
> > +
> >  #ifdef CONFIG_SMP
> >
> >  /* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
> > @@ -7342,8 +7347,12 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
> >         if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
> >                 return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
> >
> > -       if (sync && cpu_rq(this_cpu)->nr_running == 1)
> > -               return this_cpu;
> > +       if (sync) {
> > +               struct rq *rq = cpu_rq(this_cpu);
> > +
> > +               if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
> > +                       return this_cpu;
> > +       }
> >
> >         if (available_idle_cpu(prev_cpu))
> >                 return prev_cpu;
> > --
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip: sched/core] sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE
  2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
  2025-03-19  9:05   ` Tianchen Ding
  2025-03-19  9:34   ` Vincent Guittot
@ 2025-05-21 12:06   ` tip-bot2 for Xuewen Yan
  2 siblings, 0 replies; 12+ messages in thread
From: tip-bot2 for Xuewen Yan @ 2025-05-21 12:06 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Xuewen Yan, Peter Zijlstra (Intel), Vincent Guittot, x86,
	linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     aa3ee4f0b7541382c9f6f43f7408d73a5d4f4042
Gitweb:        https://git.kernel.org/tip/aa3ee4f0b7541382c9f6f43f7408d73a5d4f4042
Author:        Xuewen Yan <xuewen.yan@unisoc.com>
AuthorDate:    Mon, 03 Mar 2025 18:52:39 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 21 May 2025 13:57:37 +02:00

sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE

Delayed dequeued feature keeps a sleeping task enqueued until its
lag has elapsed. As a result, it stays also visible in rq->nr_running.
So when in wake_affine_idle(), we should use the real running-tasks
in rq to check whether we should place the wake-up task to
current cpu.
On the other hand, add a helper function to return the nr-delayed.

Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
Reviewed-and-tested-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20250303105241.17251-2-xuewen.yan@unisoc.com
---
 kernel/sched/fair.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index eb5a257..b00f167 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7193,6 +7193,11 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 	return true;
 }
 
+static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
+{
+	return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
+}
+
 #ifdef CONFIG_SMP
 
 /* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
@@ -7354,8 +7359,12 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
 	if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
 		return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
 
-	if (sync && cpu_rq(this_cpu)->nr_running == 1)
-		return this_cpu;
+	if (sync) {
+		struct rq *rq = cpu_rq(this_cpu);
+
+		if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
+			return this_cpu;
+	}
 
 	if (available_idle_cpu(prev_cpu))
 		return prev_cpu;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-05-21 12:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
2025-03-19  9:05   ` Tianchen Ding
2025-03-19  9:34   ` Vincent Guittot
2025-05-20  6:00     ` Xuewen Yan
2025-05-21 12:06   ` [tip: sched/core] sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE tip-bot2 for Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 2/3] sched/fair: Do not consider the sched-delayed task when yield Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 3/3] sched: Do not consider the delayed task when cpu is about to enter idle Xuewen Yan
2025-03-03 12:00 ` [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Peter Zijlstra
2025-03-04  1:56   ` Xuewen Yan
2025-03-05  8:17     ` Vincent Guittot
2025-03-06  7:33       ` Xuewen Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).