* [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
@ 2025-03-03 10:52 Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-03 10:52 UTC (permalink / raw)
To: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid
Cc: linux-kernel, ke.wang, di.shen, xuewen.yan94
Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
lag has elapsed. As a result, it stays also visible in rq->nr_running.
However, sometimes when using nr-running, we should not consider
sched-delayed tasks.
This serie fixes those by adding a helper function which return the
number of sched-delayed tasks. And when we should get the real runnable
tasks, we sub the nr-delayed tasks.
Changes sinc v1:
- add cover-letter
- add helper function;
- add more fixes
Xuewen Yan (3):
sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
sched/fair: Do not consider the sched-delayed task when yield
sched: Do not consider the delayed task when cpu is about to enter
idle
kernel/sched/core.c | 2 +-
kernel/sched/fair.c | 10 +++++++---
kernel/sched/sched.h | 5 +++++
3 files changed, 13 insertions(+), 4 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
@ 2025-03-03 10:52 ` Xuewen Yan
2025-03-19 9:05 ` Tianchen Ding
` (2 more replies)
2025-03-03 10:52 ` [RFC PATCH V2 2/3] sched/fair: Do not consider the sched-delayed task when yield Xuewen Yan
` (2 subsequent siblings)
3 siblings, 3 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-03 10:52 UTC (permalink / raw)
To: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid
Cc: linux-kernel, ke.wang, di.shen, xuewen.yan94
Delayed dequeued feature keeps a sleeping task enqueued until its
lag has elapsed. As a result, it stays also visible in rq->nr_running.
So when in wake_affine_idle(), we should use the real running-tasks
in rq to check whether we should place the wake-up task to
current cpu.
On the other hand, add a helper function to return the nr-delayed.
Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
V2:
- add helper function (Vincent)
---
kernel/sched/fair.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1c0ef435a7aa..a354f29c4f6f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7181,6 +7181,11 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
return true;
}
+static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
+{
+ return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
+}
+
#ifdef CONFIG_SMP
/* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
@@ -7342,8 +7347,12 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
- if (sync && cpu_rq(this_cpu)->nr_running == 1)
- return this_cpu;
+ if (sync) {
+ struct rq *rq = cpu_rq(this_cpu);
+
+ if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
+ return this_cpu;
+ }
if (available_idle_cpu(prev_cpu))
return prev_cpu;
--
2.25.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH V2 2/3] sched/fair: Do not consider the sched-delayed task when yield
2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
@ 2025-03-03 10:52 ` Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 3/3] sched: Do not consider the delayed task when cpu is about to enter idle Xuewen Yan
2025-03-03 12:00 ` [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Peter Zijlstra
3 siblings, 0 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-03 10:52 UTC (permalink / raw)
To: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid
Cc: linux-kernel, ke.wang, di.shen, xuewen.yan94
When task call sched_yield, if there is only one task in rq,
it is no need to yield, however now, the rq->nr_running include
the sched-delayed tasks which are indeed not runnable tasks.
So sub the sched-delayed tasks when check the nr-running.
Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a354f29c4f6f..8797f6872155 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8992,7 +8992,7 @@ static void yield_task_fair(struct rq *rq)
/*
* Are we the only task in the tree?
*/
- if (unlikely(rq->nr_running == 1))
+ if (unlikely((rq->nr_running - cfs_h_nr_delayed(rq)) == 1))
return;
clear_buddies(cfs_rq, se);
--
2.25.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [RFC PATCH V2 3/3] sched: Do not consider the delayed task when cpu is about to enter idle
2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 2/3] sched/fair: Do not consider the sched-delayed task when yield Xuewen Yan
@ 2025-03-03 10:52 ` Xuewen Yan
2025-03-03 12:00 ` [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Peter Zijlstra
3 siblings, 0 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-03 10:52 UTC (permalink / raw)
To: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid
Cc: linux-kernel, ke.wang, di.shen, xuewen.yan94
When there are one task with sched-delayed and one task
which is descheduling, Using nr-running to determine
CPU idle may be incorrect.
For example:
task-A is sched_delayed, task-B is descheduling:
1. before schedule():
rq-nr-running=2, task-A->on_rq=1; task-B->on_rq=1;
2. after block_task(B):
rq-nr-running=1, task-A->on_rq=1; task-B->on_rq=0;
3. after pick_next_task(), because the task-A would be dequeued:
rq-nr-running=0, task-A->on_rq=0; task-B->on_rq=0;
In ttwu_queue_cond, it hope the nr-running to be 0 after
the step 2, however, now the nr-running is not 0.
So sub the nr-delayed-task when checking the rq-nr-running.
Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
kernel/sched/core.c | 2 +-
kernel/sched/fair.c | 5 -----
kernel/sched/sched.h | 5 +++++
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 67189907214d..6569f220c2fb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3963,7 +3963,7 @@ static inline bool ttwu_queue_cond(struct task_struct *p, int cpu)
* p->on_cpu can be whatever, we've done the dequeue, so
* the wakee has been accounted out of ->nr_running.
*/
- if (!cpu_rq(cpu)->nr_running)
+ if (!(cpu_rq(cpu)->nr_running - cfs_h_nr_delayed(cpu_rq(cpu))))
return true;
return false;
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8797f6872155..29ee1ce17036 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7181,11 +7181,6 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
return true;
}
-static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
-{
- return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
-}
-
#ifdef CONFIG_SMP
/* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c8512a9fb022..3996b0c5c332 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3609,6 +3609,11 @@ static inline bool is_per_cpu_kthread(struct task_struct *p)
}
#endif
+static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
+{
+ return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
+}
+
extern void swake_up_all_locked(struct swait_queue_head *q);
extern void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
--
2.25.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
` (2 preceding siblings ...)
2025-03-03 10:52 ` [RFC PATCH V2 3/3] sched: Do not consider the delayed task when cpu is about to enter idle Xuewen Yan
@ 2025-03-03 12:00 ` Peter Zijlstra
2025-03-04 1:56 ` Xuewen Yan
3 siblings, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2025-03-03 12:00 UTC (permalink / raw)
To: Xuewen Yan
Cc: vincent.guittot, mingo, juri.lelli, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, linux-kernel, ke.wang, di.shen,
xuewen.yan94
On Mon, Mar 03, 2025 at 06:52:38PM +0800, Xuewen Yan wrote:
> Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> lag has elapsed. As a result, it stays also visible in rq->nr_running.
> However, sometimes when using nr-running, we should not consider
> sched-delayed tasks.
> This serie fixes those by adding a helper function which return the
> number of sched-delayed tasks. And when we should get the real runnable
> tasks, we sub the nr-delayed tasks.
>
Is there an actual performance improvement? Because when a runqueue
looses competition, delayed tasks very quickly dissipate.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
2025-03-03 12:00 ` [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Peter Zijlstra
@ 2025-03-04 1:56 ` Xuewen Yan
2025-03-05 8:17 ` Vincent Guittot
0 siblings, 1 reply; 12+ messages in thread
From: Xuewen Yan @ 2025-03-04 1:56 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Xuewen Yan, vincent.guittot, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid, linux-kernel, ke.wang,
di.shen
Hi Peter
On Mon, Mar 3, 2025 at 8:00 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Mar 03, 2025 at 06:52:38PM +0800, Xuewen Yan wrote:
> > Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > However, sometimes when using nr-running, we should not consider
> > sched-delayed tasks.
> > This serie fixes those by adding a helper function which return the
> > number of sched-delayed tasks. And when we should get the real runnable
> > tasks, we sub the nr-delayed tasks.
> >
>
> Is there an actual performance improvement? Because when a runqueue
> looses competition, delayed tasks very quickly dissipate.
At the moment, I don't have very detailed test data. I've been
studying delay-dequeue carefully recently, and these are the issues I
feel might need modification as I go through the code.
Thanks!
BR
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
2025-03-04 1:56 ` Xuewen Yan
@ 2025-03-05 8:17 ` Vincent Guittot
2025-03-06 7:33 ` Xuewen Yan
0 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2025-03-05 8:17 UTC (permalink / raw)
To: Xuewen Yan
Cc: Peter Zijlstra, Xuewen Yan, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid, linux-kernel, ke.wang,
di.shen
On Tue, 4 Mar 2025 at 02:56, Xuewen Yan <xuewen.yan94@gmail.com> wrote:
>
> Hi Peter
>
> On Mon, Mar 3, 2025 at 8:00 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Mar 03, 2025 at 06:52:38PM +0800, Xuewen Yan wrote:
> > > Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> > > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > > However, sometimes when using nr-running, we should not consider
> > > sched-delayed tasks.
> > > This serie fixes those by adding a helper function which return the
> > > number of sched-delayed tasks. And when we should get the real runnable
> > > tasks, we sub the nr-delayed tasks.
> > >
> >
> > Is there an actual performance improvement? Because when a runqueue
> > looses competition, delayed tasks very quickly dissipate.
>
> At the moment, I don't have very detailed test data. I've been
> studying delay-dequeue carefully recently, and these are the issues I
> feel might need modification as I go through the code.
Patch 1 makes sense for me but I'm less convinced by patch 2 and 3. As
Peter also mentioned, the state where cpu_rq(cpu)->nr_running ==
cfs_h_nr_delayed(cpu_rq(cpu)) is really transient as they will be
picked as soon as the last runnable task will be dequeued
>
> Thanks!
>
> BR
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue
2025-03-05 8:17 ` Vincent Guittot
@ 2025-03-06 7:33 ` Xuewen Yan
0 siblings, 0 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-03-06 7:33 UTC (permalink / raw)
To: Vincent Guittot
Cc: Peter Zijlstra, Xuewen Yan, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid, linux-kernel, ke.wang,
di.shen
Hi Vincent,
On Wed, Mar 5, 2025 at 4:17 PM Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Tue, 4 Mar 2025 at 02:56, Xuewen Yan <xuewen.yan94@gmail.com> wrote:
> >
> > Hi Peter
> >
> > On Mon, Mar 3, 2025 at 8:00 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Mon, Mar 03, 2025 at 06:52:38PM +0800, Xuewen Yan wrote:
> > > > Delayed dequeued feature keeps a sleeping sched_entitiy enqueued until its
> > > > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > > > However, sometimes when using nr-running, we should not consider
> > > > sched-delayed tasks.
> > > > This serie fixes those by adding a helper function which return the
> > > > number of sched-delayed tasks. And when we should get the real runnable
> > > > tasks, we sub the nr-delayed tasks.
> > > >
> > >
> > > Is there an actual performance improvement? Because when a runqueue
> > > looses competition, delayed tasks very quickly dissipate.
> >
> > At the moment, I don't have very detailed test data. I've been
> > studying delay-dequeue carefully recently, and these are the issues I
> > feel might need modification as I go through the code.
>
> Patch 1 makes sense for me but I'm less convinced by patch 2 and 3. As
> Peter also mentioned, the state where cpu_rq(cpu)->nr_running ==
> cfs_h_nr_delayed(cpu_rq(cpu)) is really transient as they will be
> picked as soon as the last runnable task will be dequeued
>
Thanks for the comments, based on your and Peter's explanation, it
seems that patch2 and patch3 might not have any significant impact at
the moment.
I will also test patch2 and patch3 later.
Thank you again!
BR
> >
> > Thanks!
> >
> > BR
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
@ 2025-03-19 9:05 ` Tianchen Ding
2025-03-19 9:34 ` Vincent Guittot
2025-05-21 12:06 ` [tip: sched/core] sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE tip-bot2 for Xuewen Yan
2 siblings, 0 replies; 12+ messages in thread
From: Tianchen Ding @ 2025-03-19 9:05 UTC (permalink / raw)
To: Xuewen Yan
Cc: vincent.guittot, peterz, mingo, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid, linux-kernel, ke.wang,
di.shen, xuewen.yan94
Hi Xuewen,
On 3/3/25 6:52 PM, Xuewen Yan wrote:
> Delayed dequeued feature keeps a sleeping task enqueued until its
> lag has elapsed. As a result, it stays also visible in rq->nr_running.
> So when in wake_affine_idle(), we should use the real running-tasks
> in rq to check whether we should place the wake-up task to
> current cpu.
> On the other hand, add a helper function to return the nr-delayed.
>
> Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
We noticed that your patch can fix a regression introduced by DELAY_DEQUEUE
in lmbench lat_ctx.
Here's the performance data running
`./lat_ctx -P $(nproc) 96`
on an intel SPR server with 192 CPUs (smaller is better):
DELAY_DEQUEUE 9.71
NO_DELAY_DEQUEUE 4.02
DELAY_DEQUEUE + this_patch 3.86
Also on an aarch64 server with 128 CPUs:
DELAY_DEQUEUE 14.82
NO_DELAY_DEQUEUE 5.62
DELAY_DEQUEUE + this_patch 4.66
We found the lmbench lat_ctx regression when enabling DELAY_DEQUEUE, with
cpu-migrations increasing more than 100 times, higher nr_wakeups_migrate,
nr_wakeups_remote, nr_wakeups_affine, nr_wakeups_affine_attempts and lower
nr_wakeups_local.
We think this benchmark prefers waker and wakee staying on the same cpu,
but WA_IDLE failed to reach this due to sched_delay noise. So your patch
does fix it.
Feel free to add
Reviewed-and-tested-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
2025-03-19 9:05 ` Tianchen Ding
@ 2025-03-19 9:34 ` Vincent Guittot
2025-05-20 6:00 ` Xuewen Yan
2025-05-21 12:06 ` [tip: sched/core] sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE tip-bot2 for Xuewen Yan
2 siblings, 1 reply; 12+ messages in thread
From: Vincent Guittot @ 2025-03-19 9:34 UTC (permalink / raw)
To: Xuewen Yan
Cc: peterz, mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, linux-kernel, ke.wang, di.shen, xuewen.yan94
On Mon, 3 Mar 2025 at 11:56, Xuewen Yan <xuewen.yan@unisoc.com> wrote:
>
> Delayed dequeued feature keeps a sleeping task enqueued until its
> lag has elapsed. As a result, it stays also visible in rq->nr_running.
> So when in wake_affine_idle(), we should use the real running-tasks
> in rq to check whether we should place the wake-up task to
> current cpu.
> On the other hand, add a helper function to return the nr-delayed.
>
> Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> V2:
> - add helper function (Vincent)
> ---
> kernel/sched/fair.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1c0ef435a7aa..a354f29c4f6f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7181,6 +7181,11 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> return true;
> }
>
> +static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
> +{
> + return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
> +}
> +
> #ifdef CONFIG_SMP
>
> /* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
> @@ -7342,8 +7347,12 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
> if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
> return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
>
> - if (sync && cpu_rq(this_cpu)->nr_running == 1)
> - return this_cpu;
> + if (sync) {
> + struct rq *rq = cpu_rq(this_cpu);
> +
> + if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
> + return this_cpu;
> + }
>
> if (available_idle_cpu(prev_cpu))
> return prev_cpu;
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE
2025-03-19 9:34 ` Vincent Guittot
@ 2025-05-20 6:00 ` Xuewen Yan
0 siblings, 0 replies; 12+ messages in thread
From: Xuewen Yan @ 2025-05-20 6:00 UTC (permalink / raw)
To: Vincent Guittot
Cc: Xuewen Yan, peterz, mingo, juri.lelli, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, linux-kernel, ke.wang, di.shen
Hi Vincent,
Sorry to ask, but may I know if this patch can be merged into the mainline?
Thanks!
On Wed, Mar 19, 2025 at 5:35 PM Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Mon, 3 Mar 2025 at 11:56, Xuewen Yan <xuewen.yan@unisoc.com> wrote:
> >
> > Delayed dequeued feature keeps a sleeping task enqueued until its
> > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > So when in wake_affine_idle(), we should use the real running-tasks
> > in rq to check whether we should place the wake-up task to
> > current cpu.
> > On the other hand, add a helper function to return the nr-delayed.
> >
> > Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
>
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
>
> > ---
> > V2:
> > - add helper function (Vincent)
> > ---
> > kernel/sched/fair.c | 13 +++++++++++--
> > 1 file changed, 11 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 1c0ef435a7aa..a354f29c4f6f 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7181,6 +7181,11 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> > return true;
> > }
> >
> > +static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
> > +{
> > + return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
> > +}
> > +
> > #ifdef CONFIG_SMP
> >
> > /* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
> > @@ -7342,8 +7347,12 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
> > if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
> > return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
> >
> > - if (sync && cpu_rq(this_cpu)->nr_running == 1)
> > - return this_cpu;
> > + if (sync) {
> > + struct rq *rq = cpu_rq(this_cpu);
> > +
> > + if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
> > + return this_cpu;
> > + }
> >
> > if (available_idle_cpu(prev_cpu))
> > return prev_cpu;
> > --
> > 2.25.1
> >
^ permalink raw reply [flat|nested] 12+ messages in thread
* [tip: sched/core] sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE
2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
2025-03-19 9:05 ` Tianchen Ding
2025-03-19 9:34 ` Vincent Guittot
@ 2025-05-21 12:06 ` tip-bot2 for Xuewen Yan
2 siblings, 0 replies; 12+ messages in thread
From: tip-bot2 for Xuewen Yan @ 2025-05-21 12:06 UTC (permalink / raw)
To: linux-tip-commits
Cc: Xuewen Yan, Peter Zijlstra (Intel), Vincent Guittot, x86,
linux-kernel
The following commit has been merged into the sched/core branch of tip:
Commit-ID: aa3ee4f0b7541382c9f6f43f7408d73a5d4f4042
Gitweb: https://git.kernel.org/tip/aa3ee4f0b7541382c9f6f43f7408d73a5d4f4042
Author: Xuewen Yan <xuewen.yan@unisoc.com>
AuthorDate: Mon, 03 Mar 2025 18:52:39 +08:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 21 May 2025 13:57:37 +02:00
sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE
Delayed dequeued feature keeps a sleeping task enqueued until its
lag has elapsed. As a result, it stays also visible in rq->nr_running.
So when in wake_affine_idle(), we should use the real running-tasks
in rq to check whether we should place the wake-up task to
current cpu.
On the other hand, add a helper function to return the nr-delayed.
Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
Reviewed-and-tested-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20250303105241.17251-2-xuewen.yan@unisoc.com
---
kernel/sched/fair.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index eb5a257..b00f167 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7193,6 +7193,11 @@ static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
return true;
}
+static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
+{
+ return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
+}
+
#ifdef CONFIG_SMP
/* Working cpumask for: sched_balance_rq(), sched_balance_newidle(). */
@@ -7354,8 +7359,12 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
- if (sync && cpu_rq(this_cpu)->nr_running == 1)
- return this_cpu;
+ if (sync) {
+ struct rq *rq = cpu_rq(this_cpu);
+
+ if ((rq->nr_running - cfs_h_nr_delayed(rq)) == 1)
+ return this_cpu;
+ }
if (available_idle_cpu(prev_cpu))
return prev_cpu;
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-05-21 12:07 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-03 10:52 [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE Xuewen Yan
2025-03-19 9:05 ` Tianchen Ding
2025-03-19 9:34 ` Vincent Guittot
2025-05-20 6:00 ` Xuewen Yan
2025-05-21 12:06 ` [tip: sched/core] sched/fair: Fixup wake_up_sync() vs DELAYED_DEQUEUE tip-bot2 for Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 2/3] sched/fair: Do not consider the sched-delayed task when yield Xuewen Yan
2025-03-03 10:52 ` [RFC PATCH V2 3/3] sched: Do not consider the delayed task when cpu is about to enter idle Xuewen Yan
2025-03-03 12:00 ` [RFC PATCH V2 0/3] sched/fair: Fix nr-running vs delayed-dequeue Peter Zijlstra
2025-03-04 1:56 ` Xuewen Yan
2025-03-05 8:17 ` Vincent Guittot
2025-03-06 7:33 ` Xuewen Yan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).