* [PATCH] sched/fair: Fix wakeup_preempt_fair for not waking up task
@ 2026-04-29 16:41 Vincent Guittot
2026-04-30 6:16 ` Furkan Çalışkan
2026-05-01 14:49 ` Peter Zijlstra
0 siblings, 2 replies; 5+ messages in thread
From: Vincent Guittot @ 2026-04-29 16:41 UTC (permalink / raw)
To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, kprateek.nayak, linux-kernel, qyousef
Cc: Vincent Guittot
The assumption that p is always enqueued and not delayed, is only true for
wakeup. If p was moved while sched_delayed, pick_next_entity will dequeue
it during the attach and the cfs might become empty.
Fixes: ac8e69e69363 ("sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
I have triggered this while running my latency stress test on a new platform.
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 728965851842..99fb524c4922 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9147,7 +9147,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
* Because p is enqueued, nse being null can only mean that we
* dequeued a delayed task.
*/
- if (!nse)
+ if (!nse && (wake_flags & WF_TTWU))
goto pick;
if (sched_feat(RUN_TO_PARITY))
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] sched/fair: Fix wakeup_preempt_fair for not waking up task
2026-04-29 16:41 [PATCH] sched/fair: Fix wakeup_preempt_fair for not waking up task Vincent Guittot
@ 2026-04-30 6:16 ` Furkan Çalışkan
2026-04-30 7:49 ` K Prateek Nayak
2026-05-01 14:49 ` Peter Zijlstra
1 sibling, 1 reply; 5+ messages in thread
From: Furkan Çalışkan @ 2026-04-30 6:16 UTC (permalink / raw)
To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, vschneid, kprateek.nayak, linux-kernel,
qyousef
Hi Vincent,
On 4/29/26 19:41, Vincent Guittot wrote:
> The assumption that p is always enqueued and not delayed, is only true for
> wakeup. If p was moved while sched_delayed, pick_next_entity will dequeue
> it during the attach and the cfs might become empty.
>
> Fixes: ac8e69e69363 ("sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue")
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>
> I have triggered this while running my latency stress test on a new platform.
>
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 728965851842..99fb524c4922 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9147,7 +9147,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
> * Because p is enqueued, nse being null can only mean that we
> * dequeued a delayed task.
> */
> - if (!nse)
> + if (!nse && (wake_flags & WF_TTWU))
> goto pick;
>
> if (sched_feat(RUN_TO_PARITY))
When a sched_delayed task is migrated (which can only happen via MIGRATE_LOAD per can_migrate_task()), enqueuing it on the dest cpu will call wakeup_preempt_fair immediately, and if the dest cpu is not busy, pick_next_entity() will likely pick and dequeue it immediately. So a wasted enqueue+dequeue pair. Could we skip the enqueue when sched_delayed is set, and defer it to the actual wakeup path?
Thanks
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched/fair: Fix wakeup_preempt_fair for not waking up task
2026-04-30 6:16 ` Furkan Çalışkan
@ 2026-04-30 7:49 ` K Prateek Nayak
2026-04-30 9:21 ` Furkan Çalışkan
0 siblings, 1 reply; 5+ messages in thread
From: K Prateek Nayak @ 2026-04-30 7:49 UTC (permalink / raw)
To: Furkan Çalışkan, Vincent Guittot, mingo, peterz,
juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
linux-kernel, qyousef
Hello Furkan,
On 4/30/2026 11:46 AM, Furkan Çalışkan wrote:
> On 4/29/26 19:41, Vincent Guittot wrote:
>> The assumption that p is always enqueued and not delayed, is only true for
>> wakeup. If p was moved while sched_delayed, pick_next_entity will dequeue
>> it during the attach and the cfs might become empty.
>>
>> Fixes: ac8e69e69363 ("sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue")
>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>> ---
>>
>> I have triggered this while running my latency stress test on a new platform.
>>
>> kernel/sched/fair.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 728965851842..99fb524c4922 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -9147,7 +9147,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
>> * Because p is enqueued, nse being null can only mean that we
>> * dequeued a delayed task.
>> */
>> - if (!nse)
>> + if (!nse && (wake_flags & WF_TTWU))
>> goto pick;
>>
>> if (sched_feat(RUN_TO_PARITY))
>
> When a sched_delayed task is migrated (which can only happen via
> MIGRATE_LOAD per can_migrate_task()), enqueuing it on the dest cpu will
> call wakeup_preempt_fair immediately, and if the dest cpu is not busy,
> pick_next_entity() will likely pick and dequeue it immediately. So a
> wasted enqueue+dequeue pair. Could we skip the enqueue when
> sched_delayed is set, and defer it to the actual wakeup path?
That requires some considerations - if we are migrating a delayed task
to an idle CPU, we can readily block the delayed task if we don't have
other tasks on the migration list.
If the destination is busy, or if we are migrating a bunch of tasks,
we need to know what the final state of the task_timeline will
be to make a decision whether it is okay to block them immediately.
We need to know where the avg_vruntime() and deadline ends up to know
if the task will get picked immediately and we cannot do that without
going through place_entity + __enqueue_entity().
There is also cgroup implication where, the delayed task might not be
picked immediately if it is on a cgroup whose entity is not eligible
and that requires going through the full enqueue + pick.
--
Thanks and Regards,
Prateek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched/fair: Fix wakeup_preempt_fair for not waking up task
2026-04-30 7:49 ` K Prateek Nayak
@ 2026-04-30 9:21 ` Furkan Çalışkan
0 siblings, 0 replies; 5+ messages in thread
From: Furkan Çalışkan @ 2026-04-30 9:21 UTC (permalink / raw)
To: K Prateek Nayak, Vincent Guittot, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid,
linux-kernel, qyousef
Hi K Prateek,
On 4/30/26 10:49, K Prateek Nayak wrote:
> Hello Furkan,
>
> On 4/30/2026 11:46 AM, Furkan Çalışkan wrote:
>> On 4/29/26 19:41, Vincent Guittot wrote:
>>> The assumption that p is always enqueued and not delayed, is only true for
>>> wakeup. If p was moved while sched_delayed, pick_next_entity will dequeue
>>> it during the attach and the cfs might become empty.
>>>
>>> Fixes: ac8e69e69363 ("sched/fair: Fix wakeup_preempt_fair() vs delayed dequeue")
>>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>>> ---
>>>
>>> I have triggered this while running my latency stress test on a new platform.
>>>
>>> kernel/sched/fair.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 728965851842..99fb524c4922 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -9147,7 +9147,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
>>> * Because p is enqueued, nse being null can only mean that we
>>> * dequeued a delayed task.
>>> */
>>> - if (!nse)
>>> + if (!nse && (wake_flags & WF_TTWU))
>>> goto pick;
>>>
>>> if (sched_feat(RUN_TO_PARITY))
>>
>> When a sched_delayed task is migrated (which can only happen via
>> MIGRATE_LOAD per can_migrate_task()), enqueuing it on the dest cpu will
>> call wakeup_preempt_fair immediately, and if the dest cpu is not busy,
>> pick_next_entity() will likely pick and dequeue it immediately. So a
>> wasted enqueue+dequeue pair. Could we skip the enqueue when
>> sched_delayed is set, and defer it to the actual wakeup path?
>
> That requires some considerations - if we are migrating a delayed task
> to an idle CPU, we can readily block the delayed task if we don't have
> other tasks on the migration list.
>
> If the destination is busy, or if we are migrating a bunch of tasks,
> we need to know what the final state of the task_timeline will
> be to make a decision whether it is okay to block them immediately.
>
> We need to know where the avg_vruntime() and deadline ends up to know
> if the task will get picked immediately and we cannot do that without
> going through place_entity + __enqueue_entity().
>
> There is also cgroup implication where, the delayed task might not be
> picked immediately if it is on a cgroup whose entity is not eligible
> and that requires going through the full enqueue + pick.
>
You're right - skipping the enqueue introduces far more complexity than the
cost of the enqueue+dequeue pair it avoids, since it requires reasoning about
the full migration list, destination CPU state, cgroup eligiblity and
avg_vruntime placement.
Thanks for the detailed explanation
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched/fair: Fix wakeup_preempt_fair for not waking up task
2026-04-29 16:41 [PATCH] sched/fair: Fix wakeup_preempt_fair for not waking up task Vincent Guittot
2026-04-30 6:16 ` Furkan Çalışkan
@ 2026-05-01 14:49 ` Peter Zijlstra
1 sibling, 0 replies; 5+ messages in thread
From: Peter Zijlstra @ 2026-05-01 14:49 UTC (permalink / raw)
To: Vincent Guittot
Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak, linux-kernel, qyousef
On Wed, Apr 29, 2026 at 06:41:02PM +0200, Vincent Guittot wrote:
> The assumption that p is always enqueued and not delayed, is only true for
> wakeup. If p was moved while sched_delayed, pick_next_entity will dequeue
> it during the attach and the cfs might become empty.
Changelog needs more text on why this is a problem.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-01 14:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-29 16:41 [PATCH] sched/fair: Fix wakeup_preempt_fair for not waking up task Vincent Guittot
2026-04-30 6:16 ` Furkan Çalışkan
2026-04-30 7:49 ` K Prateek Nayak
2026-04-30 9:21 ` Furkan Çalışkan
2026-05-01 14:49 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox