* [PATCH] sched: fair: Prevent negative lag increase during delayed dequeue
@ 2026-03-27 18:02 Vincent Guittot
2026-03-30 14:06 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Vincent Guittot @ 2026-03-27 18:02 UTC (permalink / raw)
To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, kprateek.nayak, linux-kernel
Cc: Vincent Guittot
Delayed dequeue feature aims to reduce the negative lag of a dequeued task
while sleeping but it can happens that newly enqueued tasks will move
backward the avg vruntime and increase its negative lag.
When the delayed dequeued task wakes up, it has more neg lag compared to
being dequeued immediately or to other delayed or not tasks that have been
dequeued just before theses new enqueues.
Ensure that the negative lag of a delayed dequeued task doesn't increase
during its delayed dequeued phase while waiting for its neg lag to
diseappear. Similarly, we remove any positive lag that the delayed
dequeued task could have gain during thsi period.
Short slice tasks are particularly impacted in overloaded system.
Test on snapdragon rb5:
hackbench -T -p -l 16000000 -g 2 1> /dev/null &
cyclictest -t 1 -i 2777 -D 333 --policy=fair --mlock -h 20000 -q
The scheduling latency of cyclictest is:
tip/sched/core tip/sched/core +this patch
cyclictest slice (ms) (default)2.8 8 8
hackbench slice (ms) (default)2.8 20 20
Total Samples | 115632 119733 119806
Average (us) | 364 64(-82%) 61(- 5%)
Median (P50) (us) | 60 56(- 7%) 56( 0%)
90th Percentile (us) | 1166 62(-95%) 62( 0%)
99th Percentile (us) | 4192 73(-98%) 72(- 1%)
99.9th Percentile (us) | 8528 2707(-68%) 1300(-52%)
Maximum (us) | 17735 14273(-20%) 13525(- 5%)
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
kernel/sched/fair.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 226509231e67..efa9dfa8c583 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5595,6 +5595,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
if (sched_feat(DELAY_DEQUEUE) && delay &&
!entity_eligible(cfs_rq, se)) {
update_load_avg(cfs_rq, se, 0);
+ update_entity_lag(cfs_rq, se);
set_delayed(se);
return false;
}
@@ -7089,12 +7090,16 @@ requeue_delayed_entity(struct sched_entity *se)
WARN_ON_ONCE(!se->on_rq);
if (sched_feat(DELAY_ZERO)) {
+ s64 vlag, prev_vlag = se->vlag;
update_entity_lag(cfs_rq, se);
- if (se->vlag > 0) {
+ /* prev_vlag < 0 otherwise se would not be delayed */
+ vlag = clamp(se->vlag, prev_vlag, 0);
+
+ if (vlag != se->vlag) {
cfs_rq->nr_queued--;
if (se != cfs_rq->curr)
__dequeue_entity(cfs_rq, se);
- se->vlag = 0;
+ se->vlag = vlag;
place_entity(cfs_rq, se, 0);
if (se != cfs_rq->curr)
__enqueue_entity(cfs_rq, se);
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: fair: Prevent negative lag increase during delayed dequeue
2026-03-27 18:02 [PATCH] sched: fair: Prevent negative lag increase during delayed dequeue Vincent Guittot
@ 2026-03-30 14:06 ` Peter Zijlstra
2026-03-30 14:46 ` Vincent Guittot
0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2026-03-30 14:06 UTC (permalink / raw)
To: Vincent Guittot
Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak, linux-kernel
On Fri, Mar 27, 2026 at 07:02:19PM +0100, Vincent Guittot wrote:
> Delayed dequeue feature aims to reduce the negative lag of a dequeued task
> while sleeping but it can happens that newly enqueued tasks will move
> backward the avg vruntime and increase its negative lag.
> When the delayed dequeued task wakes up, it has more neg lag compared to
> being dequeued immediately or to other delayed or not tasks that have been
> dequeued just before theses new enqueues.
>
> Ensure that the negative lag of a delayed dequeued task doesn't increase
> during its delayed dequeued phase while waiting for its neg lag to
> diseappear. Similarly, we remove any positive lag that the delayed
> dequeued task could have gain during thsi period.
*groan*, indeed!
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
> kernel/sched/fair.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 226509231e67..efa9dfa8c583 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5595,6 +5595,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> if (sched_feat(DELAY_DEQUEUE) && delay &&
> !entity_eligible(cfs_rq, se)) {
> update_load_avg(cfs_rq, se, 0);
> + update_entity_lag(cfs_rq, se);
> set_delayed(se);
> return false;
> }
> @@ -7089,12 +7090,16 @@ requeue_delayed_entity(struct sched_entity *se)
> WARN_ON_ONCE(!se->on_rq);
>
> if (sched_feat(DELAY_ZERO)) {
> + s64 vlag, prev_vlag = se->vlag;
> update_entity_lag(cfs_rq, se);
> - if (se->vlag > 0) {
> + /* prev_vlag < 0 otherwise se would not be delayed */
> + vlag = clamp(se->vlag, prev_vlag, 0);
> +
> + if (vlag != se->vlag) {
> cfs_rq->nr_queued--;
> if (se != cfs_rq->curr)
> __dequeue_entity(cfs_rq, se);
> - se->vlag = 0;
> + se->vlag = vlag;
> place_entity(cfs_rq, se, 0);
> if (se != cfs_rq->curr)
> __enqueue_entity(cfs_rq, se);
So I think this should apply irrespective of DELAY_ZERO -- although
since we default to that, it is the most relevant.
Also, I'm thinking we might want a comment with this. We're bound to
forget the details at some point.
Also, there is one other site that has DLAY_ZERO/clear_delayed(), does
that want updating too?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: fair: Prevent negative lag increase during delayed dequeue
2026-03-30 14:06 ` Peter Zijlstra
@ 2026-03-30 14:46 ` Vincent Guittot
2026-03-30 21:32 ` Shubhang Kaushik
0 siblings, 1 reply; 5+ messages in thread
From: Vincent Guittot @ 2026-03-30 14:46 UTC (permalink / raw)
To: Peter Zijlstra
Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak, linux-kernel
On Mon, 30 Mar 2026 at 16:06, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Mar 27, 2026 at 07:02:19PM +0100, Vincent Guittot wrote:
> > Delayed dequeue feature aims to reduce the negative lag of a dequeued task
> > while sleeping but it can happens that newly enqueued tasks will move
> > backward the avg vruntime and increase its negative lag.
> > When the delayed dequeued task wakes up, it has more neg lag compared to
> > being dequeued immediately or to other delayed or not tasks that have been
> > dequeued just before theses new enqueues.
> >
> > Ensure that the negative lag of a delayed dequeued task doesn't increase
> > during its delayed dequeued phase while waiting for its neg lag to
> > diseappear. Similarly, we remove any positive lag that the delayed
> > dequeued task could have gain during thsi period.
>
> *groan*, indeed!
>
>
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> > kernel/sched/fair.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 226509231e67..efa9dfa8c583 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -5595,6 +5595,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> > if (sched_feat(DELAY_DEQUEUE) && delay &&
> > !entity_eligible(cfs_rq, se)) {
> > update_load_avg(cfs_rq, se, 0);
> > + update_entity_lag(cfs_rq, se);
> > set_delayed(se);
> > return false;
> > }
> > @@ -7089,12 +7090,16 @@ requeue_delayed_entity(struct sched_entity *se)
> > WARN_ON_ONCE(!se->on_rq);
> >
> > if (sched_feat(DELAY_ZERO)) {
> > + s64 vlag, prev_vlag = se->vlag;
> > update_entity_lag(cfs_rq, se);
> > - if (se->vlag > 0) {
> > + /* prev_vlag < 0 otherwise se would not be delayed */
> > + vlag = clamp(se->vlag, prev_vlag, 0);
> > +
> > + if (vlag != se->vlag) {
> > cfs_rq->nr_queued--;
> > if (se != cfs_rq->curr)
> > __dequeue_entity(cfs_rq, se);
> > - se->vlag = 0;
> > + se->vlag = vlag;
> > place_entity(cfs_rq, se, 0);
> > if (se != cfs_rq->curr)
> > __enqueue_entity(cfs_rq, se);
>
> So I think this should apply irrespective of DELAY_ZERO -- although
> since we default to that, it is the most relevant.
I can try to link it only to delayed dequeue
>
> Also, I'm thinking we might want a comment with this. We're bound to
> forget the details at some point.
yes
>
> Also, there is one other site that has DLAY_ZERO/clear_delayed(), does
> that want updating too?
My assumption was that the other site happens when se is picked and
its lag is positive but we can sometimes force dequeue of delayed
entities.
Will add it
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: fair: Prevent negative lag increase during delayed dequeue
2026-03-30 14:46 ` Vincent Guittot
@ 2026-03-30 21:32 ` Shubhang Kaushik
2026-03-31 8:54 ` Vincent Guittot
0 siblings, 1 reply; 5+ messages in thread
From: Shubhang Kaushik @ 2026-03-30 21:32 UTC (permalink / raw)
To: Vincent Guittot, Peter Zijlstra
Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, kprateek.nayak, linux-kernel
Hello Vincent,
On Mon, 30 Mar 2026, Vincent Guittot wrote:
> On Mon, 30 Mar 2026 at 16:06, Peter Zijlstra <peterz@infradead.org> wrote:
>>
>> On Fri, Mar 27, 2026 at 07:02:19PM +0100, Vincent Guittot wrote:
>>> Delayed dequeue feature aims to reduce the negative lag of a dequeued task
>>> while sleeping but it can happens that newly enqueued tasks will move
>>> backward the avg vruntime and increase its negative lag.
>>> When the delayed dequeued task wakes up, it has more neg lag compared to
>>> being dequeued immediately or to other delayed or not tasks that have been
>>> dequeued just before theses new enqueues.
>>>
>>> Ensure that the negative lag of a delayed dequeued task doesn't increase
>>> during its delayed dequeued phase while waiting for its neg lag to
>>> diseappear. Similarly, we remove any positive lag that the delayed
>>> dequeued task could have gain during thsi period.
>>
>> *groan*, indeed!
>>
>>
>>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
>>> ---
>>> kernel/sched/fair.c | 9 +++++++--
>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 226509231e67..efa9dfa8c583 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -5595,6 +5595,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
>>> if (sched_feat(DELAY_DEQUEUE) && delay &&
>>> !entity_eligible(cfs_rq, se)) {
>>> update_load_avg(cfs_rq, se, 0);
>>> + update_entity_lag(cfs_rq, se);
>>> set_delayed(se);
>>> return false;
>>> }
>>> @@ -7089,12 +7090,16 @@ requeue_delayed_entity(struct sched_entity *se)
>>> WARN_ON_ONCE(!se->on_rq);
>>>
>>> if (sched_feat(DELAY_ZERO)) {
>>> + s64 vlag, prev_vlag = se->vlag;
>>> update_entity_lag(cfs_rq, se);
>>> - if (se->vlag > 0) {
>>> + /* prev_vlag < 0 otherwise se would not be delayed */
>>> + vlag = clamp(se->vlag, prev_vlag, 0);
>>> +
>>> + if (vlag != se->vlag) {
>>> cfs_rq->nr_queued--;
>>> if (se != cfs_rq->curr)
>>> __dequeue_entity(cfs_rq, se);
>>> - se->vlag = 0;
>>> + se->vlag = vlag;
>>> place_entity(cfs_rq, se, 0);
>>> if (se != cfs_rq->curr)
>>> __enqueue_entity(cfs_rq, se);
>>
>> So I think this should apply irrespective of DELAY_ZERO -- although
>> since we default to that, it is the most relevant.
>
> I can try to link it only to delayed dequeue
>
>>
>> Also, I'm thinking we might want a comment with this. We're bound to
>> forget the details at some point.
>
> yes
>
>>
>> Also, there is one other site that has DLAY_ZERO/clear_delayed(), does
>> that want updating too?
>
> My assumption was that the other site happens when se is picked and
> its lag is positive but we can sometimes force dequeue of delayed
> entities.
> Will add it
>
It seems finish_delayed_dequeue_entity() is another path where the delayed
state is cleared. Would it be worth applying the same lag bounding there
as well to keep the behavior consistent regardless of how the task exits
the delayed state?
Regards,
Shubhang Kaushik
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] sched: fair: Prevent negative lag increase during delayed dequeue
2026-03-30 21:32 ` Shubhang Kaushik
@ 2026-03-31 8:54 ` Vincent Guittot
0 siblings, 0 replies; 5+ messages in thread
From: Vincent Guittot @ 2026-03-31 8:54 UTC (permalink / raw)
To: Shubhang Kaushik
Cc: Peter Zijlstra, mingo, juri.lelli, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, kprateek.nayak, linux-kernel
On Mon, 30 Mar 2026 at 23:32, Shubhang Kaushik
<shubhang@os.amperecomputing.com> wrote:
>
> Hello Vincent,
>
> On Mon, 30 Mar 2026, Vincent Guittot wrote:
>
> > On Mon, 30 Mar 2026 at 16:06, Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> On Fri, Mar 27, 2026 at 07:02:19PM +0100, Vincent Guittot wrote:
> >>> Delayed dequeue feature aims to reduce the negative lag of a dequeued task
> >>> while sleeping but it can happens that newly enqueued tasks will move
> >>> backward the avg vruntime and increase its negative lag.
> >>> When the delayed dequeued task wakes up, it has more neg lag compared to
> >>> being dequeued immediately or to other delayed or not tasks that have been
> >>> dequeued just before theses new enqueues.
> >>>
> >>> Ensure that the negative lag of a delayed dequeued task doesn't increase
> >>> during its delayed dequeued phase while waiting for its neg lag to
> >>> diseappear. Similarly, we remove any positive lag that the delayed
> >>> dequeued task could have gain during thsi period.
> >>
> >> *groan*, indeed!
> >>
> >>
> >>> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> >>> ---
> >>> kernel/sched/fair.c | 9 +++++++--
> >>> 1 file changed, 7 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>> index 226509231e67..efa9dfa8c583 100644
> >>> --- a/kernel/sched/fair.c
> >>> +++ b/kernel/sched/fair.c
> >>> @@ -5595,6 +5595,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> >>> if (sched_feat(DELAY_DEQUEUE) && delay &&
> >>> !entity_eligible(cfs_rq, se)) {
> >>> update_load_avg(cfs_rq, se, 0);
> >>> + update_entity_lag(cfs_rq, se);
> >>> set_delayed(se);
> >>> return false;
> >>> }
> >>> @@ -7089,12 +7090,16 @@ requeue_delayed_entity(struct sched_entity *se)
> >>> WARN_ON_ONCE(!se->on_rq);
> >>>
> >>> if (sched_feat(DELAY_ZERO)) {
> >>> + s64 vlag, prev_vlag = se->vlag;
> >>> update_entity_lag(cfs_rq, se);
> >>> - if (se->vlag > 0) {
> >>> + /* prev_vlag < 0 otherwise se would not be delayed */
> >>> + vlag = clamp(se->vlag, prev_vlag, 0);
> >>> +
> >>> + if (vlag != se->vlag) {
> >>> cfs_rq->nr_queued--;
> >>> if (se != cfs_rq->curr)
> >>> __dequeue_entity(cfs_rq, se);
> >>> - se->vlag = 0;
> >>> + se->vlag = vlag;
> >>> place_entity(cfs_rq, se, 0);
> >>> if (se != cfs_rq->curr)
> >>> __enqueue_entity(cfs_rq, se);
> >>
> >> So I think this should apply irrespective of DELAY_ZERO -- although
> >> since we default to that, it is the most relevant.
> >
> > I can try to link it only to delayed dequeue
> >
> >>
> >> Also, I'm thinking we might want a comment with this. We're bound to
> >> forget the details at some point.
> >
> > yes
> >
> >>
> >> Also, there is one other site that has DLAY_ZERO/clear_delayed(), does
> >> that want updating too?
> >
> > My assumption was that the other site happens when se is picked and
> > its lag is positive but we can sometimes force dequeue of delayed
> > entities.
> > Will add it
> >
> It seems finish_delayed_dequeue_entity() is another path where the delayed
> state is cleared. Would it be worth applying the same lag bounding there
> as well to keep the behavior consistent regardless of how the task exits
> the delayed state?
Yes that's the othersite I have in mind
>
> Regards,
> Shubhang Kaushik
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-03-31 8:54 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-27 18:02 [PATCH] sched: fair: Prevent negative lag increase during delayed dequeue Vincent Guittot
2026-03-30 14:06 ` Peter Zijlstra
2026-03-30 14:46 ` Vincent Guittot
2026-03-30 21:32 ` Shubhang Kaushik
2026-03-31 8:54 ` Vincent Guittot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox