[PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task
@ 2026-06-24 15:12 Vincent Guittot
  2026-06-24 15:12 ` [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short Vincent Guittot
                   ` (5 more replies)
  0 siblings, 6 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-24 15:12 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, kprateek.nayak, linux-kernel, qyousef
  Cc: Vincent Guittot

This series continues to improve the scheduling latency of tasks with
shorter slice duration by mainly canceling, updating or minimizing the
protection of the running tasks when appropriate.

Benchmarks, like hackbench, haven't seen any noticeable performance
differences with this patchset (The default 2.8ms slice has been used for
testing performance regressions)

Several use cases has been used to test the scheduling latency of short
slice tasks:
- cyclictest with a 3777us period and a 8ms slice alone
- cyclictest with a 3777us period and a 8ms slice. 2xNR_CPUS rt-app
  tasks that run (8177us) and sleep (17777us) with a 16ms slice.
- cyclictest with a 3777us period and a 8ms slice. Hackbench with
  1 group using thread and pipe and a 16ms slice.

NB: periods and run duration have been chosen to minimize alignment
with tick or other periodic activities.

scheduling latency (us) for cyclictest
                   tip/sched/core| this patchset | tip/sched/core
slice                 8ms        |  8ms          |  2.8ms
90th Percentile               80 |    80 (  0 %) |    79 (+  1 %)
99th Percentile               92 |    93 (- 1 %) |    93 (-  1 %)
99.9th Percentile            238 |   180 (+24 %) |  1551 (-552 %)
Maximum                     3177 |  2261 (+29 %) |  4771 (+ 65 %)

scheduling latency (us) for cyclictest and rt-app 
                   tip/sched/core| this patchset | tip/sched/core
slice                 8ms / 16ms |  8ms  / 16 ms |  2.8ms / 2.8ms
90th Percentile               59 |    67 (-14 %) |   358 (-507 %)
99th Percentile            10414 |  2454 (+77 %) |  3222 (+ 69 %)
99.9th Percentile          16547 |  5542 (+67 %) |  5512 (+ 67 %)
Maximum                    24298 |  8526 (+65 %) |  8650 (+ 64 %)

scheduling latency (us) for cyclictest and hackbench 
                   tip/sched/core| this patchset | tip/sched/core
slice                 8ms / 16ms |  8ms  / 16 ms |  2.8ms / 2.8ms
90th Percentile               63 |    63 (  0 %) |  1331 (-2013 %)
99th Percentile               76 |    80 (- 5 %) |  4685 (-6064 %)
99.9th Percentile           3321 |  1152 (+65 %) |  8741 (- 163 %)
Maximum                    17199 | 10432 (+39 %) | 24309 (-  41 %)

Since v2:
- Remove avg_vruntime() in set_protect_slice()
- Make sure to update curr for sched idle and batch tasks
- Use a new eligible_vruntime() function instead of avg_runtime() when a
 short slice task is enqueued but not the next one to run.

Since v1:
- Use the correct min_vruntime() instead of min()

Vincent Guittot (6):
  sched/fair: Set next buddy for preempt short
  sched/eevdf: Take into account current's lag when updating slice
    protection
  sched/eevdf: Update slice protection even when resched is already set
  sched/eevdf: Cancel slice protection if short slice task is eligible
  sched/eevdf: Always update slice protection
  sched/eevdf: Speedup short slice task scheduling

 kernel/sched/fair.c | 92 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 72 insertions(+), 20 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short
  2026-06-24 15:12 [PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task Vincent Guittot
@ 2026-06-24 15:12 ` Vincent Guittot
  2026-06-25  6:24   ` K Prateek Nayak
  2026-06-24 15:12 ` [PATCH 2/6 v3] sched/eevdf: Take into account current's lag when updating slice protection Vincent Guittot
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 20+ messages in thread
From: Vincent Guittot @ 2026-06-24 15:12 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, kprateek.nayak, linux-kernel, qyousef
  Cc: Vincent Guittot

If a shorter slice task can preempt current at wakeup, we make sure that
the decision will not be overwritten in between by setting the task as the
next buddy. This still implies that the waking task remains eligible when
the scheduler will actually pick the next task to run.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d78467ec6ee1..83bce5a04f3d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9903,7 +9903,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
 preempt:
 	if (preempt_action == PREEMPT_WAKEUP_SHORT) {
 		cancel_protect_slice(se);
-		clear_buddies(cfs_rq, se);
+		set_next_buddy(&p->se);
 	}
 
 	resched_curr_lazy(rq);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short
  2026-06-24 15:12 ` [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short Vincent Guittot
@ 2026-06-25  6:24   ` K Prateek Nayak
  2026-06-25 12:40     ` Vincent Guittot
  0 siblings, 1 reply; 20+ messages in thread
From: K Prateek Nayak @ 2026-06-25  6:24 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, linux-kernel, qyousef

Hello Vincent,

On 6/24/2026 8:42 PM, Vincent Guittot wrote:
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d78467ec6ee1..83bce5a04f3d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9903,7 +9903,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
>  preempt:
>  	if (preempt_action == PREEMPT_WAKEUP_SHORT) {
>  		cancel_protect_slice(se);
> -		clear_buddies(cfs_rq, se);
> +		set_next_buddy(&p->se);
>  	}

On a tangential note, I just noticed set_preempt_buddy() has two unused
parameters. Seems to have been like that since it was introduced in
commit e837456fdca8 ("sched/fair: Reimplement NEXT_BUDDY to align with
EEVDF goals"). 

Perhaps this can be included in the series too as a cleanup:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7c541f27a1ed..34b3888c4ccf 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9755,9 +9755,7 @@ enum preempt_wakeup_action {
 	PREEMPT_WAKEUP_RESCHED,	/* Force reschedule. */
 };
 
-static inline bool
-set_preempt_buddy(struct cfs_rq *cfs_rq, int wake_flags,
-		  struct sched_entity *pse, struct sched_entity *se)
+static inline bool set_preempt_buddy(struct cfs_rq *cfs_rq, struct sched_entity *pse)
 {
 	/*
 	 * Keep existing buddy if the deadline is sooner than pse.
@@ -9903,9 +9901,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
 		goto update;
 
 	/* Prefer picking wakee soon if appropriate. */
-	if (sched_feat(NEXT_BUDDY) &&
-	    set_preempt_buddy(cfs_rq, wake_flags, pse, se)) {
-
+	if (sched_feat(NEXT_BUDDY) && set_preempt_buddy(cfs_rq, pse)) {
 		/*
 		 * Decide whether to obey WF_SYNC hint for a new buddy. Old
 		 * buddies are ignored as they may not be relevant to the
-- 
Thanks and Regards,
Prateek


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short
  2026-06-25  6:24   ` K Prateek Nayak
@ 2026-06-25 12:40     ` Vincent Guittot
  2026-06-25 12:43       ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Vincent Guittot @ 2026-06-25 12:40 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, linux-kernel, qyousef

On Thu, 25 Jun 2026 at 08:24, K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>
> Hello Vincent,
>
> On 6/24/2026 8:42 PM, Vincent Guittot wrote:
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index d78467ec6ee1..83bce5a04f3d 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9903,7 +9903,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
> >  preempt:
> >       if (preempt_action == PREEMPT_WAKEUP_SHORT) {
> >               cancel_protect_slice(se);
> > -             clear_buddies(cfs_rq, se);
> > +             set_next_buddy(&p->se);
> >       }
>
> On a tangential note, I just noticed set_preempt_buddy() has two unused
> parameters. Seems to have been like that since it was introduced in
> commit e837456fdca8 ("sched/fair: Reimplement NEXT_BUDDY to align with
> EEVDF goals").
>
> Perhaps this can be included in the series too as a cleanup:

I would even go further and remove it. The NEXT_BUDDY feature is broken anyway


>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 7c541f27a1ed..34b3888c4ccf 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9755,9 +9755,7 @@ enum preempt_wakeup_action {
>         PREEMPT_WAKEUP_RESCHED, /* Force reschedule. */
>  };
>
> -static inline bool
> -set_preempt_buddy(struct cfs_rq *cfs_rq, int wake_flags,
> -                 struct sched_entity *pse, struct sched_entity *se)
> +static inline bool set_preempt_buddy(struct cfs_rq *cfs_rq, struct sched_entity *pse)
>  {
>         /*
>          * Keep existing buddy if the deadline is sooner than pse.
> @@ -9903,9 +9901,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
>                 goto update;
>
>         /* Prefer picking wakee soon if appropriate. */
> -       if (sched_feat(NEXT_BUDDY) &&
> -           set_preempt_buddy(cfs_rq, wake_flags, pse, se)) {
> -
> +       if (sched_feat(NEXT_BUDDY) && set_preempt_buddy(cfs_rq, pse)) {
>                 /*
>                  * Decide whether to obey WF_SYNC hint for a new buddy. Old
>                  * buddies are ignored as they may not be relevant to the
> --
> Thanks and Regards,
> Prateek
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short
  2026-06-25 12:40     ` Vincent Guittot
@ 2026-06-25 12:43       ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2026-06-25 12:43 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: K Prateek Nayak, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, qyousef

On Thu, Jun 25, 2026 at 02:40:34PM +0200, Vincent Guittot wrote:
> On Thu, 25 Jun 2026 at 08:24, K Prateek Nayak <kprateek.nayak@amd.com> wrote:
> >
> > Hello Vincent,
> >
> > On 6/24/2026 8:42 PM, Vincent Guittot wrote:
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index d78467ec6ee1..83bce5a04f3d 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -9903,7 +9903,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
> > >  preempt:
> > >       if (preempt_action == PREEMPT_WAKEUP_SHORT) {
> > >               cancel_protect_slice(se);
> > > -             clear_buddies(cfs_rq, se);
> > > +             set_next_buddy(&p->se);
> > >       }
> >
> > On a tangential note, I just noticed set_preempt_buddy() has two unused
> > parameters. Seems to have been like that since it was introduced in
> > commit e837456fdca8 ("sched/fair: Reimplement NEXT_BUDDY to align with
> > EEVDF goals").
> >
> > Perhaps this can be included in the series too as a cleanup:
> 
> I would even go further and remove it. The NEXT_BUDDY feature is broken anyway

I thought Mel wanted to try again, but he's been somewhat silent on
matters. Mel?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/6 v3] sched/eevdf: Take into account current's lag when updating slice protection
  2026-06-24 15:12 [PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task Vincent Guittot
  2026-06-24 15:12 ` [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short Vincent Guittot
@ 2026-06-24 15:12 ` Vincent Guittot
  2026-06-24 15:12 ` [PATCH 3/6 v3] sched/eevdf: Update slice protection even when resched is already set Vincent Guittot
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-24 15:12 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, kprateek.nayak, linux-kernel, qyousef
  Cc: Vincent Guittot

Take into account the lag of current task when updating the slice
protection in order to ensure that the absolute value of lags will remain
in the range [0 : slice+tick]
A task that already has a negative lag will see its protection reduced
whereas a task with positive lag will keep a full slice protection.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/fair.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 83bce5a04f3d..8639086e5d9e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1099,8 +1099,9 @@ static inline void set_protect_slice(struct cfs_rq *cfs_rq, struct sched_entity
 static inline void update_protect_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
 	u64 slice = cfs_rq_min_slice(cfs_rq);
+	u64 vruntime = min_vruntime(se->vruntime, avg_vruntime(cfs_rq));
 
-	se->vprot = min_vruntime(se->vprot, se->vruntime + calc_delta_fair(slice, se));
+	se->vprot = min_vruntime(se->vprot, vruntime + calc_delta_fair(slice, se));
 }
 
 static inline bool protect_slice(struct sched_entity *se)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/6 v3] sched/eevdf: Update slice protection even when resched is already set
  2026-06-24 15:12 [PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task Vincent Guittot
  2026-06-24 15:12 ` [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short Vincent Guittot
  2026-06-24 15:12 ` [PATCH 2/6 v3] sched/eevdf: Take into account current's lag when updating slice protection Vincent Guittot
@ 2026-06-24 15:12 ` Vincent Guittot
  2026-06-24 15:12 ` [PATCH 4/6 v3] sched/eevdf: Cancel slice protection if short slice task is eligible Vincent Guittot
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-24 15:12 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, kprateek.nayak, linux-kernel, qyousef
  Cc: Vincent Guittot

Even if resched is already set, we might want to update or even cancel
the slice protection and ensure that the newly waking task will be the
next one to run.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8639086e5d9e..854f3a9f1d80 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9804,7 +9804,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
 	 * prevents us from potentially nominating it as a false LAST_BUDDY
 	 * below.
 	 */
-	if (test_tsk_need_resched(rq->curr))
+	if (!sched_feat(PREEMPT_SHORT) && test_tsk_need_resched(rq->curr))
 		return;
 
 	if (!sched_feat(WAKEUP_PREEMPTION))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/6 v3] sched/eevdf: Cancel slice protection if short slice task is eligible
  2026-06-24 15:12 [PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task Vincent Guittot
                   ` (2 preceding siblings ...)
  2026-06-24 15:12 ` [PATCH 3/6 v3] sched/eevdf: Update slice protection even when resched is already set Vincent Guittot
@ 2026-06-24 15:12 ` Vincent Guittot
  2026-06-25  6:00   ` K Prateek Nayak
  2026-06-24 15:12 ` [PATCH 5/6 v3] sched/eevdf: Always update slice protection Vincent Guittot
  2026-06-24 15:12 ` [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling Vincent Guittot
  5 siblings, 1 reply; 20+ messages in thread
From: Vincent Guittot @ 2026-06-24 15:12 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, kprateek.nayak, linux-kernel, qyousef
  Cc: Vincent Guittot

If a short slice task will not be the next to be picked but is eligible,
we cancel the slice protection to speedup the time when the short slice
task will be the next to run.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/fair.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 854f3a9f1d80..719aa53851e4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9816,18 +9816,13 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
 	cse_is_idle = se_is_idle(se);
 	pse_is_idle = se_is_idle(pse);
 
+	nse = se;
 	/*
 	 * Preempt an idle entity in favor of a non-idle entity (and don't preempt
 	 * in the inverse case).
 	 */
-	if (cse_is_idle && !pse_is_idle) {
-		/*
-		 * When non-idle entity preempt an idle entity,
-		 * don't give idle entity slice protection.
-		 */
-		preempt_action = PREEMPT_WAKEUP_SHORT;
+	if (cse_is_idle && !pse_is_idle)
 		goto preempt;
-	}
 
 	if (cse_is_idle != pse_is_idle)
 		return;
@@ -9896,16 +9891,23 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
 	if (!nse && cfs_rq->nr_queued)
 		goto pick;
 
+	/*
+	 * If @p is eligible but not the next task to run then cancel protection
+	 * to prevent large scheduling latency
+	 */
+	if (preempt_action == PREEMPT_WAKEUP_SHORT && entity_eligible(cfs_rq, pse))
+		goto preempt;
+
 	if (sched_feat(RUN_TO_PARITY))
 		update_protect_slice(cfs_rq, se);
 
 	return;
 
 preempt:
-	if (preempt_action == PREEMPT_WAKEUP_SHORT) {
-		cancel_protect_slice(se);
+	cancel_protect_slice(se);
+
+	if (preempt_action == PREEMPT_WAKEUP_SHORT && nse == pse)
 		set_next_buddy(&p->se);
-	}
 
 	resched_curr_lazy(rq);
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/6 v3] sched/eevdf: Cancel slice protection if short slice task is eligible
  2026-06-24 15:12 ` [PATCH 4/6 v3] sched/eevdf: Cancel slice protection if short slice task is eligible Vincent Guittot
@ 2026-06-25  6:00   ` K Prateek Nayak
  2026-06-25 12:40     ` Vincent Guittot
  0 siblings, 1 reply; 20+ messages in thread
From: K Prateek Nayak @ 2026-06-25  6:00 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, linux-kernel, qyousef

Hello Vincent,

On 6/24/2026 8:42 PM, Vincent Guittot wrote:
> @@ -9896,16 +9891,23 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
>  	if (!nse && cfs_rq->nr_queued)
>  		goto pick;
>  
> +	/*
> +	 * If @p is eligible but not the next task to run then cancel protection
> +	 * to prevent large scheduling latency
> +	 */
> +	if (preempt_action == PREEMPT_WAKEUP_SHORT && entity_eligible(cfs_rq, pse))
> +		goto preempt;

We handle "pse->slice < se->slice" case before "pse->sched_delayed" case
and jump to "pick", but pse can get dequeued as a part of
pick_next_entity() if it was delayed and picked.

I think we can reach here for PREEMPT_WAKEUP_SHORT after pse is
completely dequeued from cfs_rq. If p is a task on root cfs_rq, we could
have blocked the task entirely and ideally it shouldn't be referenced
here.

Since a wakeup of delayed entity / on delayed hierarchy will call
wakeup_preempt() anyways, I think we should return early if we should
directly jump to update if we see "pse->sched_delayed".

> +
>  	if (sched_feat(RUN_TO_PARITY))
>  		update_protect_slice(cfs_rq, se);
>  
>  	return;
>  
>  preempt:
> -	if (preempt_action == PREEMPT_WAKEUP_SHORT) {
> -		cancel_protect_slice(se);
> +	cancel_protect_slice(se);
> +
> +	if (preempt_action == PREEMPT_WAKEUP_SHORT && nse == pse)
>  		set_next_buddy(&p->se);
> -	}
>  
>  	resched_curr_lazy(rq);
>  }

-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/6 v3] sched/eevdf: Cancel slice protection if short slice task is eligible
  2026-06-25  6:00   ` K Prateek Nayak
@ 2026-06-25 12:40     ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-25 12:40 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, linux-kernel, qyousef

On Thu, 25 Jun 2026 at 08:00, K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>
> Hello Vincent,
>
> On 6/24/2026 8:42 PM, Vincent Guittot wrote:
> > @@ -9896,16 +9891,23 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
> >       if (!nse && cfs_rq->nr_queued)
> >               goto pick;
> >
> > +     /*
> > +      * If @p is eligible but not the next task to run then cancel protection
> > +      * to prevent large scheduling latency
> > +      */
> > +     if (preempt_action == PREEMPT_WAKEUP_SHORT && entity_eligible(cfs_rq, pse))
> > +             goto preempt;
>
> We handle "pse->slice < se->slice" case before "pse->sched_delayed" case
> and jump to "pick", but pse can get dequeued as a part of
> pick_next_entity() if it was delayed and picked.

yes

>
> I think we can reach here for PREEMPT_WAKEUP_SHORT after pse is
> completely dequeued from cfs_rq. If p is a task on root cfs_rq, we could
> have blocked the task entirely and ideally it shouldn't be referenced
> here.

I tried to find which use case calls wakeup_preempt_fair() for a
sched_delayed task but can't find it

>
> Since a wakeup of delayed entity / on delayed hierarchy will call
> wakeup_preempt() anyways, I think we should return early if we should
> directly jump to update if we see "pse->sched_delayed".

Fair enough. But we still need to consider the FORK case

>
> > +
> >       if (sched_feat(RUN_TO_PARITY))
> >               update_protect_slice(cfs_rq, se);
> >
> >       return;
> >
> >  preempt:
> > -     if (preempt_action == PREEMPT_WAKEUP_SHORT) {
> > -             cancel_protect_slice(se);
> > +     cancel_protect_slice(se);
> > +
> > +     if (preempt_action == PREEMPT_WAKEUP_SHORT && nse == pse)
> >               set_next_buddy(&p->se);
> > -     }
> >
> >       resched_curr_lazy(rq);
> >  }
>
> --
> Thanks and Regards,
> Prateek
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/6 v3] sched/eevdf: Always update slice protection
  2026-06-24 15:12 [PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task Vincent Guittot
                   ` (3 preceding siblings ...)
  2026-06-24 15:12 ` [PATCH 4/6 v3] sched/eevdf: Cancel slice protection if short slice task is eligible Vincent Guittot
@ 2026-06-24 15:12 ` Vincent Guittot
  2026-06-24 15:12 ` [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling Vincent Guittot
  5 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-24 15:12 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, kprateek.nayak, linux-kernel, qyousef
  Cc: Vincent Guittot

Even if p will not preempt current, it modifies the avg_vruntime and
possibly the min slice. Make sure to update the slice protection with the
updated figures. As an example, Batch and Sched Idle tasks can otherwise
get a larger lag than their slice and finaly delay the scheduling of a
normal task, which deadline will be a later.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/fair.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 719aa53851e4..f972987618e7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9824,17 +9824,18 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
 	if (cse_is_idle && !pse_is_idle)
 		goto preempt;
 
+	cfs_rq = cfs_rq_of(se);
+	update_curr(cfs_rq);
+
 	if (cse_is_idle != pse_is_idle)
-		return;
+		goto update;
 
 	/*
 	 * BATCH and IDLE tasks do not preempt others.
 	 */
 	if (unlikely(!normal_policy(p->policy)))
-		return;
+		goto update;
 
-	cfs_rq = cfs_rq_of(se);
-	update_curr(cfs_rq);
 	/*
 	 * If @p has a shorter slice than current and @p is eligible, override
 	 * current's slice protection in order to allow preemption.
@@ -9851,7 +9852,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
 	 * EEVDF to forcibly queue an ineligible task.
 	 */
 	if ((wake_flags & WF_FORK) || pse->sched_delayed)
-		return;
+		goto update;
 
 	/* Prefer picking wakee soon if appropriate. */
 	if (sched_feat(NEXT_BUDDY) &&
@@ -9897,7 +9898,7 @@ static void wakeup_preempt_fair(struct rq *rq, struct task_struct *p, int wake_f
 	 */
 	if (preempt_action == PREEMPT_WAKEUP_SHORT && entity_eligible(cfs_rq, pse))
 		goto preempt;
-
+update:
 	if (sched_feat(RUN_TO_PARITY))
 		update_protect_slice(cfs_rq, se);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-24 15:12 [PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task Vincent Guittot
                   ` (4 preceding siblings ...)
  2026-06-24 15:12 ` [PATCH 5/6 v3] sched/eevdf: Always update slice protection Vincent Guittot
@ 2026-06-24 15:12 ` Vincent Guittot
  2026-06-25  7:37   ` K Prateek Nayak
  2026-06-25  8:33   ` Peter Zijlstra
  5 siblings, 2 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-24 15:12 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, kprateek.nayak, linux-kernel, qyousef
  Cc: Vincent Guittot

When a task with a shorter slice is enqueued, we protect the running
task which has a longer slice until it becomes ineligible instead of a
full slice in order to speedup the switch to other tasks until the task
with the shortest slice is scheduled. This helps to the task to not wait
too many full slices before running.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/fair.c | 52 +++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 50 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f972987618e7..7c541f27a1ed 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -813,6 +813,48 @@ u64 avg_vruntime(struct cfs_rq *cfs_rq)
 	return cfs_rq->zero_vruntime;
 }
 
+/*
+ * Compute the vruntime until which the entity remains eligible when it runs
+ * or is about to run on the CPU. We use this value to set vprot to the min
+ * value until which other entities would not be picked anyway.
+ *     \Sum (v_i - v0)*w_i
+ * V = ------------------- + v0
+ *          \Sum w_i
+ *
+ * We want V' for (v_se - v0) == 0. Previous entity has already been enqueued
+ * in the rb tree and next is already dequeued so
+ *
+ *      cfs_rq->sum_w_vruntime
+ * V' = ------------------------- + v0
+ *      cfs_rq->sum_weight + w_se
+
+ */
+static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
+{
+	struct sched_entity *curr = cfs_rq->curr;
+	long weight = cfs_rq->sum_weight;
+	s64 delta = 0;
+
+	if (weight) {
+		s64 runtime = cfs_rq->sum_w_vruntime;
+
+		weight += avg_vruntime_weight(cfs_rq, se->load.weight);
+
+		/* sign flips effective floor / ceiling */
+		if (runtime < 0)
+			runtime -= (weight - 1);
+
+		delta = div64_long(runtime, weight);
+	} else {
+		/*
+		 * When there is but one element, it is the average.
+		 */
+		delta = 0;
+	}
+
+	return cfs_rq->zero_vruntime + delta + 1;
+}
+
 static inline u64 cfs_rq_max_slice(struct cfs_rq *cfs_rq);
 
 /*
@@ -1090,8 +1132,14 @@ static inline void set_protect_slice(struct cfs_rq *cfs_rq, struct sched_entity
 		slice = cfs_rq_min_slice(cfs_rq);
 
 	slice = min(slice, se->slice);
-	if (slice != se->slice)
-		vprot = min_vruntime(vprot, se->vruntime + calc_delta_fair(slice, se));
+
+	/* If there are shorter slices than se's one */
+	if (slice != se->slice) {
+		if (sched_feat(PREEMPT_SHORT))
+			vprot = min_vruntime(vprot, eligible_vruntime(cfs_rq, se));
+		else
+			vprot = min_vruntime(vprot, se->vruntime + calc_delta_fair(slice, se));
+	}
 
 	se->vprot = vprot;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-24 15:12 ` [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling Vincent Guittot
@ 2026-06-25  7:37   ` K Prateek Nayak
  2026-06-25  8:37     ` Peter Zijlstra
  2026-06-25 12:51     ` Vincent Guittot
  2026-06-25  8:33   ` Peter Zijlstra
  1 sibling, 2 replies; 20+ messages in thread
From: K Prateek Nayak @ 2026-06-25  7:37 UTC (permalink / raw)
  To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
	rostedt, bsegall, mgorman, vschneid, linux-kernel, qyousef

Hello Vincent,

On 6/24/2026 8:42 PM, Vincent Guittot wrote:
> +/*
> + * Compute the vruntime until which the entity remains eligible when it runs
> + * or is about to run on the CPU. We use this value to set vprot to the min
> + * value until which other entities would not be picked anyway.
> + *     \Sum (v_i - v0)*w_i
> + * V = ------------------- + v0
> + *          \Sum w_i
> + *
> + * We want V' for (v_se - v0) == 0. Previous entity has already been enqueued
> + * in the rb tree and next is already dequeued so
> + *
> + *      cfs_rq->sum_w_vruntime
> + * V' = ------------------------- + v0
> + *      cfs_rq->sum_weight + w_se
> +

nit.

^ is that a stray line or a Missing * at the beginning of the comment
line?

> + */
> +static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> +	struct sched_entity *curr = cfs_rq->curr;

curr seems to be unused here and is NULL anyways when
set_protect_slice() is called ;-)

> +	long weight = cfs_rq->sum_weight;
> +	s64 delta = 0;
> +
> +	if (weight) {
> +		s64 runtime = cfs_rq->sum_w_vruntime;
> +
> +		weight += avg_vruntime_weight(cfs_rq, se->load.weight);
> +
> +		/* sign flips effective floor / ceiling */
> +		if (runtime < 0)
> +			runtime -= (weight - 1);
> +
> +		delta = div64_long(runtime, weight);
> +	} else {> +		/*
> +		 * When there is but one element, it is the average.
> +		 */
> +		delta = 0;

Even with a single entity, the se->vruntime can still diverge from
cfs_rq->zero_vruntime

Last avg_vruntime() call for cfs_rq was at update_entity_lag() during
last dequeue while se->on_rq was still set for the dequeuing entity.

Should this be entity_key(cfs_rq, se) instead?

> +	}
> +
> +	return cfs_rq->zero_vruntime + delta + 1;
> +}
> +
>  static inline u64 cfs_rq_max_slice(struct cfs_rq *cfs_rq);
>  
>  /*
-- 
Thanks and Regards,
Prateek


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-25  7:37   ` K Prateek Nayak
@ 2026-06-25  8:37     ` Peter Zijlstra
  2026-06-25 10:09       ` Peter Zijlstra
  2026-06-25 14:55       ` Vincent Guittot
  2026-06-25 12:51     ` Vincent Guittot
  1 sibling, 2 replies; 20+ messages in thread
From: Peter Zijlstra @ 2026-06-25  8:37 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Vincent Guittot, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, qyousef

On Thu, Jun 25, 2026 at 01:07:43PM +0530, K Prateek Nayak wrote:

> > +static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > +{
> > +	struct sched_entity *curr = cfs_rq->curr;
> 
> curr seems to be unused here and is NULL anyways when
> set_protect_slice() is called ;-)

Ah, but it is not with the flat patches on, which is why I was a little
confused ;-)

That said; I now see se == curr. So let me go have another look at all
that.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-25  8:37     ` Peter Zijlstra
@ 2026-06-25 10:09       ` Peter Zijlstra
  2026-06-25 12:57         ` Vincent Guittot
  2026-06-25 14:55       ` Vincent Guittot
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2026-06-25 10:09 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: Vincent Guittot, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, qyousef

On Thu, Jun 25, 2026 at 10:37:20AM +0200, Peter Zijlstra wrote:
> On Thu, Jun 25, 2026 at 01:07:43PM +0530, K Prateek Nayak wrote:
> 
> > > +static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > > +{
> > > +	struct sched_entity *curr = cfs_rq->curr;
> > 
> > curr seems to be unused here and is NULL anyways when
> > set_protect_slice() is called ;-)
> 
> Ah, but it is not with the flat patches on, which is why I was a little
> confused ;-)
> 
> That said; I now see se == curr. So let me go have another look at all
> that.

I might be slow -- it is definitely waay to warm already -- but I'm not
seeing how you don't want avg_vruntime() here.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-25 10:09       ` Peter Zijlstra
@ 2026-06-25 12:57         ` Vincent Guittot
  2026-06-25 12:59           ` Vincent Guittot
  0 siblings, 1 reply; 20+ messages in thread
From: Vincent Guittot @ 2026-06-25 12:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: K Prateek Nayak, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, qyousef

On Thu, 25 Jun 2026 at 12:10, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Jun 25, 2026 at 10:37:20AM +0200, Peter Zijlstra wrote:
> > On Thu, Jun 25, 2026 at 01:07:43PM +0530, K Prateek Nayak wrote:
> >
> > > > +static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > > > +{
> > > > + struct sched_entity *curr = cfs_rq->curr;
> > >
> > > curr seems to be unused here and is NULL anyways when
> > > set_protect_slice() is called ;-)
> >
> > Ah, but it is not with the flat patches on, which is why I was a little
> > confused ;-)
> >
> > That said; I now see se == curr. So let me go have another look at all
> > that.
>
> I might be slow -- it is definitely waay to warm already -- but I'm not
> seeing how you don't want avg_vruntime() here.

It is somehow related to avg_vruntime() except that I don't want the
current avg_vruntime but the avg_vruntime when entity_key(se) will be
null and se will become ineligible

If I use current avg_vruntime(), once se will have run enough to get
its vruntime == (now old) avg_vruntime, the new avg_vruntime will have
move forward and the se's vruntime will  still be eligible


>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-25 12:57         ` Vincent Guittot
@ 2026-06-25 12:59           ` Vincent Guittot
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-25 12:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: K Prateek Nayak, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, qyousef

On Thu, 25 Jun 2026 at 14:57, Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Thu, 25 Jun 2026 at 12:10, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Jun 25, 2026 at 10:37:20AM +0200, Peter Zijlstra wrote:
> > > On Thu, Jun 25, 2026 at 01:07:43PM +0530, K Prateek Nayak wrote:
> > >
> > > > > +static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > > > > +{
> > > > > + struct sched_entity *curr = cfs_rq->curr;
> > > >
> > > > curr seems to be unused here and is NULL anyways when
> > > > set_protect_slice() is called ;-)
> > >
> > > Ah, but it is not with the flat patches on, which is why I was a little
> > > confused ;-)
> > >
> > > That said; I now see se == curr. So let me go have another look at all
> > > that.
> >
> > I might be slow -- it is definitely waay to warm already -- but I'm not
> > seeing how you don't want avg_vruntime() here.
>
> It is somehow related to avg_vruntime() except that I don't want the
> current avg_vruntime but the avg_vruntime when entity_key(se) will be
> null and se will become ineligible
>
> If I use current avg_vruntime(), once se will have run enough to get
> its vruntime == (now old) avg_vruntime, the new avg_vruntime will have
> move forward and the se's vruntime will  still be eligible

And I should name it ineligible_vruntime because of the +1

>
>
> >
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-25  8:37     ` Peter Zijlstra
  2026-06-25 10:09       ` Peter Zijlstra
@ 2026-06-25 14:55       ` Vincent Guittot
  1 sibling, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-25 14:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: K Prateek Nayak, mingo, juri.lelli, dietmar.eggemann, rostedt,
	bsegall, mgorman, vschneid, linux-kernel, qyousef

On Thu, 25 Jun 2026 at 10:37, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Jun 25, 2026 at 01:07:43PM +0530, K Prateek Nayak wrote:
>
> > > +static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > > +{
> > > +   struct sched_entity *curr = cfs_rq->curr;
> >
> > curr seems to be unused here and is NULL anyways when
> > set_protect_slice() is called ;-)
>
> Ah, but it is not with the flat patches on, which is why I was a little
> confused ;-)

Yeah, flat hierarchy has another level of complexity in tracking
scheduling latency and that will be the next step

>
> That said; I now see se == curr. So let me go have another look at all
> that.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-25  7:37   ` K Prateek Nayak
  2026-06-25  8:37     ` Peter Zijlstra
@ 2026-06-25 12:51     ` Vincent Guittot
  1 sibling, 0 replies; 20+ messages in thread
From: Vincent Guittot @ 2026-06-25 12:51 UTC (permalink / raw)
  To: K Prateek Nayak
  Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, vschneid, linux-kernel, qyousef

On Thu, 25 Jun 2026 at 09:37, K Prateek Nayak <kprateek.nayak@amd.com> wrote:
>
> Hello Vincent,
>
> On 6/24/2026 8:42 PM, Vincent Guittot wrote:
> > +/*
> > + * Compute the vruntime until which the entity remains eligible when it runs
> > + * or is about to run on the CPU. We use this value to set vprot to the min
> > + * value until which other entities would not be picked anyway.
> > + *     \Sum (v_i - v0)*w_i
> > + * V = ------------------- + v0
> > + *          \Sum w_i
> > + *
> > + * We want V' for (v_se - v0) == 0. Previous entity has already been enqueued
> > + * in the rb tree and next is already dequeued so
> > + *
> > + *      cfs_rq->sum_w_vruntime
> > + * V' = ------------------------- + v0
> > + *      cfs_rq->sum_weight + w_se
> > +
>
> nit.
>
> ^ is that a stray line or a Missing * at the beginning of the comment
> line?

yes

>
> > + */
> > +static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
> > +{
> > +     struct sched_entity *curr = cfs_rq->curr;
>
> curr seems to be unused here and is NULL anyways when
> set_protect_slice() is called ;-)

Yeah, I remember seeing the warning and forgot to remove it

>
> > +     long weight = cfs_rq->sum_weight;
> > +     s64 delta = 0;
> > +
> > +     if (weight) {
> > +             s64 runtime = cfs_rq->sum_w_vruntime;
> > +
> > +             weight += avg_vruntime_weight(cfs_rq, se->load.weight);
> > +
> > +             /* sign flips effective floor / ceiling */
> > +             if (runtime < 0)
> > +                     runtime -= (weight - 1);
> > +
> > +             delta = div64_long(runtime, weight);
> > +     } else {> +             /*
> > +              * When there is but one element, it is the average.
> > +              */
> > +             delta = 0;
>
> Even with a single entity, the se->vruntime can still diverge from
> cfs_rq->zero_vruntime
>
> Last avg_vruntime() call for cfs_rq was at update_entity_lag() during
> last dequeue while se->on_rq was still set for the dequeuing entity.
>
> Should this be entity_key(cfs_rq, se) instead?

Probably although the case should never happen

>
> > +     }
> > +
> > +     return cfs_rq->zero_vruntime + delta + 1;
> > +}
> > +
> >  static inline u64 cfs_rq_max_slice(struct cfs_rq *cfs_rq);
> >
> >  /*
> --
> Thanks and Regards,
> Prateek
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling
  2026-06-24 15:12 ` [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling Vincent Guittot
  2026-06-25  7:37   ` K Prateek Nayak
@ 2026-06-25  8:33   ` Peter Zijlstra
  1 sibling, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2026-06-25  8:33 UTC (permalink / raw)
  To: Vincent Guittot
  Cc: mingo, juri.lelli, dietmar.eggemann, rostedt, bsegall, mgorman,
	vschneid, kprateek.nayak, linux-kernel, qyousef

On Wed, Jun 24, 2026 at 05:12:29PM +0200, Vincent Guittot wrote:
> When a task with a shorter slice is enqueued, we protect the running
> task which has a longer slice until it becomes ineligible instead of a
> full slice in order to speedup the switch to other tasks until the task
> with the shortest slice is scheduled. This helps to the task to not wait
> too many full slices before running.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---
>  kernel/sched/fair.c | 52 +++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 50 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index f972987618e7..7c541f27a1ed 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -813,6 +813,48 @@ u64 avg_vruntime(struct cfs_rq *cfs_rq)
>  	return cfs_rq->zero_vruntime;
>  }
>  
> +/*
> + * Compute the vruntime until which the entity remains eligible when it runs
> + * or is about to run on the CPU. We use this value to set vprot to the min
> + * value until which other entities would not be picked anyway.
> + *     \Sum (v_i - v0)*w_i
> + * V = ------------------- + v0
> + *          \Sum w_i
> + *
> + * We want V' for (v_se - v0) == 0. Previous entity has already been enqueued
> + * in the rb tree and next is already dequeued so
> + *
> + *      cfs_rq->sum_w_vruntime
> + * V' = ------------------------- + v0
> + *      cfs_rq->sum_weight + w_se
> +
> + */
> +static u64 eligible_vruntime(struct cfs_rq *cfs_rq, struct sched_entity *se)
> +{
> +	struct sched_entity *curr = cfs_rq->curr;
> +	long weight = cfs_rq->sum_weight;
> +	s64 delta = 0;

'curr' goes unused in this function, did you want:

	if (curr && !curr->on_rq)
		curr = NULL;

> +
> +	if (weight) {
> +		s64 runtime = cfs_rq->sum_w_vruntime;


		if (curr) {
			unsigned long w = avg_vruntime_weight(cfs_rq, curr->load.weight);

			runtime += entity_key(cfs_rq, curr) * w;
			weight += w;
		}

?

> +
> +		weight += avg_vruntime_weight(cfs_rq, se->load.weight);
> +
> +		/* sign flips effective floor / ceiling */
> +		if (runtime < 0)
> +			runtime -= (weight - 1);
> +
> +		delta = div64_long(runtime, weight);
> +	} else {
> +		/*
> +		 * When there is but one element, it is the average.
> +		 */
> +		delta = 0;
> +	}
> +
> +	return cfs_rq->zero_vruntime + delta + 1;
> +}



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2026-06-25 14:55 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-24 15:12 [PATCH 0/6 v3] sched/eevdf: Improve scheduling latency of short slice task Vincent Guittot
2026-06-24 15:12 ` [PATCH 1/6 v3] sched/fair: Set next buddy for preempt short Vincent Guittot
2026-06-25  6:24   ` K Prateek Nayak
2026-06-25 12:40     ` Vincent Guittot
2026-06-25 12:43       ` Peter Zijlstra
2026-06-24 15:12 ` [PATCH 2/6 v3] sched/eevdf: Take into account current's lag when updating slice protection Vincent Guittot
2026-06-24 15:12 ` [PATCH 3/6 v3] sched/eevdf: Update slice protection even when resched is already set Vincent Guittot
2026-06-24 15:12 ` [PATCH 4/6 v3] sched/eevdf: Cancel slice protection if short slice task is eligible Vincent Guittot
2026-06-25  6:00   ` K Prateek Nayak
2026-06-25 12:40     ` Vincent Guittot
2026-06-24 15:12 ` [PATCH 5/6 v3] sched/eevdf: Always update slice protection Vincent Guittot
2026-06-24 15:12 ` [PATCH 6/6 v3] sched/eevdf: Speedup short slice task scheduling Vincent Guittot
2026-06-25  7:37   ` K Prateek Nayak
2026-06-25  8:37     ` Peter Zijlstra
2026-06-25 10:09       ` Peter Zijlstra
2026-06-25 12:57         ` Vincent Guittot
2026-06-25 12:59           ` Vincent Guittot
2026-06-25 14:55       ` Vincent Guittot
2026-06-25 12:51     ` Vincent Guittot
2026-06-25  8:33   ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.