public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched/deadline: Fix stale dl_defer_running in dl_server else-branch
@ 2026-04-02 13:30 soolaugust
  2026-04-03  0:05 ` John Stultz
  0 siblings, 1 reply; 14+ messages in thread
From: soolaugust @ 2026-04-02 13:30 UTC (permalink / raw)
  To: jstultz, bristot; +Cc: peterz, mingo, linux-kernel, arighi, Zhidao Su

From: Zhidao Su <suzhidao@xiaomi.com>

Peter's fix (115135422562) cleared dl_defer_running in the if-branch of
update_dl_entity() (deadline expired/overflow). This ensures
replenish_dl_new_period() always arms the zero-laxity timer. However,
with PROXY_WAKING, re-activation hits the else-branch (same-period,
deadline not expired), where dl_defer_running from a prior starvation
episode can be stale.

During PROXY_WAKING CPU return-migration, proxy_force_return() migrates
the task to a new CPU via deactivate_task()+attach_one_task(). The
enqueue path on the new CPU triggers enqueue_task_fair() which calls
dl_server_start() for the fair_server. Crucially, this re-activation
does NOT call dl_server_stop() first, so dl_defer_running retains its
prior value. If a prior starvation episode left dl_defer_running=1,
and the server is re-activated within the same period:

  [4] D->A: dl_server_stop() clears flags but may be skipped when
            dl_server_active=0 (server was already stopped before
            return-migration triggered dl_server_start())
  [1] A->B: dl_server_start() -> enqueue_dl_entity(WAKEUP)
             -> update_dl_entity() enters else-branch
             -> 'if (!dl_defer_running)' guard fires, skips
                dl_defer_armed=1 / dl_throttled=1
             -> server enqueued into [D] state directly
             -> update_curr_dl_se() consumes runtime
             -> start_dl_timer() with dl_defer_armed=0 (slow path)
             -> boot time increases ~72%

Fix: in the else-branch, unconditionally clear dl_defer_running and always
set dl_defer_armed=1 / dl_throttled=1. This ensures every same-period
re-activation properly re-arms the zero-laxity timer, regardless of whether
a prior starvation episode had set dl_defer_running.

The if-branch (deadline expired) is left untouched:
replenish_dl_new_period() contains its own guard ('if (!dl_defer_running)')
that arms the zero-laxity timer only when dl_defer_running=0. With
PROXY_WAKING, dl_defer_running=1 in the deadline-expired path means a
genuine starvation episode is ongoing, so the server can skip the
zero-laxity wait and enter [D] directly. Clearing dl_defer_running here
(as Peter's fix did) forces every PROXY_WAKING deadline-expired
re-activation through the ~950ms zero-laxity wait.

Measured boot time to first ksched_football event (4 CPUs, 4G):
  This fix: ~15-20s
  Without fix (stale dl_defer_running): ~43-62s (+72-200%)

Note: Andrea Righi's v2 patch addresses the same symptom by clearing
dl_defer_running in dl_server_stop(). However, dl_server_stop() is not
called during PROXY_WAKING return-migration (proxy_force_return() calls
dl_server_start() directly without dl_server_stop()). This fix targets
the correct location: the else-branch of update_dl_entity().

Signed-off-by: Zhidao Su <suzhidao@xiaomi.com>
---
 kernel/sched/deadline.c | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 01754d699f0..b2bcd34f3ea 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1034,22 +1034,22 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
 			return;
 		}
 
-		/*
-		 * When [4] D->A is followed by [1] A->B, dl_defer_running
-		 * needs to be cleared, otherwise it will fail to properly
-		 * start the zero-laxity timer.
-		 */
-		dl_se->dl_defer_running = 0;
 		replenish_dl_new_period(dl_se, rq);
 	} else if (dl_server(dl_se) && dl_se->dl_defer) {
 		/*
-		 * The server can still use its previous deadline, so check if
-		 * it left the dl_defer_running state.
+		 * The server can still use its previous deadline. Clear
+		 * dl_defer_running unconditionally: a stale dl_defer_running=1
+		 * from a prior starvation episode (set in dl_server_timer() when
+		 * the zero-laxity timer fires) must not carry over to the next
+		 * activation. PROXY_WAKING return-migration (proxy_force_return)
+		 * re-activates the server via attach_one_task()->enqueue_task_fair()
+		 * without calling dl_server_stop() first, so the flag is not
+		 * cleared in the [4] D->A path for that case.
+		 * Always re-arm the zero-laxity timer on each re-activation.
 		 */
-		if (!dl_se->dl_defer_running) {
-			dl_se->dl_defer_armed = 1;
-			dl_se->dl_throttled = 1;
-		}
+		dl_se->dl_defer_running = 0;
+		dl_se->dl_defer_armed = 1;
+		dl_se->dl_throttled = 1;
 	}
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in dl_server else-branch
  2026-04-02 13:30 [PATCH] sched/deadline: Fix stale dl_defer_running in dl_server else-branch soolaugust
@ 2026-04-03  0:05 ` John Stultz
  2026-04-03  1:30   ` John Stultz
  0 siblings, 1 reply; 14+ messages in thread
From: John Stultz @ 2026-04-03  0:05 UTC (permalink / raw)
  To: soolaugust; +Cc: bristot, peterz, mingo, linux-kernel, arighi, Zhidao Su

On Thu, Apr 2, 2026 at 6:30 AM <soolaugust@gmail.com> wrote:
>
> From: Zhidao Su <suzhidao@xiaomi.com>
>
> Peter's fix (115135422562) cleared dl_defer_running in the if-branch of
> update_dl_entity() (deadline expired/overflow). This ensures
> replenish_dl_new_period() always arms the zero-laxity timer. However,
> with PROXY_WAKING, re-activation hits the else-branch (same-period,
> deadline not expired), where dl_defer_running from a prior starvation
> episode can be stale.
>
> During PROXY_WAKING CPU return-migration, proxy_force_return() migrates
> the task to a new CPU via deactivate_task()+attach_one_task(). The
> enqueue path on the new CPU triggers enqueue_task_fair() which calls
> dl_server_start() for the fair_server. Crucially, this re-activation
> does NOT call dl_server_stop() first, so dl_defer_running retains its
> prior value. If a prior starvation episode left dl_defer_running=1,
> and the server is re-activated within the same period:
>
>   [4] D->A: dl_server_stop() clears flags but may be skipped when
>             dl_server_active=0 (server was already stopped before
>             return-migration triggered dl_server_start())
>   [1] A->B: dl_server_start() -> enqueue_dl_entity(WAKEUP)
>              -> update_dl_entity() enters else-branch
>              -> 'if (!dl_defer_running)' guard fires, skips
>                 dl_defer_armed=1 / dl_throttled=1
>              -> server enqueued into [D] state directly
>              -> update_curr_dl_se() consumes runtime
>              -> start_dl_timer() with dl_defer_armed=0 (slow path)
>              -> boot time increases ~72%
>
> Fix: in the else-branch, unconditionally clear dl_defer_running and always
> set dl_defer_armed=1 / dl_throttled=1. This ensures every same-period
> re-activation properly re-arms the zero-laxity timer, regardless of whether
> a prior starvation episode had set dl_defer_running.
>
> The if-branch (deadline expired) is left untouched:
> replenish_dl_new_period() contains its own guard ('if (!dl_defer_running)')
> that arms the zero-laxity timer only when dl_defer_running=0. With
> PROXY_WAKING, dl_defer_running=1 in the deadline-expired path means a
> genuine starvation episode is ongoing, so the server can skip the
> zero-laxity wait and enter [D] directly. Clearing dl_defer_running here
> (as Peter's fix did) forces every PROXY_WAKING deadline-expired
> re-activation through the ~950ms zero-laxity wait.
>
> Measured boot time to first ksched_football event (4 CPUs, 4G):
>   This fix: ~15-20s
>   Without fix (stale dl_defer_running): ~43-62s (+72-200%)
>
> Note: Andrea Righi's v2 patch addresses the same symptom by clearing
> dl_defer_running in dl_server_stop(). However, dl_server_stop() is not
> called during PROXY_WAKING return-migration (proxy_force_return() calls
> dl_server_start() directly without dl_server_stop()). This fix targets
> the correct location: the else-branch of update_dl_entity().
>
> Signed-off-by: Zhidao Su <suzhidao@xiaomi.com>

Oh, this is perfect! I've noticed the performance regression
previously and narrowed it down to commit 115135422562
("sched/deadline: Fix 'stuck' dl_server"), but I hadn't quite gotten
my head around the issue.  In testing, your patch seems to resolve the
regression as well as the revert I was doing previously.

I've included your patch in the series I'm hoping to send out soon here.

Thanks so much!
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in dl_server else-branch
  2026-04-03  0:05 ` John Stultz
@ 2026-04-03  1:30   ` John Stultz
  2026-04-03  8:12     ` [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch soolaugust
  0 siblings, 1 reply; 14+ messages in thread
From: John Stultz @ 2026-04-03  1:30 UTC (permalink / raw)
  To: soolaugust; +Cc: peterz, mingo, linux-kernel, arighi, Zhidao Su

On Thu, Apr 2, 2026 at 5:05 PM John Stultz <jstultz@google.com> wrote:
> On Thu, Apr 2, 2026 at 6:30 AM <soolaugust@gmail.com> wrote:
> >
> > From: Zhidao Su <suzhidao@xiaomi.com>
> >
> > Peter's fix (115135422562) cleared dl_defer_running in the if-branch of
> > update_dl_entity() (deadline expired/overflow). This ensures
> > replenish_dl_new_period() always arms the zero-laxity timer. However,
> > with PROXY_WAKING, re-activation hits the else-branch (same-period,
> > deadline not expired), where dl_defer_running from a prior starvation
> > episode can be stale.
> >
> > During PROXY_WAKING CPU return-migration, proxy_force_return() migrates
> > the task to a new CPU via deactivate_task()+attach_one_task(). The
> > enqueue path on the new CPU triggers enqueue_task_fair() which calls
> > dl_server_start() for the fair_server. Crucially, this re-activation
> > does NOT call dl_server_stop() first, so dl_defer_running retains its
> > prior value. If a prior starvation episode left dl_defer_running=1,
> > and the server is re-activated within the same period:
> >
> >   [4] D->A: dl_server_stop() clears flags but may be skipped when
> >             dl_server_active=0 (server was already stopped before
> >             return-migration triggered dl_server_start())
> >   [1] A->B: dl_server_start() -> enqueue_dl_entity(WAKEUP)
> >              -> update_dl_entity() enters else-branch
> >              -> 'if (!dl_defer_running)' guard fires, skips
> >                 dl_defer_armed=1 / dl_throttled=1
> >              -> server enqueued into [D] state directly
> >              -> update_curr_dl_se() consumes runtime
> >              -> start_dl_timer() with dl_defer_armed=0 (slow path)
> >              -> boot time increases ~72%
> >
> > Fix: in the else-branch, unconditionally clear dl_defer_running and always
> > set dl_defer_armed=1 / dl_throttled=1. This ensures every same-period
> > re-activation properly re-arms the zero-laxity timer, regardless of whether
> > a prior starvation episode had set dl_defer_running.
> >
> > The if-branch (deadline expired) is left untouched:
> > replenish_dl_new_period() contains its own guard ('if (!dl_defer_running)')
> > that arms the zero-laxity timer only when dl_defer_running=0. With
> > PROXY_WAKING, dl_defer_running=1 in the deadline-expired path means a
> > genuine starvation episode is ongoing, so the server can skip the
> > zero-laxity wait and enter [D] directly. Clearing dl_defer_running here
> > (as Peter's fix did) forces every PROXY_WAKING deadline-expired
> > re-activation through the ~950ms zero-laxity wait.
> >
> > Measured boot time to first ksched_football event (4 CPUs, 4G):
> >   This fix: ~15-20s
> >   Without fix (stale dl_defer_running): ~43-62s (+72-200%)
> >
> > Note: Andrea Righi's v2 patch addresses the same symptom by clearing
> > dl_defer_running in dl_server_stop(). However, dl_server_stop() is not
> > called during PROXY_WAKING return-migration (proxy_force_return() calls
> > dl_server_start() directly without dl_server_stop()). This fix targets
> > the correct location: the else-branch of update_dl_entity().
> >
> > Signed-off-by: Zhidao Su <suzhidao@xiaomi.com>
>
> Oh, this is perfect! I've noticed the performance regression
> previously and narrowed it down to commit 115135422562
> ("sched/deadline: Fix 'stuck' dl_server"), but I hadn't quite gotten
> my head around the issue.  In testing, your patch seems to resolve the
> regression as well as the revert I was doing previously.

Oh drat, unfortunately I was testing without the ksched_football test
applied, and unfortunately this change isn't resolving the issue
(basically see the ksched_football test seemingly stop making progress
on boot, seemingly hanging the system).

So it doesn't seem this is sufficient. I'll continue working to
understand the issue and will use your hint about calling
dl_server_stop() maybe in the return migration path.

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-03  1:30   ` John Stultz
@ 2026-04-03  8:12     ` soolaugust
  2026-04-03 13:42       ` Peter Zijlstra
  0 siblings, 1 reply; 14+ messages in thread
From: soolaugust @ 2026-04-03  8:12 UTC (permalink / raw)
  To: jstultz, juri.lelli; +Cc: peterz, mingo, linux-kernel, zhidao su

From: zhidao su <suzhidao@xiaomi.com>

commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server") added a
dl_defer_running = 0 reset in the if-branch of update_dl_entity() to
handle the case where [4] D->A is followed by [1] A->B (lapsed
deadline). The intent was to ensure the server re-enters the zero-laxity
wait when restarted after the deadline has passed.

With Proxy Execution (PE), RT tasks proxied through the scheduler appear
to trigger frequent dl_server_start() calls with expired deadlines. When
this happens with dl_defer_running=1 (from a prior starvation episode),
Peter's fix forces the fair_server back through the ~950ms zero-laxity
wait each time.

In our testing (virtme-ng, 4 CPUs, 4G RAM, ksched_football):
  With this fix:    ~1s for all players to check in
  Without this fix: ~28s for all players to check in

The issue appears to be that the clearing in update_dl_entity()'s
if-branch is too aggressive for the PE use case.
replenish_dl_new_period() already handles this via its internal guard:

  if (dl_se->dl_defer && !dl_se->dl_defer_running) {
      dl_se->dl_throttled = 1;
      dl_se->dl_defer_armed = 1;
  }

When dl_defer_running=1 (starvation previously confirmed by the
zero-laxity timer), replenish_dl_new_period() skips arming the
zero-laxity timer, allowing the server to run directly. This seems
correct: once starvation has been confirmed, subsequent start/stop
cycles triggered by PE should not re-introduce the deferral delay.

Note: this is the same change as the HACK revert in John's PE series
(679ede58445 "HACK: Revert 'sched/deadline: Fix stuck dl_server'"),
but with the rationale documented.

The state machine comment is updated to reflect the actual behavior of
replenish_dl_new_period() when dl_defer_running=1.

Signed-off-by: zhidao su <suzhidao@xiaomi.com>
---
 kernel/sched/deadline.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 01754d699f0..30b03021fce 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1034,12 +1034,6 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
 			return;
 		}
 
-		/*
-		 * When [4] D->A is followed by [1] A->B, dl_defer_running
-		 * needs to be cleared, otherwise it will fail to properly
-		 * start the zero-laxity timer.
-		 */
-		dl_se->dl_defer_running = 0;
 		replenish_dl_new_period(dl_se, rq);
 	} else if (dl_server(dl_se) && dl_se->dl_defer) {
 		/*
@@ -1662,11 +1656,11 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec)
  *   enqueue_dl_entity()
  *     update_dl_entity(WAKEUP)
  *       if (dl_time_before() || dl_entity_overflow())
- *         dl_defer_running = 0;
  *         replenish_dl_new_period();
  *           // fwd period
- *           dl_throttled = 1;
- *           dl_defer_armed = 1;
+ *           if (!dl_defer_running)
+ *             dl_throttled = 1;
+ *             dl_defer_armed = 1;
  *       if (!dl_defer_running)
  *         dl_defer_armed = 1;
  *         dl_throttled = 1;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-03  8:12     ` [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch soolaugust
@ 2026-04-03 13:42       ` Peter Zijlstra
  2026-04-03 13:58         ` Andrea Righi
  2026-04-03 19:31         ` John Stultz
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2026-04-03 13:42 UTC (permalink / raw)
  To: soolaugust
  Cc: jstultz, juri.lelli, mingo, linux-kernel, zhidao su, Andrea Righi

On Fri, Apr 03, 2026 at 04:12:15PM +0800, soolaugust@gmail.com wrote:
> From: zhidao su <suzhidao@xiaomi.com>
> 
> commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server") added a
> dl_defer_running = 0 reset in the if-branch of update_dl_entity() to
> handle the case where [4] D->A is followed by [1] A->B (lapsed
> deadline). The intent was to ensure the server re-enters the zero-laxity
> wait when restarted after the deadline has passed.
> 
> With Proxy Execution (PE), RT tasks proxied through the scheduler appear
> to trigger frequent dl_server_start() calls with expired deadlines. When
> this happens with dl_defer_running=1 (from a prior starvation episode),
> Peter's fix forces the fair_server back through the ~950ms zero-laxity
> wait each time.
> 
> In our testing (virtme-ng, 4 CPUs, 4G RAM, ksched_football):
>   With this fix:    ~1s for all players to check in
>   Without this fix: ~28s for all players to check in
> 
> The issue appears to be that the clearing in update_dl_entity()'s
> if-branch is too aggressive for the PE use case.
> replenish_dl_new_period() already handles this via its internal guard:
> 
>   if (dl_se->dl_defer && !dl_se->dl_defer_running) {
>       dl_se->dl_throttled = 1;
>       dl_se->dl_defer_armed = 1;
>   }
> 
> When dl_defer_running=1 (starvation previously confirmed by the
> zero-laxity timer), replenish_dl_new_period() skips arming the
> zero-laxity timer, allowing the server to run directly. This seems
> correct: once starvation has been confirmed, subsequent start/stop
> cycles triggered by PE should not re-introduce the deferral delay.
> 
> Note: this is the same change as the HACK revert in John's PE series
> (679ede58445 "HACK: Revert 'sched/deadline: Fix stuck dl_server'"),
> but with the rationale documented.
> 
> The state machine comment is updated to reflect the actual behavior of
> replenish_dl_new_period() when dl_defer_running=1.
> 
> Signed-off-by: zhidao su <suzhidao@xiaomi.com>
> ---
>  kernel/sched/deadline.c | 12 +++---------
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 01754d699f0..30b03021fce 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1034,12 +1034,6 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
>  			return;
>  		}
>  
> -		/*
> -		 * When [4] D->A is followed by [1] A->B, dl_defer_running
> -		 * needs to be cleared, otherwise it will fail to properly
> -		 * start the zero-laxity timer.
> -		 */
> -		dl_se->dl_defer_running = 0;
>  		replenish_dl_new_period(dl_se, rq);
>  	} else if (dl_server(dl_se) && dl_se->dl_defer) {
>  		/*

This cannot be right; it will insta break Andrea's test case again.

And I cannot make sense of your explanation; how does PE cause what to
happen? You mention PROXY_WAKING, this then means proxy_force_return().

I suspect whatever it is you're seeing will go away once we delete that
thing, see this discussion:

  https://lkml.kernel.org/r/20260402155055.GV3738010@noisy.programming.kicks-ass.net


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-03 13:42       ` Peter Zijlstra
@ 2026-04-03 13:58         ` Andrea Righi
  2026-04-03 19:31         ` John Stultz
  1 sibling, 0 replies; 14+ messages in thread
From: Andrea Righi @ 2026-04-03 13:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: soolaugust, jstultz, juri.lelli, mingo, linux-kernel, zhidao su

Hello,

On Fri, Apr 03, 2026 at 03:42:56PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 03, 2026 at 04:12:15PM +0800, soolaugust@gmail.com wrote:
> > From: zhidao su <suzhidao@xiaomi.com>
> > 
> > commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server") added a
> > dl_defer_running = 0 reset in the if-branch of update_dl_entity() to
> > handle the case where [4] D->A is followed by [1] A->B (lapsed
> > deadline). The intent was to ensure the server re-enters the zero-laxity
> > wait when restarted after the deadline has passed.
> > 
> > With Proxy Execution (PE), RT tasks proxied through the scheduler appear
> > to trigger frequent dl_server_start() calls with expired deadlines. When
> > this happens with dl_defer_running=1 (from a prior starvation episode),
> > Peter's fix forces the fair_server back through the ~950ms zero-laxity
> > wait each time.
> > 
> > In our testing (virtme-ng, 4 CPUs, 4G RAM, ksched_football):
> >   With this fix:    ~1s for all players to check in
> >   Without this fix: ~28s for all players to check in
> > 
> > The issue appears to be that the clearing in update_dl_entity()'s
> > if-branch is too aggressive for the PE use case.
> > replenish_dl_new_period() already handles this via its internal guard:
> > 
> >   if (dl_se->dl_defer && !dl_se->dl_defer_running) {
> >       dl_se->dl_throttled = 1;
> >       dl_se->dl_defer_armed = 1;
> >   }
> > 
> > When dl_defer_running=1 (starvation previously confirmed by the
> > zero-laxity timer), replenish_dl_new_period() skips arming the
> > zero-laxity timer, allowing the server to run directly. This seems
> > correct: once starvation has been confirmed, subsequent start/stop
> > cycles triggered by PE should not re-introduce the deferral delay.
> > 
> > Note: this is the same change as the HACK revert in John's PE series
> > (679ede58445 "HACK: Revert 'sched/deadline: Fix stuck dl_server'"),
> > but with the rationale documented.
> > 
> > The state machine comment is updated to reflect the actual behavior of
> > replenish_dl_new_period() when dl_defer_running=1.
> > 
> > Signed-off-by: zhidao su <suzhidao@xiaomi.com>
> > ---
> >  kernel/sched/deadline.c | 12 +++---------
> >  1 file changed, 3 insertions(+), 9 deletions(-)
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 01754d699f0..30b03021fce 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1034,12 +1034,6 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
> >  			return;
> >  		}
> >  
> > -		/*
> > -		 * When [4] D->A is followed by [1] A->B, dl_defer_running
> > -		 * needs to be cleared, otherwise it will fail to properly
> > -		 * start the zero-laxity timer.
> > -		 */
> > -		dl_se->dl_defer_running = 0;
> >  		replenish_dl_new_period(dl_se, rq);
> >  	} else if (dl_server(dl_se) && dl_se->dl_defer) {
> >  		/*
> 
> This cannot be right; it will insta break Andrea's test case again.

I confirm that with this applied the sched_ext rt_stall selftest starts failing:

$ sudo ./runner -t rt_stall
...
# Runtime of EXT task (PID 2260) is 0.010000 seconds
# Runtime of RT task (PID 2261) is 5.000000 seconds
# EXT task got 0.20% of total runtime
not ok 4 FAIL: EXT task got less than 4.00% of runtime
[  218.923834] sched_ext: BPF scheduler "rt_stall" disabled (unregistered from user space)
# Planned tests != run tests (1 != 4)

> 
> And I cannot make sense of your explanation; how does PE cause what to
> happen? You mention PROXY_WAKING, this then means proxy_force_return().
> 
> I suspect whatever it is you're seeing will go away once we delete that
> thing, see this discussion:
> 
>   https://lkml.kernel.org/r/20260402155055.GV3738010@noisy.programming.kicks-ass.net
> 

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-03 13:42       ` Peter Zijlstra
  2026-04-03 13:58         ` Andrea Righi
@ 2026-04-03 19:31         ` John Stultz
  2026-04-03 22:46           ` Peter Zijlstra
  1 sibling, 1 reply; 14+ messages in thread
From: John Stultz @ 2026-04-03 19:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: soolaugust, juri.lelli, mingo, linux-kernel, zhidao su,
	Andrea Righi

On Fri, Apr 3, 2026 at 6:43 AM Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Apr 03, 2026 at 04:12:15PM +0800, soolaugust@gmail.com wrote:
> > From: zhidao su <suzhidao@xiaomi.com>
> >
> > commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server") added a
> > dl_defer_running = 0 reset in the if-branch of update_dl_entity() to
> > handle the case where [4] D->A is followed by [1] A->B (lapsed
> > deadline). The intent was to ensure the server re-enters the zero-laxity
> > wait when restarted after the deadline has passed.
> >
> > With Proxy Execution (PE), RT tasks proxied through the scheduler appear
> > to trigger frequent dl_server_start() calls with expired deadlines. When
> > this happens with dl_defer_running=1 (from a prior starvation episode),
> > Peter's fix forces the fair_server back through the ~950ms zero-laxity
> > wait each time.
> >
> > In our testing (virtme-ng, 4 CPUs, 4G RAM, ksched_football):
> >   With this fix:    ~1s for all players to check in
> >   Without this fix: ~28s for all players to check in
> >
> > The issue appears to be that the clearing in update_dl_entity()'s
> > if-branch is too aggressive for the PE use case.
> > replenish_dl_new_period() already handles this via its internal guard:
> >
> >   if (dl_se->dl_defer && !dl_se->dl_defer_running) {
> >       dl_se->dl_throttled = 1;
> >       dl_se->dl_defer_armed = 1;
> >   }
> >
> > When dl_defer_running=1 (starvation previously confirmed by the
> > zero-laxity timer), replenish_dl_new_period() skips arming the
> > zero-laxity timer, allowing the server to run directly. This seems
> > correct: once starvation has been confirmed, subsequent start/stop
> > cycles triggered by PE should not re-introduce the deferral delay.
> >
> > Note: this is the same change as the HACK revert in John's PE series
> > (679ede58445 "HACK: Revert 'sched/deadline: Fix stuck dl_server'"),
> > but with the rationale documented.
> >
> > The state machine comment is updated to reflect the actual behavior of
> > replenish_dl_new_period() when dl_defer_running=1.
> >
> > Signed-off-by: zhidao su <suzhidao@xiaomi.com>
> > ---
> >  kernel/sched/deadline.c | 12 +++---------
> >  1 file changed, 3 insertions(+), 9 deletions(-)
> >
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 01754d699f0..30b03021fce 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1034,12 +1034,6 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
> >                       return;
> >               }
> >
> > -             /*
> > -              * When [4] D->A is followed by [1] A->B, dl_defer_running
> > -              * needs to be cleared, otherwise it will fail to properly
> > -              * start the zero-laxity timer.
> > -              */
> > -             dl_se->dl_defer_running = 0;
> >               replenish_dl_new_period(dl_se, rq);
> >       } else if (dl_server(dl_se) && dl_se->dl_defer) {
> >               /*
>
> This cannot be right; it will insta break Andrea's test case again.
>
> And I cannot make sense of your explanation; how does PE cause what to
> happen? You mention PROXY_WAKING, this then means proxy_force_return().
>
> I suspect whatever it is you're seeing will go away once we delete that
> thing, see this discussion:
>
>   https://lkml.kernel.org/r/20260402155055.GV3738010@noisy.programming.kicks-ass.net
>

So unfortunately, this doesn't seem to be proxy-exec related at all.

It's almost identical to the issue I had awhile back with the
dl_server, when spawning RT spinner threads threads (as kthreadd
doesn't run as RT).
https://lore.kernel.org/lkml/CANDhNCqK3VBAxxWMsDez8xkX0vcTStWjRMR95pksUM6Q26Ctyw@mail.gmail.com/

Now, this is with my out-of-tree ksched_football test, which is a bit
quirky. I've updated a branch with my test here against 7.0-rc5:
  https://github.com/johnstultz-work/linux-dev/commits/ksched-football-dl_server-issue/

It runs at boot, but you can also re-run it via "echo 10 >
/sys/kernel/ksched_football/start_game"

The idea in the test is there is a high priority "Ref" thread that
spawns players from low priority to high that just spin on the cpu.
The issue is once NR_CPU low-prio players start, they starve
additional higher-prio players from starting (despite the highest
priority Ref spawning them) because kthreadd is not RT. So the test
effectively relies on the dl_server to kick in and let the rest of the
players spawn. This isn't actually what the test is testing, but just
how it gets ready to run the test.

Using a 8 cpu VM with CONFIG_SCHED_PROXY_EXEC disabled:

With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
reverted, I see the (expected, maybe) behavior where the starvation
lasts ~1second, then dl_server allows all the threads to spawn right
away, and then the test runs for 10 seconds.

See perfetto chart:
  https://ui.perfetto.dev/#!/?s=a729fd2dd4b224d6335c5b2e727dc1a1c302c11a
(click the Kernel-threads track and scroll down to see the test
threads named referee/defense/offense/crazy-fan)

With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
applied, it seems the dl_server boosting the kthreadd spawning is much
more staggered. Again we spin up NR_CPU low priority threads, and
there's  ~1second of starvation, then we spawn one of the mid threads,
and another second delay, then there's a two second delay befofe we
get the third running, then we get a small burst of 5 threads at once,
then it falls back to 1 second or more per thread as it spawns off the
rest. All in all it takes ~44 seconds just to spawn the threads before
running the test.

Perfetto chart:
  https://ui.perfetto.dev/#!/?s=ab8e487375d0c82ceea478ee4534a7189269c0d4

With higher cpu counts (64), the test effectively prevents the system
from booting (trips the hung task watchdog).

I haven't really diagnosed the issue, but it feels a little like the
dl_server is boosting until the fair rq is empty but then giving up
the rest of its time, so if a fair task runs repeatedly but for a very
short period of time, it won't get to run again until the next
dl_server period? Causing this rate-limiting one-task-per-second
effect for thread spawning? I still need to stare at the dl_server
logic some more.

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-03 19:31         ` John Stultz
@ 2026-04-03 22:46           ` Peter Zijlstra
  2026-04-03 22:51             ` John Stultz
  2026-04-04 10:22             ` Peter Zijlstra
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2026-04-03 22:46 UTC (permalink / raw)
  To: John Stultz
  Cc: soolaugust, juri.lelli, mingo, linux-kernel, zhidao su,
	Andrea Righi

On Fri, Apr 03, 2026 at 12:31:19PM -0700, John Stultz wrote:

> Using a 8 cpu VM with CONFIG_SCHED_PROXY_EXEC disabled:
> 
> With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
> reverted, I see the (expected, maybe) behavior where the starvation
> lasts ~1second, then dl_server allows all the threads to spawn right
> away, and then the test runs for 10 seconds.
> 
> See perfetto chart:
>   https://ui.perfetto.dev/#!/?s=a729fd2dd4b224d6335c5b2e727dc1a1c302c11a
> (click the Kernel-threads track and scroll down to see the test
> threads named referee/defense/offense/crazy-fan)
> 
> With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
> applied, it seems the dl_server boosting the kthreadd spawning is much
> more staggered. Again we spin up NR_CPU low priority threads, and
> there's  ~1second of starvation, then we spawn one of the mid threads,
> and another second delay, then there's a two second delay befofe we
> get the third running, then we get a small burst of 5 threads at once,
> then it falls back to 1 second or more per thread as it spawns off the
> rest. All in all it takes ~44 seconds just to spawn the threads before
> running the test.
> 
> Perfetto chart:
>   https://ui.perfetto.dev/#!/?s=ab8e487375d0c82ceea478ee4534a7189269c0d4
> 
> With higher cpu counts (64), the test effectively prevents the system
> from booting (trips the hung task watchdog).
> 
> I haven't really diagnosed the issue, but it feels a little like the
> dl_server is boosting until the fair rq is empty but then giving up
> the rest of its time, so if a fair task runs repeatedly but for a very
> short period of time, it won't get to run again until the next
> dl_server period? Causing this rate-limiting one-task-per-second
> effect for thread spawning? I still need to stare at the dl_server
> logic some more.

I'm getting a sense of deja-vu here. Didn't we cure this once before?

I'll go stare at this somewhere next week I suppose -- we have a long
weekend here.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-03 22:46           ` Peter Zijlstra
@ 2026-04-03 22:51             ` John Stultz
  2026-04-03 22:54               ` John Stultz
  2026-04-04 10:22             ` Peter Zijlstra
  1 sibling, 1 reply; 14+ messages in thread
From: John Stultz @ 2026-04-03 22:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: soolaugust, juri.lelli, mingo, linux-kernel, zhidao su,
	Andrea Righi

On Fri, Apr 3, 2026 at 3:46 PM Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Apr 03, 2026 at 12:31:19PM -0700, John Stultz wrote:
> > I haven't really diagnosed the issue, but it feels a little like the
> > dl_server is boosting until the fair rq is empty but then giving up
> > the rest of its time, so if a fair task runs repeatedly but for a very
> > short period of time, it won't get to run again until the next
> > dl_server period? Causing this rate-limiting one-task-per-second
> > effect for thread spawning? I still need to stare at the dl_server
> > logic some more.
>
> I'm getting a sense of deja-vu here. Didn't we cure this once before?

Oh yeah:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-03 22:51             ` John Stultz
@ 2026-04-03 22:54               ` John Stultz
  0 siblings, 0 replies; 14+ messages in thread
From: John Stultz @ 2026-04-03 22:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: soolaugust, juri.lelli, mingo, linux-kernel, zhidao su,
	Andrea Righi

On Fri, Apr 3, 2026 at 3:51 PM John Stultz <jstultz@google.com> wrote:
>
> On Fri, Apr 3, 2026 at 3:46 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > On Fri, Apr 03, 2026 at 12:31:19PM -0700, John Stultz wrote:
> > > I haven't really diagnosed the issue, but it feels a little like the
> > > dl_server is boosting until the fair rq is empty but then giving up
> > > the rest of its time, so if a fair task runs repeatedly but for a very
> > > short period of time, it won't get to run again until the next
> > > dl_server period? Causing this rate-limiting one-task-per-second
> > > effect for thread spawning? I still need to stare at the dl_server
> > > logic some more.
> >
> > I'm getting a sense of deja-vu here. Didn't we cure this once before?
>
> Oh yeah:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4ae8d9aa9f9dc7137ea5e564d79c5aa5af1bc45c

Oops, sorry, that was the other one. This was the previous fix for a
similar issue with the same test:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a3a70caf7906708bf9bbc80018752a6b36543808

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-03 22:46           ` Peter Zijlstra
  2026-04-03 22:51             ` John Stultz
@ 2026-04-04 10:22             ` Peter Zijlstra
  2026-04-05  8:37               ` zhidao su
  2026-04-06 20:01               ` John Stultz
  1 sibling, 2 replies; 14+ messages in thread
From: Peter Zijlstra @ 2026-04-04 10:22 UTC (permalink / raw)
  To: John Stultz
  Cc: soolaugust, juri.lelli, mingo, linux-kernel, zhidao su,
	Andrea Righi

On Sat, Apr 04, 2026 at 12:46:10AM +0200, Peter Zijlstra wrote:
> On Fri, Apr 03, 2026 at 12:31:19PM -0700, John Stultz wrote:
> 
> > Using a 8 cpu VM with CONFIG_SCHED_PROXY_EXEC disabled:
> > 
> > With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
> > reverted, I see the (expected, maybe) behavior where the starvation
> > lasts ~1second, then dl_server allows all the threads to spawn right
> > away, and then the test runs for 10 seconds.
> > 
> > See perfetto chart:
> >   https://ui.perfetto.dev/#!/?s=a729fd2dd4b224d6335c5b2e727dc1a1c302c11a
> > (click the Kernel-threads track and scroll down to see the test
> > threads named referee/defense/offense/crazy-fan)
> > 
> > With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
> > applied, it seems the dl_server boosting the kthreadd spawning is much
> > more staggered. Again we spin up NR_CPU low priority threads, and
> > there's  ~1second of starvation, then we spawn one of the mid threads,
> > and another second delay, then there's a two second delay befofe we
> > get the third running, then we get a small burst of 5 threads at once,
> > then it falls back to 1 second or more per thread as it spawns off the
> > rest. All in all it takes ~44 seconds just to spawn the threads before
> > running the test.
> > 
> > Perfetto chart:
> >   https://ui.perfetto.dev/#!/?s=ab8e487375d0c82ceea478ee4534a7189269c0d4
> > 
> > With higher cpu counts (64), the test effectively prevents the system
> > from booting (trips the hung task watchdog).
> > 
> > I haven't really diagnosed the issue, but it feels a little like the
> > dl_server is boosting until the fair rq is empty but then giving up
> > the rest of its time, so if a fair task runs repeatedly but for a very
> > short period of time, it won't get to run again until the next
> > dl_server period? Causing this rate-limiting one-task-per-second
> > effect for thread spawning? I still need to stare at the dl_server
> > logic some more.
> 
> I'm getting a sense of deja-vu here. Didn't we cure this once before?
> 
> I'll go stare at this somewhere next week I suppose -- we have a long
> weekend here.

Random brain wave...

Since the dl_server is LLF (deferred), it will pretty much always trip
the dl_entity_overflow() when interrupted, right? Does it make sense to
use the revised wake-up rule for it, when appropriate?

---
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index d08b00429323..674de6a48551 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1027,7 +1027,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
 	if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
 	    dl_entity_overflow(dl_se, rq_clock(rq))) {
 
-		if (unlikely(!dl_is_implicit(dl_se) &&
+		if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) &&
 			     !dl_time_before(dl_se->deadline, rq_clock(rq)) &&
 			     !is_dl_boosted(dl_se))) {
 			update_dl_revised_wakeup(dl_se, rq);

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-04 10:22             ` Peter Zijlstra
@ 2026-04-05  8:37               ` zhidao su
  2026-04-06 20:01               ` John Stultz
  1 sibling, 0 replies; 14+ messages in thread
From: zhidao su @ 2026-04-05  8:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: soolaugust, jstultz, juri.lelli, mingo, linux-kernel,
	Andrea Righi

On Sat, Apr 04, 2026 at 12:22:44PM +0200, Peter Zijlstra wrote:
> Random brain wave...
>
> Since the dl_server is LLF (deferred), it will pretty much always trip
> the dl_entity_overflow() when interrupted, right? Does it make sense to
> use the revised wake-up rule for it, when appropriate?

Thanks for the brain wave!

Tested your diff — locktorture boot time drops to ~13s (vs ~37-52s with
the hack revert) and ksched_football ball_pos stays at 0.

I traced update_dl_entity() and found the else-branch hits all show
dl_defer_running=1 with dl_throttled=0 and dl_defer_armed=0 — that's
the [D:running] state, so the guard there is correct. The actual stale
case is in the if-branch (overflow=1, deadline not past, dl_defer_running=1),
which your diff handles via revised wakeup.

That also means our original else-branch fix was wrong — unconditionally
clearing dl_defer_running in [D:running] would corrupt a legitimately
running server's state.

Is your revised wakeup diff the intended replacement for 115135422562?
If so, happy to test further or help draft it into a proper patch.

Signed-off-by: zhidao su <suzhidao@xiaomi.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-04 10:22             ` Peter Zijlstra
  2026-04-05  8:37               ` zhidao su
@ 2026-04-06 20:01               ` John Stultz
  2026-04-06 20:03                 ` John Stultz
  1 sibling, 1 reply; 14+ messages in thread
From: John Stultz @ 2026-04-06 20:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: soolaugust, juri.lelli, mingo, linux-kernel, zhidao su,
	Andrea Righi

On Sat, Apr 4, 2026 at 3:22 AM Peter Zijlstra <peterz@infradead.org> wrote:
> On Sat, Apr 04, 2026 at 12:46:10AM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 03, 2026 at 12:31:19PM -0700, John Stultz wrote:
> >
> > > Using a 8 cpu VM with CONFIG_SCHED_PROXY_EXEC disabled:
> > >
> > > With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
> > > reverted, I see the (expected, maybe) behavior where the starvation
> > > lasts ~1second, then dl_server allows all the threads to spawn right
> > > away, and then the test runs for 10 seconds.
> > >
> > > See perfetto chart:
> > >   https://ui.perfetto.dev/#!/?s=a729fd2dd4b224d6335c5b2e727dc1a1c302c11a
> > > (click the Kernel-threads track and scroll down to see the test
> > > threads named referee/defense/offense/crazy-fan)
> > >
> > > With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
> > > applied, it seems the dl_server boosting the kthreadd spawning is much
> > > more staggered. Again we spin up NR_CPU low priority threads, and
> > > there's  ~1second of starvation, then we spawn one of the mid threads,
> > > and another second delay, then there's a two second delay befofe we
> > > get the third running, then we get a small burst of 5 threads at once,
> > > then it falls back to 1 second or more per thread as it spawns off the
> > > rest. All in all it takes ~44 seconds just to spawn the threads before
> > > running the test.
> > >
> > > Perfetto chart:
> > >   https://ui.perfetto.dev/#!/?s=ab8e487375d0c82ceea478ee4534a7189269c0d4
> > >
> > > With higher cpu counts (64), the test effectively prevents the system
> > > from booting (trips the hung task watchdog).
> > >
> > > I haven't really diagnosed the issue, but it feels a little like the
> > > dl_server is boosting until the fair rq is empty but then giving up
> > > the rest of its time, so if a fair task runs repeatedly but for a very
> > > short period of time, it won't get to run again until the next
> > > dl_server period? Causing this rate-limiting one-task-per-second
> > > effect for thread spawning? I still need to stare at the dl_server
> > > logic some more.
> >
> > I'm getting a sense of deja-vu here. Didn't we cure this once before?
> >
> > I'll go stare at this somewhere next week I suppose -- we have a long
> > weekend here.
>
> Random brain wave...
>
> Since the dl_server is LLF (deferred), it will pretty much always trip
> the dl_entity_overflow() when interrupted, right? Does it make sense to
> use the revised wake-up rule for it, when appropriate?
>
> ---
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index d08b00429323..674de6a48551 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1027,7 +1027,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
>         if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
>             dl_entity_overflow(dl_se, rq_clock(rq))) {
>
> -               if (unlikely(!dl_is_implicit(dl_se) &&
> +               if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) &&
>                              !dl_time_before(dl_se->deadline, rq_clock(rq)) &&
>                              !is_dl_boosted(dl_se))) {
>                         update_dl_revised_wakeup(dl_se, rq);

Hey Peter!
  So yeah, this does seem to resolve the main issue with the test.
After ~1second of the initial low-priority RT tasks starving the CPU,
all the other threads spawn in quick succession, and it doesnt' delay
us getting to run the test.

The only detail I might mention, is that looking at perfetto charts,
comparing this fix vs reverting 115135422562 ("sched/deadline: Fix
'stuck' dl_server"), is that during the ksched_football test (where we
have a lot of RT spinners running), other very short-running non-RT
kworker threads seem to have more 1 second delays where they are
runnable with this solution:
  https://ui.perfetto.dev/#!/?s=fbc54ab8b823fc3d906eb16b9bcfb5b1fcbadf09

With 115135422562 reverted, they seem to get to run fairly quickly
despite the RT spinners.
  https://ui.perfetto.dev/#!/?s=35aa5b8e395ee5cf6fe22cc7f8c7e0cd8f4fcec5

This may be in fact the issue being fixed with 115135422562 (I still
find the details opaque), and I don't see any delays much larger than
a second. So it probably isn't an issue, but just wanted to highlight
it.

thanks
-john

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
  2026-04-06 20:01               ` John Stultz
@ 2026-04-06 20:03                 ` John Stultz
  0 siblings, 0 replies; 14+ messages in thread
From: John Stultz @ 2026-04-06 20:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: soolaugust, juri.lelli, mingo, linux-kernel, zhidao su,
	Andrea Righi

On Mon, Apr 6, 2026 at 1:01 PM John Stultz <jstultz@google.com> wrote:
>
> On Sat, Apr 4, 2026 at 3:22 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > On Sat, Apr 04, 2026 at 12:46:10AM +0200, Peter Zijlstra wrote:
> > > On Fri, Apr 03, 2026 at 12:31:19PM -0700, John Stultz wrote:
> > >
> > > > Using a 8 cpu VM with CONFIG_SCHED_PROXY_EXEC disabled:
> > > >
> > > > With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
> > > > reverted, I see the (expected, maybe) behavior where the starvation
> > > > lasts ~1second, then dl_server allows all the threads to spawn right
> > > > away, and then the test runs for 10 seconds.
> > > >
> > > > See perfetto chart:
> > > >   https://ui.perfetto.dev/#!/?s=a729fd2dd4b224d6335c5b2e727dc1a1c302c11a
> > > > (click the Kernel-threads track and scroll down to see the test
> > > > threads named referee/defense/offense/crazy-fan)
> > > >
> > > > With commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server")
> > > > applied, it seems the dl_server boosting the kthreadd spawning is much
> > > > more staggered. Again we spin up NR_CPU low priority threads, and
> > > > there's  ~1second of starvation, then we spawn one of the mid threads,
> > > > and another second delay, then there's a two second delay befofe we
> > > > get the third running, then we get a small burst of 5 threads at once,
> > > > then it falls back to 1 second or more per thread as it spawns off the
> > > > rest. All in all it takes ~44 seconds just to spawn the threads before
> > > > running the test.
> > > >
> > > > Perfetto chart:
> > > >   https://ui.perfetto.dev/#!/?s=ab8e487375d0c82ceea478ee4534a7189269c0d4
> > > >
> > > > With higher cpu counts (64), the test effectively prevents the system
> > > > from booting (trips the hung task watchdog).
> > > >
> > > > I haven't really diagnosed the issue, but it feels a little like the
> > > > dl_server is boosting until the fair rq is empty but then giving up
> > > > the rest of its time, so if a fair task runs repeatedly but for a very
> > > > short period of time, it won't get to run again until the next
> > > > dl_server period? Causing this rate-limiting one-task-per-second
> > > > effect for thread spawning? I still need to stare at the dl_server
> > > > logic some more.
> > >
> > > I'm getting a sense of deja-vu here. Didn't we cure this once before?
> > >
> > > I'll go stare at this somewhere next week I suppose -- we have a long
> > > weekend here.
> >
> > Random brain wave...
> >
> > Since the dl_server is LLF (deferred), it will pretty much always trip
> > the dl_entity_overflow() when interrupted, right? Does it make sense to
> > use the revised wake-up rule for it, when appropriate?
> >
> > ---
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index d08b00429323..674de6a48551 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1027,7 +1027,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
> >         if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
> >             dl_entity_overflow(dl_se, rq_clock(rq))) {
> >
> > -               if (unlikely(!dl_is_implicit(dl_se) &&
> > +               if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) &&
> >                              !dl_time_before(dl_se->deadline, rq_clock(rq)) &&
> >                              !is_dl_boosted(dl_se))) {
> >                         update_dl_revised_wakeup(dl_se, rq);
>
> Hey Peter!
>   So yeah, this does seem to resolve the main issue with the test.
> After ~1second of the initial low-priority RT tasks starving the CPU,
> all the other threads spawn in quick succession, and it doesnt' delay
> us getting to run the test.

Forgot to add:
Tested-by: John Stultz <jstultz@google.com>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2026-04-06 20:03 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-02 13:30 [PATCH] sched/deadline: Fix stale dl_defer_running in dl_server else-branch soolaugust
2026-04-03  0:05 ` John Stultz
2026-04-03  1:30   ` John Stultz
2026-04-03  8:12     ` [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch soolaugust
2026-04-03 13:42       ` Peter Zijlstra
2026-04-03 13:58         ` Andrea Righi
2026-04-03 19:31         ` John Stultz
2026-04-03 22:46           ` Peter Zijlstra
2026-04-03 22:51             ` John Stultz
2026-04-03 22:54               ` John Stultz
2026-04-04 10:22             ` Peter Zijlstra
2026-04-05  8:37               ` zhidao su
2026-04-06 20:01               ` John Stultz
2026-04-06 20:03                 ` John Stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox