From: Andrea Righi <arighi@nvidia.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: soolaugust@gmail.com, jstultz@google.com, juri.lelli@redhat.com,
mingo@redhat.com, linux-kernel@vger.kernel.org,
zhidao su <suzhidao@xiaomi.com>
Subject: Re: [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch
Date: Fri, 3 Apr 2026 15:58:53 +0200 [thread overview]
Message-ID: <ac_Hnd4VahbwCRWI@gpd4> (raw)
In-Reply-To: <20260403134256.GH3558198@noisy.programming.kicks-ass.net>
Hello,
On Fri, Apr 03, 2026 at 03:42:56PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 03, 2026 at 04:12:15PM +0800, soolaugust@gmail.com wrote:
> > From: zhidao su <suzhidao@xiaomi.com>
> >
> > commit 115135422562 ("sched/deadline: Fix 'stuck' dl_server") added a
> > dl_defer_running = 0 reset in the if-branch of update_dl_entity() to
> > handle the case where [4] D->A is followed by [1] A->B (lapsed
> > deadline). The intent was to ensure the server re-enters the zero-laxity
> > wait when restarted after the deadline has passed.
> >
> > With Proxy Execution (PE), RT tasks proxied through the scheduler appear
> > to trigger frequent dl_server_start() calls with expired deadlines. When
> > this happens with dl_defer_running=1 (from a prior starvation episode),
> > Peter's fix forces the fair_server back through the ~950ms zero-laxity
> > wait each time.
> >
> > In our testing (virtme-ng, 4 CPUs, 4G RAM, ksched_football):
> > With this fix: ~1s for all players to check in
> > Without this fix: ~28s for all players to check in
> >
> > The issue appears to be that the clearing in update_dl_entity()'s
> > if-branch is too aggressive for the PE use case.
> > replenish_dl_new_period() already handles this via its internal guard:
> >
> > if (dl_se->dl_defer && !dl_se->dl_defer_running) {
> > dl_se->dl_throttled = 1;
> > dl_se->dl_defer_armed = 1;
> > }
> >
> > When dl_defer_running=1 (starvation previously confirmed by the
> > zero-laxity timer), replenish_dl_new_period() skips arming the
> > zero-laxity timer, allowing the server to run directly. This seems
> > correct: once starvation has been confirmed, subsequent start/stop
> > cycles triggered by PE should not re-introduce the deferral delay.
> >
> > Note: this is the same change as the HACK revert in John's PE series
> > (679ede58445 "HACK: Revert 'sched/deadline: Fix stuck dl_server'"),
> > but with the rationale documented.
> >
> > The state machine comment is updated to reflect the actual behavior of
> > replenish_dl_new_period() when dl_defer_running=1.
> >
> > Signed-off-by: zhidao su <suzhidao@xiaomi.com>
> > ---
> > kernel/sched/deadline.c | 12 +++---------
> > 1 file changed, 3 insertions(+), 9 deletions(-)
> >
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index 01754d699f0..30b03021fce 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1034,12 +1034,6 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
> > return;
> > }
> >
> > - /*
> > - * When [4] D->A is followed by [1] A->B, dl_defer_running
> > - * needs to be cleared, otherwise it will fail to properly
> > - * start the zero-laxity timer.
> > - */
> > - dl_se->dl_defer_running = 0;
> > replenish_dl_new_period(dl_se, rq);
> > } else if (dl_server(dl_se) && dl_se->dl_defer) {
> > /*
>
> This cannot be right; it will insta break Andrea's test case again.
I confirm that with this applied the sched_ext rt_stall selftest starts failing:
$ sudo ./runner -t rt_stall
...
# Runtime of EXT task (PID 2260) is 0.010000 seconds
# Runtime of RT task (PID 2261) is 5.000000 seconds
# EXT task got 0.20% of total runtime
not ok 4 FAIL: EXT task got less than 4.00% of runtime
[ 218.923834] sched_ext: BPF scheduler "rt_stall" disabled (unregistered from user space)
# Planned tests != run tests (1 != 4)
>
> And I cannot make sense of your explanation; how does PE cause what to
> happen? You mention PROXY_WAKING, this then means proxy_force_return().
>
> I suspect whatever it is you're seeing will go away once we delete that
> thing, see this discussion:
>
> https://lkml.kernel.org/r/20260402155055.GV3738010@noisy.programming.kicks-ass.net
>
Thanks,
-Andrea
next prev parent reply other threads:[~2026-04-03 13:59 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-02 13:30 [PATCH] sched/deadline: Fix stale dl_defer_running in dl_server else-branch soolaugust
2026-04-03 0:05 ` John Stultz
2026-04-03 1:30 ` John Stultz
2026-04-03 8:12 ` [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch soolaugust
2026-04-03 13:42 ` Peter Zijlstra
2026-04-03 13:58 ` Andrea Righi [this message]
2026-04-03 19:31 ` John Stultz
2026-04-03 22:46 ` Peter Zijlstra
2026-04-03 22:51 ` John Stultz
2026-04-03 22:54 ` John Stultz
2026-04-04 10:22 ` Peter Zijlstra
2026-04-05 8:37 ` zhidao su
2026-04-06 20:01 ` John Stultz
2026-04-06 20:03 ` John Stultz
2026-04-07 12:22 ` Juri Lelli
2026-04-07 15:00 ` Peter Zijlstra
2026-04-08 11:20 ` [tip: sched/urgent] sched/deadline: Use revised wakeup rule for dl_server tip-bot2 for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ac_Hnd4VahbwCRWI@gpd4 \
--to=arighi@nvidia.com \
--cc=jstultz@google.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=soolaugust@gmail.com \
--cc=suzhidao@xiaomi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.