From: soolaugust@gmail.com
To: jstultz@google.com, bristot@redhat.com
Cc: peterz@infradead.org, mingo@redhat.com,
linux-kernel@vger.kernel.org, arighi@nvidia.com,
Zhidao Su <suzhidao@xiaomi.com>
Subject: [PATCH] sched/deadline: Fix stale dl_defer_running in dl_server else-branch
Date: Thu, 2 Apr 2026 21:30:50 +0800 [thread overview]
Message-ID: <20260402133050.607777-1-soolaugust@gmail.com> (raw)
From: Zhidao Su <suzhidao@xiaomi.com>
Peter's fix (115135422562) cleared dl_defer_running in the if-branch of
update_dl_entity() (deadline expired/overflow). This ensures
replenish_dl_new_period() always arms the zero-laxity timer. However,
with PROXY_WAKING, re-activation hits the else-branch (same-period,
deadline not expired), where dl_defer_running from a prior starvation
episode can be stale.
During PROXY_WAKING CPU return-migration, proxy_force_return() migrates
the task to a new CPU via deactivate_task()+attach_one_task(). The
enqueue path on the new CPU triggers enqueue_task_fair() which calls
dl_server_start() for the fair_server. Crucially, this re-activation
does NOT call dl_server_stop() first, so dl_defer_running retains its
prior value. If a prior starvation episode left dl_defer_running=1,
and the server is re-activated within the same period:
[4] D->A: dl_server_stop() clears flags but may be skipped when
dl_server_active=0 (server was already stopped before
return-migration triggered dl_server_start())
[1] A->B: dl_server_start() -> enqueue_dl_entity(WAKEUP)
-> update_dl_entity() enters else-branch
-> 'if (!dl_defer_running)' guard fires, skips
dl_defer_armed=1 / dl_throttled=1
-> server enqueued into [D] state directly
-> update_curr_dl_se() consumes runtime
-> start_dl_timer() with dl_defer_armed=0 (slow path)
-> boot time increases ~72%
Fix: in the else-branch, unconditionally clear dl_defer_running and always
set dl_defer_armed=1 / dl_throttled=1. This ensures every same-period
re-activation properly re-arms the zero-laxity timer, regardless of whether
a prior starvation episode had set dl_defer_running.
The if-branch (deadline expired) is left untouched:
replenish_dl_new_period() contains its own guard ('if (!dl_defer_running)')
that arms the zero-laxity timer only when dl_defer_running=0. With
PROXY_WAKING, dl_defer_running=1 in the deadline-expired path means a
genuine starvation episode is ongoing, so the server can skip the
zero-laxity wait and enter [D] directly. Clearing dl_defer_running here
(as Peter's fix did) forces every PROXY_WAKING deadline-expired
re-activation through the ~950ms zero-laxity wait.
Measured boot time to first ksched_football event (4 CPUs, 4G):
This fix: ~15-20s
Without fix (stale dl_defer_running): ~43-62s (+72-200%)
Note: Andrea Righi's v2 patch addresses the same symptom by clearing
dl_defer_running in dl_server_stop(). However, dl_server_stop() is not
called during PROXY_WAKING return-migration (proxy_force_return() calls
dl_server_start() directly without dl_server_stop()). This fix targets
the correct location: the else-branch of update_dl_entity().
Signed-off-by: Zhidao Su <suzhidao@xiaomi.com>
---
kernel/sched/deadline.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 01754d699f0..b2bcd34f3ea 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1034,22 +1034,22 @@ static void update_dl_entity(struct sched_dl_entity *dl_se)
return;
}
- /*
- * When [4] D->A is followed by [1] A->B, dl_defer_running
- * needs to be cleared, otherwise it will fail to properly
- * start the zero-laxity timer.
- */
- dl_se->dl_defer_running = 0;
replenish_dl_new_period(dl_se, rq);
} else if (dl_server(dl_se) && dl_se->dl_defer) {
/*
- * The server can still use its previous deadline, so check if
- * it left the dl_defer_running state.
+ * The server can still use its previous deadline. Clear
+ * dl_defer_running unconditionally: a stale dl_defer_running=1
+ * from a prior starvation episode (set in dl_server_timer() when
+ * the zero-laxity timer fires) must not carry over to the next
+ * activation. PROXY_WAKING return-migration (proxy_force_return)
+ * re-activates the server via attach_one_task()->enqueue_task_fair()
+ * without calling dl_server_stop() first, so the flag is not
+ * cleared in the [4] D->A path for that case.
+ * Always re-arm the zero-laxity timer on each re-activation.
*/
- if (!dl_se->dl_defer_running) {
- dl_se->dl_defer_armed = 1;
- dl_se->dl_throttled = 1;
- }
+ dl_se->dl_defer_running = 0;
+ dl_se->dl_defer_armed = 1;
+ dl_se->dl_throttled = 1;
}
}
--
2.43.0
next reply other threads:[~2026-04-02 13:30 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-02 13:30 soolaugust [this message]
2026-04-03 0:05 ` [PATCH] sched/deadline: Fix stale dl_defer_running in dl_server else-branch John Stultz
2026-04-03 1:30 ` John Stultz
2026-04-03 8:12 ` [PATCH] sched/deadline: Fix stale dl_defer_running in update_dl_entity() if-branch soolaugust
2026-04-03 13:42 ` Peter Zijlstra
2026-04-03 13:58 ` Andrea Righi
2026-04-03 19:31 ` John Stultz
2026-04-03 22:46 ` Peter Zijlstra
2026-04-03 22:51 ` John Stultz
2026-04-03 22:54 ` John Stultz
2026-04-04 10:22 ` Peter Zijlstra
2026-04-05 8:37 ` zhidao su
2026-04-06 20:01 ` John Stultz
2026-04-06 20:03 ` John Stultz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260402133050.607777-1-soolaugust@gmail.com \
--to=soolaugust@gmail.com \
--cc=arighi@nvidia.com \
--cc=bristot@redhat.com \
--cc=jstultz@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=suzhidao@xiaomi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox