From: Juri Lelli <juri.lelli@redhat.com>
To: "Furkan Çalışkan" <frn1furkan10@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Andrea Righi <arighi@nvidia.com>,
Frederic Weisbecker <frederic@kernel.org>,
linux-kernel@vger.kernel.org,
David Haufe <dhaufe@simplextrading.com>,
Cao Ruichuang <create0818@163.com>
Subject: Re: [PATCH] sched/deadline: Make dl-server nohz full aware
Date: Tue, 12 May 2026 14:27:57 +0200 [thread overview]
Message-ID: <agMczRJrRiur_ffg@jlelli-thinkpadt14gen4.remote.csb> (raw)
In-Reply-To: <cfbbec12-45e4-432d-99dc-383dc217f91d@gmail.com>
Hi Furkan,
On 12/05/26 13:06, Furkan Çalışkan wrote:
> Hi Juri,
>
> On 5/12/26 12:02, Juri Lelli wrote:
> > The dl_server_timer() causes spurious IPIs on nohz_full cores, breaking
> > isolation guarantees. The timer executes on a housekeeping core and
> > eventually calls tick_nohz_dep_set_cpu(), sending IPIs to isolated cores
> > even when only a single task is running.
> >
> > The problem is that dl-servers are not coordinated with nohz_full tick
> > state. Timers can fire and send IPIs to otherwise undisturbed cores.
> >
> > Fix by managing servers in sched_can_stop_tick():
> >
> > - When RT tasks run with CFS/SCX tasks, start the appropriate server
> > and keep the tick running
> > - When only RT tasks remain, stop all servers and allow tick to stop
> > (except for >1 RR tasks which need the tick for round-robin)
> > - When only CFS/SCX tasks remain, stop all servers before stopping tick
> >
> > Introduce dl_servers_stop_all() to reduce duplication and abstract
> > server management from core.c. Unify RT handling into one block that
> > handles both RR and FIFO cases.
> >
> > Fixes: 557a6bfc662c ("sched/fair: Add trivial fair server")
> > Reported-by: David Haufe <dhaufe@simplextrading.com>
> > Closes: https://lore.kernel.org/lkml/CAKJHwtOw_G67edzuHVtL1xC5Vyt6StcZzihtDd0yaKudW=rwVw@mail.gmail.com
> > Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
> > ---
> > I had to modify my first original attempt at fixing this (please take a
> > look at the linked report/discussion) to also take SCX into
> > consideration.
> >
> > FYI, I temporarily pushed the script I'm using to repro and verify the
> > fix here
> >
> > https://github.com/jlelli/sched-deadline-tests/blob/master/test-dlserver-nohz.sh
> > ---
> > kernel/sched/core.c | 43 +++++++++++++++++++++++--------------------
> > kernel/sched/deadline.c | 14 ++++++++++++++
> > kernel/sched/sched.h | 1 +
> > 3 files changed, 38 insertions(+), 20 deletions(-)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index b905805bbcbe4..98759255c306b 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -1414,30 +1414,35 @@ static inline bool __need_bw_check(struct rq *rq, struct task_struct *p)
> >
> > bool sched_can_stop_tick(struct rq *rq)
> > {
> > - int fifo_nr_running;
> > -
> > /* Deadline tasks, even if single, need the tick */
> > if (rq->dl.dl_nr_running)
> > return false;
> >
> > /*
> > - * If there are more than one RR tasks, we need the tick to affect the
> > - * actual RR behaviour.
> > + * If there are RT tasks, we may need the tick (for >1 RR tasks),
> > + * but we must also service lower-priority CFS/SCX tasks via dl-servers.
> > */
> > - if (rq->rt.rr_nr_running) {
> > - if (rq->rt.rr_nr_running == 1)
> > - return true;
> > - else
> > + if (rq->rt.rt_nr_running) {
> > + if (rq->cfs.h_nr_queued) {
> > + dl_server_start(&rq->fair_server);
> > + return false;
> > + }
> > +#ifdef CONFIG_SCHED_CLASS_EXT
> > + if (rq->scx.nr_running) {
> > + dl_server_start(&rq->ext_server);
> > + return false;
> > + }
> > +#endif
>
> In the above block, the CFS and SCX server start paths are mutually exclusive.
> If both cfs.h_nr_queued and scx.nr_running are non-zero at the same time, only
> fair_server gets started and ext_server remains stopped. Could that leave SCX
> tasks without server coverage in a mixed CFS+SCX+RT scenario?
Indeed there is the partial switch mode to consider. Can fix in the next
version.
Thanks,
Juri
next prev parent reply other threads:[~2026-05-12 12:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 9:02 [PATCH] sched/deadline: Make dl-server nohz full aware Juri Lelli
2026-05-12 10:06 ` Furkan Çalışkan
2026-05-12 12:27 ` Juri Lelli [this message]
2026-05-12 14:03 ` Frederic Weisbecker
2026-05-12 15:31 ` Juri Lelli
2026-05-12 14:55 ` Andrea Righi
2026-05-12 15:34 ` Juri Lelli
2026-05-13 6:16 ` Juri Lelli
2026-05-13 6:38 ` Andrea Righi
2026-05-13 12:09 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agMczRJrRiur_ffg@jlelli-thinkpadt14gen4.remote.csb \
--to=juri.lelli@redhat.com \
--cc=arighi@nvidia.com \
--cc=bsegall@google.com \
--cc=create0818@163.com \
--cc=dhaufe@simplextrading.com \
--cc=dietmar.eggemann@arm.com \
--cc=frederic@kernel.org \
--cc=frn1furkan10@gmail.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.