From: Andrea Righi <arighi@nvidia.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Tejun Heo <tj@kernel.org>, Joel Fernandes <joelagnelf@nvidia.com>,
David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Daniel Hodges <hodgesd@meta.com>,
Christian Loehle <christian.loehle@arm.com>,
Emil Tsalapatis <emil@etsalapatis.com>,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/7] sched/debug: Stop and start server based on if it was active
Date: Mon, 2 Feb 2026 23:37:31 +0100 [thread overview]
Message-ID: <aYEnK57vUwtgGXwH@gpd4> (raw)
In-Reply-To: <20260202211723.GF1395416@noisy.programming.kicks-ass.net>
On Mon, Feb 02, 2026 at 10:17:23PM +0100, Peter Zijlstra wrote:
> On Mon, Feb 02, 2026 at 10:13:26PM +0100, Peter Zijlstra wrote:
> > On Mon, Jan 26, 2026 at 10:59:01AM +0100, Andrea Righi wrote:
> > > From: Joel Fernandes <joelagnelf@nvidia.com>
> > >
> > > Currently the DL server interface for applying parameters checks
> > > CFS-internals to identify if the server is active. This is error-prone
> > > and makes it difficult when adding new servers in the future.
> > >
> > > Fix it, by using dl_server_active() which is also used by the DL server
> > > code to determine if the DL server was started.
> > >
> > > Tested-by: Christian Loehle <christian.loehle@arm.com>
> > > Acked-by: Tejun Heo <tj@kernel.org>
> > > Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
> > > Reviewed-by: Andrea Righi <arighi@nvidia.com>
> > > Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> > > ---
> > > kernel/sched/debug.c | 11 ++++++++---
> > > 1 file changed, 8 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> > > index 93f009e1076d8..dd793f8f3858a 100644
> > > --- a/kernel/sched/debug.c
> > > +++ b/kernel/sched/debug.c
> > > @@ -354,6 +354,8 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > > return err;
> > >
> > > scoped_guard (rq_lock_irqsave, rq) {
> > > + bool is_active;
> > > +
> > > runtime = rq->fair_server.dl_runtime;
> > > period = rq->fair_server.dl_period;
> > >
> > > @@ -376,8 +378,11 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > > return -EINVAL;
> > > }
> > >
> > > - update_rq_clock(rq);
> > > - dl_server_stop(&rq->fair_server);
> > > + is_active = dl_server_active(&rq->fair_server);
> > > + if (is_active) {
> > > + update_rq_clock(rq);
> > > + dl_server_stop(&rq->fair_server);
> > > + }
> > >
> > > retval = dl_server_apply_params(&rq->fair_server, runtime, period, 0);
> > >
> > > @@ -385,7 +390,7 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > > printk_deferred("Fair server disabled in CPU %d, system may crash due to starvation.\n",
> > > cpu_of(rq));
> > >
> > > - if (rq->cfs.h_nr_queued)
> > > + if (is_active && runtime)
> > > dl_server_start(&rq->fair_server);
> > >
> > > if (retval < 0)
> >
> > Suppose runtime was 0, and gets incremented while there are already
> > tasks enqueued, then the above isn't going to DTRT.
>
Right that's a bug, if the user sets runtime=0, tasks can be enqueued while
the server is disabled, when the user later sets runtime>0 to re-enable the
server, the current code doesn't start it.
Also, as discussed with Juri at LPC, we likely need a better interface than
debugfs for configuring these parameters (possibly a topic for OSPM).
> Something like so perhaps?
>
> ---
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index 59e650f9d436..884bdf7a292f 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -340,7 +340,7 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
> long cpu = (long) ((struct seq_file *) filp->private_data)->private;
> struct rq *rq = cpu_rq(cpu);
> struct sched_dl_entity *dl_se = (struct sched_dl_entity *)server;
> - u64 runtime, period;
> + u64 old_runtime, runtime, period;
> int retval = 0;
> size_t err;
> u64 value;
> @@ -352,7 +352,7 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
> scoped_guard (rq_lock_irqsave, rq) {
> bool is_active;
>
> - runtime = dl_se->dl_runtime;
> + old_runtime = runtime = dl_se->dl_runtime;
> period = dl_se->dl_period;
>
> switch (param) {
> @@ -382,17 +382,20 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
>
> retval = dl_server_apply_params(dl_se, runtime, period, 0);
>
> - if (!runtime)
> - printk_deferred("%s server disabled in CPU %d, system may crash due to starvation.\n",
> - server == &rq->fair_server ? "Fair" : "Ext", cpu_of(rq));
> -
> - if (is_active && runtime)
> + if (runtime)
> dl_server_start(dl_se);
>
> if (retval < 0)
> return retval;
> }
>
> + if (!!old_runtime ^ !!runtime) {
> + pr_info("%s server %sabled in CPU %d, system may crash due to starvation.\n",
> + server == &rq->fair_server ? "Fair" : "Ext",
> + runtime ? "en" : "dis",
> + cpu_of(rq));
Or:
pr_info("%s server %sabled in CPU %d%s\n",
server == &rq->fair_server ? "Fair" : "Ext",
runtime ? "en" : "dis",
cpu_of(rq),
runtime ? "" : ", system may crash due to starvation");
> + }
> +
> *ppos += cnt;
> return cnt;
> }
I like that, it should fix the issue.
Thanks,
-Andrea
next prev parent reply other threads:[~2026-02-02 22:37 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-26 9:58 [PATCHSET v12 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Andrea Righi
2026-01-26 9:58 ` [PATCH 1/7] sched/deadline: Clear the defer params Andrea Righi
2026-02-03 11:18 ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26 9:59 ` [PATCH 2/7] sched/debug: Fix updating of ppos on server write ops Andrea Righi
2026-02-03 11:18 ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26 9:59 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2026-02-02 21:13 ` Peter Zijlstra
2026-02-02 21:14 ` Peter Zijlstra
2026-02-02 21:17 ` Peter Zijlstra
2026-02-02 22:37 ` Andrea Righi [this message]
2026-02-03 10:34 ` Peter Zijlstra
2026-02-03 11:18 ` [tip: sched/core] sched/debug: Fix dl_server (re)start conditions tip-bot2 for Peter Zijlstra
2026-02-03 13:50 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2026-02-03 10:11 ` Andrea Righi
2026-02-03 11:18 ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26 9:59 ` [PATCH 4/7] sched_ext: Add a DL server for sched_ext tasks Andrea Righi
2026-02-02 19:50 ` Peter Zijlstra
2026-02-02 20:32 ` Andrea Righi
2026-02-02 21:10 ` Peter Zijlstra
2026-02-02 22:18 ` Andrea Righi
2026-02-03 11:18 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-01-26 9:59 ` [PATCH 5/7] sched/debug: Add support to change sched_ext server params Andrea Righi
2026-02-03 11:18 ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26 9:59 ` [PATCH 6/7] selftests/sched_ext: Add test for sched_ext dl_server Andrea Righi
2026-02-03 11:18 ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-01-26 9:59 ` [PATCH 7/7] selftests/sched_ext: Add test for DL server total_bw consistency Andrea Righi
2026-02-03 11:18 ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-02-02 16:45 ` [PATCHSET v12 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Tejun Heo
2026-02-02 19:56 ` Peter Zijlstra
2026-02-02 20:20 ` Tejun Heo
-- strict thread matches above, loose matches on Subject: below --
2026-01-20 21:50 [PATCHSET RESEND v11 " Andrea Righi
2026-01-20 21:50 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2025-12-17 9:35 [PATCHSET v11 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Andrea Righi
2025-12-17 9:35 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYEnK57vUwtgGXwH@gpd4 \
--to=arighi@nvidia.com \
--cc=bsegall@google.com \
--cc=changwoo@igalia.com \
--cc=christian.loehle@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=emil@etsalapatis.com \
--cc=hodgesd@meta.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.