public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Tejun Heo <tj@kernel.org>, Joel Fernandes <joelagnelf@nvidia.com>,
	David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	Daniel Hodges <hodgesd@meta.com>,
	Christian Loehle <christian.loehle@arm.com>,
	Emil Tsalapatis <emil@etsalapatis.com>,
	sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/7] sched/debug: Stop and start server based on if it was active
Date: Mon, 2 Feb 2026 23:37:31 +0100	[thread overview]
Message-ID: <aYEnK57vUwtgGXwH@gpd4> (raw)
In-Reply-To: <20260202211723.GF1395416@noisy.programming.kicks-ass.net>

On Mon, Feb 02, 2026 at 10:17:23PM +0100, Peter Zijlstra wrote:
> On Mon, Feb 02, 2026 at 10:13:26PM +0100, Peter Zijlstra wrote:
> > On Mon, Jan 26, 2026 at 10:59:01AM +0100, Andrea Righi wrote:
> > > From: Joel Fernandes <joelagnelf@nvidia.com>
> > > 
> > > Currently the DL server interface for applying parameters checks
> > > CFS-internals to identify if the server is active. This is error-prone
> > > and makes it difficult when adding new servers in the future.
> > > 
> > > Fix it, by using dl_server_active() which is also used by the DL server
> > > code to determine if the DL server was started.
> > > 
> > > Tested-by: Christian Loehle <christian.loehle@arm.com>
> > > Acked-by: Tejun Heo <tj@kernel.org>
> > > Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
> > > Reviewed-by: Andrea Righi <arighi@nvidia.com>
> > > Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> > > ---
> > >  kernel/sched/debug.c | 11 ++++++++---
> > >  1 file changed, 8 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> > > index 93f009e1076d8..dd793f8f3858a 100644
> > > --- a/kernel/sched/debug.c
> > > +++ b/kernel/sched/debug.c
> > > @@ -354,6 +354,8 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > >  		return err;
> > >  
> > >  	scoped_guard (rq_lock_irqsave, rq) {
> > > +		bool is_active;
> > > +
> > >  		runtime  = rq->fair_server.dl_runtime;
> > >  		period = rq->fair_server.dl_period;
> > >  
> > > @@ -376,8 +378,11 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > >  			return  -EINVAL;
> > >  		}
> > >  
> > > -		update_rq_clock(rq);
> > > -		dl_server_stop(&rq->fair_server);
> > > +		is_active = dl_server_active(&rq->fair_server);
> > > +		if (is_active) {
> > > +			update_rq_clock(rq);
> > > +			dl_server_stop(&rq->fair_server);
> > > +		}
> > >  
> > >  		retval = dl_server_apply_params(&rq->fair_server, runtime, period, 0);
> > >  
> > > @@ -385,7 +390,7 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > >  			printk_deferred("Fair server disabled in CPU %d, system may crash due to starvation.\n",
> > >  					cpu_of(rq));
> > >  
> > > -		if (rq->cfs.h_nr_queued)
> > > +		if (is_active && runtime)
> > >  			dl_server_start(&rq->fair_server);
> > >  
> > >  		if (retval < 0)
> > 
> > Suppose runtime was 0, and gets incremented while there are already
> > tasks enqueued, then the above isn't going to DTRT.
> 

Right that's a bug, if the user sets runtime=0, tasks can be enqueued while
the server is disabled, when the user later sets runtime>0 to re-enable the
server, the current code doesn't start it.

Also, as discussed with Juri at LPC, we likely need a better interface than
debugfs for configuring these parameters (possibly a topic for OSPM).

> Something like so perhaps?
> 
> ---
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index 59e650f9d436..884bdf7a292f 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -340,7 +340,7 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
>  	long cpu = (long) ((struct seq_file *) filp->private_data)->private;
>  	struct rq *rq = cpu_rq(cpu);
>  	struct sched_dl_entity *dl_se = (struct sched_dl_entity *)server;
> -	u64 runtime, period;
> +	u64 old_runtime, runtime, period;
>  	int retval = 0;
>  	size_t err;
>  	u64 value;
> @@ -352,7 +352,7 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
>  	scoped_guard (rq_lock_irqsave, rq) {
>  		bool is_active;
>  
> -		runtime = dl_se->dl_runtime;
> +		old_runtime = runtime = dl_se->dl_runtime;
>  		period = dl_se->dl_period;
>  
>  		switch (param) {
> @@ -382,17 +382,20 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
>  
>  		retval = dl_server_apply_params(dl_se, runtime, period, 0);
>  
> -		if (!runtime)
> -			printk_deferred("%s server disabled in CPU %d, system may crash due to starvation.\n",
> -					server == &rq->fair_server ? "Fair" : "Ext", cpu_of(rq));
> -
> -		if (is_active && runtime)
> +		if (runtime)
>  			dl_server_start(dl_se);
>  
>  		if (retval < 0)
>  			return retval;
>  	}
>  
> +	if (!!old_runtime ^ !!runtime) {
> +		pr_info("%s server %sabled in CPU %d, system may crash due to starvation.\n",
> +			server == &rq->fair_server ? "Fair" : "Ext",
> +			runtime ? "en" : "dis",
> +			cpu_of(rq));

Or:

    pr_info("%s server %sabled in CPU %d%s\n",
              server == &rq->fair_server ? "Fair" : "Ext",
              runtime ? "en" : "dis",
              cpu_of(rq),
              runtime ? "" : ", system may crash due to starvation");

> +	}
> +
>  	*ppos += cnt;
>  	return cnt;
>  }

I like that, it should fix the issue.

Thanks,
-Andrea

  reply	other threads:[~2026-02-02 22:37 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26  9:58 [PATCHSET v12 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Andrea Righi
2026-01-26  9:58 ` [PATCH 1/7] sched/deadline: Clear the defer params Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26  9:59 ` [PATCH 2/7] sched/debug: Fix updating of ppos on server write ops Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26  9:59 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2026-02-02 21:13   ` Peter Zijlstra
2026-02-02 21:14     ` Peter Zijlstra
2026-02-02 21:17     ` Peter Zijlstra
2026-02-02 22:37       ` Andrea Righi [this message]
2026-02-03 10:34         ` Peter Zijlstra
2026-02-03 11:18           ` [tip: sched/core] sched/debug: Fix dl_server (re)start conditions tip-bot2 for Peter Zijlstra
2026-02-03 13:50           ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2026-02-03 10:11       ` Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26  9:59 ` [PATCH 4/7] sched_ext: Add a DL server for sched_ext tasks Andrea Righi
2026-02-02 19:50   ` Peter Zijlstra
2026-02-02 20:32     ` Andrea Righi
2026-02-02 21:10       ` Peter Zijlstra
2026-02-02 22:18         ` Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-01-26  9:59 ` [PATCH 5/7] sched/debug: Add support to change sched_ext server params Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26  9:59 ` [PATCH 6/7] selftests/sched_ext: Add test for sched_ext dl_server Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-01-26  9:59 ` [PATCH 7/7] selftests/sched_ext: Add test for DL server total_bw consistency Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-02-02 16:45 ` [PATCHSET v12 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Tejun Heo
2026-02-02 19:56   ` Peter Zijlstra
2026-02-02 20:20     ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2026-01-20 21:50 [PATCHSET RESEND v11 " Andrea Righi
2026-01-20 21:50 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2025-12-17  9:35 [PATCHSET v11 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Andrea Righi
2025-12-17  9:35 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYEnK57vUwtgGXwH@gpd4 \
    --to=arighi@nvidia.com \
    --cc=bsegall@google.com \
    --cc=changwoo@igalia.com \
    --cc=christian.loehle@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=emil@etsalapatis.com \
    --cc=hodgesd@meta.com \
    --cc=joelagnelf@nvidia.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=void@manifault.com \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox