All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Tejun Heo <tj@kernel.org>, Joel Fernandes <joelagnelf@nvidia.com>,
	David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	Daniel Hodges <hodgesd@meta.com>,
	Christian Loehle <christian.loehle@arm.com>,
	Emil Tsalapatis <emil@etsalapatis.com>,
	sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/7] sched/debug: Stop and start server based on if it was active
Date: Tue, 3 Feb 2026 11:11:06 +0100	[thread overview]
Message-ID: <aYHJuvVaa8xNbuHR@gpd4> (raw)
In-Reply-To: <20260202211723.GF1395416@noisy.programming.kicks-ass.net>

Hi Peter,

On Mon, Feb 02, 2026 at 10:17:23PM +0100, Peter Zijlstra wrote:
> On Mon, Feb 02, 2026 at 10:13:26PM +0100, Peter Zijlstra wrote:
> > On Mon, Jan 26, 2026 at 10:59:01AM +0100, Andrea Righi wrote:
> > > From: Joel Fernandes <joelagnelf@nvidia.com>
> > > 
> > > Currently the DL server interface for applying parameters checks
> > > CFS-internals to identify if the server is active. This is error-prone
> > > and makes it difficult when adding new servers in the future.
> > > 
> > > Fix it, by using dl_server_active() which is also used by the DL server
> > > code to determine if the DL server was started.
> > > 
> > > Tested-by: Christian Loehle <christian.loehle@arm.com>
> > > Acked-by: Tejun Heo <tj@kernel.org>
> > > Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
> > > Reviewed-by: Andrea Righi <arighi@nvidia.com>
> > > Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
> > > ---
> > >  kernel/sched/debug.c | 11 ++++++++---
> > >  1 file changed, 8 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> > > index 93f009e1076d8..dd793f8f3858a 100644
> > > --- a/kernel/sched/debug.c
> > > +++ b/kernel/sched/debug.c
> > > @@ -354,6 +354,8 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > >  		return err;
> > >  
> > >  	scoped_guard (rq_lock_irqsave, rq) {
> > > +		bool is_active;
> > > +
> > >  		runtime  = rq->fair_server.dl_runtime;
> > >  		period = rq->fair_server.dl_period;
> > >  
> > > @@ -376,8 +378,11 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > >  			return  -EINVAL;
> > >  		}
> > >  
> > > -		update_rq_clock(rq);
> > > -		dl_server_stop(&rq->fair_server);
> > > +		is_active = dl_server_active(&rq->fair_server);
> > > +		if (is_active) {
> > > +			update_rq_clock(rq);
> > > +			dl_server_stop(&rq->fair_server);
> > > +		}
> > >  
> > >  		retval = dl_server_apply_params(&rq->fair_server, runtime, period, 0);
> > >  
> > > @@ -385,7 +390,7 @@ static ssize_t sched_fair_server_write(struct file *filp, const char __user *ubu
> > >  			printk_deferred("Fair server disabled in CPU %d, system may crash due to starvation.\n",
> > >  					cpu_of(rq));
> > >  
> > > -		if (rq->cfs.h_nr_queued)
> > > +		if (is_active && runtime)
> > >  			dl_server_start(&rq->fair_server);
> > >  
> > >  		if (retval < 0)
> > 
> > Suppose runtime was 0, and gets incremented while there are already
> > tasks enqueued, then the above isn't going to DTRT.
> 
> Something like so perhaps?
> 
> ---
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index 59e650f9d436..884bdf7a292f 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -340,7 +340,7 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
>  	long cpu = (long) ((struct seq_file *) filp->private_data)->private;
>  	struct rq *rq = cpu_rq(cpu);
>  	struct sched_dl_entity *dl_se = (struct sched_dl_entity *)server;
> -	u64 runtime, period;
> +	u64 old_runtime, runtime, period;
>  	int retval = 0;
>  	size_t err;
>  	u64 value;
> @@ -352,7 +352,7 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
>  	scoped_guard (rq_lock_irqsave, rq) {
>  		bool is_active;
>  
> -		runtime = dl_se->dl_runtime;
> +		old_runtime = runtime = dl_se->dl_runtime;
>  		period = dl_se->dl_period;
>  
>  		switch (param) {
> @@ -382,17 +382,20 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
>  
>  		retval = dl_server_apply_params(dl_se, runtime, period, 0);
>  
> -		if (!runtime)
> -			printk_deferred("%s server disabled in CPU %d, system may crash due to starvation.\n",
> -					server == &rq->fair_server ? "Fair" : "Ext", cpu_of(rq));
> -
> -		if (is_active && runtime)
> +		if (runtime)
>  			dl_server_start(dl_se);
>  
>  		if (retval < 0)
>  			return retval;
>  	}
>  
> +	if (!!old_runtime ^ !!runtime) {
> +		pr_info("%s server %sabled in CPU %d, system may crash due to starvation.\n",
> +			server == &rq->fair_server ? "Fair" : "Ext",
> +			runtime ? "en" : "dis",
> +			cpu_of(rq));
> +	}
> +
>  	*ppos += cnt;
>  	return cnt;
>  }

I slightly changed your patch (see below), adding a missing
update_rq_clock(rq) before starting the DL server and updated the pr_info
message, as mentioned in my previous email.

I ran some tests, and with this change the DL server starts correctly when
runtime is changed from 0 to a value > 0, so this fixes the issue.

Would you prefer that I send an updated patch set, or should we apply this
fix on top?

Thanks,
-Andrea

---
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 2e9896668c6fd..dbd5e67a16c67 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -346,7 +346,7 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
 	long cpu = (long) ((struct seq_file *) filp->private_data)->private;
 	struct rq *rq = cpu_rq(cpu);
 	struct sched_dl_entity *dl_se = (struct sched_dl_entity *)server;
-	u64 runtime, period;
+	u64 old_runtime, runtime, period;
 	int retval = 0;
 	size_t err;
 	u64 value;
@@ -358,7 +358,7 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
 	scoped_guard (rq_lock_irqsave, rq) {
 		bool is_active;
 
-		runtime = dl_se->dl_runtime;
+		old_runtime = runtime = dl_se->dl_runtime;
 		period = dl_se->dl_period;
 
 		switch (param) {
@@ -388,17 +388,23 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
 
 		retval = dl_server_apply_params(dl_se, runtime, period, 0);
 
-		if (!runtime)
-			printk_deferred("%s server disabled in CPU %d, system may crash due to starvation.\n",
-					server == &rq->fair_server ? "Fair" : "Ext", cpu_of(rq));
-
-		if (is_active && runtime)
+		if (runtime) {
+			update_rq_clock(rq);
 			dl_server_start(dl_se);
+		}
 
 		if (retval < 0)
 			return retval;
 	}
 
+	if (!!old_runtime ^ !!runtime) {
+		pr_info("%s server %sabled in CPU %d%s\n",
+			server == &rq->fair_server ? "Fair" : "Ext",
+			runtime ? "en" : "dis",
+			cpu_of(rq),
+			runtime ? "" : ", system may crash due to starvation");
+	}
+
 	*ppos += cnt;
 	return cnt;
 }
-- 
2.52.0


  parent reply	other threads:[~2026-02-03 10:11 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26  9:58 [PATCHSET v12 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Andrea Righi
2026-01-26  9:58 ` [PATCH 1/7] sched/deadline: Clear the defer params Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26  9:59 ` [PATCH 2/7] sched/debug: Fix updating of ppos on server write ops Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26  9:59 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2026-02-02 21:13   ` Peter Zijlstra
2026-02-02 21:14     ` Peter Zijlstra
2026-02-02 21:17     ` Peter Zijlstra
2026-02-02 22:37       ` Andrea Righi
2026-02-03 10:34         ` Peter Zijlstra
2026-02-03 11:18           ` [tip: sched/core] sched/debug: Fix dl_server (re)start conditions tip-bot2 for Peter Zijlstra
2026-02-03 13:50           ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2026-02-03 10:11       ` Andrea Righi [this message]
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26  9:59 ` [PATCH 4/7] sched_ext: Add a DL server for sched_ext tasks Andrea Righi
2026-02-02 19:50   ` Peter Zijlstra
2026-02-02 20:32     ` Andrea Righi
2026-02-02 21:10       ` Peter Zijlstra
2026-02-02 22:18         ` Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-01-26  9:59 ` [PATCH 5/7] sched/debug: Add support to change sched_ext server params Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-01-26  9:59 ` [PATCH 6/7] selftests/sched_ext: Add test for sched_ext dl_server Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Andrea Righi
2026-01-26  9:59 ` [PATCH 7/7] selftests/sched_ext: Add test for DL server total_bw consistency Andrea Righi
2026-02-03 11:18   ` [tip: sched/core] " tip-bot2 for Joel Fernandes
2026-02-02 16:45 ` [PATCHSET v12 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Tejun Heo
2026-02-02 19:56   ` Peter Zijlstra
2026-02-02 20:20     ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2026-01-20 21:50 [PATCHSET RESEND v11 " Andrea Righi
2026-01-20 21:50 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi
2025-12-17  9:35 [PATCHSET v11 sched_ext/for-6.20] Add a deadline server for sched_ext tasks Andrea Righi
2025-12-17  9:35 ` [PATCH 3/7] sched/debug: Stop and start server based on if it was active Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYHJuvVaa8xNbuHR@gpd4 \
    --to=arighi@nvidia.com \
    --cc=bsegall@google.com \
    --cc=changwoo@igalia.com \
    --cc=christian.loehle@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=emil@etsalapatis.com \
    --cc=hodgesd@meta.com \
    --cc=joelagnelf@nvidia.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=void@manifault.com \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.