All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Qais Yousef <qyousef@layalina.io>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jens Axboe <axboe@kernel.dk>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org,
	Phil Auld <pauld@redhat.com>
Subject: Re: [PATCH v2] sched/rt: Clean up usage of rt_task()
Date: Tue, 21 May 2024 13:00:35 +0200	[thread overview]
Message-ID: <20240521110035.KRIwllGe@linutronix.de> (raw)
In-Reply-To: <20240515220536.823145-1-qyousef@layalina.io>

On 2024-05-15 23:05:36 [+0100], Qais Yousef wrote:
> rt_task() checks if a task has RT priority. But depends on your
> dictionary, this could mean it belongs to RT class, or is a 'realtime'
> task, which includes RT and DL classes.
> 
> Since this has caused some confusion already on discussion [1], it
> seemed a clean up is due.
> 
> I define the usage of rt_task() to be tasks that belong to RT class.
> Make sure that it returns true only for RT class and audit the users and
> replace the ones required the old behavior with the new realtime_task()
> which returns true for RT and DL classes. Introduce similar
> realtime_prio() to create similar distinction to rt_prio() and update
> the users that required the old behavior to use the new function.
> 
> Move MAX_DL_PRIO to prio.h so it can be used in the new definitions.
> 
> Document the functions to make it more obvious what is the difference
> between them. PI-boosted tasks is a factor that must be taken into
> account when choosing which function to use.
> 
> Rename task_is_realtime() to realtime_task_policy() as the old name is
> confusing against the new realtime_task().

I *think* everyone using rt_task() means to include DL tasks. And
everyone means !SCHED-people since they know when the difference matters.

> No functional changes were intended.
> 
> [1] https://lore.kernel.org/lkml/20240506100509.GL40213@noisy.programming.kicks-ass.net/
> 
> Reviewed-by: Phil Auld <pauld@redhat.com>
> Signed-off-by: Qais Yousef <qyousef@layalina.io>
> ---
> 
> Changes since v1:
> 
> 	* Use realtime_task_policy() instead task_has_realtime_policy() (Peter)
> 	* Improve commit message readability about replace some rt_task()
> 	  users.
> 
> v1 discussion: https://lore.kernel.org/lkml/20240514234112.792989-1-qyousef@layalina.io/
> 
>  fs/select.c                       |  2 +-

fs/bcachefs/six.c
six_owner_running() has rt_task(). But imho should have realtime_task()
to consider DL. But I think it is way worse that it has its own locking
rather than using what everyone else but then again it wouldn't be the
new hot thing…

>  include/linux/ioprio.h            |  2 +-
>  include/linux/sched/deadline.h    |  6 ++++--
>  include/linux/sched/prio.h        |  1 +
>  include/linux/sched/rt.h          | 27 ++++++++++++++++++++++++++-
>  kernel/locking/rtmutex.c          |  4 ++--
>  kernel/locking/rwsem.c            |  4 ++--
>  kernel/locking/ww_mutex.h         |  2 +-
>  kernel/sched/core.c               |  6 +++---
>  kernel/time/hrtimer.c             |  6 +++---
>  kernel/trace/trace_sched_wakeup.c |  2 +-
>  mm/page-writeback.c               |  4 ++--
>  mm/page_alloc.c                   |  2 +-
>  13 files changed, 48 insertions(+), 20 deletions(-)
> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> index 70625dff62ce..08b95e0a41ab 100644
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -1996,7 +1996,7 @@ static void __hrtimer_init_sleeper(struct hrtimer_sleeper *sl,
>  	 * expiry.
>  	 */
>  	if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> -		if (task_is_realtime(current) && !(mode & HRTIMER_MODE_SOFT))
> +		if (realtime_task_policy(current) && !(mode & HRTIMER_MODE_SOFT))
>  			mode |= HRTIMER_MODE_HARD;
>  	}
>  
> @@ -2096,7 +2096,7 @@ long hrtimer_nanosleep(ktime_t rqtp, const enum hrtimer_mode mode,
>  	u64 slack;
>  
>  	slack = current->timer_slack_ns;
> -	if (rt_task(current))
> +	if (realtime_task(current))
>  		slack = 0;
>  
>  	hrtimer_init_sleeper_on_stack(&t, clockid, mode);
> @@ -2301,7 +2301,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 delta,
>  	 * Override any slack passed by the user if under
>  	 * rt contraints.
>  	 */
> -	if (rt_task(current))
> +	if (realtime_task(current))
>  		delta = 0;

I know this is just converting what is already here but…
__hrtimer_init_sleeper() looks at the policy to figure out if the task
is realtime do decide if should expire in HARD-IRQ context. This is
correct, a boosted task should not sleep.

hrtimer_nanosleep() + schedule_hrtimeout_range_clock() is looking at
priority to decide if slack should be removed. This should also look at
policy since a boosted task shouldn't sleep.

In order to be PI-boosted you need to acquire a lock and the only lock
you can sleep while acquired without generating a warning is a mutex_t
(or equivalent sleeping lock) on PREEMPT_RT. 

>  	hrtimer_init_sleeper_on_stack(&t, clock_id, mode);
> diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
> index 0469a04a355f..19d737742e29 100644
> --- a/kernel/trace/trace_sched_wakeup.c
> +++ b/kernel/trace/trace_sched_wakeup.c
> @@ -545,7 +545,7 @@ probe_wakeup(void *ignore, struct task_struct *p)
>  	 *  - wakeup_dl handles tasks belonging to sched_dl class only.
>  	 */
>  	if (tracing_dl || (wakeup_dl && !dl_task(p)) ||
> -	    (wakeup_rt && !dl_task(p) && !rt_task(p)) ||
> +	    (wakeup_rt && !realtime_task(p)) ||
>  	    (!dl_task(p) && (p->prio >= wakeup_prio || p->prio >= current->prio)))
>  		return;
>  

Sebastian

  reply	other threads:[~2024-05-21 11:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-15 22:05 [PATCH v2] sched/rt: Clean up usage of rt_task() Qais Yousef
2024-05-21 11:00 ` Sebastian Andrzej Siewior [this message]
2024-05-27 17:26   ` Qais Yousef
2024-05-29  8:29     ` Sebastian Andrzej Siewior
2024-05-29 10:34       ` Qais Yousef
2024-05-29 10:55         ` Sebastian Andrzej Siewior
2024-05-30 11:10           ` Qais Yousef
2024-05-31  6:30             ` Sebastian Andrzej Siewior
2024-06-01 22:31               ` Qais Yousef
2024-05-23 15:45 ` Steven Rostedt
2024-05-27 17:37   ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240521110035.KRIwllGe@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=bristot@redhat.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=qyousef@layalina.io \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vincent.guittot@linaro.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.