linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zilstra <peterz@infradead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Prakash Sangappa <prakash.sangappa@oracle.com>,
	Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arch@vger.kernel.org
Subject: Re: [patch V2 00/12] rseq: Implement time slice extension mechanism
Date: Mon, 27 Oct 2025 18:30:37 +0100	[thread overview]
Message-ID: <20251027173037.Cj4b_alm@linutronix.de> (raw)
In-Reply-To: <20251022110646.839870156@linutronix.de>

On 2025-10-22 14:57:28 [+0200], Thomas Gleixner wrote:
> Time slice extensions are an attempt to provide opportunistic priority
> ceiling without the overhead of an actual priority ceiling protocol, but
> also without the guarantees such a protocol provides.
> 
> The intent is to avoid situations where a user space thread is interrupted
> in a critical section and scheduled out, while holding a resource on which
> the preempting thread or other threads in the system might block on. That
> obviously prevents those threads from making progress in the worst case for
> at least a full time slice. Especially in the context of user space
> spinlocks, which are a patently bad idea to begin with, but that's also
> true for other mechanisms.

I've been playing with it a bit with RT enabled and started to debug
this:

|       slice_test-2903    [001] d.h..  2313.285439: local_timer_entry: vector=236
|       slice_test-2903    [001] d.h1.  2313.285440: hrtimer_cancel: hrtimer=000000000507e6d5
|       slice_test-2903    [001] d.h..  2313.285440: hrtimer_expire_entry: hrtimer=000000000507e6d5 function=tick_nohz_handler now=2313208001152
|       slice_test-2903    [001] d.h1.  2313.285449: sched_stat_runtime: comm=slice_test pid=2903 runtime=3982905 [ns]
|       slice_test-2903    [001] dlh..  2313.285452: softirq_raise: vec=7 [action=SCHED]
|       slice_test-2903    [001] dlh..  2313.285452: hrtimer_expire_exit: hrtimer=000000000507e6d5
|       slice_test-2903    [001] dlh1.  2313.285452: hrtimer_start: hrtimer=000000000507e6d5 function=tick_nohz_handler expires=2313212000000 softexpires=2313212000000 mode=ABS
|       slice_test-2903    [001] dlh..  2313.285453: local_timer_exit: vector=236
|       slice_test-2903    [001] dl.2.  2313.285453: sched_waking: comm=ksoftirqd/1 pid=32 prio=120 target_cpu=001
|       slice_test-2903    [001] dl.3.  2313.285456: sched_wakeup: comm=ksoftirqd/1 pid=32 prio=120 target_cpu=001
|       slice_test-2903    [001] d....  2313.285457: irqentry_exit: rseq_grant_slice_extension(216)

granting the extension and removing the lazy wakup. We are still on
return from IRQ but the 'h' flag has been already removed…

|       slice_test-2903    [001] d..1.  2313.285458: hrtimer_start: hrtimer=0000000030a688cc function=rseq_slice_expired expires=2313208047790 softexpires=2313208047790 mode=ABS|PINNED|HARD
|       slice_test-2903    [001] d....  2313.285458: __rseq_arm_slice_extension_timer: timer
|       slice_test-2903    [001] d..2.  2313.285484: hrtimer_cancel: hrtimer=0000000030a688cc
extension granted, timer started and revoked and set need resched.

|       slice_test-2903    [001] dN.2.  2313.285487: sched_stat_runtime: comm=slice_test pid=2903 runtime=36886 [ns]
This is coming from schedule() already. It took me a while since I was
hunting a missing clear of need-resched.

|       slice_test-2903    [001] d..2.  2313.285489: sched_switch: prev_comm=slice_test prev_pid=2903 prev_prio=120 prev_state=R+ ==> next_comm=ksoftirqd/1 next_pid=32 next_prio=120
|      ksoftirqd/1-32      [001] ..s.1  2313.285490: softirq_entry: vec=7 [action=SCHED]
|      ksoftirqd/1-32      [001] ..s.1  2313.285501: softirq_exit: vec=7 [action=SCHED]
|      ksoftirqd/1-32      [001] d..2.  2313.285502: sched_stat_runtime: comm=ksoftirqd/1 pid=32 runtime=16438 [ns]
|      ksoftirqd/1-32      [001] d..2.  2313.285503: sched_switch: prev_comm=ksoftirqd/1 prev_pid=32 prev_prio=120 prev_state=S ==> next_comm=slice_test next_pid=2904 next_prio=120
|       slice_test-2904    [001] .....  2313.285507: sys_enter: NR 230 (1, 0, 7f4692c7baa0, 0, 0, 0)
|       slice_test-2904    [001] .....  2313.285507: hrtimer_setup: hrtimer=00000000f2d53899 clockid=CLOCK_MONOTONIC mode=REL
|       slice_test-2904    [001] d..1.  2313.285507: hrtimer_start: hrtimer=00000000f2d53899 function=hrtimer_wakeup expires=2313208168792 softexpires=2313208118792 mode=REL
|       slice_test-2904    [001] d..2.  2313.285508: sched_stat_runtime: comm=slice_test pid=2904 runtime=6149 [ns]
|       slice_test-2904    [001] d..2.  2313.285510: sched_switch: prev_comm=slice_test prev_pid=2904 prev_prio=120 prev_state=S ==> next_comm=slice_test next_pid=2903 next_prio=120
|       slice_test-2903    [001] .....  2313.285510: sys_enter: NR 470 (7fffc04f1ff0, c350, 11a0e0, 0, 7f4692e99000, 0)

slice_test-2903 enters _now_ rseq_slice_yield() so it must have been in
userland during the suppressed wake up at 2313.285457.
But a few iterations later it turns at out this trace event is recorded
_after_ the rseq magic happens at sys_enter time. We entered
rseq_slice_yield() a few cycles after the extension was granted. Buh.
So it seems to work as intended but it is not obvious tell from tracing
why it does not work.

Sebastian

  parent reply	other threads:[~2025-10-27 17:30 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-22 12:57 [patch V2 00/12] rseq: Implement time slice extension mechanism Thomas Gleixner
2025-10-22 12:57 ` [patch V2 01/12] sched: Provide and use set_need_resched_current() Thomas Gleixner
2025-10-27  8:59   ` Sebastian Andrzej Siewior
2025-10-27 11:13     ` Thomas Gleixner
2025-10-22 12:57 ` [patch V2 02/12] rseq: Add fields and constants for time slice extension Thomas Gleixner
2025-10-22 17:28   ` Randy Dunlap
2025-10-22 12:57 ` [patch V2 03/12] rseq: Provide static branch for time slice extensions Thomas Gleixner
2025-10-27  9:29   ` Sebastian Andrzej Siewior
2025-10-22 12:57 ` [patch V2 04/12] rseq: Add statistics " Thomas Gleixner
2025-10-22 12:57 ` [patch V2 05/12] rseq: Add prctl() to enable " Thomas Gleixner
2025-10-27  9:40   ` Sebastian Andrzej Siewior
2025-10-22 12:57 ` [patch V2 06/12] rseq: Implement sys_rseq_slice_yield() Thomas Gleixner
2025-10-22 12:57 ` [patch V2 07/12] rseq: Implement syscall entry work for time slice extensions Thomas Gleixner
2025-10-22 12:57 ` [patch V2 08/12] rseq: Implement time slice extension enforcement timer Thomas Gleixner
2025-10-27 11:38   ` Sebastian Andrzej Siewior
2025-10-27 16:26     ` Thomas Gleixner
2025-10-28  8:33       ` Sebastian Andrzej Siewior
2025-10-28  8:51         ` K Prateek Nayak
2025-10-28  9:00           ` Sebastian Andrzej Siewior
2025-10-28  9:22             ` K Prateek Nayak
2025-10-28 10:22               ` Sebastian Andrzej Siewior
2025-10-28 13:04         ` Thomas Gleixner
2025-10-22 12:57 ` [patch V2 09/12] rseq: Reset slice extension when scheduled Thomas Gleixner
2025-10-22 12:57 ` [patch V2 10/12] rseq: Implement rseq_grant_slice_extension() Thomas Gleixner
2025-10-22 12:57 ` [patch V2 11/12] entry: Hook up rseq time slice extension Thomas Gleixner
2025-10-22 12:57 ` [patch V2 12/12] selftests/rseq: Implement time slice extension test Thomas Gleixner
2025-10-27 17:30 ` Sebastian Andrzej Siewior [this message]
2025-10-27 18:48   ` [patch V2 00/12] rseq: Implement time slice extension mechanism Thomas Gleixner
2025-10-28  8:53     ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251027173037.Cj4b_alm@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=arnd@arndb.de \
    --cc=boqun.feng@gmail.com \
    --cc=corbet@lwn.net \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=prakash.sangappa@oracle.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=vineethr@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).