All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Jeanson <mjeanson@efficios.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>, Wei Liu <wei.liu@kernel.org>,
	Jens Axboe <axboe@kernel.dk>
Subject: [patch 06/11] rseq: Optimize exit to user space further
Date: Wed, 13 Aug 2025 18:29:27 +0200 (CEST)	[thread overview]
Message-ID: <20250813162824.164609663@linutronix.de> (raw)
In-Reply-To: 20250813155941.014821755@linutronix.de

Now that the event pending bit management is consistent, the invocation of
__rseq_handle_notify_resume() can be avoided if the event pending bit is
not set.

This is correct because of the following order:

  1)  if (TIF_NOTIFY_RESUME)
  2)    clear(TIF_NOTIFY_RESUME);
  	smp_mb__after_atomic();
  3)  	if (event_pending)
  4)       __rseq_handle_notify_resume()
  5)         guard()
  6)           work = check_and_clear_pending();

Any new event, which hits between #1 and #2 will be visible in #3.

Any new event, which hits after #2, will either be visible in #3 and
therefore consumed in #6 or missed in #3. The latter is not a problem as
the new event will also re-raise TIF_NOTIFY_RESUME, which will cause the
calling exit loop take another round.

The quick check #3 optimizes for the common case, where event_pending is
false. Ignore the quick check when CONFIG_DEBUG_RSEQ is enabled to widen
the test coverage.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/rseq.h |    8 +++++---
 kernel/rseq.c        |   17 +++++++++++++----
 2 files changed, 18 insertions(+), 7 deletions(-)

--- a/include/linux/rseq.h
+++ b/include/linux/rseq.h
@@ -17,7 +17,7 @@ void __rseq_handle_notify_resume(struct
 
 static inline void rseq_handle_notify_resume(struct pt_regs *regs)
 {
-	if (current->rseq)
+	if (IS_ENABLED(CONFIG_DEBUG_RESQ) || READ_ONCE(current->rseq_event_pending))
 		__rseq_handle_notify_resume(NULL, regs);
 }
 
@@ -30,8 +30,10 @@ static inline void rseq_signal_deliver(s
 
 static inline void rseq_notify_event(struct task_struct *t)
 {
+	lockdep_assert_irqs_disabled();
+
 	if (t->rseq) {
-		t->rseq_event_pending = true;
+		WRITE_ONCE(t->rseq_event_pending, true);
 		set_tsk_thread_flag(t, TIF_NOTIFY_RESUME);
 	}
 }
@@ -59,7 +61,7 @@ static inline void rseq_fork(struct task
 		t->rseq = current->rseq;
 		t->rseq_len = current->rseq_len;
 		t->rseq_sig = current->rseq_sig;
-		t->rseq_event_pending = current->rseq_event_pending;
+		t->rseq_event_pending = READ_ONCE(current->rseq_event_pending);
 	}
 }
 
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -524,9 +524,17 @@ SYSCALL_DEFINE4(rseq, struct rseq __user
 		ret = rseq_reset_rseq_cpu_node_id(current);
 		if (ret)
 			return ret;
-		current->rseq = NULL;
-		current->rseq_sig = 0;
-		current->rseq_len = 0;
+
+		/*
+		 * Ensure consistency of tsk::rseq and tsk::rseq_event_pending
+		 * vs. the scheduler and the RSEQ IPIs.
+		 */
+		scoped_guard(RSEQ_EVENT_GUARD) {
+			current->rseq = NULL;
+			current->rseq_sig = 0;
+			current->rseq_len = 0;
+			current->rseq_event_pending = false;
+		}
 		return 0;
 	}
 
@@ -601,7 +609,8 @@ SYSCALL_DEFINE4(rseq, struct rseq __user
 	 * registered, ensure the cpu_id_start and cpu_id fields
 	 * are updated before returning to user-space.
 	 */
-	rseq_set_notify_resume(current);
+	scoped_guard(irq)
+		rseq_notify_event(current);
 
 	return 0;
 }


  parent reply	other threads:[~2025-08-13 16:29 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-13 16:29 [patch 00/11] rseq: Optimize exit to user space Thomas Gleixner
2025-08-13 16:29 ` [patch 01/11] rseq: Avoid pointless evaluation in __rseq_notify_resume() Thomas Gleixner
2025-08-20 14:23   ` Mathieu Desnoyers
2025-08-13 16:29 ` [patch 02/11] rseq: Condense the inline stubs Thomas Gleixner
2025-08-20 14:24   ` Mathieu Desnoyers
2025-08-13 16:29 ` [patch 03/11] rseq: Rename rseq_syscall() to rseq_debug_syscall_exit() Thomas Gleixner
2025-08-20 14:25   ` Mathieu Desnoyers
2025-08-13 16:29 ` [patch 04/11] rseq: Replace the pointless event mask bit fiddling Thomas Gleixner
2025-08-13 16:29 ` [patch 05/11] rseq: Optimize the signal delivery path Thomas Gleixner
2025-08-13 16:29 ` Thomas Gleixner [this message]
2025-08-13 16:29 ` [patch 07/11] entry: Cleanup header Thomas Gleixner
2025-08-13 17:09   ` Giorgi Tchankvetadze
2025-08-13 21:30     ` Thomas Gleixner
2025-08-13 16:29 ` [patch 08/11] entry: Distinguish between syscall and interrupt exit Thomas Gleixner
2025-08-13 16:29 ` [patch 09/11] entry: Provide exit_to_user_notify_resume() Thomas Gleixner
2025-08-13 16:29 ` [patch 10/11] rseq: Skip fixup when returning from a syscall Thomas Gleixner
2025-08-14  8:54   ` Peter Zijlstra
2025-08-14 13:24     ` Thomas Gleixner
2025-08-13 16:29 ` [patch 11/11] rseq: Convert to masked user access where applicable Thomas Gleixner
2025-08-13 17:45 ` [patch 00/11] rseq: Optimize exit to user space Jens Axboe
2025-08-13 21:32   ` Thomas Gleixner
2025-08-13 21:36     ` Jens Axboe
2025-08-13 22:08       ` Thomas Gleixner
2025-08-17 21:23         ` Thomas Gleixner
2025-08-18 14:00           ` BUG: rseq selftests and librseq vs. glibc fail Thomas Gleixner
2025-08-18 14:15             ` Florian Weimer
2025-08-18 17:13               ` Thomas Gleixner
2025-08-18 19:33                 ` Florian Weimer
2025-08-18 19:46                   ` Sean Christopherson
2025-08-18 19:55                     ` Florian Weimer
2025-08-18 20:27                       ` Sean Christopherson
2025-08-18 23:54                         ` Thomas Gleixner
2025-08-19  0:28                           ` Sean Christopherson
2025-08-19  6:18                             ` Florian Weimer
2025-08-29 18:44                 ` Prakash Sangappa
2025-08-29 18:50                   ` Mathieu Desnoyers
2025-09-01 19:30                     ` Prakash Sangappa
2025-08-18 17:38           ` [patch 00/11] rseq: Optimize exit to user space Michael Jeanson
2025-08-18 20:21             ` Thomas Gleixner
2025-08-18 21:29               ` Michael Jeanson
2025-08-18 23:43                 ` Thomas Gleixner
2025-08-20 14:27           ` Mathieu Desnoyers
2025-08-20 14:10 ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250813162824.164609663@linutronix.de \
    --to=tglx@linutronix.de \
    --cc=axboe@kernel.dk \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mjeanson@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.