From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Jeanson <mjeanson@efficios.com>,
Jens Axboe <axboe@kernel.dk>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Peter Zijlstra <peterz@infradead.org>,
"Paul E. McKenney" <paulmck@kernel.org>,
x86@kernel.org, Sean Christopherson <seanjc@google.com>,
Wei Liu <wei.liu@kernel.org>
Subject: [patch V6 28/31] rseq: Switch to fast path processing on exit to user
Date: Mon, 27 Oct 2025 09:45:19 +0100 (CET) [thread overview]
Message-ID: <20251027084307.701201365@linutronix.de> (raw)
In-Reply-To: 20251027084220.785525188@linutronix.de
Now that all bits and pieces are in place, hook the RSEQ handling fast path
function into exit_to_user_mode_prepare() after the TIF work bits have been
handled. If case of fast path failure, TIF_NOTIFY_RESUME has been raised
and the caller needs to take another turn through the TIF handling slow
path.
This only works for architectures which use the generic entry code.
Architectures who still have their own incomplete hacks are not supported
and won't be.
This results in the following improvements:
Kernel build Before After Reduction
exit to user 80692981 80514451
signal checks: 32581 121 99%
slowpath runs: 1201408 1.49% 198 0.00% 100%
fastpath runs: 675941 0.84% N/A
id updates: 1233989 1.53% 50541 0.06% 96%
cs checks: 1125366 1.39% 0 0.00% 100%
cs cleared: 1125366 100% 0 100%
cs fixup: 0 0% 0
RSEQ selftests Before After Reduction
exit to user: 386281778 387373750
signal checks: 35661203 0 100%
slowpath runs: 140542396 36.38% 100 0.00% 100%
fastpath runs: 9509789 2.51% N/A
id updates: 176203599 45.62% 9087994 2.35% 95%
cs checks: 175587856 45.46% 4728394 1.22% 98%
cs cleared: 172359544 98.16% 1319307 27.90% 99%
cs fixup: 3228312 1.84% 3409087 72.10%
The 'cs cleared' and 'cs fixup' percentages are not relative to the exit to
user invocations, they are relative to the actual 'cs check' invocations.
While some of this could have been avoided in the original code, like the
obvious clearing of CS when it's already clear, the main problem of going
through TIF_NOTIFY_RESUME cannot be solved. In some workloads the RSEQ
notify handler is invoked more than once before going out to user
space. Doing this once when everything has stabilized is the only solution
to avoid this.
The initial attempt to completely decouple it from the TIF work turned out
to be suboptimal for workloads, which do a lot of quick and short system
calls. Even if the fast path decision is only 4 instructions (including a
conditional branch), this adds up quickly and becomes measurable when the
rate for actually having to handle rseq is in the low single digit
percentage range of user/kernel transitions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V4: Move the rseq handling into a separate loop to avoid gotos later on
---
include/linux/irq-entry-common.h | 7 ++-----
include/linux/resume_user_mode.h | 2 +-
include/linux/rseq.h | 18 ++++++++++++------
init/Kconfig | 2 +-
kernel/entry/common.c | 26 +++++++++++++++++++-------
kernel/rseq.c | 8 ++++++--
6 files changed, 41 insertions(+), 22 deletions(-)
--- a/include/linux/irq-entry-common.h
+++ b/include/linux/irq-entry-common.h
@@ -197,11 +197,8 @@ static __always_inline void arch_exit_to
*/
void arch_do_signal_or_restart(struct pt_regs *regs);
-/**
- * exit_to_user_mode_loop - do any pending work before leaving to user space
- */
-unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
- unsigned long ti_work);
+/* Handle pending TIF work */
+unsigned long exit_to_user_mode_loop(struct pt_regs *regs, unsigned long ti_work);
/**
* exit_to_user_mode_prepare - call exit_to_user_mode_loop() if required
--- a/include/linux/resume_user_mode.h
+++ b/include/linux/resume_user_mode.h
@@ -59,7 +59,7 @@ static inline void resume_user_mode_work
mem_cgroup_handle_over_high(GFP_KERNEL);
blkcg_maybe_throttle_current();
- rseq_handle_notify_resume(regs);
+ rseq_handle_slowpath(regs);
}
#endif /* LINUX_RESUME_USER_MODE_H */
--- a/include/linux/rseq.h
+++ b/include/linux/rseq.h
@@ -7,13 +7,19 @@
#include <uapi/linux/rseq.h>
-void __rseq_handle_notify_resume(struct pt_regs *regs);
+void __rseq_handle_slowpath(struct pt_regs *regs);
-static inline void rseq_handle_notify_resume(struct pt_regs *regs)
+/* Invoked from resume_user_mode_work() */
+static inline void rseq_handle_slowpath(struct pt_regs *regs)
{
- /* '&' is intentional to spare one conditional branch */
- if (current->rseq.event.sched_switch & current->rseq.event.has_rseq)
- __rseq_handle_notify_resume(regs);
+ if (IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
+ if (current->rseq.event.slowpath)
+ __rseq_handle_slowpath(regs);
+ } else {
+ /* '&' is intentional to spare one conditional branch */
+ if (current->rseq.event.sched_switch & current->rseq.event.has_rseq)
+ __rseq_handle_slowpath(regs);
+ }
}
void __rseq_signal_deliver(int sig, struct pt_regs *regs);
@@ -152,7 +158,7 @@ static inline void rseq_fork(struct task
}
#else /* CONFIG_RSEQ */
-static inline void rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs) { }
+static inline void rseq_handle_slowpath(struct pt_regs *regs) { }
static inline void rseq_signal_deliver(struct ksignal *ksig, struct pt_regs *regs) { }
static inline void rseq_sched_switch_event(struct task_struct *t) { }
static inline void rseq_sched_set_task_cpu(struct task_struct *t, unsigned int cpu) { }
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1941,7 +1941,7 @@ config RSEQ_DEBUG_DEFAULT_ENABLE
config DEBUG_RSEQ
default n
bool "Enable debugging of rseq() system call" if EXPERT
- depends on RSEQ && DEBUG_KERNEL
+ depends on RSEQ && DEBUG_KERNEL && !GENERIC_ENTRY
select RSEQ_DEBUG_DEFAULT_ENABLE
help
Enable extra debugging checks for the rseq system call.
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -11,13 +11,8 @@
/* Workaround to allow gradual conversion of architecture code */
void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
-/**
- * exit_to_user_mode_loop - do any pending work before leaving to user space
- * @regs: Pointer to pt_regs on entry stack
- * @ti_work: TIF work flags as read by the caller
- */
-__always_inline unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
- unsigned long ti_work)
+static __always_inline unsigned long __exit_to_user_mode_loop(struct pt_regs *regs,
+ unsigned long ti_work)
{
/*
* Before returning to user space ensure that all pending work
@@ -62,6 +57,23 @@ void __weak arch_do_signal_or_restart(st
return ti_work;
}
+/**
+ * exit_to_user_mode_loop - do any pending work before leaving to user space
+ * @regs: Pointer to pt_regs on entry stack
+ * @ti_work: TIF work flags as read by the caller
+ */
+__always_inline unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
+ unsigned long ti_work)
+{
+ for (;;) {
+ ti_work = __exit_to_user_mode_loop(regs, ti_work);
+
+ if (likely(!rseq_exit_to_user_mode_restart(regs)))
+ return ti_work;
+ ti_work = read_thread_flags();
+ }
+}
+
noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
{
irqentry_state_t ret = {
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -237,7 +237,11 @@ static bool rseq_handle_cs(struct task_s
static void rseq_slowpath_update_usr(struct pt_regs *regs)
{
- /* Preserve rseq state and user_irq state for exit to user */
+ /*
+ * Preserve rseq state and user_irq state. The generic entry code
+ * clears user_irq on the way out, the non-generic entry
+ * architectures are not having user_irq.
+ */
const struct rseq_event evt_mask = { .has_rseq = true, .user_irq = true, };
struct task_struct *t = current;
struct rseq_ids ids;
@@ -289,7 +293,7 @@ static void rseq_slowpath_update_usr(str
}
}
-void __rseq_handle_notify_resume(struct pt_regs *regs)
+void __rseq_handle_slowpath(struct pt_regs *regs)
{
/*
* If invoked from hypervisors before entering the guest via
next prev parent reply other threads:[~2025-10-27 8:45 UTC|newest]
Thread overview: 142+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-27 8:44 [patch V6 00/31] rseq: Optimize exit to user space Thomas Gleixner
2025-10-27 8:44 ` [patch V6 01/31] rseq: Avoid pointless evaluation in __rseq_notify_resume() Thomas Gleixner
2025-10-29 10:24 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 02/31] rseq: Condense the inline stubs Thomas Gleixner
2025-10-29 10:24 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 03/31] rseq: Move algorithm comment to top Thomas Gleixner
2025-10-29 10:24 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 04/31] rseq: Remove the ksig argument from rseq_handle_notify_resume() Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 05/31] rseq: Simplify registration Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 06/31] rseq: Simplify the event notification Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 07/31] rseq, virt: Retrigger RSEQ after vcpu_run() Thomas Gleixner
2025-10-28 15:08 ` Mathieu Desnoyers
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 08/31] rseq: Avoid CPU/MM CID updates when no event pending Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 09/31] rseq: Introduce struct rseq_data Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 10/31] entry: Cleanup header Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` [tip: core/rseq] entry: Clean up header tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 11/31] entry: Remove syscall_enter_from_user_mode_prepare() Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 12/31] entry: Inline irqentry_enter/exit_from/to_user_mode() Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 13/31] sched: Move MM CID related functions to sched.h Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 14/31] rseq: Cache CPU ID and MM CID values Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 15/31] rseq: Record interrupt from user space Thomas Gleixner
2025-10-28 15:26 ` Mathieu Desnoyers
2025-10-28 17:02 ` Thomas Gleixner
2025-10-28 17:53 ` Mathieu Desnoyers
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 16/31] rseq: Provide tracepoint wrappers for inline code Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 17/31] rseq: Expose lightweight statistics in debugfs Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 18/31] rseq: Provide static branch for runtime debugging Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:44 ` [patch V6 19/31] rseq: Provide and use rseq_update_user_cs() Thomas Gleixner
2025-10-28 15:40 ` Mathieu Desnoyers
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-10-29 16:04 ` [patch V6 19/31] " Steven Rostedt
2025-10-29 21:00 ` Thomas Gleixner
2025-10-29 21:53 ` Steven Rostedt
2025-11-03 14:47 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 20/31] rseq: Replace the original debug implementation Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-10-30 21:52 ` [patch V6 20/31] " Prakash Sangappa
2025-10-31 14:27 ` Thomas Gleixner
2025-11-03 14:47 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 21/31] rseq: Make exit debugging static branch based Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 22/31] rseq: Use static branch for syscall exit debug when GENERIC_IRQ_ENTRY=y Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 23/31] rseq: Provide and use rseq_set_ids() Thomas Gleixner
2025-10-28 15:47 ` Mathieu Desnoyers
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 24/31] rseq: Separate the signal delivery path Thomas Gleixner
2025-10-28 15:51 ` Mathieu Desnoyers
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 25/31] rseq: Rework the TIF_NOTIFY handler Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:17 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 26/31] rseq: Optimize event setting Thomas Gleixner
2025-10-28 15:57 ` Mathieu Desnoyers
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:16 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 27/31] rseq: Implement fast path for exit to user Thomas Gleixner
2025-10-28 16:09 ` Mathieu Desnoyers
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-10-29 16:28 ` [patch V6 27/31] " Steven Rostedt
2025-11-03 14:47 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-04 8:16 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` Thomas Gleixner [this message]
2025-10-28 16:14 ` [patch V6 28/31] rseq: Switch to fast path processing on " Mathieu Desnoyers
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:16 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 29/31] entry: Split up exit_to_user_mode_prepare() Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:16 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 30/31] rseq: Split up rseq_exit_to_user_mode() Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:16 ` tip-bot2 for Thomas Gleixner
2025-10-27 8:45 ` [patch V6 31/31] rseq: Switch to TIF_RSEQ if supported Thomas Gleixner
2025-10-29 10:23 ` [tip: core/rseq] " tip-bot2 for Thomas Gleixner
2025-11-03 14:47 ` tip-bot2 for Thomas Gleixner
2025-11-04 8:16 ` tip-bot2 for Thomas Gleixner
2025-10-29 10:23 ` [patch V6 00/31] rseq: Optimize exit to user space Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251027084307.701201365@linutronix.de \
--to=tglx@linutronix.de \
--cc=axboe@kernel.dk \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mjeanson@efficios.com \
--cc=paulmck@kernel.org \
--cc=peterz@infradead.org \
--cc=seanjc@google.com \
--cc=wei.liu@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox