All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>
Cc: Michael Jeanson <mjeanson@efficios.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>, Wei Liu <wei.liu@kernel.org>
Subject: Re: [patch 00/11] rseq: Optimize exit to user space
Date: Wed, 13 Aug 2025 11:45:09 -0600	[thread overview]
Message-ID: <12342355-b3fb-4e78-ad5b-dcfff1366ccf@kernel.dk> (raw)
In-Reply-To: <20250813155941.014821755@linutronix.de>

On 8/13/25 10:29 AM, Thomas Gleixner wrote:
> With the more wide spread usage of rseq in glibc, rseq is not longer a
> niche use case for special applications.
> 
> While working on a sane implementation of a rseq based time slice extension
> mechanism, I noticed several shortcomings of the current rseq code:
> 
>   1) task::rseq_event_mask is a pointless bitfield despite the fact that
>      the ABI flags it was meant to support have been deprecated and
>      functionally disabled three years ago.
> 
>   2) task::rseq_event_mask is accumulating bits unless there is a critical
>      section discovered in the user space rseq memory. This results in
>      pointless invocations of the rseq user space exit handler even if
>      there had nothing changed. As a matter of correctness these bits have
>      to be clear when exiting to user space and therefore pristine when
>      coming back into the kernel. Aside of correctness, this also avoids
>      pointless evaluation of the user space memory, which is a performance
>      benefit.
> 
>   3) The evaluation of critical sections does not differentiate between
>      syscall and interrupt/exception exits. The current implementation
>      silently fixes up critical sections which invoked a syscall unless
>      CONFIG_DEBUG_RSEQ is enabled.
> 
>      That's just wrong. If user space does that on a production kernel it
>      can keep the pieces. The kernel is not there to proliferate mindless
>      user space programming and letting everyone pay the performance
>      penalty.
> 
> This series addresses these issues and on top converts parts of the user
> space access over to the new masked access model, which lowers the overhead
> of Spectre-V1 mitigations significantly on architectures which support it
> (x86 as of today). This is especially noticable in the access to the
> rseq_cs field in struct rseq, which is the first quick check to figure out
> whether a critical section is installed or not.
> 
> It survives the kernels rseq selftests, but I did not any performance tests
> vs. rseq because I have no idea how to use the gazillion of undocumented
> command line parameters of the benchmark. I leave that to people who are so
> familiar with them, that they assume everyone else is too :)
> 
> The performance gain on regular workloads is clearly measurable and the
> consistent event flag state allows now to build the time slice extension
> mechanism on top. The first POC I implemented:
> 
>    https://lore.kernel.org/lkml/87o6smb3a0.ffs@tglx/
> 
> suffered badly from the stale eventmask bits and the cleaned up version
> brought a whopping 25+% performance gain.

Thanks for doing this work, it's been on my list to take a look at rseq
as it's quite the pig currently and enabled by default (with what I
assume is from a newer libc).

-- 
Jens Axboe


  parent reply	other threads:[~2025-08-13 17:45 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-13 16:29 [patch 00/11] rseq: Optimize exit to user space Thomas Gleixner
2025-08-13 16:29 ` [patch 01/11] rseq: Avoid pointless evaluation in __rseq_notify_resume() Thomas Gleixner
2025-08-20 14:23   ` Mathieu Desnoyers
2025-08-13 16:29 ` [patch 02/11] rseq: Condense the inline stubs Thomas Gleixner
2025-08-20 14:24   ` Mathieu Desnoyers
2025-08-13 16:29 ` [patch 03/11] rseq: Rename rseq_syscall() to rseq_debug_syscall_exit() Thomas Gleixner
2025-08-20 14:25   ` Mathieu Desnoyers
2025-08-13 16:29 ` [patch 04/11] rseq: Replace the pointless event mask bit fiddling Thomas Gleixner
2025-08-13 16:29 ` [patch 05/11] rseq: Optimize the signal delivery path Thomas Gleixner
2025-08-13 16:29 ` [patch 06/11] rseq: Optimize exit to user space further Thomas Gleixner
2025-08-13 16:29 ` [patch 07/11] entry: Cleanup header Thomas Gleixner
2025-08-13 17:09   ` Giorgi Tchankvetadze
2025-08-13 21:30     ` Thomas Gleixner
2025-08-13 16:29 ` [patch 08/11] entry: Distinguish between syscall and interrupt exit Thomas Gleixner
2025-08-13 16:29 ` [patch 09/11] entry: Provide exit_to_user_notify_resume() Thomas Gleixner
2025-08-13 16:29 ` [patch 10/11] rseq: Skip fixup when returning from a syscall Thomas Gleixner
2025-08-14  8:54   ` Peter Zijlstra
2025-08-14 13:24     ` Thomas Gleixner
2025-08-13 16:29 ` [patch 11/11] rseq: Convert to masked user access where applicable Thomas Gleixner
2025-08-13 17:45 ` Jens Axboe [this message]
2025-08-13 21:32   ` [patch 00/11] rseq: Optimize exit to user space Thomas Gleixner
2025-08-13 21:36     ` Jens Axboe
2025-08-13 22:08       ` Thomas Gleixner
2025-08-17 21:23         ` Thomas Gleixner
2025-08-18 14:00           ` BUG: rseq selftests and librseq vs. glibc fail Thomas Gleixner
2025-08-18 14:15             ` Florian Weimer
2025-08-18 17:13               ` Thomas Gleixner
2025-08-18 19:33                 ` Florian Weimer
2025-08-18 19:46                   ` Sean Christopherson
2025-08-18 19:55                     ` Florian Weimer
2025-08-18 20:27                       ` Sean Christopherson
2025-08-18 23:54                         ` Thomas Gleixner
2025-08-19  0:28                           ` Sean Christopherson
2025-08-19  6:18                             ` Florian Weimer
2025-08-29 18:44                 ` Prakash Sangappa
2025-08-29 18:50                   ` Mathieu Desnoyers
2025-09-01 19:30                     ` Prakash Sangappa
2025-08-18 17:38           ` [patch 00/11] rseq: Optimize exit to user space Michael Jeanson
2025-08-18 20:21             ` Thomas Gleixner
2025-08-18 21:29               ` Michael Jeanson
2025-08-18 23:43                 ` Thomas Gleixner
2025-08-20 14:27           ` Mathieu Desnoyers
2025-08-20 14:10 ` Mathieu Desnoyers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=12342355-b3fb-4e78-ad5b-dcfff1366ccf@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mjeanson@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.