public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Michael Jeanson <mjeanson@efficios.com>,
	Jens Axboe <axboe@kernel.dk>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	x86@kernel.org, Arnd Bergmann <arnd@arndb.de>,
	Heiko Carstens <hca@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Huacai Chen <chenhuacai@kernel.org>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>
Subject: [patch V4 03/36] rseq: Move algorithm comment to top
Date: Mon,  8 Sep 2025 23:31:30 +0200 (CEST)	[thread overview]
Message-ID: <20250908212925.453479676@linutronix.de> (raw)
In-Reply-To: 20250908212737.353775467@linutronix.de

Move the comment which documents the RSEQ algorithm to the top of the file,
so it does not create horrible diffs later when the actual implementation
is fed into the mincer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

---
 kernel/rseq.c |  119 ++++++++++++++++++++++++++++------------------------------
 1 file changed, 59 insertions(+), 60 deletions(-)

--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -8,6 +8,65 @@
  * Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
  */
 
+/*
+ * Restartable sequences are a lightweight interface that allows
+ * user-level code to be executed atomically relative to scheduler
+ * preemption and signal delivery. Typically used for implementing
+ * per-cpu operations.
+ *
+ * It allows user-space to perform update operations on per-cpu data
+ * without requiring heavy-weight atomic operations.
+ *
+ * Detailed algorithm of rseq user-space assembly sequences:
+ *
+ *                     init(rseq_cs)
+ *                     cpu = TLS->rseq::cpu_id_start
+ *   [1]               TLS->rseq::rseq_cs = rseq_cs
+ *   [start_ip]        ----------------------------
+ *   [2]               if (cpu != TLS->rseq::cpu_id)
+ *                             goto abort_ip;
+ *   [3]               <last_instruction_in_cs>
+ *   [post_commit_ip]  ----------------------------
+ *
+ *   The address of jump target abort_ip must be outside the critical
+ *   region, i.e.:
+ *
+ *     [abort_ip] < [start_ip]  || [abort_ip] >= [post_commit_ip]
+ *
+ *   Steps [2]-[3] (inclusive) need to be a sequence of instructions in
+ *   userspace that can handle being interrupted between any of those
+ *   instructions, and then resumed to the abort_ip.
+ *
+ *   1.  Userspace stores the address of the struct rseq_cs assembly
+ *       block descriptor into the rseq_cs field of the registered
+ *       struct rseq TLS area. This update is performed through a single
+ *       store within the inline assembly instruction sequence.
+ *       [start_ip]
+ *
+ *   2.  Userspace tests to check whether the current cpu_id field match
+ *       the cpu number loaded before start_ip, branching to abort_ip
+ *       in case of a mismatch.
+ *
+ *       If the sequence is preempted or interrupted by a signal
+ *       at or after start_ip and before post_commit_ip, then the kernel
+ *       clears TLS->__rseq_abi::rseq_cs, and sets the user-space return
+ *       ip to abort_ip before returning to user-space, so the preempted
+ *       execution resumes at abort_ip.
+ *
+ *   3.  Userspace critical section final instruction before
+ *       post_commit_ip is the commit. The critical section is
+ *       self-terminating.
+ *       [post_commit_ip]
+ *
+ *   4.  <success>
+ *
+ *   On failure at [2], or if interrupted by preempt or signal delivery
+ *   between [1] and [3]:
+ *
+ *       [abort_ip]
+ *   F1. <failure>
+ */
+
 #include <linux/sched.h>
 #include <linux/uaccess.h>
 #include <linux/syscalls.h>
@@ -98,66 +157,6 @@ static int rseq_validate_ro_fields(struc
 	unsafe_put_user(value, &t->rseq->field, error_label)
 #endif
 
-/*
- *
- * Restartable sequences are a lightweight interface that allows
- * user-level code to be executed atomically relative to scheduler
- * preemption and signal delivery. Typically used for implementing
- * per-cpu operations.
- *
- * It allows user-space to perform update operations on per-cpu data
- * without requiring heavy-weight atomic operations.
- *
- * Detailed algorithm of rseq user-space assembly sequences:
- *
- *                     init(rseq_cs)
- *                     cpu = TLS->rseq::cpu_id_start
- *   [1]               TLS->rseq::rseq_cs = rseq_cs
- *   [start_ip]        ----------------------------
- *   [2]               if (cpu != TLS->rseq::cpu_id)
- *                             goto abort_ip;
- *   [3]               <last_instruction_in_cs>
- *   [post_commit_ip]  ----------------------------
- *
- *   The address of jump target abort_ip must be outside the critical
- *   region, i.e.:
- *
- *     [abort_ip] < [start_ip]  || [abort_ip] >= [post_commit_ip]
- *
- *   Steps [2]-[3] (inclusive) need to be a sequence of instructions in
- *   userspace that can handle being interrupted between any of those
- *   instructions, and then resumed to the abort_ip.
- *
- *   1.  Userspace stores the address of the struct rseq_cs assembly
- *       block descriptor into the rseq_cs field of the registered
- *       struct rseq TLS area. This update is performed through a single
- *       store within the inline assembly instruction sequence.
- *       [start_ip]
- *
- *   2.  Userspace tests to check whether the current cpu_id field match
- *       the cpu number loaded before start_ip, branching to abort_ip
- *       in case of a mismatch.
- *
- *       If the sequence is preempted or interrupted by a signal
- *       at or after start_ip and before post_commit_ip, then the kernel
- *       clears TLS->__rseq_abi::rseq_cs, and sets the user-space return
- *       ip to abort_ip before returning to user-space, so the preempted
- *       execution resumes at abort_ip.
- *
- *   3.  Userspace critical section final instruction before
- *       post_commit_ip is the commit. The critical section is
- *       self-terminating.
- *       [post_commit_ip]
- *
- *   4.  <success>
- *
- *   On failure at [2], or if interrupted by preempt or signal delivery
- *   between [1] and [3]:
- *
- *       [abort_ip]
- *   F1. <failure>
- */
-
 static int rseq_update_cpu_node_id(struct task_struct *t)
 {
 	struct rseq __user *rseq = t->rseq;




  parent reply	other threads:[~2025-09-08 21:31 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-08 21:31 [patch V4 00/36] rseq: Optimize exit to user space Thomas Gleixner
2025-09-08 21:31 ` [patch V4 01/36] rseq: Avoid pointless evaluation in __rseq_notify_resume() Thomas Gleixner
2025-09-08 21:31 ` [patch V4 02/36] rseq: Condense the inline stubs Thomas Gleixner
2025-09-08 21:31 ` Thomas Gleixner [this message]
2025-09-08 21:31 ` [patch V4 04/36] rseq: Remove the ksig argument from rseq_handle_notify_resume() Thomas Gleixner
2025-09-08 21:31 ` [patch V4 05/36] rseq: Simplify registration Thomas Gleixner
2025-09-08 21:31 ` [patch V4 06/36] rseq: Simplify the event notification Thomas Gleixner
2025-09-09 13:18   ` Mathieu Desnoyers
2025-09-08 21:31 ` [patch V4 07/36] rseq, virt: Retrigger RSEQ after vcpu_run() Thomas Gleixner
2025-09-09  0:00   ` Sean Christopherson
2025-09-09 12:10     ` Thomas Gleixner
2025-09-09 13:21   ` Mathieu Desnoyers
2025-09-08 21:31 ` [patch V4 08/36] rseq: Avoid CPU/MM CID updates when no event pending Thomas Gleixner
2025-09-09 13:25   ` Mathieu Desnoyers
2025-09-08 21:31 ` [patch V4 09/36] rseq: Introduce struct rseq_data Thomas Gleixner
2025-09-09 13:30   ` Mathieu Desnoyers
2025-09-12 20:44     ` Thomas Gleixner
2025-09-12 21:33       ` Mathieu Desnoyers
2025-09-08 21:31 ` [patch V4 10/36] entry: Cleanup header Thomas Gleixner
2025-09-08 21:31 ` [patch V4 11/36] entry: Remove syscall_enter_from_user_mode_prepare() Thomas Gleixner
2025-09-09 13:33   ` Mathieu Desnoyers
2025-09-08 21:31 ` [patch V4 12/36] entry: Inline irqentry_enter/exit_from/to_user_mode() Thomas Gleixner
2025-09-09 13:38   ` Mathieu Desnoyers
2025-09-09 14:10     ` Thomas Gleixner
2025-09-09 14:59       ` Mathieu Desnoyers
2025-09-08 21:31 ` [patch V4 13/36] sched: Move MM CID related functions to sched.h Thomas Gleixner
2025-09-08 21:31 ` [patch V4 14/36] rseq: Cache CPU ID and MM CID values Thomas Gleixner
2025-09-09 13:43   ` Mathieu Desnoyers
2025-09-09 14:13     ` Thomas Gleixner
2025-09-09 15:01       ` Mathieu Desnoyers
2025-09-08 21:31 ` [patch V4 15/36] rseq: Record interrupt from user space Thomas Gleixner
2025-09-09 13:53   ` Mathieu Desnoyers
2025-09-09 14:17     ` Thomas Gleixner
2025-09-09 15:05       ` Mathieu Desnoyers
2025-09-08 21:31 ` [patch V4 16/36] rseq: Provide tracepoint wrappers for inline code Thomas Gleixner
2025-09-08 21:31 ` [patch V4 17/36] rseq: Expose lightweight statistics in debugfs Thomas Gleixner
2025-09-08 21:32 ` [patch V4 18/36] rseq: Provide static branch for runtime debugging Thomas Gleixner
2025-09-08 21:32 ` [patch V4 19/36] rseq: Provide and use rseq_update_user_cs() Thomas Gleixner
2025-09-09 15:11   ` Mathieu Desnoyers
2025-09-08 21:32 ` [patch V4 20/36] rseq: Replace the original debug implementation Thomas Gleixner
2025-09-08 21:32 ` [patch V4 21/36] rseq: Make exit debugging static branch based Thomas Gleixner
2025-09-08 21:32 ` [patch V4 22/36] rseq: Use static branch for syscall exit debug when GENERIC_IRQ_ENTRY=y Thomas Gleixner
2025-09-08 21:32 ` [patch V4 23/36] rseq: Provide and use rseq_set_ids() Thomas Gleixner
2025-09-11 13:40   ` Mathieu Desnoyers
2025-09-11 16:02     ` Thomas Gleixner
2025-09-11 17:13       ` Mathieu Desnoyers
2025-09-08 21:32 ` [patch V4 24/36] rseq: Separate the signal delivery path Thomas Gleixner
2025-09-08 21:32 ` [patch V4 25/36] rseq: Rework the TIF_NOTIFY handler Thomas Gleixner
2025-09-08 21:32 ` [patch V4 26/36] rseq: Optimize event setting Thomas Gleixner
2025-09-11 14:03   ` Mathieu Desnoyers
2025-09-11 16:06     ` Thomas Gleixner
2025-09-11 17:15       ` Mathieu Desnoyers
2025-09-12  6:58         ` Thomas Gleixner
2025-09-08 21:32 ` [patch V4 27/36] rseq: Implement fast path for exit to user Thomas Gleixner
2025-09-11 14:27   ` Mathieu Desnoyers
2025-09-11 16:08     ` Thomas Gleixner
2025-09-08 21:32 ` [patch V4 28/36] rseq: Switch to fast path processing on " Thomas Gleixner
2025-09-11 14:44   ` Mathieu Desnoyers
2025-09-11 14:45     ` Mathieu Desnoyers
2025-09-11 16:50       ` Thomas Gleixner
2025-09-11 16:47     ` Thomas Gleixner
2025-09-11 20:00       ` Mathieu Desnoyers
2025-09-12 14:22         ` Thomas Gleixner
2025-09-12 15:44           ` Mathieu Desnoyers
2025-09-08 21:32 ` [patch V4 29/36] entry: Split up exit_to_user_mode_prepare() Thomas Gleixner
2025-09-08 21:32 ` [patch V4 30/36] rseq: Split up rseq_exit_to_user_mode() Thomas Gleixner
2025-09-08 21:32 ` [patch V4 31/36] asm-generic: Provide generic TIF infrastructure Thomas Gleixner
2025-09-17  6:16   ` [tip: core/core] " tip-bot2 for Thomas Gleixner
2025-09-08 21:32 ` [patch V4 32/36] x86: Use generic TIF bits Thomas Gleixner
2025-09-17  6:16   ` [tip: core/core] " tip-bot2 for Thomas Gleixner
2025-09-08 21:32 ` [patch V4 33/36] s390: " Thomas Gleixner
2025-09-11  9:11   ` Sven Schnelle
2025-09-11 11:03   ` Heiko Carstens
2025-09-17  6:16   ` [tip: core/core] " tip-bot2 for Thomas Gleixner
2025-09-08 21:32 ` [patch V4 34/36] loongarch: " Thomas Gleixner
2025-09-17  6:16   ` [tip: core/core] " tip-bot2 for Thomas Gleixner
2025-09-08 21:32 ` [patch V4 35/36] riscv: " Thomas Gleixner
2025-09-17  6:16   ` [tip: core/core] " tip-bot2 for Thomas Gleixner
2025-09-08 21:32 ` [patch V4 36/36] rseq: Switch to TIF_RSEQ if supported Thomas Gleixner
2025-09-10 13:55 ` [patch V4 00/36] rseq: Optimize exit to user space Jens Axboe
2025-09-10 14:45   ` Michael Jeanson
2025-09-10 15:34     ` Jens Axboe
2025-09-10 14:54   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250908212925.453479676@linutronix.de \
    --to=tglx@linutronix.de \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=boqun.feng@gmail.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=chenhuacai@kernel.org \
    --cc=decui@microsoft.com \
    --cc=hca@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mjeanson@efficios.com \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=svens@linux.ibm.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox