Re: [RFC] in-kernel rseq - Heiko Carstens

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Heiko Carstens <hca@linux.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	mathieu.desnoyers@efficios.com,
	Mark Rutland <mark.rutland@arm.com>,
	cmarinas@kernel.org, maddy@linux.ibm.com, ryan.roberts@arm.com
Subject: Re: [RFC] in-kernel rseq
Date: Tue, 24 Feb 2026 12:16:46 +0100	[thread overview]
Message-ID: <20260224111646.20006Ddc-hca@linux.ibm.com> (raw)
In-Reply-To: <20260223163843.GR1282955@noisy.programming.kicks-ass.net>

On Mon, Feb 23, 2026 at 05:38:43PM +0100, Peter Zijlstra wrote:
> This means, it needs to be woven into the asm... and I'm not that handy
> with arm64 asm.
> 
> The pseudo code would be something like:
> 
> 	current->sched_seq = &_R;
> 	...
> 
> _start:  compute per cpu-addr
> 	 load addr
> 	 $OP
> _commit: store addr
> 
> 	...
> 	current->sched_rseq = NULL;
> 
> 
> Then when preemption happens (from interrupt), the instruction pointer
> is 'simply' reset to _start and it tries again.

I guess also on every interrupt, exception, and nmi current->sched_rseq needs
to be saved on entry, and restored on exit, since other contexts can make use
of this_cpu ops as well.

> Anyway, this was aimed at arm64, which chose to use atomics for
> this_cpu. But if we move sched_rseq() from schedule-tail into interrupt
> entry, then this would also work for things like Power.

Let's assume s390 would be target, which also uses atomics for
this_cpu ops. A very simple function like:

static DEFINE_PER_CPU(long, bar);

long foo(long val)
{
	return this_cpu_add_return(bar, val); 
}

would turn into the below with PREEMPT_NONE:

0000000000000000 <foo>:
   0:   c0 04 00 00 00 00       jgnop   0 <foo>
   6:   c0 10 00 00 00 00       larl    %r1,6 <foo+0x6> <- r1 contains address of "bar"
                        8: R_390_PC32DBL        .data..percpu+0x2
   c:   a7 39 00 00             lghi    %r3,0
  10:   e3 10 33 b8 00 08       ag      %r1,952(%r3)    <- add per-cpu offset
  16:   eb 02 10 00 00 e8       laag    %r0,%r2,0(%r1)  <- atomic op
  1c:   b9 08 00 20             agr     %r2,%r0
  20:   07 fe                   br      %r14

With PREEMPT_LAZY this turns into:

0000000000000000 <foo>:
   0:   c0 04 00 00 00 00       jgnop   0 <foo>
   6:   eb af f0 68 00 24       stmg    %r10,%r15,104(%r15)
   c:   b9 04 00 ef             lgr     %r14,%r15
  10:   b9 04 00 b2             lgr     %r11,%r2
  14:   e3 f0 ff c8 ff 71       lay     %r15,-56(%r15)
  1a:   e3 e0 f0 98 00 24       stg     %r14,152(%r15) <- up to here: create stack frame
  20:   eb 01 03 a8 00 6a       asi     936,1          <- preempt_inc()
  26:   c0 10 00 00 00 00       larl    %r1,26 <foo+0x26>
                        28: R_390_PC32DBL       .data..percpu+0x2
  2c:   a7 29 00 00             lghi    %r2,0
  30:   e3 10 23 b8 00 08       ag      %r1,952(%r2)
  36:   eb ab 10 00 00 e8       laag    %r10,%r11,0(%r1)
  3c:   eb ff 03 a8 00 6e       alsi    936,-1         <- preempt_dec_and_test()
  42:   a7 54 00 05             jnhe    4c <foo+0x4c>
  46:   c0 e5 00 00 00 00       brasl   %r14,46 <foo+0x46>
                        48: R_390_PLT32DBL      preempt_schedule_notrace+0x2
  4c:   b9 e8 b0 2a             agrk    %r2,%r10,%r11
  50:   eb af f0 a0 00 04       lmg     %r10,%r15,160(%r15)
  56:   07 fe                   br      %r14

With your proposal I guess this would turn into something like below.  Note,
the below is hand-edited, therefore offsets etc, do not make any sense, it is
just the instruction sequence I guess we _could_ end up with:

0000000000000000 <foo>:
   0:   c0 04 00 00 00 00       jgnop   0 <foo>
                                larl    %r1,#this_seq <- &_RR 
                                stg     %r1,944       <- lowcore->sched_seq = &_R;
   c:   c0 10 00 00 00 00       larl    %r1,c <foo+0xc>
                        e: R_390_PC32DBL        .data..percpu+0x2
  16:   e3 10 33 b8 00 08       ag      %r1,952
  1c:   eb 02 10 00 00 e8       laag    %r0,%r2,0(%r1)
                                mvghi   944,0         <- lowcore->sched_seq = NULL;
  2c:   b9 08 00 20             agr     %r2,%r0
  30:   07 fe                   br      %r14

This uses the s390 specific "lowcore" instead of current for sched_seq, since
it is an architecture per-cpu area mapped at address zero.

Let me give it a try to verify if the generated code would really look
like the above, but might a few days.

next prev parent reply	other threads:[~2026-02-24 11:17 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-23 16:38 [RFC] in-kernel rseq Peter Zijlstra
2026-02-23 17:53 ` David Laight
2026-02-23 18:22   ` Mathieu Desnoyers
2026-02-23 21:54     ` Peter Zijlstra
2026-02-24 10:27       ` David Laight
2026-02-24 13:33         ` Mathieu Desnoyers
2026-02-24 14:49           ` David Laight
2026-02-24 16:15             ` Mathieu Desnoyers
2026-02-24 11:16 ` Heiko Carstens [this message]
2026-02-24 13:48   ` Mathieu Desnoyers
2026-02-24 14:59     ` David Laight
2026-02-24 16:18       ` Mathieu Desnoyers
2026-02-24 15:17   ` Peter Zijlstra
2026-02-24 15:20   ` Peter Zijlstra
2026-02-24 16:02     ` Heiko Carstens
2026-02-24 16:15       ` Heiko Carstens
2026-04-10 17:57 ` Shrikanth Hegde
2026-04-15  8:51   ` Heiko Carstens
2026-04-17  9:29     ` Shrikanth Hegde
2026-04-17  9:36     ` Shrikanth Hegde

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260224111646.20006Ddc-hca@linux.ibm.com \
    --to=hca@linux.ibm.com \
    --cc=cmarinas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maddy@linux.ibm.com \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=ryan.roberts@arm.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.