From: Heiko Carstens <hca@linux.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org,
Thomas Gleixner <tglx@linutronix.de>,
mathieu.desnoyers@efficios.com,
Mark Rutland <mark.rutland@arm.com>,
cmarinas@kernel.org, maddy@linux.ibm.com, ryan.roberts@arm.com
Subject: Re: [RFC] in-kernel rseq
Date: Tue, 24 Feb 2026 12:16:46 +0100 [thread overview]
Message-ID: <20260224111646.20006Ddc-hca@linux.ibm.com> (raw)
In-Reply-To: <20260223163843.GR1282955@noisy.programming.kicks-ass.net>
On Mon, Feb 23, 2026 at 05:38:43PM +0100, Peter Zijlstra wrote:
> This means, it needs to be woven into the asm... and I'm not that handy
> with arm64 asm.
>
> The pseudo code would be something like:
>
> current->sched_seq = &_R;
> ...
>
> _start: compute per cpu-addr
> load addr
> $OP
> _commit: store addr
>
> ...
> current->sched_rseq = NULL;
>
>
> Then when preemption happens (from interrupt), the instruction pointer
> is 'simply' reset to _start and it tries again.
I guess also on every interrupt, exception, and nmi current->sched_rseq needs
to be saved on entry, and restored on exit, since other contexts can make use
of this_cpu ops as well.
> Anyway, this was aimed at arm64, which chose to use atomics for
> this_cpu. But if we move sched_rseq() from schedule-tail into interrupt
> entry, then this would also work for things like Power.
Let's assume s390 would be target, which also uses atomics for
this_cpu ops. A very simple function like:
static DEFINE_PER_CPU(long, bar);
long foo(long val)
{
return this_cpu_add_return(bar, val);
}
would turn into the below with PREEMPT_NONE:
0000000000000000 <foo>:
0: c0 04 00 00 00 00 jgnop 0 <foo>
6: c0 10 00 00 00 00 larl %r1,6 <foo+0x6> <- r1 contains address of "bar"
8: R_390_PC32DBL .data..percpu+0x2
c: a7 39 00 00 lghi %r3,0
10: e3 10 33 b8 00 08 ag %r1,952(%r3) <- add per-cpu offset
16: eb 02 10 00 00 e8 laag %r0,%r2,0(%r1) <- atomic op
1c: b9 08 00 20 agr %r2,%r0
20: 07 fe br %r14
With PREEMPT_LAZY this turns into:
0000000000000000 <foo>:
0: c0 04 00 00 00 00 jgnop 0 <foo>
6: eb af f0 68 00 24 stmg %r10,%r15,104(%r15)
c: b9 04 00 ef lgr %r14,%r15
10: b9 04 00 b2 lgr %r11,%r2
14: e3 f0 ff c8 ff 71 lay %r15,-56(%r15)
1a: e3 e0 f0 98 00 24 stg %r14,152(%r15) <- up to here: create stack frame
20: eb 01 03 a8 00 6a asi 936,1 <- preempt_inc()
26: c0 10 00 00 00 00 larl %r1,26 <foo+0x26>
28: R_390_PC32DBL .data..percpu+0x2
2c: a7 29 00 00 lghi %r2,0
30: e3 10 23 b8 00 08 ag %r1,952(%r2)
36: eb ab 10 00 00 e8 laag %r10,%r11,0(%r1)
3c: eb ff 03 a8 00 6e alsi 936,-1 <- preempt_dec_and_test()
42: a7 54 00 05 jnhe 4c <foo+0x4c>
46: c0 e5 00 00 00 00 brasl %r14,46 <foo+0x46>
48: R_390_PLT32DBL preempt_schedule_notrace+0x2
4c: b9 e8 b0 2a agrk %r2,%r10,%r11
50: eb af f0 a0 00 04 lmg %r10,%r15,160(%r15)
56: 07 fe br %r14
With your proposal I guess this would turn into something like below. Note,
the below is hand-edited, therefore offsets etc, do not make any sense, it is
just the instruction sequence I guess we _could_ end up with:
0000000000000000 <foo>:
0: c0 04 00 00 00 00 jgnop 0 <foo>
larl %r1,#this_seq <- &_RR
stg %r1,944 <- lowcore->sched_seq = &_R;
c: c0 10 00 00 00 00 larl %r1,c <foo+0xc>
e: R_390_PC32DBL .data..percpu+0x2
16: e3 10 33 b8 00 08 ag %r1,952
1c: eb 02 10 00 00 e8 laag %r0,%r2,0(%r1)
mvghi 944,0 <- lowcore->sched_seq = NULL;
2c: b9 08 00 20 agr %r2,%r0
30: 07 fe br %r14
This uses the s390 specific "lowcore" instead of current for sched_seq, since
it is an architecture per-cpu area mapped at address zero.
Let me give it a try to verify if the generated code would really look
like the above, but might a few days.
next prev parent reply other threads:[~2026-02-24 11:17 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-23 16:38 [RFC] in-kernel rseq Peter Zijlstra
2026-02-23 17:53 ` David Laight
2026-02-23 18:22 ` Mathieu Desnoyers
2026-02-23 21:54 ` Peter Zijlstra
2026-02-24 10:27 ` David Laight
2026-02-24 13:33 ` Mathieu Desnoyers
2026-02-24 14:49 ` David Laight
2026-02-24 16:15 ` Mathieu Desnoyers
2026-02-24 11:16 ` Heiko Carstens [this message]
2026-02-24 13:48 ` Mathieu Desnoyers
2026-02-24 14:59 ` David Laight
2026-02-24 16:18 ` Mathieu Desnoyers
2026-02-24 15:17 ` Peter Zijlstra
2026-02-24 15:20 ` Peter Zijlstra
2026-02-24 16:02 ` Heiko Carstens
2026-02-24 16:15 ` Heiko Carstens
2026-04-10 17:57 ` Shrikanth Hegde
2026-04-15 8:51 ` Heiko Carstens
2026-04-17 9:29 ` Shrikanth Hegde
2026-04-17 9:36 ` Shrikanth Hegde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260224111646.20006Ddc-hca@linux.ibm.com \
--to=hca@linux.ibm.com \
--cc=cmarinas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=maddy@linux.ibm.com \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=peterz@infradead.org \
--cc=ryan.roberts@arm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.