From: David Laight <david.laight.linux@gmail.com>
To: Heiko Carstens <hca@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>,
Sven Schnelle <svens@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Juergen Christ <jchrist@linux.ibm.com>,
Peter Zijlstra <peterz@infradead.org>,
Yang Shi <yang@os.amperecomputing.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org
Subject: Re: [PATCH v4 0/8] s390: Improve this_cpu operations
Date: Fri, 22 May 2026 19:33:04 +0100 [thread overview]
Message-ID: <20260522193304.72f226ab@pumpkin> (raw)
In-Reply-To: <20260522141257.303617-1-hca@linux.ibm.com>
On Fri, 22 May 2026 16:12:49 +0200
Heiko Carstens <hca@linux.ibm.com> wrote:
> v4:
>
> - Drop alternatives approach and extract percpu base register number for
> mviy at compile time [David Laight]
Definitely looks better.
I'm sure I managed to understand it once :-)
-- David
>
> - Fix logic for percpu code section detection, as well as
> interruption/exception/nmi path [Sashiko [5]]
>
> [5] https://sashiko.dev/#/patchset/20260520092243.264847-1-hca%40linux.ibm.com
>
> v3:
>
> - Fix various typos [Juergen Christ]
>
> - Add missing kprobe detection / handling [Sashiko [3]]
> [FWIW, this made me also aware of that the current general s390 kprobes
> code seems to be racy against concurrent removal of a kprobe while a
> probe hit on a different CPU. But that is a different story.]
>
> - Fix various minor findings [Sashiko [3]]
>
> - All of this might be dropped / exchanged in future in favor of the percpu
> page table approach proposed by Yang Shi [4].
>
> [3] https://sashiko.dev/#/patchset/20260319120503.4046659-1-hca@linux.ibm.com
> [4] https://lore.kernel.org/all/20260429170758.3018959-1-yang@os.amperecomputing.com/
>
> v2:
>
> - Add proper PERCPU_PTR cast to most patches to avoid tons of sparse
> warnings
>
> - Add missing __packed attribute to insn structure [Sashiko [2]]
>
> - Fix inverted if condition [Sashiko [2]]
>
> - Add missing user_mode() check [Sashiko [2]]
>
> - Move percpu_entry() call in front of irqentry_enter() call in all
> entry paths to avoid that potential this_cpu() operations overwrite
> the not-yet saved percpu code section indicator [Sashiko [2]]
>
> [2] https://sashiko.dev/#/patchset/20260317195436.2276810-1-hca%40linux.ibm.com
>
> v1:
>
> This is a follow-up to Peter Zijlstra's in-kernel rseq RFC [1].
>
> With the intended removal of PREEMPT_NONE this_cpu operations based on
> atomic instructions, guarded with preempt_disable()/preempt_enable() pairs,
> become more expensive: the preempt_disable() / preempt_enable() pairs are
> not optimized away anymore during compile time.
>
> In particular the conditional call to preempt_schedule_notrace() after
> preempt_enable() adds additional code and register pressure.
>
> To avoid this Peter suggested an in-kernel rseq approach. While this would
> certainly work, this series tries to come up with a solution which uses
> less instructions and doesn't require to repeat instruction sequences.
>
> The idea is that this_cpu operations based on atomic instructions are
> guarded with mvyi instructions:
>
> - The first mvyi instruction writes the register number, which contains
> the percpu address variable to lowcore. This also indicates that a
> percpu code section is executed.
>
> - The first instruction following the mvyi instruction must be the ag
> instruction which adds the percpu offset to the percpu address register.
>
> - Afterwards the atomic percpu operation follows.
>
> - Then a second mvyi instruction writes a zero to lowcore, which indicates
> the end of the percpu code section.
>
> - In case of an interrupt/exception/nmi the register number which was
> written to lowcore is copied to the exception frame (pt_regs), and a zero
> is written to lowcore.
>
> - On return to the previous context it is checked if a percpu code section
> was executed (saved register number not zero), and if the process was
> migrated to a different cpu. If the percpu offset was already added to
> the percpu address register (instruction address does _not_ point to the
> ag instruction) the content of the percpu address register is adjusted so
> it points to percpu variable of the new cpu.
>
> All of this seems to work, but of course it could still be broken since I
> missed some detail.
>
> In total this series results in a kernel text size reduction of ~106kb. The
> number of preempt_schedule_notrace() call sites is reduced from 7089 to
> 1577.
>
> Note: this comes without any huge performance analysis, however all
> microbenchmarks confirmed that the new code is at least as fast as the
> old code, like expected.
>
> [1] 20260223163843.GR1282955@noisy.programming.kicks-ass.net
>
> Heiko Carstens (8):
> s390/percpu: Infrastructure for more efficient this_cpu operations
> s390/percpu: Add missing do { } while (0) constructs
> s390/percpu: Use new percpu code section for arch_this_cpu_add()
> s390/percpu: Use new percpu code section for arch_this_cpu_add_return()
> s390/percpu: Use new percpu code section for arch_this_cpu_[and|or]()
> s390/percpu: Provide arch_this_cpu_read() implementation
> s390/percpu: Provide arch_this_cpu_write() implementation
> s390/percpu: Remove one and two byte this_cpu operation implementation
>
> arch/s390/include/asm/entry-percpu.h | 78 ++++++++
> arch/s390/include/asm/lowcore.h | 3 +-
> arch/s390/include/asm/percpu.h | 257 +++++++++++++++++++++------
> arch/s390/include/asm/ptrace.h | 2 +
> arch/s390/kernel/irq.c | 24 ++-
> arch/s390/kernel/nmi.c | 5 +
> arch/s390/kernel/traps.c | 5 +
> 7 files changed, 315 insertions(+), 59 deletions(-)
> create mode 100644 arch/s390/include/asm/entry-percpu.h
>
prev parent reply other threads:[~2026-05-22 18:33 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-22 14:12 [PATCH v4 0/8] s390: Improve this_cpu operations Heiko Carstens
2026-05-22 14:12 ` [PATCH v4 1/8] s390/percpu: Infrastructure for more efficient " Heiko Carstens
2026-05-22 14:12 ` [PATCH v4 2/8] s390/percpu: Add missing do { } while (0) constructs Heiko Carstens
2026-05-22 14:12 ` [PATCH v4 3/8] s390/percpu: Use new percpu code section for arch_this_cpu_add() Heiko Carstens
2026-05-22 14:12 ` [PATCH v4 4/8] s390/percpu: Use new percpu code section for arch_this_cpu_add_return() Heiko Carstens
2026-05-22 14:12 ` [PATCH v4 5/8] s390/percpu: Use new percpu code section for arch_this_cpu_[and|or]() Heiko Carstens
2026-05-22 14:12 ` [PATCH v4 6/8] s390/percpu: Provide arch_this_cpu_read() implementation Heiko Carstens
2026-05-22 14:12 ` [PATCH v4 7/8] s390/percpu: Provide arch_this_cpu_write() implementation Heiko Carstens
2026-05-22 14:12 ` [PATCH v4 8/8] s390/percpu: Remove one and two byte this_cpu operation implementation Heiko Carstens
2026-05-22 18:33 ` David Laight [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260522193304.72f226ab@pumpkin \
--to=david.laight.linux@gmail.com \
--cc=agordeev@linux.ibm.com \
--cc=borntraeger@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=jchrist@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=sshegde@linux.ibm.com \
--cc=svens@linux.ibm.com \
--cc=yang@os.amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox