From: Sven Schnelle <svens@linux.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Thomas Gleixner" <tglx@kernel.org>,
"H. Peter Anvin" <hpa@zytor.com>,
"Michal Suchánek" <msuchanek@suse.de>,
"Jonathan Corbet" <corbet@lwn.net>,
"Shuah Khan" <skhan@linuxfoundation.org>,
"Huacai Chen" <chenhuacai@kernel.org>,
"WANG Xuerui" <kernel@xen0n.name>,
"Madhavan Srinivasan" <maddy@linux.ibm.com>,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
"Paul Walmsley" <pjw@kernel.org>,
"Palmer Dabbelt" <palmer@dabbelt.com>,
"Albert Ou" <aou@eecs.berkeley.edu>,
"Alexandre Ghiti" <alex@ghiti.fr>,
"Heiko Carstens" <hca@linux.ibm.com>,
"Vasily Gorbik" <gor@linux.ibm.com>,
"Alexander Gordeev" <agordeev@linux.ibm.com>,
"Christian Borntraeger" <borntraeger@linux.ibm.com>,
"Andy Lutomirski" <luto@kernel.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Borislav Petkov" <bp@alien8.de>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
x86@kernel.org,
"Andrew Donnellan" <andrew+kernel@donnellan.id.au>,
"Mark Rutland" <mark.rutland@arm.com>,
"Arnd Bergmann" <arnd@arndb.de>,
"Jiaxun Yang" <jiaxun.yang@flygoat.com>,
"Ryan Roberts" <ryan.roberts@arm.com>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Mukesh Kumar Chaurasiya" <mkchauras@linux.ibm.com>,
"Shrikanth Hegde" <sshegde@linux.ibm.com>,
"Zong Li" <zong.li@sifive.com>, "Nam Cao" <namcao@linutronix.de>,
"Deepak Gupta" <debug@rivosinc.com>,
"Lukas Gerlach" <lukas.gerlach@cispa.de>,
"Rui Qi" <qirui.001@bytedance.com>, "Kees Cook" <kees@kernel.org>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
loongarch@lists.linux.dev, linuxppc-dev@lists.ozlabs.org,
linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org
Subject: Re: [RFC] entry: Untangle the return value of syscall_enter_from_user_mode from syscall NR
Date: Fri, 03 Jul 2026 13:17:17 +0200 [thread overview]
Message-ID: <yt9dy0fs491e.fsf@linux.ibm.com> (raw)
In-Reply-To: <20260703105718.GO751831@noisy.programming.kicks-ass.net>
Peter Zijlstra <peterz@infradead.org> writes:
> On Fri, Jul 03, 2026 at 11:59:07AM +0200, Sven Schnelle wrote:
>> Thomas Gleixner <tglx@kernel.org> writes:
>>
>> > On Fri, Jul 03 2026 at 08:26, Sven Schnelle wrote:
>> >> Thomas Gleixner <tglx@kernel.org> writes:
>> >>> It's less than obvious and I have no objections to clean that up and
>> >>> make it more intuitive, but I still fail to see what Michal is actually
>> >>> trying to solve and what the magic flag is for. If s390 requires it,
>> >>> then that's an s390 problem, but definitely x86 does not.
>> >>
>> >> The difference between x86 and s390 is that on s390, regs->gprs[2] is
>> >> used for both the syscall number and the syscall return value.
>> >> That was a design mistake early in the begin about 25 years ago, but
>> >> it's ABI now, so it cannot be changed.
>> >
>> > Cute.
>> >
>> >> When seccomp decides to skip a syscall, it write a return value into
>> >> regs->gprs[2]. When syscall_enter_from_user_mode_work() returns, it
>> >> returns this number. If it's negative all is good - the 'if (likely(nr <
>> >> NR_syscalls))' conditiion would just catch it and skip the syscall.
>> >>
>> >> But if it's a positive number, the code cannot distinguish whether
>> >> that's a return value or a syscall number.
>> >>
>> >> So I introduced PIF_SYSCALL_RET_SET when converting s390 to generic
>> >> entry. This flag tells the syscall code that a return value was set in
>> >> ptregs and the syscall should be skipped.
>> >
>> > You also could have added a 'syscall_ret' member to pt_regs, operate
>> > on that for the return values (seccomp, syscall...) and swap it into
>> > gprs[2] right before returning to user space.
>>
>> That would likely also work, but I found it easier to read and
>> understand to have an additional flag with a descriptive name than having
>> yet another 'somehow-related-to-gpr2' member in ptregs.
>
> I find this very odd; I would think that having both syscall-nr and
> syscall-ret in separate (virtual) registers for most of the normal cycle
> would be most obvious and less surprising -- given that this is what all
> other architectures do.
>
> Entry either grabs a copy of gpr2 and preserves it in orig_gpr2 as the
> syscall nr, or as Thomas suggests, you keep syscall_ret and copy that
> into gpr2 on return to userspace (and ptrace and signal and whatever
> other surface bits are affected).
>
> Either way around you then have separate values for the entire range of
> at least the C part of the kernel syscall handling -- just like every
> other arch. How is munging things in a single value and a flag easier?
Looks like we have different opinions on that - I find the flag way
easier, and we don't need additional space for a long in ptregs and copy
things around.
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2026-07-03 11:18 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 17:42 [RFC] entry: Untangle the return value of syscall_enter_from_user_mode from syscall NR Michal Suchánek
2026-07-01 18:29 ` H. Peter Anvin
2026-07-02 9:30 ` Michal Suchánek
2026-07-02 21:49 ` Thomas Gleixner
2026-07-03 6:26 ` Sven Schnelle
2026-07-03 9:25 ` Peter Zijlstra
2026-07-03 9:27 ` Thomas Gleixner
2026-07-03 9:59 ` Sven Schnelle
2026-07-03 10:57 ` Peter Zijlstra
2026-07-03 11:17 ` Sven Schnelle [this message]
2026-07-03 11:25 ` Michal Suchánek
2026-07-03 11:39 ` Sven Schnelle
2026-07-02 8:12 ` Sven Schnelle
[not found] ` <akYreY_BHuRbxSsO@kunlun.suse.cz>
2026-07-02 12:01 ` Sven Schnelle
[not found] ` <akZV7kjVh37z63Nz@kunlun.suse.cz>
2026-07-03 6:16 ` Sven Schnelle
2026-07-02 11:24 ` Thomas Gleixner
2026-07-02 11:45 ` Michal Suchánek
2026-07-02 20:45 ` Thomas Gleixner
[not found] ` <akdqlO0eJ6jKH-wU@kunlun.suse.cz>
2026-07-03 9:34 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=yt9dy0fs491e.fsf@linux.ibm.com \
--to=svens@linux.ibm.com \
--cc=agordeev@linux.ibm.com \
--cc=alex@ghiti.fr \
--cc=andrew+kernel@donnellan.id.au \
--cc=aou@eecs.berkeley.edu \
--cc=arnd@arndb.de \
--cc=borntraeger@linux.ibm.com \
--cc=bp@alien8.de \
--cc=chenhuacai@kernel.org \
--cc=chleroy@kernel.org \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=debug@rivosinc.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=hca@linux.ibm.com \
--cc=hpa@zytor.com \
--cc=jiaxun.yang@flygoat.com \
--cc=kees@kernel.org \
--cc=kernel@xen0n.name \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=loongarch@lists.linux.dev \
--cc=lukas.gerlach@cispa.de \
--cc=luto@kernel.org \
--cc=maddy@linux.ibm.com \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=mkchauras@linux.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=msuchanek@suse.de \
--cc=namcao@linutronix.de \
--cc=npiggin@gmail.com \
--cc=palmer@dabbelt.com \
--cc=peterz@infradead.org \
--cc=pjw@kernel.org \
--cc=qirui.001@bytedance.com \
--cc=ryan.roberts@arm.com \
--cc=skhan@linuxfoundation.org \
--cc=sshegde@linux.ibm.com \
--cc=tglx@kernel.org \
--cc=x86@kernel.org \
--cc=zong.li@sifive.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox