Linux-RISC-V Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Michal Suchánek" <msuchanek@suse.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Sven Schnelle <svens@linux.ibm.com>,
	Thomas Gleixner <tglx@kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Huacai Chen <chenhuacai@kernel.org>,
	WANG Xuerui <kernel@xen0n.name>,
	Madhavan Srinivasan <maddy@linux.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Nicholas Piggin <npiggin@gmail.com>,
	"Christophe Leroy (CS GROUP)" <chleroy@kernel.org>,
	Paul Walmsley <pjw@kernel.org>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>,
	Alexandre Ghiti <alex@ghiti.fr>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Andy Lutomirski <luto@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86@kernel.org, Andrew Donnellan <andrew+kernel@donnellan.id.au>,
	Mark Rutland <mark.rutland@arm.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Jiaxun Yang <jiaxun.yang@flygoat.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Mukesh Kumar Chaurasiya <mkchauras@linux.ibm.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Zong Li <zong.li@sifive.com>, Nam Cao <namcao@linutronix.de>,
	Deepak Gupta <debug@rivosinc.com>,
	Lukas Gerlach <lukas.gerlach@cispa.de>,
	Rui Qi <qirui.001@bytedance.com>, Kees Cook <kees@kernel.org>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	loongarch@lists.linux.dev, linuxppc-dev@lists.ozlabs.org,
	linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org
Subject: Re: [RFC] entry: Untangle the return value of syscall_enter_from_user_mode from syscall NR
Date: Fri, 3 Jul 2026 13:25:25 +0200	[thread overview]
Message-ID: <akecJWAJP-e5CYP_@kunlun.suse.cz> (raw)
In-Reply-To: <20260703105718.GO751831@noisy.programming.kicks-ass.net>

On Fri, Jul 03, 2026 at 12:57:18PM +0200, Peter Zijlstra wrote:
> On Fri, Jul 03, 2026 at 11:59:07AM +0200, Sven Schnelle wrote:
> > Thomas Gleixner <tglx@kernel.org> writes:
> > 
> > > On Fri, Jul 03 2026 at 08:26, Sven Schnelle wrote:
> > >> Thomas Gleixner <tglx@kernel.org> writes:
> > >>> It's less than obvious and I have no objections to clean that up and
> > >>> make it more intuitive, but I still fail to see what Michal is actually
> > >>> trying to solve and what the magic flag is for. If s390 requires it,
> > >>> then that's an s390 problem, but definitely x86 does not.
> > >>
> > >> The difference between x86 and s390 is that on s390, regs->gprs[2] is
> > >> used for both the syscall number and the syscall return value.
> > >> That was a design mistake early in the begin about 25 years ago, but
> > >> it's ABI now, so it cannot be changed.
> > >
> > > Cute.
> > >
> > >> When seccomp decides to skip a syscall, it write a return value into
> > >> regs->gprs[2]. When syscall_enter_from_user_mode_work() returns, it
> > >> returns this number. If it's negative all is good - the 'if (likely(nr <
> > >> NR_syscalls))' conditiion would just catch it and skip the syscall.
> > >>
> > >> But if it's a positive number, the code cannot distinguish whether
> > >> that's a return value or a syscall number.
> > >>
> > >> So I introduced PIF_SYSCALL_RET_SET when converting s390 to generic
> > >> entry. This flag tells the syscall code that a return value was set in
> > >> ptregs and the syscall should be skipped.
> > >
> > > You also could have added a 'syscall_ret' member to pt_regs, operate
> > > on that for the return values (seccomp, syscall...) and swap it into
> > > gprs[2] right before returning to user space.
> > 
> > That would likely also work, but I found it easier to read and
> > understand to have an additional flag with a descriptive name than having
> > yet another 'somehow-related-to-gpr2' member in ptregs.
> 
> I find this very odd; I would think that having both syscall-nr and
> syscall-ret in separate (virtual) registers for most of the normal cycle
> would be most obvious and less surprising -- given that this is what all
> other architectures do.
> 
> Entry either grabs a copy of gpr2 and preserves it in orig_gpr2 as the
> syscall nr, or as Thomas suggests, you keep syscall_ret and copy that
> into gpr2 on return to userspace (and ptrace and signal and whatever
> other surface bits are affected).
> 
> Either way around you then have separate values for the entire range of
> at least the C part of the kernel syscall handling -- just like every
> other arch. How is munging things in a single value and a flag easier?

The same could be asked of syscall_enter_from_user_mode. I find it very
odd. Why does it conflate the syscall number with its return value?

It never uses the syscall number passed in except when returning it
unchanged. When it pokes the registers it reads the syscall number from
them.

If the caller of syscall_enter_from_user_mode only read the syscall
number from the registers when syscall_enter_from_user_mode returns and
indicates the syscall should be still executed this whole shenigan would
be avoided.

Thanks

Michal

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  parent reply	other threads:[~2026-07-03 11:25 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01 17:42 [RFC] entry: Untangle the return value of syscall_enter_from_user_mode from syscall NR Michal Suchánek
2026-07-01 18:29 ` H. Peter Anvin
2026-07-02  9:30   ` Michal Suchánek
2026-07-02 21:49   ` Thomas Gleixner
2026-07-03  6:26     ` Sven Schnelle
2026-07-03  9:25       ` Peter Zijlstra
2026-07-03  9:27       ` Thomas Gleixner
2026-07-03  9:59         ` Sven Schnelle
2026-07-03 10:57           ` Peter Zijlstra
2026-07-03 11:17             ` Sven Schnelle
2026-07-03 11:25             ` Michal Suchánek [this message]
2026-07-03 11:39               ` Sven Schnelle
2026-07-02  8:12 ` Sven Schnelle
     [not found]   ` <akYreY_BHuRbxSsO@kunlun.suse.cz>
2026-07-02 12:01     ` Sven Schnelle
     [not found]       ` <akZV7kjVh37z63Nz@kunlun.suse.cz>
2026-07-03  6:16         ` Sven Schnelle
2026-07-02 11:24 ` Thomas Gleixner
2026-07-02 11:45   ` Michal Suchánek
2026-07-02 20:45     ` Thomas Gleixner
     [not found]       ` <akdqlO0eJ6jKH-wU@kunlun.suse.cz>
2026-07-03  9:34         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=akecJWAJP-e5CYP_@kunlun.suse.cz \
    --to=msuchanek@suse.de \
    --cc=agordeev@linux.ibm.com \
    --cc=alex@ghiti.fr \
    --cc=andrew+kernel@donnellan.id.au \
    --cc=aou@eecs.berkeley.edu \
    --cc=arnd@arndb.de \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=chenhuacai@kernel.org \
    --cc=chleroy@kernel.org \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=debug@rivosinc.com \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hca@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jiaxun.yang@flygoat.com \
    --cc=kees@kernel.org \
    --cc=kernel@xen0n.name \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=loongarch@lists.linux.dev \
    --cc=lukas.gerlach@cispa.de \
    --cc=luto@kernel.org \
    --cc=maddy@linux.ibm.com \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mkchauras@linux.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=namcao@linutronix.de \
    --cc=npiggin@gmail.com \
    --cc=palmer@dabbelt.com \
    --cc=peterz@infradead.org \
    --cc=pjw@kernel.org \
    --cc=qirui.001@bytedance.com \
    --cc=ryan.roberts@arm.com \
    --cc=skhan@linuxfoundation.org \
    --cc=sshegde@linux.ibm.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@kernel.org \
    --cc=x86@kernel.org \
    --cc=zong.li@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox