The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
@ 2026-06-29 18:29 Mukesh Kumar Chaurasiya (IBM)
  2026-06-30 17:19 ` Michal Suchánek
  2026-06-30 20:11 ` Shrikanth Hegde
  0 siblings, 2 replies; 10+ messages in thread
From: Mukesh Kumar Chaurasiya (IBM) @ 2026-06-29 18:29 UTC (permalink / raw)
  To: maddy, mpe, npiggin, chleroy, mkchauras, sshegde, ryan.roberts,
	ruanjinjie, mkchauras, linuxppc-dev, linux-kernel
  Cc: Michal Suchánek

After enabling GENERIC_ENTRY on PowerPC, seccomp filters using
SCMP_ACT_ERRNO without an explicit errnoRet value return ENOSYS
(Function not implemented) instead of the expected EPERM (Operation
not permitted).

The issue occurs in system_call_exception() when syscall_enter_from_user_mode()
returns -1 to indicate the syscall should be skipped (e.g., blocked by seccomp).
The current code treats this -1 as a syscall number and compares it against
NR_syscalls. Since -1 is greater than NR_syscalls,
the code incorrectly returns -ENOSYS, overwriting the errno that seccomp
already set via syscall_set_return_value().

The generic entry code in syscall_trace_enter() calls __secure_computing(),
which sets the appropriate errno in regs->gpr[3] and returns -1 to signal
that the syscall should be skipped. However, the PowerPC syscall handler
was not checking for this -1 return value before validating the syscall
number.

Fix this by explicitly checking if syscall_enter_from_user_mode() returns
-1 and returning the value already set in regs->gpr[3] (the errno from
seccomp) before performing the syscall number validation.

Also Move the syscall_enter_from_user_mode() call and the seccomp/ptrace
skip check to after the NR_syscalls bounds check.

When syscall -1 was passed, the r0 == -1L check would trigger before
the NR_syscalls check, causing syscall_get_error() to return 0 instead
of -ENOSYS. This resulted in a silent success (ret=0, errno=0) instead
of the expected ENOSYS error.

By moving syscall_enter_from_user_mode() after the bounds check, an
initial syscall number of -1 is correctly rejected with -ENOSYS first.
The seccomp/ptrace skip path still works correctly for valid syscall
numbers that get overridden to -1 by seccomp or ptrace.

This aligns PowerPC's behavior with other architectures using GENERIC_ENTRY
and restores correct seccomp errno handling.

Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
Reported-by: Michal Suchánek <msuchanek@suse.de>
Closes: https://lore.kernel.org/all/ajpp-_XnbF3UTM_E@kunlun.suse.cz/
Signed-off-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>
---

v1 -> v2:
 - Fix issues in the previous fix (Michal)
v1: https://lore.kernel.org/all/20260624171520.772408-1-mkchauras@gmail.com

 arch/powerpc/kernel/syscall.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
index a9da2af6efa8..36d73933a311 100644
--- a/arch/powerpc/kernel/syscall.c
+++ b/arch/powerpc/kernel/syscall.c
@@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
 	syscall_fn f;
 
 	add_random_kstack_offset();
-	r0 = syscall_enter_from_user_mode(regs, r0);
 
 	if (unlikely(r0 >= NR_syscalls)) {
 		if (unlikely(trap_is_unsupported_scv(regs))) {
@@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
 		return -ENOSYS;
 	}
 
+	r0 = syscall_enter_from_user_mode(regs, r0);
+
+	/* Seccomp or ptrace may have set return value, skip syscall */
+	if (unlikely(r0 == -1L))
+		return syscall_get_error(current, regs);
+
 	/* May be faster to do array_index_nospec? */
 	barrier_nospec();
 
-- 
2.54.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-06-29 18:29 [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY Mukesh Kumar Chaurasiya (IBM)
@ 2026-06-30 17:19 ` Michal Suchánek
  2026-06-30 20:11 ` Shrikanth Hegde
  1 sibling, 0 replies; 10+ messages in thread
From: Michal Suchánek @ 2026-06-30 17:19 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya (IBM)
  Cc: maddy, mpe, npiggin, chleroy, mkchauras, sshegde, ryan.roberts,
	ruanjinjie, linuxppc-dev, linux-kernel

On Mon, Jun 29, 2026 at 11:59:46PM +0530, Mukesh Kumar Chaurasiya (IBM) wrote:
> After enabling GENERIC_ENTRY on PowerPC, seccomp filters using
> SCMP_ACT_ERRNO without an explicit errnoRet value return ENOSYS
> (Function not implemented) instead of the expected EPERM (Operation
> not permitted).
> 
> The issue occurs in system_call_exception() when syscall_enter_from_user_mode()
> returns -1 to indicate the syscall should be skipped (e.g., blocked by seccomp).
> The current code treats this -1 as a syscall number and compares it against
> NR_syscalls. Since -1 is greater than NR_syscalls,
> the code incorrectly returns -ENOSYS, overwriting the errno that seccomp
> already set via syscall_set_return_value().
> 
> The generic entry code in syscall_trace_enter() calls __secure_computing(),
> which sets the appropriate errno in regs->gpr[3] and returns -1 to signal
> that the syscall should be skipped. However, the PowerPC syscall handler
> was not checking for this -1 return value before validating the syscall
> number.
> 
> Fix this by explicitly checking if syscall_enter_from_user_mode() returns
> -1 and returning the value already set in regs->gpr[3] (the errno from
> seccomp) before performing the syscall number validation.
> 
> Also Move the syscall_enter_from_user_mode() call and the seccomp/ptrace
> skip check to after the NR_syscalls bounds check.
> 
> When syscall -1 was passed, the r0 == -1L check would trigger before
> the NR_syscalls check, causing syscall_get_error() to return 0 instead
> of -ENOSYS. This resulted in a silent success (ret=0, errno=0) instead
> of the expected ENOSYS error.
> 
> By moving syscall_enter_from_user_mode() after the bounds check, an
> initial syscall number of -1 is correctly rejected with -ENOSYS first.
> The seccomp/ptrace skip path still works correctly for valid syscall
> numbers that get overridden to -1 by seccomp or ptrace.
> 
> This aligns PowerPC's behavior with other architectures using GENERIC_ENTRY
> and restores correct seccomp errno handling.
> 
> Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
> Reported-by: Michal Suchánek <msuchanek@suse.de>

Tested-by: Michal Suchánek <msuchanek@suse.de>

Reviewed-by: Michal Suchánek <msuchanek@suse.de>

Thanks

Michal

> Closes: https://lore.kernel.org/all/ajpp-_XnbF3UTM_E@kunlun.suse.cz/
> Signed-off-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>
> ---
> 
> v1 -> v2:
>  - Fix issues in the previous fix (Michal)
> v1: https://lore.kernel.org/all/20260624171520.772408-1-mkchauras@gmail.com
> 
>  arch/powerpc/kernel/syscall.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> index a9da2af6efa8..36d73933a311 100644
> --- a/arch/powerpc/kernel/syscall.c
> +++ b/arch/powerpc/kernel/syscall.c
> @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
>  	syscall_fn f;
>  
>  	add_random_kstack_offset();
> -	r0 = syscall_enter_from_user_mode(regs, r0);
>  
>  	if (unlikely(r0 >= NR_syscalls)) {
>  		if (unlikely(trap_is_unsupported_scv(regs))) {
> @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
>  		return -ENOSYS;
>  	}
>  
> +	r0 = syscall_enter_from_user_mode(regs, r0);
> +
> +	/* Seccomp or ptrace may have set return value, skip syscall */
> +	if (unlikely(r0 == -1L))
> +		return syscall_get_error(current, regs);
> +
>  	/* May be faster to do array_index_nospec? */
>  	barrier_nospec();
>  
> -- 
> 2.54.0
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-06-29 18:29 [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY Mukesh Kumar Chaurasiya (IBM)
  2026-06-30 17:19 ` Michal Suchánek
@ 2026-06-30 20:11 ` Shrikanth Hegde
  2026-07-01  6:27   ` Mukesh Kumar Chaurasiya
  1 sibling, 1 reply; 10+ messages in thread
From: Shrikanth Hegde @ 2026-06-30 20:11 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya (IBM), maddy, mpe, npiggin, chleroy,
	mkchauras, ryan.roberts, ruanjinjie, linuxppc-dev, linux-kernel
  Cc: Michal Suchánek

Hi Mukesh.

On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> After enabling GENERIC_ENTRY on PowerPC, seccomp filters using
> SCMP_ACT_ERRNO without an explicit errnoRet value return ENOSYS
> (Function not implemented) instead of the expected EPERM (Operation
> not permitted).
> 
> The issue occurs in system_call_exception() when syscall_enter_from_user_mode()
> returns -1 to indicate the syscall should be skipped (e.g., blocked by seccomp).
> The current code treats this -1 as a syscall number and compares it against
> NR_syscalls. Since -1 is greater than NR_syscalls,
> the code incorrectly returns -ENOSYS, overwriting the errno that seccomp
> already set via syscall_set_return_value().
> 
> The generic entry code in syscall_trace_enter() calls __secure_computing(),
> which sets the appropriate errno in regs->gpr[3] and returns -1 to signal
> that the syscall should be skipped. However, the PowerPC syscall handler
> was not checking for this -1 return value before validating the syscall
> number.
> 
> Fix this by explicitly checking if syscall_enter_from_user_mode() returns
> -1 and returning the value already set in regs->gpr[3] (the errno from
> seccomp) before performing the syscall number validation.
> 
> Also Move the syscall_enter_from_user_mode() call and the seccomp/ptrace
> skip check to after the NR_syscalls bounds check.
> 
> When syscall -1 was passed, the r0 == -1L check would trigger before
> the NR_syscalls check, causing syscall_get_error() to return 0 instead
> of -ENOSYS. This resulted in a silent success (ret=0, errno=0) instead
> of the expected ENOSYS error.
> 
> By moving syscall_enter_from_user_mode() after the bounds check, an
> initial syscall number of -1 is correctly rejected with -ENOSYS first.
> The seccomp/ptrace skip path still works correctly for valid syscall
> numbers that get overridden to -1 by seccomp or ptrace.
> 
> This aligns PowerPC's behavior with other architectures using GENERIC_ENTRY
> and restores correct seccomp errno handling.
> 
> Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
> Reported-by: Michal Suchánek <msuchanek@suse.de>
> Closes: https://lore.kernel.org/all/ajpp-_XnbF3UTM_E@kunlun.suse.cz/
> Signed-off-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>
> ---
> 
> v1 -> v2:
>   - Fix issues in the previous fix (Michal)
> v1: https://lore.kernel.org/all/20260624171520.772408-1-mkchauras@gmail.com
> 
>   arch/powerpc/kernel/syscall.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> index a9da2af6efa8..36d73933a311 100644
> --- a/arch/powerpc/kernel/syscall.c
> +++ b/arch/powerpc/kernel/syscall.c
> @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
>   	syscall_fn f;
>   
>   	add_random_kstack_offset();
> -	r0 = syscall_enter_from_user_mode(regs, r0);
>   
>   	if (unlikely(r0 >= NR_syscalls)) {
>   		if (unlikely(trap_is_unsupported_scv(regs))) {
> @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
>   		return -ENOSYS;
>   	}
>   
> +	r0 = syscall_enter_from_user_mode(regs, r0);
> +

I see many arch first do syscall_enter_from_user_mode and then check for return value.
take x86 for example,

__visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
{
         nr = syscall_enter_from_user_mode(regs, nr);

         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
                 /* Invalid system call, but still a system call. */
                 regs->ax = __x64_sys_ni_syscall(regs);
         }

}

So seccomp fails silently there if initial nr was -1?



> +	/* Seccomp or ptrace may have set return value, skip syscall */
> +	if (unlikely(r0 == -1L))
> +		return syscall_get_error(current, regs);
> +
>   	/* May be faster to do array_index_nospec? */
>   	barrier_nospec();
>   

Code per se, looks okay to me.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-06-30 20:11 ` Shrikanth Hegde
@ 2026-07-01  6:27   ` Mukesh Kumar Chaurasiya
  2026-07-01  7:41     ` Michal Suchánek
  0 siblings, 1 reply; 10+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2026-07-01  6:27 UTC (permalink / raw)
  To: Shrikanth Hegde
  Cc: maddy, mpe, npiggin, chleroy, mkchauras, ryan.roberts, ruanjinjie,
	linuxppc-dev, linux-kernel, Michal Suchánek

On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> Hi Mukesh.
> 
> On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> > After enabling GENERIC_ENTRY on PowerPC, seccomp filters using
> > SCMP_ACT_ERRNO without an explicit errnoRet value return ENOSYS
> > (Function not implemented) instead of the expected EPERM (Operation
> > not permitted).
> > 
> > The issue occurs in system_call_exception() when syscall_enter_from_user_mode()
> > returns -1 to indicate the syscall should be skipped (e.g., blocked by seccomp).
> > The current code treats this -1 as a syscall number and compares it against
> > NR_syscalls. Since -1 is greater than NR_syscalls,
> > the code incorrectly returns -ENOSYS, overwriting the errno that seccomp
> > already set via syscall_set_return_value().
> > 
> > The generic entry code in syscall_trace_enter() calls __secure_computing(),
> > which sets the appropriate errno in regs->gpr[3] and returns -1 to signal
> > that the syscall should be skipped. However, the PowerPC syscall handler
> > was not checking for this -1 return value before validating the syscall
> > number.
> > 
> > Fix this by explicitly checking if syscall_enter_from_user_mode() returns
> > -1 and returning the value already set in regs->gpr[3] (the errno from
> > seccomp) before performing the syscall number validation.
> > 
> > Also Move the syscall_enter_from_user_mode() call and the seccomp/ptrace
> > skip check to after the NR_syscalls bounds check.
> > 
> > When syscall -1 was passed, the r0 == -1L check would trigger before
> > the NR_syscalls check, causing syscall_get_error() to return 0 instead
> > of -ENOSYS. This resulted in a silent success (ret=0, errno=0) instead
> > of the expected ENOSYS error.
> > 
> > By moving syscall_enter_from_user_mode() after the bounds check, an
> > initial syscall number of -1 is correctly rejected with -ENOSYS first.
> > The seccomp/ptrace skip path still works correctly for valid syscall
> > numbers that get overridden to -1 by seccomp or ptrace.
> > 
> > This aligns PowerPC's behavior with other architectures using GENERIC_ENTRY
> > and restores correct seccomp errno handling.
> > 
> > Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
> > Reported-by: Michal Suchánek <msuchanek@suse.de>
> > Closes: https://lore.kernel.org/all/ajpp-_XnbF3UTM_E@kunlun.suse.cz/
> > Signed-off-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>
> > ---
> > 
> > v1 -> v2:
> >   - Fix issues in the previous fix (Michal)
> > v1: https://lore.kernel.org/all/20260624171520.772408-1-mkchauras@gmail.com
> > 
> >   arch/powerpc/kernel/syscall.c | 7 ++++++-
> >   1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> > index a9da2af6efa8..36d73933a311 100644
> > --- a/arch/powerpc/kernel/syscall.c
> > +++ b/arch/powerpc/kernel/syscall.c
> > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> >   	syscall_fn f;
> >   	add_random_kstack_offset();
> > -	r0 = syscall_enter_from_user_mode(regs, r0);
> >   	if (unlikely(r0 >= NR_syscalls)) {
> >   		if (unlikely(trap_is_unsupported_scv(regs))) {
> > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> >   		return -ENOSYS;
> >   	}
> > +	r0 = syscall_enter_from_user_mode(regs, r0);
> > +
> 
> I see many arch first do syscall_enter_from_user_mode and then check for return value.
> take x86 for example,
> 
> __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> {
>         nr = syscall_enter_from_user_mode(regs, nr);
> 
>         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
>                 /* Invalid system call, but still a system call. */
>                 regs->ax = __x64_sys_ni_syscall(regs);
>         }
> 
> }
> 
> So seccomp fails silently there if initial nr was -1?
> 
Hey,

No the -1 syscall ignores the error silently and returns 0.

From the above snippet from x86. Out behaviour also will remain same.
The reasoning for that i have given here
https://lore.kernel.org/all/akKnSqGUeFIGOfHb@li-1a3e774c-28e4-11b2-a85c-acc9f2883e29.ibm.com/

Regards,
Mukesh
> 
> 
> > +	/* Seccomp or ptrace may have set return value, skip syscall */
> > +	if (unlikely(r0 == -1L))
> > +		return syscall_get_error(current, regs);
> > +
> >   	/* May be faster to do array_index_nospec? */
> >   	barrier_nospec();
> 
> Code per se, looks okay to me.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-07-01  6:27   ` Mukesh Kumar Chaurasiya
@ 2026-07-01  7:41     ` Michal Suchánek
  2026-07-01  8:01       ` Michal Suchánek
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Suchánek @ 2026-07-01  7:41 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya
  Cc: Shrikanth Hegde, maddy, mpe, npiggin, chleroy, mkchauras,
	ryan.roberts, ruanjinjie, linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5028 bytes --]

On Wed, Jul 01, 2026 at 11:57:00AM +0530, Mukesh Kumar Chaurasiya wrote:
> On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> > Hi Mukesh.
> > 
> > On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> > > After enabling GENERIC_ENTRY on PowerPC, seccomp filters using
> > > SCMP_ACT_ERRNO without an explicit errnoRet value return ENOSYS
> > > (Function not implemented) instead of the expected EPERM (Operation
> > > not permitted).
> > > 
> > > The issue occurs in system_call_exception() when syscall_enter_from_user_mode()
> > > returns -1 to indicate the syscall should be skipped (e.g., blocked by seccomp).
> > > The current code treats this -1 as a syscall number and compares it against
> > > NR_syscalls. Since -1 is greater than NR_syscalls,
> > > the code incorrectly returns -ENOSYS, overwriting the errno that seccomp
> > > already set via syscall_set_return_value().
> > > 
> > > The generic entry code in syscall_trace_enter() calls __secure_computing(),
> > > which sets the appropriate errno in regs->gpr[3] and returns -1 to signal
> > > that the syscall should be skipped. However, the PowerPC syscall handler
> > > was not checking for this -1 return value before validating the syscall
> > > number.
> > > 
> > > Fix this by explicitly checking if syscall_enter_from_user_mode() returns
> > > -1 and returning the value already set in regs->gpr[3] (the errno from
> > > seccomp) before performing the syscall number validation.
> > > 
> > > Also Move the syscall_enter_from_user_mode() call and the seccomp/ptrace
> > > skip check to after the NR_syscalls bounds check.
> > > 
> > > When syscall -1 was passed, the r0 == -1L check would trigger before
> > > the NR_syscalls check, causing syscall_get_error() to return 0 instead
> > > of -ENOSYS. This resulted in a silent success (ret=0, errno=0) instead
> > > of the expected ENOSYS error.
> > > 
> > > By moving syscall_enter_from_user_mode() after the bounds check, an
> > > initial syscall number of -1 is correctly rejected with -ENOSYS first.
> > > The seccomp/ptrace skip path still works correctly for valid syscall
> > > numbers that get overridden to -1 by seccomp or ptrace.
> > > 
> > > This aligns PowerPC's behavior with other architectures using GENERIC_ENTRY
> > > and restores correct seccomp errno handling.
> > > 
> > > Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
> > > Reported-by: Michal Suchánek <msuchanek@suse.de>
> > > Closes: https://lore.kernel.org/all/ajpp-_XnbF3UTM_E@kunlun.suse.cz/
> > > Signed-off-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>
> > > ---
> > > 
> > > v1 -> v2:
> > >   - Fix issues in the previous fix (Michal)
> > > v1: https://lore.kernel.org/all/20260624171520.772408-1-mkchauras@gmail.com
> > > 
> > >   arch/powerpc/kernel/syscall.c | 7 ++++++-
> > >   1 file changed, 6 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> > > index a9da2af6efa8..36d73933a311 100644
> > > --- a/arch/powerpc/kernel/syscall.c
> > > +++ b/arch/powerpc/kernel/syscall.c
> > > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > >   	syscall_fn f;
> > >   	add_random_kstack_offset();
> > > -	r0 = syscall_enter_from_user_mode(regs, r0);
> > >   	if (unlikely(r0 >= NR_syscalls)) {
> > >   		if (unlikely(trap_is_unsupported_scv(regs))) {
> > > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > >   		return -ENOSYS;
> > >   	}
> > > +	r0 = syscall_enter_from_user_mode(regs, r0);
> > > +
> > 
> > I see many arch first do syscall_enter_from_user_mode and then check for return value.
> > take x86 for example,
> > 
> > __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > {
> >         nr = syscall_enter_from_user_mode(regs, nr);
> > 
> >         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
> >                 /* Invalid system call, but still a system call. */
> >                 regs->ax = __x64_sys_ni_syscall(regs);
> >         }
> > 
> > }
> > 
> > So seccomp fails silently there if initial nr was -1?
> > 
> Hey,
> 
> No the -1 syscall ignores the error silently and returns 0.
> 

There seems to be some inconsistency with the invalid syscalls.

Adapting the example from seccomp man page to ignore architecture I get
on x86_64 (presumably with GENERIC_ENTRY since long ago):

./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
ret=-1 errno=55 (No anode)

but on ppc64le (with GENEREC_ENTRY):

./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
ret=-1 errno=38 (Function not implemented)

That said, behavior of seccomp on invalid syscalls is not particularly
concerning. The tools that people typically use for constructing those
filters typically require a valid syscall number.

It would be nice to align, though.

Thanks

Michal

[-- Attachment #2: seccomp.c --]
[-- Type: text/x-c, Size: 1757 bytes --]

#include <linux/audit.h>
#include <linux/filter.h>
#include <linux/seccomp.h>
#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/prctl.h>
#include <sys/syscall.h>
#include <unistd.h>

	static int
install_filter(int syscall_nr, int f_errno)
{
	struct sock_filter filter[] = {
		/* [0] Load architecture from 'seccomp_data' buffer into
		   accumulator.  */
		BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
				(offsetof(struct seccomp_data, arch))),

		/* [1] Load system call number from 'seccomp_data' buffer into
		   accumulator.  */
		BPF_STMT(BPF_LD | BPF_W | BPF_ABS,
				(offsetof(struct seccomp_data, nr))),

		/* [2] Jump forward 1 instruction if system call number
		   does not match 'syscall_nr'.  */
		BPF_JUMP(BPF_JMP | BPF_JEQ | BPF_K, syscall_nr, 0, 1),

		/* [3] Matching system call: don't execute
		   the system call, and return 'f_errno' in 'errno'.  */
		BPF_STMT(BPF_RET | BPF_K,
				SECCOMP_RET_ERRNO | (f_errno & SECCOMP_RET_DATA)),

		/* [4] Destination of system call number mismatch: allow other
		   system calls.  */
		BPF_STMT(BPF_RET | BPF_K, SECCOMP_RET_ALLOW),
	};

	struct sock_fprog prog = {
		.len = sizeof(filter) / sizeof(*filter),
		.filter = filter,
	};

	if (syscall(SYS_seccomp, SECCOMP_SET_MODE_FILTER, 0, &prog)) {
		perror("seccomp");
		return 1;
	}

	return 0;
}

	int
main(int argc, char *argv[])
{
	if (argc < 4) {
		fprintf(stderr, "Usage: "
				"%s <syscall_nr> <errno> <prog> [<args>]\n"
				"\n", argv[0]);
		exit(EXIT_FAILURE);
	}

	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
		perror("prctl");
		exit(EXIT_FAILURE);
	}

	if (install_filter(strtol(argv[1], NULL, 0),
				strtol(argv[2], NULL, 0)))
		exit(EXIT_FAILURE);

	execv(argv[3], &argv[3]);
	perror("execv");
	exit(EXIT_FAILURE);
}


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-07-01  7:41     ` Michal Suchánek
@ 2026-07-01  8:01       ` Michal Suchánek
  2026-07-01  8:29         ` Michal Suchánek
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Suchánek @ 2026-07-01  8:01 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya
  Cc: Shrikanth Hegde, maddy, mpe, npiggin, chleroy, mkchauras,
	ryan.roberts, ruanjinjie, linuxppc-dev, linux-kernel

On Wed, Jul 01, 2026 at 09:41:57AM +0200, Michal Suchánek wrote:
> On Wed, Jul 01, 2026 at 11:57:00AM +0530, Mukesh Kumar Chaurasiya wrote:
> > On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> > > Hi Mukesh.
> > > 
> > > On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> > > > After enabling GENERIC_ENTRY on PowerPC, seccomp filters using
> > > > SCMP_ACT_ERRNO without an explicit errnoRet value return ENOSYS
> > > > (Function not implemented) instead of the expected EPERM (Operation
> > > > not permitted).
> > > > 
> > > > The issue occurs in system_call_exception() when syscall_enter_from_user_mode()
> > > > returns -1 to indicate the syscall should be skipped (e.g., blocked by seccomp).
> > > > The current code treats this -1 as a syscall number and compares it against
> > > > NR_syscalls. Since -1 is greater than NR_syscalls,
> > > > the code incorrectly returns -ENOSYS, overwriting the errno that seccomp
> > > > already set via syscall_set_return_value().
> > > > 
> > > > The generic entry code in syscall_trace_enter() calls __secure_computing(),
> > > > which sets the appropriate errno in regs->gpr[3] and returns -1 to signal
> > > > that the syscall should be skipped. However, the PowerPC syscall handler
> > > > was not checking for this -1 return value before validating the syscall
> > > > number.
> > > > 
> > > > Fix this by explicitly checking if syscall_enter_from_user_mode() returns
> > > > -1 and returning the value already set in regs->gpr[3] (the errno from
> > > > seccomp) before performing the syscall number validation.
> > > > 
> > > > Also Move the syscall_enter_from_user_mode() call and the seccomp/ptrace
> > > > skip check to after the NR_syscalls bounds check.
> > > > 
> > > > When syscall -1 was passed, the r0 == -1L check would trigger before
> > > > the NR_syscalls check, causing syscall_get_error() to return 0 instead
> > > > of -ENOSYS. This resulted in a silent success (ret=0, errno=0) instead
> > > > of the expected ENOSYS error.
> > > > 
> > > > By moving syscall_enter_from_user_mode() after the bounds check, an
> > > > initial syscall number of -1 is correctly rejected with -ENOSYS first.
> > > > The seccomp/ptrace skip path still works correctly for valid syscall
> > > > numbers that get overridden to -1 by seccomp or ptrace.
> > > > 
> > > > This aligns PowerPC's behavior with other architectures using GENERIC_ENTRY
> > > > and restores correct seccomp errno handling.
> > > > 
> > > > Fixes: bee25f97ad24 ("powerpc: Enable GENERIC_ENTRY feature")
> > > > Reported-by: Michal Suchánek <msuchanek@suse.de>
> > > > Closes: https://lore.kernel.org/all/ajpp-_XnbF3UTM_E@kunlun.suse.cz/
> > > > Signed-off-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>
> > > > ---
> > > > 
> > > > v1 -> v2:
> > > >   - Fix issues in the previous fix (Michal)
> > > > v1: https://lore.kernel.org/all/20260624171520.772408-1-mkchauras@gmail.com
> > > > 
> > > >   arch/powerpc/kernel/syscall.c | 7 ++++++-
> > > >   1 file changed, 6 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> > > > index a9da2af6efa8..36d73933a311 100644
> > > > --- a/arch/powerpc/kernel/syscall.c
> > > > +++ b/arch/powerpc/kernel/syscall.c
> > > > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > >   	syscall_fn f;
> > > >   	add_random_kstack_offset();
> > > > -	r0 = syscall_enter_from_user_mode(regs, r0);
> > > >   	if (unlikely(r0 >= NR_syscalls)) {
> > > >   		if (unlikely(trap_is_unsupported_scv(regs))) {
> > > > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > >   		return -ENOSYS;
> > > >   	}
> > > > +	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > +
> > > 
> > > I see many arch first do syscall_enter_from_user_mode and then check for return value.
> > > take x86 for example,
> > > 
> > > __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > > {
> > >         nr = syscall_enter_from_user_mode(regs, nr);
> > > 
> > >         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
> > >                 /* Invalid system call, but still a system call. */
> > >                 regs->ax = __x64_sys_ni_syscall(regs);
> > >         }
> > > 
> > > }
> > > 
> > > So seccomp fails silently there if initial nr was -1?
> > > 
> > Hey,
> > 
> > No the -1 syscall ignores the error silently and returns 0.
> > 
> 
> There seems to be some inconsistency with the invalid syscalls.
> 
> Adapting the example from seccomp man page to ignore architecture I get
> on x86_64 (presumably with GENERIC_ENTRY since long ago):
> 
> ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> ret=-1 errno=55 (No anode)
> 
> but on ppc64le (with GENEREC_ENTRY):
> 
> ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> ret=-1 errno=38 (Function not implemented)
> 
> That said, behavior of seccomp on invalid syscalls is not particularly
> concerning. The tools that people typically use for constructing those
> filters typically require a valid syscall number.
> 
> It would be nice to align, though.

It is more concerning for SECCOMP_SET_MODE_STRICT or similar. So it
should be resolved to correctly execute seccomp even on invalid
syscalls. The syscall_enter_from_user_mode API is not particularly
well-suited for that, though.

Thanks

Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-07-01  8:01       ` Michal Suchánek
@ 2026-07-01  8:29         ` Michal Suchánek
  2026-07-02  5:50           ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Suchánek @ 2026-07-01  8:29 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya
  Cc: Shrikanth Hegde, maddy, mpe, npiggin, chleroy, mkchauras,
	ryan.roberts, ruanjinjie, linuxppc-dev, linux-kernel

On Wed, Jul 01, 2026 at 10:01:49AM +0200, Michal Suchánek wrote:
> On Wed, Jul 01, 2026 at 09:41:57AM +0200, Michal Suchánek wrote:
> > On Wed, Jul 01, 2026 at 11:57:00AM +0530, Mukesh Kumar Chaurasiya wrote:
> > > On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> > > > Hi Mukesh.
> > > > 
> > > > On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:

> > > > > diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> > > > > index a9da2af6efa8..36d73933a311 100644
> > > > > --- a/arch/powerpc/kernel/syscall.c
> > > > > +++ b/arch/powerpc/kernel/syscall.c
> > > > > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > > >   	syscall_fn f;
> > > > >   	add_random_kstack_offset();
> > > > > -	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > >   	if (unlikely(r0 >= NR_syscalls)) {
> > > > >   		if (unlikely(trap_is_unsupported_scv(regs))) {
> > > > > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > > >   		return -ENOSYS;
> > > > >   	}
> > > > > +	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > +
> > > > 
> > > > I see many arch first do syscall_enter_from_user_mode and then check for return value.
> > > > take x86 for example,
> > > > 
> > > > __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > > > {
> > > >         nr = syscall_enter_from_user_mode(regs, nr);
> > > > 
> > > >         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
> > > >                 /* Invalid system call, but still a system call. */
> > > >                 regs->ax = __x64_sys_ni_syscall(regs);
> > > >         }
> > > > 
> > > > }
> > > > 
> > > > So seccomp fails silently there if initial nr was -1?
> > > > 
> > > Hey,
> > > 
> > > No the -1 syscall ignores the error silently and returns 0.
> > > 
> > 
> > There seems to be some inconsistency with the invalid syscalls.
> > 
> > Adapting the example from seccomp man page to ignore architecture I get
> > on x86_64 (presumably with GENERIC_ENTRY since long ago):
> > 
> > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > ret=-1 errno=55 (No anode)
> > 
> > but on ppc64le (with GENEREC_ENTRY):
> > 
> > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > ret=-1 errno=38 (Function not implemented)
> > 
> > That said, behavior of seccomp on invalid syscalls is not particularly
> > concerning. The tools that people typically use for constructing those
> > filters typically require a valid syscall number.
> > 
> > It would be nice to align, though.
> 
> It is more concerning for SECCOMP_SET_MODE_STRICT or similar. So it
> should be resolved to correctly execute seccomp even on invalid
> syscalls. The syscall_enter_from_user_mode API is not particularly
> well-suited for that, though.

In particular the fixup per
https://lore.kernel.org/linuxppc-dev/akJzuEJRLniHk4Fi@kunlun.suse.cz/

handles some cases

./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
ret=-1 errno=55 (No anode)

but not -1

./a.out -1 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-1, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
ret=-1 errno=38 (Function not implemented)

which is the direct result of the ambiguous return value of
syscall_enter_from_user_mode

Thanks

Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-07-01  8:29         ` Michal Suchánek
@ 2026-07-02  5:50           ` Mukesh Kumar Chaurasiya
  2026-07-02  9:34             ` Michal Suchánek
  0 siblings, 1 reply; 10+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2026-07-02  5:50 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Shrikanth Hegde, maddy, mpe, npiggin, chleroy, mkchauras,
	ryan.roberts, ruanjinjie, linuxppc-dev, linux-kernel

On Wed, Jul 01, 2026 at 10:29:49AM +0200, Michal Suchánek wrote:
> On Wed, Jul 01, 2026 at 10:01:49AM +0200, Michal Suchánek wrote:
> > On Wed, Jul 01, 2026 at 09:41:57AM +0200, Michal Suchánek wrote:
> > > On Wed, Jul 01, 2026 at 11:57:00AM +0530, Mukesh Kumar Chaurasiya wrote:
> > > > On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> > > > > Hi Mukesh.
> > > > > 
> > > > > On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> 
> > > > > > diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> > > > > > index a9da2af6efa8..36d73933a311 100644
> > > > > > --- a/arch/powerpc/kernel/syscall.c
> > > > > > +++ b/arch/powerpc/kernel/syscall.c
> > > > > > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > > > >   	syscall_fn f;
> > > > > >   	add_random_kstack_offset();
> > > > > > -	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > >   	if (unlikely(r0 >= NR_syscalls)) {
> > > > > >   		if (unlikely(trap_is_unsupported_scv(regs))) {
> > > > > > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > > > >   		return -ENOSYS;
> > > > > >   	}
> > > > > > +	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > > +
> > > > > 
> > > > > I see many arch first do syscall_enter_from_user_mode and then check for return value.
> > > > > take x86 for example,
> > > > > 
> > > > > __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > > > > {
> > > > >         nr = syscall_enter_from_user_mode(regs, nr);
> > > > > 
> > > > >         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
> > > > >                 /* Invalid system call, but still a system call. */
> > > > >                 regs->ax = __x64_sys_ni_syscall(regs);
> > > > >         }
> > > > > 
> > > > > }
> > > > > 
> > > > > So seccomp fails silently there if initial nr was -1?
> > > > > 
> > > > Hey,
> > > > 
> > > > No the -1 syscall ignores the error silently and returns 0.
> > > > 
> > > 
> > > There seems to be some inconsistency with the invalid syscalls.
> > > 
> > > Adapting the example from seccomp man page to ignore architecture I get
> > > on x86_64 (presumably with GENERIC_ENTRY since long ago):
> > > 
> > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > > ret=-1 errno=55 (No anode)
> > > 
> > > but on ppc64le (with GENEREC_ENTRY):
> > > 
> > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > > ret=-1 errno=38 (Function not implemented)
> > > 
> > > That said, behavior of seccomp on invalid syscalls is not particularly
> > > concerning. The tools that people typically use for constructing those
> > > filters typically require a valid syscall number.
> > > 
> > > It would be nice to align, though.
> > 
> > It is more concerning for SECCOMP_SET_MODE_STRICT or similar. So it
> > should be resolved to correctly execute seccomp even on invalid
> > syscalls. The syscall_enter_from_user_mode API is not particularly
> > well-suited for that, though.
> 
> In particular the fixup per
> https://lore.kernel.org/linuxppc-dev/akJzuEJRLniHk4Fi@kunlun.suse.cz/
> 
> handles some cases
> 
> ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> ret=-1 errno=55 (No anode)
> 
> but not -1
> 
> ./a.out -1 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-1, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> ret=-1 errno=38 (Function not implemented)
> 
> which is the direct result of the ambiguous return value of
> syscall_enter_from_user_mode
> 
> Thanks
> 
> Michal
Hey Michal,

Yeah this seems to be a more complex thing than anticipated.
As per conversation on your another patch here
https://lore.kernel.org/all/BA7CD91D-C0E5-47A1-B49C-BC6AF6604182@zytor.com/

This patch seems to be redundant at this point.

Regards,
Mukesh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-07-02  5:50           ` Mukesh Kumar Chaurasiya
@ 2026-07-02  9:34             ` Michal Suchánek
  2026-07-02  9:39               ` Mukesh Kumar Chaurasiya
  0 siblings, 1 reply; 10+ messages in thread
From: Michal Suchánek @ 2026-07-02  9:34 UTC (permalink / raw)
  To: Mukesh Kumar Chaurasiya
  Cc: Shrikanth Hegde, maddy, mpe, npiggin, chleroy, mkchauras,
	ryan.roberts, ruanjinjie, linuxppc-dev, linux-kernel

On Thu, Jul 02, 2026 at 11:20:03AM +0530, Mukesh Kumar Chaurasiya wrote:
> On Wed, Jul 01, 2026 at 10:29:49AM +0200, Michal Suchánek wrote:
> > On Wed, Jul 01, 2026 at 10:01:49AM +0200, Michal Suchánek wrote:
> > > On Wed, Jul 01, 2026 at 09:41:57AM +0200, Michal Suchánek wrote:
> > > > On Wed, Jul 01, 2026 at 11:57:00AM +0530, Mukesh Kumar Chaurasiya wrote:
> > > > > On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> > > > > > Hi Mukesh.
> > > > > > 
> > > > > > On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> > 
> > > > > > > diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> > > > > > > index a9da2af6efa8..36d73933a311 100644
> > > > > > > --- a/arch/powerpc/kernel/syscall.c
> > > > > > > +++ b/arch/powerpc/kernel/syscall.c
> > > > > > > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > > > > >   	syscall_fn f;
> > > > > > >   	add_random_kstack_offset();
> > > > > > > -	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > > >   	if (unlikely(r0 >= NR_syscalls)) {
> > > > > > >   		if (unlikely(trap_is_unsupported_scv(regs))) {
> > > > > > > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > > > > >   		return -ENOSYS;
> > > > > > >   	}
> > > > > > > +	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > > > +
> > > > > > 
> > > > > > I see many arch first do syscall_enter_from_user_mode and then check for return value.
> > > > > > take x86 for example,
> > > > > > 
> > > > > > __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > > > > > {
> > > > > >         nr = syscall_enter_from_user_mode(regs, nr);
> > > > > > 
> > > > > >         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
> > > > > >                 /* Invalid system call, but still a system call. */
> > > > > >                 regs->ax = __x64_sys_ni_syscall(regs);
> > > > > >         }
> > > > > > 
> > > > > > }
> > > > > > 
> > > > > > So seccomp fails silently there if initial nr was -1?
> > > > > > 
> > > > > Hey,
> > > > > 
> > > > > No the -1 syscall ignores the error silently and returns 0.
> > > > > 
> > > > 
> > > > There seems to be some inconsistency with the invalid syscalls.
> > > > 
> > > > Adapting the example from seccomp man page to ignore architecture I get
> > > > on x86_64 (presumably with GENERIC_ENTRY since long ago):
> > > > 
> > > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > > > ret=-1 errno=55 (No anode)
> > > > 
> > > > but on ppc64le (with GENEREC_ENTRY):
> > > > 
> > > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > > > ret=-1 errno=38 (Function not implemented)
> > > > 
> > > > That said, behavior of seccomp on invalid syscalls is not particularly
> > > > concerning. The tools that people typically use for constructing those
> > > > filters typically require a valid syscall number.
> > > > 
> > > > It would be nice to align, though.
> > > 
> > > It is more concerning for SECCOMP_SET_MODE_STRICT or similar. So it
> > > should be resolved to correctly execute seccomp even on invalid
> > > syscalls. The syscall_enter_from_user_mode API is not particularly
> > > well-suited for that, though.
> > 
> > In particular the fixup per
> > https://lore.kernel.org/linuxppc-dev/akJzuEJRLniHk4Fi@kunlun.suse.cz/
> > 
> > handles some cases
> > 
> > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > ret=-1 errno=55 (No anode)
> > 
> > but not -1
> > 
> > ./a.out -1 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-1, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > ret=-1 errno=38 (Function not implemented)
> > 
> > which is the direct result of the ambiguous return value of
> > syscall_enter_from_user_mode
> > 
> > Thanks
> > 
> > Michal
> Hey Michal,
> 
> Yeah this seems to be a more complex thing than anticipated.
> As per conversation on your another patch here
> https://lore.kernel.org/all/BA7CD91D-C0E5-47A1-B49C-BC6AF6604182@zytor.com/
> 
> This patch seems to be redundant at this point.

Hello,

while improving the entry API is a fine goal it will take time if
something can be even agreed on.

In the mantime we should provide a fix using the current API.

Inability to run container workloads is a significant regression.

Thanks

Michal

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY
  2026-07-02  9:34             ` Michal Suchánek
@ 2026-07-02  9:39               ` Mukesh Kumar Chaurasiya
  0 siblings, 0 replies; 10+ messages in thread
From: Mukesh Kumar Chaurasiya @ 2026-07-02  9:39 UTC (permalink / raw)
  To: Michal Suchánek
  Cc: Shrikanth Hegde, maddy, mpe, npiggin, chleroy, mkchauras,
	ryan.roberts, ruanjinjie, linuxppc-dev, linux-kernel

On Thu, Jul 02, 2026 at 11:34:55AM +0200, Michal Suchánek wrote:
> On Thu, Jul 02, 2026 at 11:20:03AM +0530, Mukesh Kumar Chaurasiya wrote:
> > On Wed, Jul 01, 2026 at 10:29:49AM +0200, Michal Suchánek wrote:
> > > On Wed, Jul 01, 2026 at 10:01:49AM +0200, Michal Suchánek wrote:
> > > > On Wed, Jul 01, 2026 at 09:41:57AM +0200, Michal Suchánek wrote:
> > > > > On Wed, Jul 01, 2026 at 11:57:00AM +0530, Mukesh Kumar Chaurasiya wrote:
> > > > > > On Wed, Jul 01, 2026 at 01:41:09AM +0530, Shrikanth Hegde wrote:
> > > > > > > Hi Mukesh.
> > > > > > > 
> > > > > > > On 6/29/26 11:59 PM, Mukesh Kumar Chaurasiya (IBM) wrote:
> > > 
> > > > > > > > diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c
> > > > > > > > index a9da2af6efa8..36d73933a311 100644
> > > > > > > > --- a/arch/powerpc/kernel/syscall.c
> > > > > > > > +++ b/arch/powerpc/kernel/syscall.c
> > > > > > > > @@ -20,7 +20,6 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > > > > > >   	syscall_fn f;
> > > > > > > >   	add_random_kstack_offset();
> > > > > > > > -	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > > > >   	if (unlikely(r0 >= NR_syscalls)) {
> > > > > > > >   		if (unlikely(trap_is_unsupported_scv(regs))) {
> > > > > > > > @@ -31,6 +30,12 @@ notrace long system_call_exception(struct pt_regs *regs, unsigned long r0)
> > > > > > > >   		return -ENOSYS;
> > > > > > > >   	}
> > > > > > > > +	r0 = syscall_enter_from_user_mode(regs, r0);
> > > > > > > > +
> > > > > > > 
> > > > > > > I see many arch first do syscall_enter_from_user_mode and then check for return value.
> > > > > > > take x86 for example,
> > > > > > > 
> > > > > > > __visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > > > > > > {
> > > > > > >         nr = syscall_enter_from_user_mode(regs, nr);
> > > > > > > 
> > > > > > >         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
> > > > > > >                 /* Invalid system call, but still a system call. */
> > > > > > >                 regs->ax = __x64_sys_ni_syscall(regs);
> > > > > > >         }
> > > > > > > 
> > > > > > > }
> > > > > > > 
> > > > > > > So seccomp fails silently there if initial nr was -1?
> > > > > > > 
> > > > > > Hey,
> > > > > > 
> > > > > > No the -1 syscall ignores the error silently and returns 0.
> > > > > > 
> > > > > 
> > > > > There seems to be some inconsistency with the invalid syscalls.
> > > > > 
> > > > > Adapting the example from seccomp man page to ignore architecture I get
> > > > > on x86_64 (presumably with GENERIC_ENTRY since long ago):
> > > > > 
> > > > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > > > > ret=-1 errno=55 (No anode)
> > > > > 
> > > > > but on ppc64le (with GENEREC_ENTRY):
> > > > > 
> > > > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > > > > ret=-1 errno=38 (Function not implemented)
> > > > > 
> > > > > That said, behavior of seccomp on invalid syscalls is not particularly
> > > > > concerning. The tools that people typically use for constructing those
> > > > > filters typically require a valid syscall number.
> > > > > 
> > > > > It would be nice to align, though.
> > > > 
> > > > It is more concerning for SECCOMP_SET_MODE_STRICT or similar. So it
> > > > should be resolved to correctly execute seccomp even on invalid
> > > > syscalls. The syscall_enter_from_user_mode API is not particularly
> > > > well-suited for that, though.
> > > 
> > > In particular the fixup per
> > > https://lore.kernel.org/linuxppc-dev/akJzuEJRLniHk4Fi@kunlun.suse.cz/
> > > 
> > > handles some cases
> > > 
> > > ./a.out -2 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-2, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > > ret=-1 errno=55 (No anode)
> > > 
> > > but not -1
> > > 
> > > ./a.out -1 55 /usr/bin/perl -MPOSIX -e '$!=0; my $r = syscall(-1, 0); print "ret=$r errno=".($!+0)." ($!)\n"'
> > > ret=-1 errno=38 (Function not implemented)
> > > 
> > > which is the direct result of the ambiguous return value of
> > > syscall_enter_from_user_mode
> > > 
> > > Thanks
> > > 
> > > Michal
> > Hey Michal,
> > 
> > Yeah this seems to be a more complex thing than anticipated.
> > As per conversation on your another patch here
> > https://lore.kernel.org/all/BA7CD91D-C0E5-47A1-B49C-BC6AF6604182@zytor.com/
> > 
> > This patch seems to be redundant at this point.
> 
> Hello,
> 
> while improving the entry API is a fine goal it will take time if
> something can be even agreed on.
> 
> In the mantime we should provide a fix using the current API.
> 
> Inability to run container workloads is a significant regression.
> 
> Thanks
> 
> Michal

Yeah, i am working on a fix for now. Will post out a new version.

Regards,
Mukesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-07-02  9:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-29 18:29 [PATCH V2] powerpc/syscall: Fix seccomp errno handling with GENERIC_ENTRY Mukesh Kumar Chaurasiya (IBM)
2026-06-30 17:19 ` Michal Suchánek
2026-06-30 20:11 ` Shrikanth Hegde
2026-07-01  6:27   ` Mukesh Kumar Chaurasiya
2026-07-01  7:41     ` Michal Suchánek
2026-07-01  8:01       ` Michal Suchánek
2026-07-01  8:29         ` Michal Suchánek
2026-07-02  5:50           ` Mukesh Kumar Chaurasiya
2026-07-02  9:34             ` Michal Suchánek
2026-07-02  9:39               ` Mukesh Kumar Chaurasiya

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox