linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
@ 2023-07-18 20:57 Celeste Liu
  2023-07-18 21:20 ` Björn Töpel
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Celeste Liu @ 2023-07-18 20:57 UTC (permalink / raw)
  To: Palmer Dabbelt, Paul Walmsley, Albert Ou, Guo Ren,
	Björn Töpel, Conor Dooley, linux-riscv
  Cc: linux-kernel, Celeste Liu, Felix Yan, Ruizhe Pan, Shiqi Zhang

When we test seccomp with 6.4 kernel, we found errno has wrong value.
If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
get ENOSYS instead. We got same result with commit 9c2598d43510 ("riscv: entry:
Save a0 prior syscall_enter_from_user_mode()").

After analysing code, we think that regs->a0 = -ENOSYS should only be executed
when syscall != -1 In __seccomp_filter, when seccomp rejected this syscall with
specified errno, they will set a0 to return number as syscall ABI, and then
return -1. This return number is finally pass as return number of
syscall_enter_from_user_mode, and then is compared with NR_syscalls after
converted to ulong (so it will be ULONG_MAX). The condition
syscall < NR_syscalls will always be false, so regs->a0 = -ENOSYS is always
executed. It covered a0 set by seccomp, so we always get ENOSYS when match
seccomp RET_ERRNO rule.

Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
Reported-by: Felix Yan <felixonmars@archlinux.org>
Co-developed-by: Ruizhe Pan <c141028@gmail.com>
Signed-off-by: Ruizhe Pan <c141028@gmail.com>
Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Tested-by: Felix Yan <felixonmars@archlinux.org>
---

v2 -> v3: use if-statement instead of set default value,
          clarify the type of syscall
v1 -> v2: added explanation on why always got ENOSYS

 arch/riscv/kernel/traps.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f910dfccbf5d2..5cef728745420 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -297,6 +297,10 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
 asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
 {
 	if (user_mode(regs)) {
+		/*
+		 * Convert negative numbers to very high and thus out of range
+		 * numbers for comparisons.
+		 */
 		ulong syscall = regs->a7;
 
 		regs->epc += 4;
@@ -308,7 +312,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
 
 		if (syscall < NR_syscalls)
 			syscall_handler(regs, syscall);
-		else
+		else if ((long)syscall != -1L)
 			regs->a0 = -ENOSYS;
 
 		syscall_exit_to_user_mode(regs);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
  2023-07-18 20:57 [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1 Celeste Liu
@ 2023-07-18 21:20 ` Björn Töpel
  2023-07-18 23:40 ` Guo Ren
  2023-07-19  7:21 ` Andreas Schwab
  2 siblings, 0 replies; 9+ messages in thread
From: Björn Töpel @ 2023-07-18 21:20 UTC (permalink / raw)
  To: Celeste Liu, Palmer Dabbelt, Paul Walmsley, Albert Ou, Guo Ren,
	Björn Töpel, Conor Dooley, linux-riscv
  Cc: linux-kernel, Celeste Liu, Felix Yan, Ruizhe Pan, Shiqi Zhang

Celeste Liu <coelacanthushex@gmail.com> writes:

> When we test seccomp with 6.4 kernel, we found errno has wrong value.
> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
> get ENOSYS instead. We got same result with commit 9c2598d43510 ("riscv: entry:
> Save a0 prior syscall_enter_from_user_mode()").
>
> After analysing code, we think that regs->a0 = -ENOSYS should only be executed
> when syscall != -1 In __seccomp_filter, when seccomp rejected this syscall with
> specified errno, they will set a0 to return number as syscall ABI, and then
> return -1. This return number is finally pass as return number of
> syscall_enter_from_user_mode, and then is compared with NR_syscalls after
> converted to ulong (so it will be ULONG_MAX). The condition
> syscall < NR_syscalls will always be false, so regs->a0 = -ENOSYS is always
> executed. It covered a0 set by seccomp, so we always get ENOSYS when match
> seccomp RET_ERRNO rule.
>
> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> Reported-by: Felix Yan <felixonmars@archlinux.org>
> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
> Tested-by: Felix Yan <felixonmars@archlinux.org>

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
  2023-07-18 20:57 [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1 Celeste Liu
  2023-07-18 21:20 ` Björn Töpel
@ 2023-07-18 23:40 ` Guo Ren
  2023-07-19 13:13   ` David Laight
  2023-07-19  7:21 ` Andreas Schwab
  2 siblings, 1 reply; 9+ messages in thread
From: Guo Ren @ 2023-07-18 23:40 UTC (permalink / raw)
  To: Celeste Liu, Huacai Chen
  Cc: Palmer Dabbelt, Paul Walmsley, Albert Ou, Björn Töpel,
	Conor Dooley, linux-riscv, linux-kernel, Felix Yan, Ruizhe Pan,
	Shiqi Zhang

On Wed, Jul 19, 2023 at 5:01 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
>
> When we test seccomp with 6.4 kernel, we found errno has wrong value.
> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
> get ENOSYS instead. We got same result with commit 9c2598d43510 ("riscv: entry:
> Save a0 prior syscall_enter_from_user_mode()").
>
> After analysing code, we think that regs->a0 = -ENOSYS should only be executed
> when syscall != -1 In __seccomp_filter, when seccomp rejected this syscall with
> specified errno, they will set a0 to return number as syscall ABI, and then
> return -1. This return number is finally pass as return number of
> syscall_enter_from_user_mode, and then is compared with NR_syscalls after
> converted to ulong (so it will be ULONG_MAX). The condition
> syscall < NR_syscalls will always be false, so regs->a0 = -ENOSYS is always
> executed. It covered a0 set by seccomp, so we always get ENOSYS when match
> seccomp RET_ERRNO rule.
>
> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> Reported-by: Felix Yan <felixonmars@archlinux.org>
> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
> Tested-by: Felix Yan <felixonmars@archlinux.org>
> ---
>
> v2 -> v3: use if-statement instead of set default value,
>           clarify the type of syscall
> v1 -> v2: added explanation on why always got ENOSYS
>
>  arch/riscv/kernel/traps.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index f910dfccbf5d2..5cef728745420 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -297,6 +297,10 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
>  asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>  {
>         if (user_mode(regs)) {
> +               /*
> +                * Convert negative numbers to very high and thus out of range
> +                * numbers for comparisons.
> +                */
>                 ulong syscall = regs->a7;
>
>                 regs->epc += 4;
> @@ -308,7 +312,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>
>                 if (syscall < NR_syscalls)
>                         syscall_handler(regs, syscall);
> -               else
> +               else if ((long)syscall != -1L)
Maybe we should define an explicit macro for this ERRNO in
__seccomp_filter, and this style obeys the coding convention.

For this patch:
Reviewed-by: Guo Ren <guoren@kernel.org>

Cc: loongarch guy, please check loongarch's code. :)

>                         regs->a0 = -ENOSYS;
>
>                 syscall_exit_to_user_mode(regs);
> --
> 2.41.0
>


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
  2023-07-18 20:57 [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1 Celeste Liu
  2023-07-18 21:20 ` Björn Töpel
  2023-07-18 23:40 ` Guo Ren
@ 2023-07-19  7:21 ` Andreas Schwab
  2023-07-19 16:28   ` Björn Töpel
  2 siblings, 1 reply; 9+ messages in thread
From: Andreas Schwab @ 2023-07-19  7:21 UTC (permalink / raw)
  To: Celeste Liu
  Cc: Palmer Dabbelt, Paul Walmsley, Albert Ou, Guo Ren,
	Björn Töpel, Conor Dooley, linux-riscv, linux-kernel,
	Felix Yan, Ruizhe Pan, Shiqi Zhang

On Jul 19 2023, Celeste Liu wrote:

> @@ -308,7 +312,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>  
>  		if (syscall < NR_syscalls)
>  			syscall_handler(regs, syscall);
> -		else
> +		else if ((long)syscall != -1L)

You can also use syscall != -1UL or even syscall != -1.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
  2023-07-18 23:40 ` Guo Ren
@ 2023-07-19 13:13   ` David Laight
  0 siblings, 0 replies; 9+ messages in thread
From: David Laight @ 2023-07-19 13:13 UTC (permalink / raw)
  To: 'Guo Ren', Celeste Liu, Huacai Chen
  Cc: Palmer Dabbelt, Paul Walmsley, Albert Ou, Björn Töpel,
	Conor Dooley, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org, Felix Yan, Ruizhe Pan, Shiqi Zhang

...
> > +               /*
> > +                * Convert negative numbers to very high and thus out of range
> > +                * numbers for comparisons.
> > +                */
> >                 ulong syscall = regs->a7;
> >
> >                 regs->epc += 4;
> > @@ -308,7 +312,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> >
> >                 if (syscall < NR_syscalls)

If you leave 'syscall' signed and write:
	if (syscall >= 0 && syscall < NR_syscalls)
the compiler will use a single unsigned compare.
There is no need to 'optimise' it yourself.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
  2023-07-19  7:21 ` Andreas Schwab
@ 2023-07-19 16:28   ` Björn Töpel
  2023-07-20  6:46     ` Celeste Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Björn Töpel @ 2023-07-19 16:28 UTC (permalink / raw)
  To: Andreas Schwab, Celeste Liu
  Cc: Palmer Dabbelt, Paul Walmsley, Albert Ou, Guo Ren,
	Björn Töpel, Conor Dooley, linux-riscv, linux-kernel,
	Felix Yan, Ruizhe Pan, Shiqi Zhang

Andreas Schwab <schwab@suse.de> writes:

> On Jul 19 2023, Celeste Liu wrote:
>
>> @@ -308,7 +312,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>>  
>>  		if (syscall < NR_syscalls)
>>  			syscall_handler(regs, syscall);
>> -		else
>> +		else if ((long)syscall != -1L)
>
> You can also use syscall != -1UL or even syscall != -1.

The former is indeed better for the eyes! :-) The latter will get a
-Wsign-compare warning, no?


Björn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
  2023-07-19 16:28   ` Björn Töpel
@ 2023-07-20  6:46     ` Celeste Liu
  2023-07-20  9:08       ` Björn Töpel
  0 siblings, 1 reply; 9+ messages in thread
From: Celeste Liu @ 2023-07-20  6:46 UTC (permalink / raw)
  To: Björn Töpel, Andreas Schwab
  Cc: Palmer Dabbelt, Paul Walmsley, Albert Ou, Guo Ren,
	Björn Töpel, Conor Dooley, linux-riscv, linux-kernel,
	Felix Yan, Ruizhe Pan, Shiqi Zhang

On July 20, 2023 12:28:47 AM GMT+08:00, "Björn Töpel" <bjorn@kernel.org> wrote:
>Andreas Schwab <schwab@suse.de> writes:
>
>> On Jul 19 2023, Celeste Liu wrote:
>>
>>> @@ -308,7 +312,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>>>  
>>>  		if (syscall < NR_syscalls)
>>>  			syscall_handler(regs, syscall);
>>> -		else
>>> +		else if ((long)syscall != -1L)
>>
>> You can also use syscall != -1UL or even syscall != -1.
>
>The former is indeed better for the eyes! :-) The latter will get a
>-Wsign-compare warning, no?
>
>
>Björn

Well, that's true. And I just found out that by C standards, converting
ulong to long is implementation-defined behavior, unlike long to ulong
which is well-defined. So it is really better than (long)syscall != -1L.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
  2023-07-20  6:46     ` Celeste Liu
@ 2023-07-20  9:08       ` Björn Töpel
  2023-07-20 19:24         ` Celeste Liu
  0 siblings, 1 reply; 9+ messages in thread
From: Björn Töpel @ 2023-07-20  9:08 UTC (permalink / raw)
  To: Celeste Liu, Andreas Schwab
  Cc: Palmer Dabbelt, Paul Walmsley, Albert Ou, Guo Ren,
	Björn Töpel, Conor Dooley, linux-riscv, linux-kernel,
	Felix Yan, Ruizhe Pan, Shiqi Zhang

Celeste Liu <coelacanthushex@gmail.com> writes:

> On July 20, 2023 12:28:47 AM GMT+08:00, "Björn Töpel" <bjorn@kernel.org> wrote:
>>Andreas Schwab <schwab@suse.de> writes:
>>
>>> On Jul 19 2023, Celeste Liu wrote:
>>>
>>>> @@ -308,7 +312,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>>>>  
>>>>  		if (syscall < NR_syscalls)
>>>>  			syscall_handler(regs, syscall);
>>>> -		else
>>>> +		else if ((long)syscall != -1L)
>>>
>>> You can also use syscall != -1UL or even syscall != -1.
>>
>>The former is indeed better for the eyes! :-) The latter will get a
>>-Wsign-compare warning, no?
>>
>>
>>Björn
>
> Well, that's true. And I just found out that by C standards, converting
> ulong to long is implementation-defined behavior, unlike long to ulong
> which is well-defined. So it is really better than (long)syscall != -1L.

If you're respinning, I suggest you use David's suggestion:
 * Remove the comment I suggest you to add
 * Use (signed) long
 * Add syscall >= 0 &&
 * else if (syscall != -1)

Which is the least amount of surprises IMO.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1
  2023-07-20  9:08       ` Björn Töpel
@ 2023-07-20 19:24         ` Celeste Liu
  0 siblings, 0 replies; 9+ messages in thread
From: Celeste Liu @ 2023-07-20 19:24 UTC (permalink / raw)
  To: Björn Töpel, Andreas Schwab
  Cc: Palmer Dabbelt, Paul Walmsley, Albert Ou, Guo Ren,
	Björn Töpel, Conor Dooley, linux-riscv, linux-kernel,
	Felix Yan, Ruizhe Pan, Shiqi Zhang

On July 20, 2023 5:08:37 PM GMT+08:00, "Björn Töpel" <bjorn@kernel.org> wrote:
>Celeste Liu <coelacanthushex@gmail.com> writes:
>
>> On July 20, 2023 12:28:47 AM GMT+08:00, "Björn Töpel" <bjorn@kernel.org> wrote:
>>>Andreas Schwab <schwab@suse.de> writes:
>>>
>>>> On Jul 19 2023, Celeste Liu wrote:
>>>>
>>>>> @@ -308,7 +312,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>>>>>  
>>>>>  		if (syscall < NR_syscalls)
>>>>>  			syscall_handler(regs, syscall);
>>>>> -		else
>>>>> +		else if ((long)syscall != -1L)
>>>>
>>>> You can also use syscall != -1UL or even syscall != -1.
>>>
>>>The former is indeed better for the eyes! :-) The latter will get a
>>>-Wsign-compare warning, no?
>>>
>>>
>>>Björn
>>
>> Well, that's true. And I just found out that by C standards, converting
>> ulong to long is implementation-defined behavior, unlike long to ulong
>> which is well-defined. So it is really better than (long)syscall != -1L.
>
>If you're respinning, I suggest you use David's suggestion:
> * Remove the comment I suggest you to add
> * Use (signed) long
> * Add syscall >= 0 &&
> * else if (syscall != -1)
>
>Which is the least amount of surprises IMO.

v4 has sent

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-07-20 19:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-18 20:57 [PATCH v3] riscv: entry: set a0 = -ENOSYS only when syscall != -1 Celeste Liu
2023-07-18 21:20 ` Björn Töpel
2023-07-18 23:40 ` Guo Ren
2023-07-19 13:13   ` David Laight
2023-07-19  7:21 ` Andreas Schwab
2023-07-19 16:28   ` Björn Töpel
2023-07-20  6:46     ` Celeste Liu
2023-07-20  9:08       ` Björn Töpel
2023-07-20 19:24         ` Celeste Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).