All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dmitry V. Levin" <ldv@strace.io>
To: "Björn Töpel" <bjorn@rivosinc.com>
Cc: Celeste Liu <coelacanthushex@gmail.com>,
	Palmer Dabbelt <palmer@rivosinc.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Albert Ou <aou@eecs.berkeley.edu>, Guo Ren <guoren@kernel.org>,
	Conor Dooley <conor.dooley@microchip.com>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Andreas Schwab <schwab@suse.de>,
	David Laight <David.Laight@aculab.com>,
	Felix Yan <felixonmars@archlinux.org>,
	Ruizhe Pan <c141028@gmail.com>,
	Shiqi Zhang <shiqi@isrc.iscas.ac.cn>,
	Emil Renner Berthing <emil.renner.berthing@canonical.com>,
	"Ivan A. Melnikov" <iv@altlinux.org>
Subject: Re: [PATCH v5] riscv: entry: set a0 = -ENOSYS only when syscall != -1
Date: Thu, 27 Jun 2024 12:52:58 +0300	[thread overview]
Message-ID: <20240627095258.GA2977@altlinux.org> (raw)
In-Reply-To: <CA+FstbVf7TJx==WsY5fBoFrdeY8php5ETn8kMq5s6YScy-2O=A@mail.gmail.com>

On Thu, Jun 27, 2024 at 11:43:03AM +0200, Björn Töpel wrote:
> On Thu, Jun 27, 2024 at 9:47 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
> > On 2024-06-27 15:14, Dmitry V. Levin wrote:
> >
> > > Hi,
> > >
> > > On Tue, Aug 01, 2023 at 10:15:16PM +0800, Celeste Liu wrote:
> > >> When we test seccomp with 6.4 kernel, we found errno has wrong value.
> > >> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
> > >> get ENOSYS instead. We got same result with commit 9c2598d43510 ("riscv:
> > >> entry: Save a0 prior syscall_enter_from_user_mode()").
> > >>
> > >> After analysing code, we think that regs->a0 = -ENOSYS should only be
> > >> executed when syscall != -1. In __seccomp_filter, when seccomp rejected
> > >> this syscall with specified errno, they will set a0 to return number as
> > >> syscall ABI, and then return -1. This return number is finally pass as
> > >> return number of syscall_enter_from_user_mode, and then is compared with
> > >> NR_syscalls after converted to ulong (so it will be ULONG_MAX). The
> > >> condition syscall < NR_syscalls will always be false, so regs->a0 = -ENOSYS
> > >> is always executed. It covered a0 set by seccomp, so we always get
> > >> ENOSYS when match seccomp RET_ERRNO rule.
> > >>
> > >> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> > >> Reported-by: Felix Yan <felixonmars@archlinux.org>
> > >> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
> > >> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
> > >> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> > >> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> > >> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
> > >> Tested-by: Felix Yan <felixonmars@archlinux.org>
> > >> Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
> > >> Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
> > >> Reviewed-by: Guo Ren <guoren@kernel.org>
> > >> ---
> > >>
> > >> v4 -> v5: add Tested-by Emil Renner Berthing <emil.renner.berthing@canonical.com>
> > >> v3 -> v4: use long instead of ulong to reduce type cast and avoid
> > >>           implementation-defined behavior, and make the judgment of syscall
> > >>           invalid more explicit
> > >> v2 -> v3: use if-statement instead of set default value,
> > >>           clarify the type of syscall
> > >> v1 -> v2: added explanation on why always got ENOSYS
> > >>
> > >>  arch/riscv/kernel/traps.c | 6 +++---
> > >>  1 file changed, 3 insertions(+), 3 deletions(-)
> > >>
> > >> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> > >> index f910dfccbf5d2..729f79c97e2bf 100644
> > >> --- a/arch/riscv/kernel/traps.c
> > >> +++ b/arch/riscv/kernel/traps.c
> > >> @@ -297,7 +297,7 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
> > >>  asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> > >>  {
> > >>      if (user_mode(regs)) {
> > >> -            ulong syscall = regs->a7;
> > >> +            long syscall = regs->a7;
> > >>
> > >>              regs->epc += 4;
> > >>              regs->orig_a0 = regs->a0;
> > >> @@ -306,9 +306,9 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> > >>
> > >>              syscall = syscall_enter_from_user_mode(regs, syscall);
> > >>
> > >> -            if (syscall < NR_syscalls)
> > >> +            if (syscall >= 0 && syscall < NR_syscalls)
> > >>                      syscall_handler(regs, syscall);
> > >> -            else
> > >> +            else if (syscall != -1)
> > >>                      regs->a0 = -ENOSYS;
> > >>
> > >>              syscall_exit_to_user_mode(regs);
> > >
> > > Unfortunately, this change introduced a regression: it broke strace
> > > syscall tampering on riscv.  When the tracer changes syscall number to -1,
> > > the kernel fails to initialize a0 with -ENOSYS and subsequently fails to
> > > return the error code of the failed syscall to userspace.
> >
> > In the patch v2, we actually do the right thing. But as Björn Töpel's
> > suggestion and we found cast long to ulong is implementation-defined
> > behavior in C, so we change it to current form. So revert this patch and
> > apply patch v2 should fix this issue. Patch v2 uses ths same way with
> > other architectures.
> >
> > [1]: https://lore.kernel.org/all/20230718162940.226118-1-CoelacanthusHex@gmail.com/
> 
> Not reverting, but a fix to make sure that a0 is initialized to -ENOSYS, e.g.:
> 
> --8<--
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index 05a16b1f0aee..51ebfd23e007 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -319,6 +319,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
> 
>   regs->epc += 4;
>   regs->orig_a0 = regs->a0;
> + regs->a0 = -ENOSYS;

Given that struct user_regs_struct doesn't have orig_a0, wouldn't this
clobber a0 too early so that the tracer will get -ENOSYS in place of the
first syscall argument?


-- 
ldv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: "Dmitry V. Levin" <ldv@strace.io>
To: "Björn Töpel" <bjorn@rivosinc.com>
Cc: Celeste Liu <coelacanthushex@gmail.com>,
	Palmer Dabbelt <palmer@rivosinc.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Albert Ou <aou@eecs.berkeley.edu>, Guo Ren <guoren@kernel.org>,
	Conor Dooley <conor.dooley@microchip.com>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Andreas Schwab <schwab@suse.de>,
	David Laight <David.Laight@aculab.com>,
	Felix Yan <felixonmars@archlinux.org>,
	Ruizhe Pan <c141028@gmail.com>,
	Shiqi Zhang <shiqi@isrc.iscas.ac.cn>,
	Emil Renner Berthing <emil.renner.berthing@canonical.com>,
	"Ivan A. Melnikov" <iv@altlinux.org>
Subject: Re: [PATCH v5] riscv: entry: set a0 = -ENOSYS only when syscall != -1
Date: Thu, 27 Jun 2024 12:52:58 +0300	[thread overview]
Message-ID: <20240627095258.GA2977@altlinux.org> (raw)
In-Reply-To: <CA+FstbVf7TJx==WsY5fBoFrdeY8php5ETn8kMq5s6YScy-2O=A@mail.gmail.com>

On Thu, Jun 27, 2024 at 11:43:03AM +0200, Björn Töpel wrote:
> On Thu, Jun 27, 2024 at 9:47 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
> > On 2024-06-27 15:14, Dmitry V. Levin wrote:
> >
> > > Hi,
> > >
> > > On Tue, Aug 01, 2023 at 10:15:16PM +0800, Celeste Liu wrote:
> > >> When we test seccomp with 6.4 kernel, we found errno has wrong value.
> > >> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
> > >> get ENOSYS instead. We got same result with commit 9c2598d43510 ("riscv:
> > >> entry: Save a0 prior syscall_enter_from_user_mode()").
> > >>
> > >> After analysing code, we think that regs->a0 = -ENOSYS should only be
> > >> executed when syscall != -1. In __seccomp_filter, when seccomp rejected
> > >> this syscall with specified errno, they will set a0 to return number as
> > >> syscall ABI, and then return -1. This return number is finally pass as
> > >> return number of syscall_enter_from_user_mode, and then is compared with
> > >> NR_syscalls after converted to ulong (so it will be ULONG_MAX). The
> > >> condition syscall < NR_syscalls will always be false, so regs->a0 = -ENOSYS
> > >> is always executed. It covered a0 set by seccomp, so we always get
> > >> ENOSYS when match seccomp RET_ERRNO rule.
> > >>
> > >> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> > >> Reported-by: Felix Yan <felixonmars@archlinux.org>
> > >> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
> > >> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
> > >> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> > >> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> > >> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
> > >> Tested-by: Felix Yan <felixonmars@archlinux.org>
> > >> Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
> > >> Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
> > >> Reviewed-by: Guo Ren <guoren@kernel.org>
> > >> ---
> > >>
> > >> v4 -> v5: add Tested-by Emil Renner Berthing <emil.renner.berthing@canonical.com>
> > >> v3 -> v4: use long instead of ulong to reduce type cast and avoid
> > >>           implementation-defined behavior, and make the judgment of syscall
> > >>           invalid more explicit
> > >> v2 -> v3: use if-statement instead of set default value,
> > >>           clarify the type of syscall
> > >> v1 -> v2: added explanation on why always got ENOSYS
> > >>
> > >>  arch/riscv/kernel/traps.c | 6 +++---
> > >>  1 file changed, 3 insertions(+), 3 deletions(-)
> > >>
> > >> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> > >> index f910dfccbf5d2..729f79c97e2bf 100644
> > >> --- a/arch/riscv/kernel/traps.c
> > >> +++ b/arch/riscv/kernel/traps.c
> > >> @@ -297,7 +297,7 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
> > >>  asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> > >>  {
> > >>      if (user_mode(regs)) {
> > >> -            ulong syscall = regs->a7;
> > >> +            long syscall = regs->a7;
> > >>
> > >>              regs->epc += 4;
> > >>              regs->orig_a0 = regs->a0;
> > >> @@ -306,9 +306,9 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> > >>
> > >>              syscall = syscall_enter_from_user_mode(regs, syscall);
> > >>
> > >> -            if (syscall < NR_syscalls)
> > >> +            if (syscall >= 0 && syscall < NR_syscalls)
> > >>                      syscall_handler(regs, syscall);
> > >> -            else
> > >> +            else if (syscall != -1)
> > >>                      regs->a0 = -ENOSYS;
> > >>
> > >>              syscall_exit_to_user_mode(regs);
> > >
> > > Unfortunately, this change introduced a regression: it broke strace
> > > syscall tampering on riscv.  When the tracer changes syscall number to -1,
> > > the kernel fails to initialize a0 with -ENOSYS and subsequently fails to
> > > return the error code of the failed syscall to userspace.
> >
> > In the patch v2, we actually do the right thing. But as Björn Töpel's
> > suggestion and we found cast long to ulong is implementation-defined
> > behavior in C, so we change it to current form. So revert this patch and
> > apply patch v2 should fix this issue. Patch v2 uses ths same way with
> > other architectures.
> >
> > [1]: https://lore.kernel.org/all/20230718162940.226118-1-CoelacanthusHex@gmail.com/
> 
> Not reverting, but a fix to make sure that a0 is initialized to -ENOSYS, e.g.:
> 
> --8<--
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index 05a16b1f0aee..51ebfd23e007 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -319,6 +319,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
> 
>   regs->epc += 4;
>   regs->orig_a0 = regs->a0;
> + regs->a0 = -ENOSYS;

Given that struct user_regs_struct doesn't have orig_a0, wouldn't this
clobber a0 too early so that the tracer will get -ENOSYS in place of the
first syscall argument?


-- 
ldv

  reply	other threads:[~2024-06-27 10:08 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 14:15 [PATCH v5] riscv: entry: set a0 = -ENOSYS only when syscall != -1 Celeste Liu
2023-08-01 14:15 ` Celeste Liu
2023-08-17 15:20 ` patchwork-bot+linux-riscv
2023-08-17 15:20   ` patchwork-bot+linux-riscv
2024-06-27  7:14 ` Dmitry V. Levin
2024-06-27  7:14   ` Dmitry V. Levin
2024-06-27  7:47   ` Celeste Liu
2024-06-27  7:47     ` Celeste Liu
2024-06-27  8:10     ` Andreas Schwab
2024-06-27  8:10       ` Andreas Schwab
2024-06-27  9:35       ` Celeste Liu
2024-06-27  9:35         ` Celeste Liu
2024-06-27  9:43     ` Björn Töpel
2024-06-27  9:43       ` Björn Töpel
2024-06-27  9:52       ` Dmitry V. Levin [this message]
2024-06-27  9:52         ` Dmitry V. Levin
2024-06-27 10:23         ` Björn Töpel
2024-06-27 10:23           ` Björn Töpel
2024-06-27 10:11       ` Celeste Liu
2024-06-27 10:11         ` Celeste Liu
2024-06-27 10:38       ` Celeste Liu
2024-06-27 10:38         ` Celeste Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240627095258.GA2977@altlinux.org \
    --to=ldv@strace.io \
    --cc=David.Laight@aculab.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=bjorn@rivosinc.com \
    --cc=c141028@gmail.com \
    --cc=coelacanthushex@gmail.com \
    --cc=conor.dooley@microchip.com \
    --cc=emil.renner.berthing@canonical.com \
    --cc=felixonmars@archlinux.org \
    --cc=guoren@kernel.org \
    --cc=iv@altlinux.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    --cc=schwab@suse.de \
    --cc=shiqi@isrc.iscas.ac.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.