public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Dmitry V. Levin" <ldv@strace.io>
To: "Björn Töpel" <bjorn@rivosinc.com>
Cc: Celeste Liu <coelacanthushex@gmail.com>,
	Palmer Dabbelt <palmer@rivosinc.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Albert Ou <aou@eecs.berkeley.edu>, Guo Ren <guoren@kernel.org>,
	Conor Dooley <conor.dooley@microchip.com>,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Andreas Schwab <schwab@suse.de>,
	David Laight <David.Laight@aculab.com>,
	Felix Yan <felixonmars@archlinux.org>,
	Ruizhe Pan <c141028@gmail.com>,
	Shiqi Zhang <shiqi@isrc.iscas.ac.cn>,
	Emil Renner Berthing <emil.renner.berthing@canonical.com>,
	"Ivan A. Melnikov" <iv@altlinux.org>
Subject: Re: [PATCH v5] riscv: entry: set a0 = -ENOSYS only when syscall != -1
Date: Thu, 27 Jun 2024 12:52:58 +0300	[thread overview]
Message-ID: <20240627095258.GA2977@altlinux.org> (raw)
In-Reply-To: <CA+FstbVf7TJx==WsY5fBoFrdeY8php5ETn8kMq5s6YScy-2O=A@mail.gmail.com>

On Thu, Jun 27, 2024 at 11:43:03AM +0200, Björn Töpel wrote:
> On Thu, Jun 27, 2024 at 9:47 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
> > On 2024-06-27 15:14, Dmitry V. Levin wrote:
> >
> > > Hi,
> > >
> > > On Tue, Aug 01, 2023 at 10:15:16PM +0800, Celeste Liu wrote:
> > >> When we test seccomp with 6.4 kernel, we found errno has wrong value.
> > >> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
> > >> get ENOSYS instead. We got same result with commit 9c2598d43510 ("riscv:
> > >> entry: Save a0 prior syscall_enter_from_user_mode()").
> > >>
> > >> After analysing code, we think that regs->a0 = -ENOSYS should only be
> > >> executed when syscall != -1. In __seccomp_filter, when seccomp rejected
> > >> this syscall with specified errno, they will set a0 to return number as
> > >> syscall ABI, and then return -1. This return number is finally pass as
> > >> return number of syscall_enter_from_user_mode, and then is compared with
> > >> NR_syscalls after converted to ulong (so it will be ULONG_MAX). The
> > >> condition syscall < NR_syscalls will always be false, so regs->a0 = -ENOSYS
> > >> is always executed. It covered a0 set by seccomp, so we always get
> > >> ENOSYS when match seccomp RET_ERRNO rule.
> > >>
> > >> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> > >> Reported-by: Felix Yan <felixonmars@archlinux.org>
> > >> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
> > >> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
> > >> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> > >> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> > >> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
> > >> Tested-by: Felix Yan <felixonmars@archlinux.org>
> > >> Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
> > >> Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
> > >> Reviewed-by: Guo Ren <guoren@kernel.org>
> > >> ---
> > >>
> > >> v4 -> v5: add Tested-by Emil Renner Berthing <emil.renner.berthing@canonical.com>
> > >> v3 -> v4: use long instead of ulong to reduce type cast and avoid
> > >>           implementation-defined behavior, and make the judgment of syscall
> > >>           invalid more explicit
> > >> v2 -> v3: use if-statement instead of set default value,
> > >>           clarify the type of syscall
> > >> v1 -> v2: added explanation on why always got ENOSYS
> > >>
> > >>  arch/riscv/kernel/traps.c | 6 +++---
> > >>  1 file changed, 3 insertions(+), 3 deletions(-)
> > >>
> > >> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> > >> index f910dfccbf5d2..729f79c97e2bf 100644
> > >> --- a/arch/riscv/kernel/traps.c
> > >> +++ b/arch/riscv/kernel/traps.c
> > >> @@ -297,7 +297,7 @@ asmlinkage __visible __trap_section void do_trap_break(struct pt_regs *regs)
> > >>  asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> > >>  {
> > >>      if (user_mode(regs)) {
> > >> -            ulong syscall = regs->a7;
> > >> +            long syscall = regs->a7;
> > >>
> > >>              regs->epc += 4;
> > >>              regs->orig_a0 = regs->a0;
> > >> @@ -306,9 +306,9 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> > >>
> > >>              syscall = syscall_enter_from_user_mode(regs, syscall);
> > >>
> > >> -            if (syscall < NR_syscalls)
> > >> +            if (syscall >= 0 && syscall < NR_syscalls)
> > >>                      syscall_handler(regs, syscall);
> > >> -            else
> > >> +            else if (syscall != -1)
> > >>                      regs->a0 = -ENOSYS;
> > >>
> > >>              syscall_exit_to_user_mode(regs);
> > >
> > > Unfortunately, this change introduced a regression: it broke strace
> > > syscall tampering on riscv.  When the tracer changes syscall number to -1,
> > > the kernel fails to initialize a0 with -ENOSYS and subsequently fails to
> > > return the error code of the failed syscall to userspace.
> >
> > In the patch v2, we actually do the right thing. But as Björn Töpel's
> > suggestion and we found cast long to ulong is implementation-defined
> > behavior in C, so we change it to current form. So revert this patch and
> > apply patch v2 should fix this issue. Patch v2 uses ths same way with
> > other architectures.
> >
> > [1]: https://lore.kernel.org/all/20230718162940.226118-1-CoelacanthusHex@gmail.com/
> 
> Not reverting, but a fix to make sure that a0 is initialized to -ENOSYS, e.g.:
> 
> --8<--
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index 05a16b1f0aee..51ebfd23e007 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -319,6 +319,7 @@ void do_trap_ecall_u(struct pt_regs *regs)
> 
>   regs->epc += 4;
>   regs->orig_a0 = regs->a0;
> + regs->a0 = -ENOSYS;

Given that struct user_regs_struct doesn't have orig_a0, wouldn't this
clobber a0 too early so that the tracer will get -ENOSYS in place of the
first syscall argument?


-- 
ldv

  reply	other threads:[~2024-06-27  9:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-01 14:15 [PATCH v5] riscv: entry: set a0 = -ENOSYS only when syscall != -1 Celeste Liu
2023-08-17 15:20 ` patchwork-bot+linux-riscv
2024-06-27  7:14 ` Dmitry V. Levin
2024-06-27  7:47   ` Celeste Liu
2024-06-27  8:10     ` Andreas Schwab
2024-06-27  9:35       ` Celeste Liu
2024-06-27  9:43     ` Björn Töpel
2024-06-27  9:52       ` Dmitry V. Levin [this message]
2024-06-27 10:23         ` Björn Töpel
2024-06-27 10:11       ` Celeste Liu
2024-06-27 10:38       ` Celeste Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240627095258.GA2977@altlinux.org \
    --to=ldv@strace.io \
    --cc=David.Laight@aculab.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=bjorn@rivosinc.com \
    --cc=c141028@gmail.com \
    --cc=coelacanthushex@gmail.com \
    --cc=conor.dooley@microchip.com \
    --cc=emil.renner.berthing@canonical.com \
    --cc=felixonmars@archlinux.org \
    --cc=guoren@kernel.org \
    --cc=iv@altlinux.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    --cc=schwab@suse.de \
    --cc=shiqi@isrc.iscas.ac.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox