* Re: [PATCH] LoongArch: Store syscall nr in thread_info [not found] ` <CAEyhmHRYghT5iFiLByUmC=AjdygiBWU8TH3joSyyWibu0Ki2xw@mail.gmail.com> @ 2023-11-22 7:58 ` Huacai Chen 2023-11-23 5:49 ` Hengqi Chen 0 siblings, 1 reply; 10+ messages in thread From: Huacai Chen @ 2023-11-22 7:58 UTC (permalink / raw) To: Hengqi Chen, Arnd Bergmann, linux-arch; +Cc: loongarch, kernel Hi, Hengqi, On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > Hi, Huacai, > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Hengqi, > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > changed during syscall execution. Take `execve` as an example: > > > > > > sys_execve > > > -> do_execve > > > -> do_execveat_common > > > -> bprm_execve > > > -> exec_binprm > > > -> search_binary_handler > > > -> load_elf_binary > > > -> ELF_PLAT_INIT > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > we get a wrong syscall nr. > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > does not trigger at all. > > > > > > Let's store syscall nr in thread_info instead. > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > rt_sigreturn is affected in another code path, and there may be other > syscalls that I am unaware of. Save syscall number in thread_info has more side effects, because ptrace allows us to change the number during syscall, then we should keep consistency between syscall and regs[11]. And about ELF_PLAT_INIT, maybe Arnd can give us some more information. Hi, Arnd, I found some new architectures, such as ARM64 and RISC-V, just do nearly nothing in ELF_PLAT_INIT, while some old architectures, such as x86 and MIPS, clear most of the registers, do you know why? Huacai > > > Huacai > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > --- > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > arch/loongarch/kernel/syscall.c | 1 + > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > index e286dc58476e..2317d674b92a 100644 > > > --- a/arch/loongarch/include/asm/syscall.h > > > +++ b/arch/loongarch/include/asm/syscall.h > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > static inline long syscall_get_nr(struct task_struct *task, > > > struct pt_regs *regs) > > > { > > > - return regs->regs[11]; > > > + return task_thread_info(task)->syscall; > > > } > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > --- a/arch/loongarch/kernel/syscall.c > > > +++ b/arch/loongarch/kernel/syscall.c > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > regs->orig_a0 = regs->regs[4]; > > > regs->regs[4] = -ENOSYS; > > > > > > + task_thread_info(current)->syscall = nr; > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > if (nr < NR_syscalls) { > > > -- > > > 2.42.0 > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-11-22 7:58 ` [PATCH] LoongArch: Store syscall nr in thread_info Huacai Chen @ 2023-11-23 5:49 ` Hengqi Chen 2023-11-23 6:13 ` Huacai Chen 0 siblings, 1 reply; 10+ messages in thread From: Hengqi Chen @ 2023-11-23 5:49 UTC (permalink / raw) To: Huacai Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Hengqi, > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > Hi, Huacai, > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Hengqi, > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > sys_execve > > > > -> do_execve > > > > -> do_execveat_common > > > > -> bprm_execve > > > > -> exec_binprm > > > > -> search_binary_handler > > > > -> load_elf_binary > > > > -> ELF_PLAT_INIT > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > we get a wrong syscall nr. > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > does not trigger at all. > > > > > > > > Let's store syscall nr in thread_info instead. > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > rt_sigreturn is affected in another code path, and there may be other > > syscalls that I am unaware of. > Save syscall number in thread_info has more side effects, because > ptrace allows us to change the number during syscall, then we should > keep consistency between syscall and regs[11]. > How about the change below: diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h index e286dc58476e..954ba53bcc9a 100644 --- a/arch/loongarch/include/asm/syscall.h +++ b/arch/loongarch/include/asm/syscall.h @@ -23,7 +23,9 @@ extern void *sys_call_table[]; static inline long syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { - return regs->regs[11]; + long nr = task_thread_info(task)->syscall; + + return nr ? : regs->regs[11]; } static inline void syscall_rollback(struct task_struct *task, diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c index b4c5acd7aa3b..553ab0d624cb 100644 --- a/arch/loongarch/kernel/syscall.c +++ b/arch/loongarch/kernel/syscall.c @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) regs->regs[4] = -ENOSYS; nr = syscall_enter_from_user_mode(regs, nr); + current_thread_info()->syscall = nr; if (nr < NR_syscalls) { syscall_fn = sys_call_table[nr]; @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) } syscall_exit_to_user_mode(regs); + current_thread_info()->syscall = 0; } * allow ptrace to change syscall nr * sys_exit_* will also see the right syscall nr * this works even if rt_sigreturn clobbers all pt_regs::regs > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > Hi, Arnd, > > I found some new architectures, such as ARM64 and RISC-V, just do > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > x86 and MIPS, clear most of the registers, do you know why? > > Huacai > > > > > > Huacai > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > --- > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > index e286dc58476e..2317d674b92a 100644 > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > struct pt_regs *regs) > > > > { > > > > - return regs->regs[11]; > > > > + return task_thread_info(task)->syscall; > > > > } > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > --- a/arch/loongarch/kernel/syscall.c > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > regs->orig_a0 = regs->regs[4]; > > > > regs->regs[4] = -ENOSYS; > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > if (nr < NR_syscalls) { > > > > -- > > > > 2.42.0 > > > > ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-11-23 5:49 ` Hengqi Chen @ 2023-11-23 6:13 ` Huacai Chen 2023-11-23 8:08 ` Hengqi Chen 0 siblings, 1 reply; 10+ messages in thread From: Huacai Chen @ 2023-11-23 6:13 UTC (permalink / raw) To: Hengqi Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel Hi, Hengqi, On Thu, Nov 23, 2023 at 1:49 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Hengqi, > > > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > Hi, Huacai, > > > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > Hi, Hengqi, > > > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > > > sys_execve > > > > > -> do_execve > > > > > -> do_execveat_common > > > > > -> bprm_execve > > > > > -> exec_binprm > > > > > -> search_binary_handler > > > > > -> load_elf_binary > > > > > -> ELF_PLAT_INIT > > > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > > we get a wrong syscall nr. > > > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > > does not trigger at all. > > > > > > > > > > Let's store syscall nr in thread_info instead. > > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > > rt_sigreturn is affected in another code path, and there may be other > > > syscalls that I am unaware of. > > Save syscall number in thread_info has more side effects, because > > ptrace allows us to change the number during syscall, then we should > > keep consistency between syscall and regs[11]. > > > > How about the change below: > > diff --git a/arch/loongarch/include/asm/syscall.h > b/arch/loongarch/include/asm/syscall.h > index e286dc58476e..954ba53bcc9a 100644 > --- a/arch/loongarch/include/asm/syscall.h > +++ b/arch/loongarch/include/asm/syscall.h > @@ -23,7 +23,9 @@ extern void *sys_call_table[]; > static inline long syscall_get_nr(struct task_struct *task, > struct pt_regs *regs) > { > - return regs->regs[11]; > + long nr = task_thread_info(task)->syscall; > + > + return nr ? : regs->regs[11]; > } > > static inline void syscall_rollback(struct task_struct *task, > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > index b4c5acd7aa3b..553ab0d624cb 100644 > --- a/arch/loongarch/kernel/syscall.c > +++ b/arch/loongarch/kernel/syscall.c > @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) > regs->regs[4] = -ENOSYS; > > nr = syscall_enter_from_user_mode(regs, nr); > + current_thread_info()->syscall = nr; > > if (nr < NR_syscalls) { > syscall_fn = sys_call_table[nr]; > @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) > } > > syscall_exit_to_user_mode(regs); > + current_thread_info()->syscall = 0; > } > > * allow ptrace to change syscall nr > * sys_exit_* will also see the right syscall nr > * this works even if rt_sigreturn clobbers all pt_regs::regs No, I still prefer to modify ELF_PLAT_INIT, we can wait Arnd's comments. And, do you mean modifying ELF_PLAT_INIT cannot solve the rt_sigreturn's problem? Huacai > > > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > > > Hi, Arnd, > > > > I found some new architectures, such as ARM64 and RISC-V, just do > > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > > x86 and MIPS, clear most of the registers, do you know why? > > > > Huacai > > > > > > > > > Huacai > > > > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > --- > > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > > index e286dc58476e..2317d674b92a 100644 > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > struct pt_regs *regs) > > > > > { > > > > > - return regs->regs[11]; > > > > > + return task_thread_info(task)->syscall; > > > > > } > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > regs->orig_a0 = regs->regs[4]; > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > > > if (nr < NR_syscalls) { > > > > > -- > > > > > 2.42.0 > > > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-11-23 6:13 ` Huacai Chen @ 2023-11-23 8:08 ` Hengqi Chen 2023-11-23 8:25 ` Huacai Chen 0 siblings, 1 reply; 10+ messages in thread From: Hengqi Chen @ 2023-11-23 8:08 UTC (permalink / raw) To: Huacai Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel On Thu, Nov 23, 2023 at 2:13 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Hengqi, > > On Thu, Nov 23, 2023 at 1:49 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Hengqi, > > > > > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > Hi, Huacai, > > > > > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > > > > > sys_execve > > > > > > -> do_execve > > > > > > -> do_execveat_common > > > > > > -> bprm_execve > > > > > > -> exec_binprm > > > > > > -> search_binary_handler > > > > > > -> load_elf_binary > > > > > > -> ELF_PLAT_INIT > > > > > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > > > we get a wrong syscall nr. > > > > > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > > > does not trigger at all. > > > > > > > > > > > > Let's store syscall nr in thread_info instead. > > > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > > > rt_sigreturn is affected in another code path, and there may be other > > > > syscalls that I am unaware of. > > > Save syscall number in thread_info has more side effects, because > > > ptrace allows us to change the number during syscall, then we should > > > keep consistency between syscall and regs[11]. > > > > > > > How about the change below: > > > > diff --git a/arch/loongarch/include/asm/syscall.h > > b/arch/loongarch/include/asm/syscall.h > > index e286dc58476e..954ba53bcc9a 100644 > > --- a/arch/loongarch/include/asm/syscall.h > > +++ b/arch/loongarch/include/asm/syscall.h > > @@ -23,7 +23,9 @@ extern void *sys_call_table[]; > > static inline long syscall_get_nr(struct task_struct *task, > > struct pt_regs *regs) > > { > > - return regs->regs[11]; > > + long nr = task_thread_info(task)->syscall; > > + > > + return nr ? : regs->regs[11]; > > } > > > > static inline void syscall_rollback(struct task_struct *task, > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > index b4c5acd7aa3b..553ab0d624cb 100644 > > --- a/arch/loongarch/kernel/syscall.c > > +++ b/arch/loongarch/kernel/syscall.c > > @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > regs->regs[4] = -ENOSYS; > > > > nr = syscall_enter_from_user_mode(regs, nr); > > + current_thread_info()->syscall = nr; > > > > if (nr < NR_syscalls) { > > syscall_fn = sys_call_table[nr]; > > @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) > > } > > > > syscall_exit_to_user_mode(regs); > > + current_thread_info()->syscall = 0; > > } > > > > * allow ptrace to change syscall nr > > * sys_exit_* will also see the right syscall nr > > * this works even if rt_sigreturn clobbers all pt_regs::regs > No, I still prefer to modify ELF_PLAT_INIT, we can wait Arnd's comments. > OK, I am not eager, anyway, we know the root cause. :) > And, do you mean modifying ELF_PLAT_INIT cannot solve the > rt_sigreturn's problem? > Right, see https://elixir.bootlin.com/linux/latest/source/arch/loongarch/kernel/signal.c#L807 > Huacai > > > > > > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > > > > > Hi, Arnd, > > > > > > I found some new architectures, such as ARM64 and RISC-V, just do > > > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > > > x86 and MIPS, clear most of the registers, do you know why? > > > > > > Huacai > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > > --- > > > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > > > index e286dc58476e..2317d674b92a 100644 > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > struct pt_regs *regs) > > > > > > { > > > > > > - return regs->regs[11]; > > > > > > + return task_thread_info(task)->syscall; > > > > > > } > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > regs->orig_a0 = regs->regs[4]; > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > -- > > > > > > 2.42.0 > > > > > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-11-23 8:08 ` Hengqi Chen @ 2023-11-23 8:25 ` Huacai Chen 2023-11-23 14:39 ` Hengqi Chen 0 siblings, 1 reply; 10+ messages in thread From: Huacai Chen @ 2023-11-23 8:25 UTC (permalink / raw) To: Hengqi Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel On Thu, Nov 23, 2023 at 4:09 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > On Thu, Nov 23, 2023 at 2:13 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Hengqi, > > > > On Thu, Nov 23, 2023 at 1:49 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > Hi, Hengqi, > > > > > > > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > Hi, Huacai, > > > > > > > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > > > > > > > sys_execve > > > > > > > -> do_execve > > > > > > > -> do_execveat_common > > > > > > > -> bprm_execve > > > > > > > -> exec_binprm > > > > > > > -> search_binary_handler > > > > > > > -> load_elf_binary > > > > > > > -> ELF_PLAT_INIT > > > > > > > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > > > > we get a wrong syscall nr. > > > > > > > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > > > > does not trigger at all. > > > > > > > > > > > > > > Let's store syscall nr in thread_info instead. > > > > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > > > > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > > > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > > > > rt_sigreturn is affected in another code path, and there may be other > > > > > syscalls that I am unaware of. > > > > Save syscall number in thread_info has more side effects, because > > > > ptrace allows us to change the number during syscall, then we should > > > > keep consistency between syscall and regs[11]. > > > > > > > > > > How about the change below: > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h > > > b/arch/loongarch/include/asm/syscall.h > > > index e286dc58476e..954ba53bcc9a 100644 > > > --- a/arch/loongarch/include/asm/syscall.h > > > +++ b/arch/loongarch/include/asm/syscall.h > > > @@ -23,7 +23,9 @@ extern void *sys_call_table[]; > > > static inline long syscall_get_nr(struct task_struct *task, > > > struct pt_regs *regs) > > > { > > > - return regs->regs[11]; > > > + long nr = task_thread_info(task)->syscall; > > > + > > > + return nr ? : regs->regs[11]; > > > } > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > index b4c5acd7aa3b..553ab0d624cb 100644 > > > --- a/arch/loongarch/kernel/syscall.c > > > +++ b/arch/loongarch/kernel/syscall.c > > > @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > regs->regs[4] = -ENOSYS; > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > + current_thread_info()->syscall = nr; > > > > > > if (nr < NR_syscalls) { > > > syscall_fn = sys_call_table[nr]; > > > @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) > > > } > > > > > > syscall_exit_to_user_mode(regs); > > > + current_thread_info()->syscall = 0; > > > } > > > > > > * allow ptrace to change syscall nr > > > * sys_exit_* will also see the right syscall nr > > > * this works even if rt_sigreturn clobbers all pt_regs::regs > > No, I still prefer to modify ELF_PLAT_INIT, we can wait Arnd's comments. > > > > OK, I am not eager, anyway, we know the root cause. :) > > > And, do you mean modifying ELF_PLAT_INIT cannot solve the > > rt_sigreturn's problem? > > > > Right, see https://elixir.bootlin.com/linux/latest/source/arch/loongarch/kernel/signal.c#L807 Is this the expected behavior for rt_sigreturn()? Otherwise I think RISC-V has the same problem. And if we really need the 'correct' syscall number, we can overwrite regs[11] in sys_rt_sigreturn(). And another question: do you have any updates about the BPF system hang problem? :) Huacai > > > Huacai > > > > > > > > > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > > > > > > > Hi, Arnd, > > > > > > > > I found some new architectures, such as ARM64 and RISC-V, just do > > > > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > > > > x86 and MIPS, clear most of the registers, do you know why? > > > > > > > > Huacai > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > > > --- > > > > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > > > > index e286dc58476e..2317d674b92a 100644 > > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > > struct pt_regs *regs) > > > > > > > { > > > > > > > - return regs->regs[11]; > > > > > > > + return task_thread_info(task)->syscall; > > > > > > > } > > > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > regs->orig_a0 = regs->regs[4]; > > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > > -- > > > > > > > 2.42.0 > > > > > > > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-11-23 8:25 ` Huacai Chen @ 2023-11-23 14:39 ` Hengqi Chen 2023-12-03 3:17 ` Huacai Chen 0 siblings, 1 reply; 10+ messages in thread From: Hengqi Chen @ 2023-11-23 14:39 UTC (permalink / raw) To: Huacai Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel On Thu, Nov 23, 2023 at 4:25 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > On Thu, Nov 23, 2023 at 4:09 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > On Thu, Nov 23, 2023 at 2:13 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Hengqi, > > > > > > On Thu, Nov 23, 2023 at 1:49 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > Hi, Huacai, > > > > > > > > > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > > > > > > > > > sys_execve > > > > > > > > -> do_execve > > > > > > > > -> do_execveat_common > > > > > > > > -> bprm_execve > > > > > > > > -> exec_binprm > > > > > > > > -> search_binary_handler > > > > > > > > -> load_elf_binary > > > > > > > > -> ELF_PLAT_INIT > > > > > > > > > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > > > > > we get a wrong syscall nr. > > > > > > > > > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > > > > > does not trigger at all. > > > > > > > > > > > > > > > > Let's store syscall nr in thread_info instead. > > > > > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > > > > > > > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > > > > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > > > > > rt_sigreturn is affected in another code path, and there may be other > > > > > > syscalls that I am unaware of. > > > > > Save syscall number in thread_info has more side effects, because > > > > > ptrace allows us to change the number during syscall, then we should > > > > > keep consistency between syscall and regs[11]. > > > > > > > > > > > > > How about the change below: > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h > > > > b/arch/loongarch/include/asm/syscall.h > > > > index e286dc58476e..954ba53bcc9a 100644 > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > @@ -23,7 +23,9 @@ extern void *sys_call_table[]; > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > struct pt_regs *regs) > > > > { > > > > - return regs->regs[11]; > > > > + long nr = task_thread_info(task)->syscall; > > > > + > > > > + return nr ? : regs->regs[11]; > > > > } > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > index b4c5acd7aa3b..553ab0d624cb 100644 > > > > --- a/arch/loongarch/kernel/syscall.c > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > regs->regs[4] = -ENOSYS; > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > + current_thread_info()->syscall = nr; > > > > > > > > if (nr < NR_syscalls) { > > > > syscall_fn = sys_call_table[nr]; > > > > @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > } > > > > > > > > syscall_exit_to_user_mode(regs); > > > > + current_thread_info()->syscall = 0; > > > > } > > > > > > > > * allow ptrace to change syscall nr > > > > * sys_exit_* will also see the right syscall nr > > > > * this works even if rt_sigreturn clobbers all pt_regs::regs > > > No, I still prefer to modify ELF_PLAT_INIT, we can wait Arnd's comments. > > > > > > > OK, I am not eager, anyway, we know the root cause. :) > > > > > And, do you mean modifying ELF_PLAT_INIT cannot solve the > > > rt_sigreturn's problem? > > > > > > > Right, see https://elixir.bootlin.com/linux/latest/source/arch/loongarch/kernel/signal.c#L807 > Is this the expected behavior for rt_sigreturn()? Otherwise I think > RISC-V has the same problem. And if we really need the 'correct' > syscall number, we can overwrite regs[11] in sys_rt_sigreturn(). > I check with x86 and arm64, both have the same issue, seems no one care. > And another question: do you have any updates about the BPF system > hang problem? :) > Let's discuss this on a new thread. > Huacai > > > > > > Huacai > > > > > > > > > > > > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > > > > > > > > > Hi, Arnd, > > > > > > > > > > I found some new architectures, such as ARM64 and RISC-V, just do > > > > > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > > > > > x86 and MIPS, clear most of the registers, do you know why? > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > > > > --- > > > > > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > > > > > index e286dc58476e..2317d674b92a 100644 > > > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > > > struct pt_regs *regs) > > > > > > > > { > > > > > > > > - return regs->regs[11]; > > > > > > > > + return task_thread_info(task)->syscall; > > > > > > > > } > > > > > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > > regs->orig_a0 = regs->regs[4]; > > > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > > > -- > > > > > > > > 2.42.0 > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-11-23 14:39 ` Hengqi Chen @ 2023-12-03 3:17 ` Huacai Chen 2023-12-04 1:55 ` Hengqi Chen 0 siblings, 1 reply; 10+ messages in thread From: Huacai Chen @ 2023-12-03 3:17 UTC (permalink / raw) To: Hengqi Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel Hi, Hengqi, On Thu, Nov 23, 2023 at 10:39 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > On Thu, Nov 23, 2023 at 4:25 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > On Thu, Nov 23, 2023 at 4:09 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > On Thu, Nov 23, 2023 at 2:13 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > Hi, Hengqi, > > > > > > > > On Thu, Nov 23, 2023 at 1:49 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > Hi, Huacai, > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > > > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > > > > > > > > > > > sys_execve > > > > > > > > > -> do_execve > > > > > > > > > -> do_execveat_common > > > > > > > > > -> bprm_execve > > > > > > > > > -> exec_binprm > > > > > > > > > -> search_binary_handler > > > > > > > > > -> load_elf_binary > > > > > > > > > -> ELF_PLAT_INIT > > > > > > > > > > > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > > > > > > we get a wrong syscall nr. > > > > > > > > > > > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > > > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > > > > > > does not trigger at all. > > > > > > > > > > > > > > > > > > Let's store syscall nr in thread_info instead. > > > > > > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > > > > > > > > > > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > > > > > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > > > > > > rt_sigreturn is affected in another code path, and there may be other > > > > > > > syscalls that I am unaware of. > > > > > > Save syscall number in thread_info has more side effects, because > > > > > > ptrace allows us to change the number during syscall, then we should > > > > > > keep consistency between syscall and regs[11]. > > > > > > > > > > > > > > > > How about the change below: > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h > > > > > b/arch/loongarch/include/asm/syscall.h > > > > > index e286dc58476e..954ba53bcc9a 100644 > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > @@ -23,7 +23,9 @@ extern void *sys_call_table[]; > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > struct pt_regs *regs) > > > > > { > > > > > - return regs->regs[11]; > > > > > + long nr = task_thread_info(task)->syscall; > > > > > + > > > > > + return nr ? : regs->regs[11]; > > > > > } > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > index b4c5acd7aa3b..553ab0d624cb 100644 > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > + current_thread_info()->syscall = nr; > > > > > > > > > > if (nr < NR_syscalls) { > > > > > syscall_fn = sys_call_table[nr]; > > > > > @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > } > > > > > > > > > > syscall_exit_to_user_mode(regs); > > > > > + current_thread_info()->syscall = 0; > > > > > } > > > > > > > > > > * allow ptrace to change syscall nr > > > > > * sys_exit_* will also see the right syscall nr > > > > > * this works even if rt_sigreturn clobbers all pt_regs::regs > > > > No, I still prefer to modify ELF_PLAT_INIT, we can wait Arnd's comments. > > > > > > > > > > OK, I am not eager, anyway, we know the root cause. :) > > > > > > > And, do you mean modifying ELF_PLAT_INIT cannot solve the > > > > rt_sigreturn's problem? > > > > > > > > > > Right, see https://elixir.bootlin.com/linux/latest/source/arch/loongarch/kernel/signal.c#L807 > > Is this the expected behavior for rt_sigreturn()? Otherwise I think > > RISC-V has the same problem. And if we really need the 'correct' > > syscall number, we can overwrite regs[11] in sys_rt_sigreturn(). > > > > I check with x86 and arm64, both have the same issue, seems no one care. Does the rt_sigreturn issue affect execsnoop? Huacai > > > And another question: do you have any updates about the BPF system > > hang problem? :) > > > > Let's discuss this on a new thread. > > > Huacai > > > > > > > > > Huacai > > > > > > > > > > > > > > > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > > > > > > > > > > > Hi, Arnd, > > > > > > > > > > > > I found some new architectures, such as ARM64 and RISC-V, just do > > > > > > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > > > > > > x86 and MIPS, clear most of the registers, do you know why? > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > > > > > --- > > > > > > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > > > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > > > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > > > > > > index e286dc58476e..2317d674b92a 100644 > > > > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > > > > struct pt_regs *regs) > > > > > > > > > { > > > > > > > > > - return regs->regs[11]; > > > > > > > > > + return task_thread_info(task)->syscall; > > > > > > > > > } > > > > > > > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > > > regs->orig_a0 = regs->regs[4]; > > > > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > > > > -- > > > > > > > > > 2.42.0 > > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-12-03 3:17 ` Huacai Chen @ 2023-12-04 1:55 ` Hengqi Chen 2023-12-04 2:16 ` Huacai Chen 0 siblings, 1 reply; 10+ messages in thread From: Hengqi Chen @ 2023-12-04 1:55 UTC (permalink / raw) To: Huacai Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel On Sun, Dec 3, 2023 at 11:17 AM Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Hengqi, > > On Thu, Nov 23, 2023 at 10:39 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > On Thu, Nov 23, 2023 at 4:25 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > On Thu, Nov 23, 2023 at 4:09 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > On Thu, Nov 23, 2023 at 2:13 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > On Thu, Nov 23, 2023 at 1:49 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > Hi, Huacai, > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > > > > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > > > > > > > > > > > > > sys_execve > > > > > > > > > > -> do_execve > > > > > > > > > > -> do_execveat_common > > > > > > > > > > -> bprm_execve > > > > > > > > > > -> exec_binprm > > > > > > > > > > -> search_binary_handler > > > > > > > > > > -> load_elf_binary > > > > > > > > > > -> ELF_PLAT_INIT > > > > > > > > > > > > > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > > > > > > > we get a wrong syscall nr. > > > > > > > > > > > > > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > > > > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > > > > > > > does not trigger at all. > > > > > > > > > > > > > > > > > > > > Let's store syscall nr in thread_info instead. > > > > > > > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > > > > > > > > > > > > > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > > > > > > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > > > > > > > rt_sigreturn is affected in another code path, and there may be other > > > > > > > > syscalls that I am unaware of. > > > > > > > Save syscall number in thread_info has more side effects, because > > > > > > > ptrace allows us to change the number during syscall, then we should > > > > > > > keep consistency between syscall and regs[11]. > > > > > > > > > > > > > > > > > > > How about the change below: > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h > > > > > > b/arch/loongarch/include/asm/syscall.h > > > > > > index e286dc58476e..954ba53bcc9a 100644 > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > @@ -23,7 +23,9 @@ extern void *sys_call_table[]; > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > struct pt_regs *regs) > > > > > > { > > > > > > - return regs->regs[11]; > > > > > > + long nr = task_thread_info(task)->syscall; > > > > > > + > > > > > > + return nr ? : regs->regs[11]; > > > > > > } > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > index b4c5acd7aa3b..553ab0d624cb 100644 > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > + current_thread_info()->syscall = nr; > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > syscall_fn = sys_call_table[nr]; > > > > > > @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > } > > > > > > > > > > > > syscall_exit_to_user_mode(regs); > > > > > > + current_thread_info()->syscall = 0; > > > > > > } > > > > > > > > > > > > * allow ptrace to change syscall nr > > > > > > * sys_exit_* will also see the right syscall nr > > > > > > * this works even if rt_sigreturn clobbers all pt_regs::regs > > > > > No, I still prefer to modify ELF_PLAT_INIT, we can wait Arnd's comments. > > > > > > > > > > > > > OK, I am not eager, anyway, we know the root cause. :) > > > > > > > > > And, do you mean modifying ELF_PLAT_INIT cannot solve the > > > > > rt_sigreturn's problem? > > > > > > > > > > > > > Right, see https://elixir.bootlin.com/linux/latest/source/arch/loongarch/kernel/signal.c#L807 > > > Is this the expected behavior for rt_sigreturn()? Otherwise I think > > > RISC-V has the same problem. And if we really need the 'correct' > > > syscall number, we can overwrite regs[11] in sys_rt_sigreturn(). > > > > > > > I check with x86 and arm64, both have the same issue, seems no one care. > Does the rt_sigreturn issue affect execsnoop? > No, execsnoop only relies on sys_enter_execve/sys_exit_execve. > Huacai > > > > > > And another question: do you have any updates about the BPF system > > > hang problem? :) > > > > > > > Let's discuss this on a new thread. > > > > > Huacai > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > > > > > > > > > > > > > Hi, Arnd, > > > > > > > > > > > > > > I found some new architectures, such as ARM64 and RISC-V, just do > > > > > > > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > > > > > > > x86 and MIPS, clear most of the registers, do you know why? > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > > > > > > --- > > > > > > > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > > > > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > > > > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > > > > > > > index e286dc58476e..2317d674b92a 100644 > > > > > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > > > > > struct pt_regs *regs) > > > > > > > > > > { > > > > > > > > > > - return regs->regs[11]; > > > > > > > > > > + return task_thread_info(task)->syscall; > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > > > > regs->orig_a0 = regs->regs[4]; > > > > > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > > > > > -- > > > > > > > > > > 2.42.0 > > > > > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-12-04 1:55 ` Hengqi Chen @ 2023-12-04 2:16 ` Huacai Chen 2023-12-04 5:39 ` Hengqi Chen 0 siblings, 1 reply; 10+ messages in thread From: Huacai Chen @ 2023-12-04 2:16 UTC (permalink / raw) To: Hengqi Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel Hi, Hengqi, On Mon, Dec 4, 2023 at 9:56 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > On Sun, Dec 3, 2023 at 11:17 AM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > Hi, Hengqi, > > > > On Thu, Nov 23, 2023 at 10:39 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > On Thu, Nov 23, 2023 at 4:25 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > On Thu, Nov 23, 2023 at 4:09 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > On Thu, Nov 23, 2023 at 2:13 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > On Thu, Nov 23, 2023 at 1:49 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > > > Hi, Huacai, > > > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > > > > > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > > > > > > > > > > > > > > > sys_execve > > > > > > > > > > > -> do_execve > > > > > > > > > > > -> do_execveat_common > > > > > > > > > > > -> bprm_execve > > > > > > > > > > > -> exec_binprm > > > > > > > > > > > -> search_binary_handler > > > > > > > > > > > -> load_elf_binary > > > > > > > > > > > -> ELF_PLAT_INIT > > > > > > > > > > > > > > > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > > > > > > > > we get a wrong syscall nr. > > > > > > > > > > > > > > > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > > > > > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > > > > > > > > does not trigger at all. > > > > > > > > > > > > > > > > > > > > > > Let's store syscall nr in thread_info instead. > > > > > > > > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > > > > > > > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > > > > > > > > rt_sigreturn is affected in another code path, and there may be other > > > > > > > > > syscalls that I am unaware of. > > > > > > > > Save syscall number in thread_info has more side effects, because > > > > > > > > ptrace allows us to change the number during syscall, then we should > > > > > > > > keep consistency between syscall and regs[11]. > > > > > > > > > > > > > > > > > > > > > > How about the change below: > > > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h > > > > > > > b/arch/loongarch/include/asm/syscall.h > > > > > > > index e286dc58476e..954ba53bcc9a 100644 > > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > > @@ -23,7 +23,9 @@ extern void *sys_call_table[]; > > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > > struct pt_regs *regs) > > > > > > > { > > > > > > > - return regs->regs[11]; > > > > > > > + long nr = task_thread_info(task)->syscall; > > > > > > > + > > > > > > > + return nr ? : regs->regs[11]; > > > > > > > } > > > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > > index b4c5acd7aa3b..553ab0d624cb 100644 > > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > > @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > + current_thread_info()->syscall = nr; > > > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > > syscall_fn = sys_call_table[nr]; > > > > > > > @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > } > > > > > > > > > > > > > > syscall_exit_to_user_mode(regs); > > > > > > > + current_thread_info()->syscall = 0; > > > > > > > } > > > > > > > > > > > > > > * allow ptrace to change syscall nr > > > > > > > * sys_exit_* will also see the right syscall nr > > > > > > > * this works even if rt_sigreturn clobbers all pt_regs::regs > > > > > > No, I still prefer to modify ELF_PLAT_INIT, we can wait Arnd's comments. > > > > > > > > > > > > > > > > OK, I am not eager, anyway, we know the root cause. :) > > > > > > > > > > > And, do you mean modifying ELF_PLAT_INIT cannot solve the > > > > > > rt_sigreturn's problem? > > > > > > > > > > > > > > > > Right, see https://elixir.bootlin.com/linux/latest/source/arch/loongarch/kernel/signal.c#L807 > > > > Is this the expected behavior for rt_sigreturn()? Otherwise I think > > > > RISC-V has the same problem. And if we really need the 'correct' > > > > syscall number, we can overwrite regs[11] in sys_rt_sigreturn(). > > > > > > > > > > I check with x86 and arm64, both have the same issue, seems no one care. > > Does the rt_sigreturn issue affect execsnoop? > > > > No, execsnoop only relies on sys_enter_execve/sys_exit_execve. Then I suggest the below one line patch, though Arnd still no response, I believe it has no side effect (because ARM64 and RISC-V clear nothing here): diff --git a/arch/loongarch/include/asm/elf.h b/arch/loongarch/include/asm/elf.h index b9a4ab54285c..45a2a2f7a27f 100644 --- a/arch/loongarch/include/asm/elf.h +++ b/arch/loongarch/include/asm/elf.h @@ -293,7 +293,7 @@ extern const char *__elf_platform; #define ELF_PLAT_INIT(_r, load_addr) do { \ _r->regs[1] = _r->regs[2] = _r->regs[3] = _r->regs[4] = 0; \ _r->regs[5] = _r->regs[6] = _r->regs[7] = _r->regs[8] = 0; \ - _r->regs[9] = _r->regs[10] = _r->regs[11] = _r->regs[12] = 0; \ + _r->regs[9] = _r->regs[10] = /* syscall */ = _r->regs[12] = 0; \ _r->regs[13] = _r->regs[14] = _r->regs[15] = _r->regs[16] = 0; \ _r->regs[17] = _r->regs[18] = _r->regs[19] = _r->regs[20] = 0; \ _r->regs[21] = _r->regs[22] = _r->regs[23] = _r->regs[24] = 0; \ > > > Huacai > > > > > > > > > And another question: do you have any updates about the BPF system > > > > hang problem? :) > > > > > > > > > > Let's discuss this on a new thread. > > > > > > > Huacai > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > > > > > > > > > > > > > > > Hi, Arnd, > > > > > > > > > > > > > > > > I found some new architectures, such as ARM64 and RISC-V, just do > > > > > > > > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > > > > > > > > x86 and MIPS, clear most of the registers, do you know why? > > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > > > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > > > > > > > --- > > > > > > > > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > > > > > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > > > > > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > > > > > > > > index e286dc58476e..2317d674b92a 100644 > > > > > > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > > > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > > > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > > > > > > struct pt_regs *regs) > > > > > > > > > > > { > > > > > > > > > > > - return regs->regs[11]; > > > > > > > > > > > + return task_thread_info(task)->syscall; > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > > > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > > > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > > > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > > > > > regs->orig_a0 = regs->regs[4]; > > > > > > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > > > > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > > > > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > > > > > > -- > > > > > > > > > > > 2.42.0 > > > > > > > > > > > > > > > > > > > > > ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] LoongArch: Store syscall nr in thread_info 2023-12-04 2:16 ` Huacai Chen @ 2023-12-04 5:39 ` Hengqi Chen 0 siblings, 0 replies; 10+ messages in thread From: Hengqi Chen @ 2023-12-04 5:39 UTC (permalink / raw) To: Huacai Chen; +Cc: Arnd Bergmann, linux-arch, loongarch, kernel Hi, Huacai, Thanks for the suggestion, just send the fix. Cheers, --- Hengqi On Mon, Dec 4, 2023 at 10:17 AM Huacai Chen <chenhuacai@kernel.org> wrote: > > Hi, Hengqi, > > On Mon, Dec 4, 2023 at 9:56 AM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > On Sun, Dec 3, 2023 at 11:17 AM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > Hi, Hengqi, > > > > > > On Thu, Nov 23, 2023 at 10:39 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > On Thu, Nov 23, 2023 at 4:25 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > On Thu, Nov 23, 2023 at 4:09 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > On Thu, Nov 23, 2023 at 2:13 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > On Thu, Nov 23, 2023 at 1:49 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 3:58 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 3:34 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > Hi, Huacai, > > > > > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 2:32 PM Huacai Chen <chenhuacai@kernel.org> wrote: > > > > > > > > > > > > > > > > > > > > > > Hi, Hengqi, > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 22, 2023 at 1:14 PM Hengqi Chen <hengqi.chen@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > > > > Currently, we store syscall number in pt_regs::regs[11] and it may be > > > > > > > > > > > > changed during syscall execution. Take `execve` as an example: > > > > > > > > > > > > > > > > > > > > > > > > sys_execve > > > > > > > > > > > > -> do_execve > > > > > > > > > > > > -> do_execveat_common > > > > > > > > > > > > -> bprm_execve > > > > > > > > > > > > -> exec_binprm > > > > > > > > > > > > -> search_binary_handler > > > > > > > > > > > > -> load_elf_binary > > > > > > > > > > > > -> ELF_PLAT_INIT > > > > > > > > > > > > > > > > > > > > > > > > ELF_PLAT_INIT reset regs[11] to 0, later in syscall_exit_to_user_mode > > > > > > > > > > > > we get a wrong syscall nr. > > > > > > > > > > > > > > > > > > > > > > > > Known affected syscalls includes execve/execveat/rt_sigreturn. Tools > > > > > > > > > > > > like execsnoop do not work properly because the sys_exit_* tracepoints > > > > > > > > > > > > does not trigger at all. > > > > > > > > > > > > > > > > > > > > > > > > Let's store syscall nr in thread_info instead. > > > > > > > > > > > Can we just modify ELF_PLAT_INIT and not clear regs[11]? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am uncertain about the side effects of changing ELF_PLAT_INIT. > > > > > > > > > > From a completeness perspective, changing ELF_PLAT_INIT is suboptimal, > > > > > > > > > > rt_sigreturn is affected in another code path, and there may be other > > > > > > > > > > syscalls that I am unaware of. > > > > > > > > > Save syscall number in thread_info has more side effects, because > > > > > > > > > ptrace allows us to change the number during syscall, then we should > > > > > > > > > keep consistency between syscall and regs[11]. > > > > > > > > > > > > > > > > > > > > > > > > > How about the change below: > > > > > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h > > > > > > > > b/arch/loongarch/include/asm/syscall.h > > > > > > > > index e286dc58476e..954ba53bcc9a 100644 > > > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > > > @@ -23,7 +23,9 @@ extern void *sys_call_table[]; > > > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > > > struct pt_regs *regs) > > > > > > > > { > > > > > > > > - return regs->regs[11]; > > > > > > > > + long nr = task_thread_info(task)->syscall; > > > > > > > > + > > > > > > > > + return nr ? : regs->regs[11]; > > > > > > > > } > > > > > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > > > index b4c5acd7aa3b..553ab0d624cb 100644 > > > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > > > @@ -53,6 +53,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > + current_thread_info()->syscall = nr; > > > > > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > > > syscall_fn = sys_call_table[nr]; > > > > > > > > @@ -61,4 +62,5 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > > } > > > > > > > > > > > > > > > > syscall_exit_to_user_mode(regs); > > > > > > > > + current_thread_info()->syscall = 0; > > > > > > > > } > > > > > > > > > > > > > > > > * allow ptrace to change syscall nr > > > > > > > > * sys_exit_* will also see the right syscall nr > > > > > > > > * this works even if rt_sigreturn clobbers all pt_regs::regs > > > > > > > No, I still prefer to modify ELF_PLAT_INIT, we can wait Arnd's comments. > > > > > > > > > > > > > > > > > > > OK, I am not eager, anyway, we know the root cause. :) > > > > > > > > > > > > > And, do you mean modifying ELF_PLAT_INIT cannot solve the > > > > > > > rt_sigreturn's problem? > > > > > > > > > > > > > > > > > > > Right, see https://elixir.bootlin.com/linux/latest/source/arch/loongarch/kernel/signal.c#L807 > > > > > Is this the expected behavior for rt_sigreturn()? Otherwise I think > > > > > RISC-V has the same problem. And if we really need the 'correct' > > > > > syscall number, we can overwrite regs[11] in sys_rt_sigreturn(). > > > > > > > > > > > > > I check with x86 and arm64, both have the same issue, seems no one care. > > > Does the rt_sigreturn issue affect execsnoop? > > > > > > > No, execsnoop only relies on sys_enter_execve/sys_exit_execve. > Then I suggest the below one line patch, though Arnd still no > response, I believe it has no side effect (because ARM64 and RISC-V > clear nothing here): > > diff --git a/arch/loongarch/include/asm/elf.h b/arch/loongarch/include/asm/elf.h > index b9a4ab54285c..45a2a2f7a27f 100644 > --- a/arch/loongarch/include/asm/elf.h > +++ b/arch/loongarch/include/asm/elf.h > @@ -293,7 +293,7 @@ extern const char *__elf_platform; > #define ELF_PLAT_INIT(_r, load_addr) do { \ > _r->regs[1] = _r->regs[2] = _r->regs[3] = _r->regs[4] = 0; \ > _r->regs[5] = _r->regs[6] = _r->regs[7] = _r->regs[8] = 0; \ > - _r->regs[9] = _r->regs[10] = _r->regs[11] = _r->regs[12] = 0; \ > + _r->regs[9] = _r->regs[10] = /* syscall */ = _r->regs[12] = 0; \ > _r->regs[13] = _r->regs[14] = _r->regs[15] = _r->regs[16] = 0; \ > _r->regs[17] = _r->regs[18] = _r->regs[19] = _r->regs[20] = 0; \ > _r->regs[21] = _r->regs[22] = _r->regs[23] = _r->regs[24] = 0; \ > > > > > > Huacai > > > > > > > > > > > > And another question: do you have any updates about the BPF system > > > > > hang problem? :) > > > > > > > > > > > > > Let's discuss this on a new thread. > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > > And about ELF_PLAT_INIT, maybe Arnd can give us some more information. > > > > > > > > > > > > > > > > > > Hi, Arnd, > > > > > > > > > > > > > > > > > > I found some new architectures, such as ARM64 and RISC-V, just do > > > > > > > > > nearly nothing in ELF_PLAT_INIT, while some old architectures, such as > > > > > > > > > x86 and MIPS, clear most of the registers, do you know why? > > > > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: be769645a2aef ("LoongArch: Add system call support") > > > > > > > > > > > > Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> > > > > > > > > > > > > --- > > > > > > > > > > > > arch/loongarch/include/asm/syscall.h | 2 +- > > > > > > > > > > > > arch/loongarch/kernel/syscall.c | 1 + > > > > > > > > > > > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/arch/loongarch/include/asm/syscall.h b/arch/loongarch/include/asm/syscall.h > > > > > > > > > > > > index e286dc58476e..2317d674b92a 100644 > > > > > > > > > > > > --- a/arch/loongarch/include/asm/syscall.h > > > > > > > > > > > > +++ b/arch/loongarch/include/asm/syscall.h > > > > > > > > > > > > @@ -23,7 +23,7 @@ extern void *sys_call_table[]; > > > > > > > > > > > > static inline long syscall_get_nr(struct task_struct *task, > > > > > > > > > > > > struct pt_regs *regs) > > > > > > > > > > > > { > > > > > > > > > > > > - return regs->regs[11]; > > > > > > > > > > > > + return task_thread_info(task)->syscall; > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > static inline void syscall_rollback(struct task_struct *task, > > > > > > > > > > > > diff --git a/arch/loongarch/kernel/syscall.c b/arch/loongarch/kernel/syscall.c > > > > > > > > > > > > index b4c5acd7aa3b..2783e33cf276 100644 > > > > > > > > > > > > --- a/arch/loongarch/kernel/syscall.c > > > > > > > > > > > > +++ b/arch/loongarch/kernel/syscall.c > > > > > > > > > > > > @@ -52,6 +52,7 @@ void noinstr do_syscall(struct pt_regs *regs) > > > > > > > > > > > > regs->orig_a0 = regs->regs[4]; > > > > > > > > > > > > regs->regs[4] = -ENOSYS; > > > > > > > > > > > > > > > > > > > > > > > > + task_thread_info(current)->syscall = nr; > > > > > > > > > > > > nr = syscall_enter_from_user_mode(regs, nr); > > > > > > > > > > > > > > > > > > > > > > > > if (nr < NR_syscalls) { > > > > > > > > > > > > -- > > > > > > > > > > > > 2.42.0 > > > > > > > > > > > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2023-12-04 5:39 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <20231121070209.210934-1-hengqi.chen@gmail.com> [not found] ` <CAAhV-H7SwSRDh8Ui2xVb1ncoaEQVd=dugphcBemkeaPNGQX2qw@mail.gmail.com> [not found] ` <CAEyhmHRYghT5iFiLByUmC=AjdygiBWU8TH3joSyyWibu0Ki2xw@mail.gmail.com> 2023-11-22 7:58 ` [PATCH] LoongArch: Store syscall nr in thread_info Huacai Chen 2023-11-23 5:49 ` Hengqi Chen 2023-11-23 6:13 ` Huacai Chen 2023-11-23 8:08 ` Hengqi Chen 2023-11-23 8:25 ` Huacai Chen 2023-11-23 14:39 ` Hengqi Chen 2023-12-03 3:17 ` Huacai Chen 2023-12-04 1:55 ` Hengqi Chen 2023-12-04 2:16 ` Huacai Chen 2023-12-04 5:39 ` Hengqi Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).