* [PATCH 0/4] riscv: entry: further clean up and VMAP_STACK fix
@ 2022-09-25 17:53 ` Jisheng Zhang
0 siblings, 0 replies; 32+ messages in thread
From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren,
Nathan Chancellor, Nick Desaulniers
Cc: linux-riscv, linux-kernel, llvm
I planed to do similar generic entry transaction as Guo Ren did[1], and
I have some commits in local. Since Guo has sent out the series, so I
dropped my version and just provide those in my local repo but missing
in Guo's series. However, this doesn't mean this series depends on
Guo's series, in fact except the first one, the remaining three patches
are independent on generic entry.
NOTE: patch4 can also clean up arch/riscv/kernel/mcount-dyn.S as well
but there's a trivial difference in the context saving, I dunno whether
is it better to clean up mcount-dyn.S too, any suggestions are welcome.
[1]https://lore.kernel.org/linux-riscv/20220918155246.1203293-1-guoren@kernel.org/T/#t
Jisheng Zhang (4):
riscv: remove extra level wrappers of trace_hardirqs_{on,off}
riscv: consolidate ret_from_kernel_thread into ret_from_fork
riscv: fix race when vmap stack overflow and remove shadow_stack
riscv: entry: consolidate general regs saving into save_gp
arch/riscv/include/asm/asm-prototypes.h | 1 -
arch/riscv/include/asm/thread_info.h | 4 +-
arch/riscv/kernel/Makefile | 2 -
arch/riscv/kernel/asm-offsets.c | 1 +
arch/riscv/kernel/entry.S | 150 +++++++-----------------
arch/riscv/kernel/process.c | 5 +-
arch/riscv/kernel/trace_irq.c | 27 -----
arch/riscv/kernel/trace_irq.h | 11 --
arch/riscv/kernel/traps.c | 15 +--
9 files changed, 47 insertions(+), 169 deletions(-)
delete mode 100644 arch/riscv/kernel/trace_irq.c
delete mode 100644 arch/riscv/kernel/trace_irq.h
--
2.34.1
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply [flat|nested] 32+ messages in thread* [PATCH 0/4] riscv: entry: further clean up and VMAP_STACK fix @ 2022-09-25 17:53 ` Jisheng Zhang 0 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm I planed to do similar generic entry transaction as Guo Ren did[1], and I have some commits in local. Since Guo has sent out the series, so I dropped my version and just provide those in my local repo but missing in Guo's series. However, this doesn't mean this series depends on Guo's series, in fact except the first one, the remaining three patches are independent on generic entry. NOTE: patch4 can also clean up arch/riscv/kernel/mcount-dyn.S as well but there's a trivial difference in the context saving, I dunno whether is it better to clean up mcount-dyn.S too, any suggestions are welcome. [1]https://lore.kernel.org/linux-riscv/20220918155246.1203293-1-guoren@kernel.org/T/#t Jisheng Zhang (4): riscv: remove extra level wrappers of trace_hardirqs_{on,off} riscv: consolidate ret_from_kernel_thread into ret_from_fork riscv: fix race when vmap stack overflow and remove shadow_stack riscv: entry: consolidate general regs saving into save_gp arch/riscv/include/asm/asm-prototypes.h | 1 - arch/riscv/include/asm/thread_info.h | 4 +- arch/riscv/kernel/Makefile | 2 - arch/riscv/kernel/asm-offsets.c | 1 + arch/riscv/kernel/entry.S | 150 +++++++----------------- arch/riscv/kernel/process.c | 5 +- arch/riscv/kernel/trace_irq.c | 27 ----- arch/riscv/kernel/trace_irq.h | 11 -- arch/riscv/kernel/traps.c | 15 +-- 9 files changed, 47 insertions(+), 169 deletions(-) delete mode 100644 arch/riscv/kernel/trace_irq.c delete mode 100644 arch/riscv/kernel/trace_irq.h -- 2.34.1 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/4] riscv: remove extra level wrappers of trace_hardirqs_{on,off} 2022-09-25 17:53 ` Jisheng Zhang @ 2022-09-25 17:53 ` Jisheng Zhang -1 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm Since riscv is converted to generic entry, there's no need for the extra wrappers of trace_hardirqs_{on,off}. Tested with llvm + irqsoff. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> --- arch/riscv/kernel/Makefile | 2 -- arch/riscv/kernel/trace_irq.c | 27 --------------------------- arch/riscv/kernel/trace_irq.h | 11 ----------- 3 files changed, 40 deletions(-) delete mode 100644 arch/riscv/kernel/trace_irq.c delete mode 100644 arch/riscv/kernel/trace_irq.h diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 01da14e21019..11ee206cc235 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -69,8 +69,6 @@ obj-$(CONFIG_CPU_PM) += suspend_entry.o suspend.o obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o -obj-$(CONFIG_TRACE_IRQFLAGS) += trace_irq.o - obj-$(CONFIG_PERF_EVENTS) += perf_callchain.o obj-$(CONFIG_HAVE_PERF_REGS) += perf_regs.o obj-$(CONFIG_RISCV_SBI) += sbi.o diff --git a/arch/riscv/kernel/trace_irq.c b/arch/riscv/kernel/trace_irq.c deleted file mode 100644 index 095ac976d7da..000000000000 --- a/arch/riscv/kernel/trace_irq.c +++ /dev/null @@ -1,27 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Copyright (C) 2022 Changbin Du <changbin.du@gmail.com> - */ - -#include <linux/irqflags.h> -#include <linux/kprobes.h> -#include "trace_irq.h" - -/* - * trace_hardirqs_on/off require the caller to setup frame pointer properly. - * Otherwise, CALLER_ADDR1 might trigger an pagging exception in kernel. - * Here we add one extra level so they can be safely called by low - * level entry code which $fp is used for other purpose. - */ - -void __trace_hardirqs_on(void) -{ - trace_hardirqs_on(); -} -NOKPROBE_SYMBOL(__trace_hardirqs_on); - -void __trace_hardirqs_off(void) -{ - trace_hardirqs_off(); -} -NOKPROBE_SYMBOL(__trace_hardirqs_off); diff --git a/arch/riscv/kernel/trace_irq.h b/arch/riscv/kernel/trace_irq.h deleted file mode 100644 index 99fe67377e5e..000000000000 --- a/arch/riscv/kernel/trace_irq.h +++ /dev/null @@ -1,11 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* - * Copyright (C) 2022 Changbin Du <changbin.du@gmail.com> - */ -#ifndef __TRACE_IRQ_H -#define __TRACE_IRQ_H - -void __trace_hardirqs_on(void); -void __trace_hardirqs_off(void); - -#endif /* __TRACE_IRQ_H */ -- 2.34.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 1/4] riscv: remove extra level wrappers of trace_hardirqs_{on,off} @ 2022-09-25 17:53 ` Jisheng Zhang 0 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm Since riscv is converted to generic entry, there's no need for the extra wrappers of trace_hardirqs_{on,off}. Tested with llvm + irqsoff. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> --- arch/riscv/kernel/Makefile | 2 -- arch/riscv/kernel/trace_irq.c | 27 --------------------------- arch/riscv/kernel/trace_irq.h | 11 ----------- 3 files changed, 40 deletions(-) delete mode 100644 arch/riscv/kernel/trace_irq.c delete mode 100644 arch/riscv/kernel/trace_irq.h diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 01da14e21019..11ee206cc235 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -69,8 +69,6 @@ obj-$(CONFIG_CPU_PM) += suspend_entry.o suspend.o obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o -obj-$(CONFIG_TRACE_IRQFLAGS) += trace_irq.o - obj-$(CONFIG_PERF_EVENTS) += perf_callchain.o obj-$(CONFIG_HAVE_PERF_REGS) += perf_regs.o obj-$(CONFIG_RISCV_SBI) += sbi.o diff --git a/arch/riscv/kernel/trace_irq.c b/arch/riscv/kernel/trace_irq.c deleted file mode 100644 index 095ac976d7da..000000000000 --- a/arch/riscv/kernel/trace_irq.c +++ /dev/null @@ -1,27 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Copyright (C) 2022 Changbin Du <changbin.du@gmail.com> - */ - -#include <linux/irqflags.h> -#include <linux/kprobes.h> -#include "trace_irq.h" - -/* - * trace_hardirqs_on/off require the caller to setup frame pointer properly. - * Otherwise, CALLER_ADDR1 might trigger an pagging exception in kernel. - * Here we add one extra level so they can be safely called by low - * level entry code which $fp is used for other purpose. - */ - -void __trace_hardirqs_on(void) -{ - trace_hardirqs_on(); -} -NOKPROBE_SYMBOL(__trace_hardirqs_on); - -void __trace_hardirqs_off(void) -{ - trace_hardirqs_off(); -} -NOKPROBE_SYMBOL(__trace_hardirqs_off); diff --git a/arch/riscv/kernel/trace_irq.h b/arch/riscv/kernel/trace_irq.h deleted file mode 100644 index 99fe67377e5e..000000000000 --- a/arch/riscv/kernel/trace_irq.h +++ /dev/null @@ -1,11 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* - * Copyright (C) 2022 Changbin Du <changbin.du@gmail.com> - */ -#ifndef __TRACE_IRQ_H -#define __TRACE_IRQ_H - -void __trace_hardirqs_on(void); -void __trace_hardirqs_off(void); - -#endif /* __TRACE_IRQ_H */ -- 2.34.1 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4] riscv: remove extra level wrappers of trace_hardirqs_{on,off} 2022-09-25 17:53 ` Jisheng Zhang @ 2022-09-25 23:18 ` Guo Ren -1 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-25 23:18 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm Reviewed-by: Guo Ren <guoren@kernel.org> I would involve the patch in the generic entry series, okay? On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > Since riscv is converted to generic entry, there's no need for the > extra wrappers of trace_hardirqs_{on,off}. > > Tested with llvm + irqsoff. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > --- > arch/riscv/kernel/Makefile | 2 -- > arch/riscv/kernel/trace_irq.c | 27 --------------------------- > arch/riscv/kernel/trace_irq.h | 11 ----------- > 3 files changed, 40 deletions(-) > delete mode 100644 arch/riscv/kernel/trace_irq.c > delete mode 100644 arch/riscv/kernel/trace_irq.h > > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 01da14e21019..11ee206cc235 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -69,8 +69,6 @@ obj-$(CONFIG_CPU_PM) += suspend_entry.o suspend.o > obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o > obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o > > -obj-$(CONFIG_TRACE_IRQFLAGS) += trace_irq.o > - > obj-$(CONFIG_PERF_EVENTS) += perf_callchain.o > obj-$(CONFIG_HAVE_PERF_REGS) += perf_regs.o > obj-$(CONFIG_RISCV_SBI) += sbi.o > diff --git a/arch/riscv/kernel/trace_irq.c b/arch/riscv/kernel/trace_irq.c > deleted file mode 100644 > index 095ac976d7da..000000000000 > --- a/arch/riscv/kernel/trace_irq.c > +++ /dev/null > @@ -1,27 +0,0 @@ > -// SPDX-License-Identifier: GPL-2.0 > -/* > - * Copyright (C) 2022 Changbin Du <changbin.du@gmail.com> > - */ > - > -#include <linux/irqflags.h> > -#include <linux/kprobes.h> > -#include "trace_irq.h" > - > -/* > - * trace_hardirqs_on/off require the caller to setup frame pointer properly. > - * Otherwise, CALLER_ADDR1 might trigger an pagging exception in kernel. > - * Here we add one extra level so they can be safely called by low > - * level entry code which $fp is used for other purpose. > - */ > - > -void __trace_hardirqs_on(void) > -{ > - trace_hardirqs_on(); > -} > -NOKPROBE_SYMBOL(__trace_hardirqs_on); > - > -void __trace_hardirqs_off(void) > -{ > - trace_hardirqs_off(); > -} > -NOKPROBE_SYMBOL(__trace_hardirqs_off); > diff --git a/arch/riscv/kernel/trace_irq.h b/arch/riscv/kernel/trace_irq.h > deleted file mode 100644 > index 99fe67377e5e..000000000000 > --- a/arch/riscv/kernel/trace_irq.h > +++ /dev/null > @@ -1,11 +0,0 @@ > -/* SPDX-License-Identifier: GPL-2.0 */ > -/* > - * Copyright (C) 2022 Changbin Du <changbin.du@gmail.com> > - */ > -#ifndef __TRACE_IRQ_H > -#define __TRACE_IRQ_H > - > -void __trace_hardirqs_on(void); > -void __trace_hardirqs_off(void); > - > -#endif /* __TRACE_IRQ_H */ > -- > 2.34.1 > -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4] riscv: remove extra level wrappers of trace_hardirqs_{on,off} @ 2022-09-25 23:18 ` Guo Ren 0 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-25 23:18 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm Reviewed-by: Guo Ren <guoren@kernel.org> I would involve the patch in the generic entry series, okay? On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > Since riscv is converted to generic entry, there's no need for the > extra wrappers of trace_hardirqs_{on,off}. > > Tested with llvm + irqsoff. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > --- > arch/riscv/kernel/Makefile | 2 -- > arch/riscv/kernel/trace_irq.c | 27 --------------------------- > arch/riscv/kernel/trace_irq.h | 11 ----------- > 3 files changed, 40 deletions(-) > delete mode 100644 arch/riscv/kernel/trace_irq.c > delete mode 100644 arch/riscv/kernel/trace_irq.h > > diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile > index 01da14e21019..11ee206cc235 100644 > --- a/arch/riscv/kernel/Makefile > +++ b/arch/riscv/kernel/Makefile > @@ -69,8 +69,6 @@ obj-$(CONFIG_CPU_PM) += suspend_entry.o suspend.o > obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o > obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o > > -obj-$(CONFIG_TRACE_IRQFLAGS) += trace_irq.o > - > obj-$(CONFIG_PERF_EVENTS) += perf_callchain.o > obj-$(CONFIG_HAVE_PERF_REGS) += perf_regs.o > obj-$(CONFIG_RISCV_SBI) += sbi.o > diff --git a/arch/riscv/kernel/trace_irq.c b/arch/riscv/kernel/trace_irq.c > deleted file mode 100644 > index 095ac976d7da..000000000000 > --- a/arch/riscv/kernel/trace_irq.c > +++ /dev/null > @@ -1,27 +0,0 @@ > -// SPDX-License-Identifier: GPL-2.0 > -/* > - * Copyright (C) 2022 Changbin Du <changbin.du@gmail.com> > - */ > - > -#include <linux/irqflags.h> > -#include <linux/kprobes.h> > -#include "trace_irq.h" > - > -/* > - * trace_hardirqs_on/off require the caller to setup frame pointer properly. > - * Otherwise, CALLER_ADDR1 might trigger an pagging exception in kernel. > - * Here we add one extra level so they can be safely called by low > - * level entry code which $fp is used for other purpose. > - */ > - > -void __trace_hardirqs_on(void) > -{ > - trace_hardirqs_on(); > -} > -NOKPROBE_SYMBOL(__trace_hardirqs_on); > - > -void __trace_hardirqs_off(void) > -{ > - trace_hardirqs_off(); > -} > -NOKPROBE_SYMBOL(__trace_hardirqs_off); > diff --git a/arch/riscv/kernel/trace_irq.h b/arch/riscv/kernel/trace_irq.h > deleted file mode 100644 > index 99fe67377e5e..000000000000 > --- a/arch/riscv/kernel/trace_irq.h > +++ /dev/null > @@ -1,11 +0,0 @@ > -/* SPDX-License-Identifier: GPL-2.0 */ > -/* > - * Copyright (C) 2022 Changbin Du <changbin.du@gmail.com> > - */ > -#ifndef __TRACE_IRQ_H > -#define __TRACE_IRQ_H > - > -void __trace_hardirqs_on(void); > -void __trace_hardirqs_off(void); > - > -#endif /* __TRACE_IRQ_H */ > -- > 2.34.1 > -- Best Regards Guo Ren ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork 2022-09-25 17:53 ` Jisheng Zhang @ 2022-09-25 17:53 ` Jisheng Zhang -1 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm The ret_from_kernel_thread() behaves similarly with ret_from_fork(), the only difference is whether call the fn(arg) or not, this can be acchieved by testing fn is NULL or not, I.E s0 is 0 or not. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> --- arch/riscv/kernel/entry.S | 11 +++-------- arch/riscv/kernel/process.c | 5 ++--- 2 files changed, 5 insertions(+), 11 deletions(-) diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 2207cf44a3bc..a3e1ed2fa2ac 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) ENTRY(ret_from_fork) call schedule_tail - move a0, sp /* pt_regs */ - la ra, ret_from_exception - tail syscall_exit_to_user_mode -ENDPROC(ret_from_fork) - -ENTRY(ret_from_kernel_thread) - call schedule_tail + beqz s0, 1f /* not from kernel thread */ /* Call fn(arg) */ move a0, s1 jalr s0 +1: move a0, sp /* pt_regs */ la ra, ret_from_exception tail syscall_exit_to_user_mode -ENDPROC(ret_from_kernel_thread) +ENDPROC(ret_from_fork) #ifdef CONFIG_IRQ_STACKS ENTRY(call_on_stack) diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index ceb9ebab6558..67e7cd123ceb 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -34,7 +34,6 @@ EXPORT_SYMBOL(__stack_chk_guard); #endif extern asmlinkage void ret_from_fork(void); -extern asmlinkage void ret_from_kernel_thread(void); void arch_cpu_idle(void) { @@ -172,7 +171,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) /* Supervisor/Machine, irqs on: */ childregs->status = SR_PP | SR_PIE; - p->thread.ra = (unsigned long)ret_from_kernel_thread; p->thread.s[0] = (unsigned long)args->fn; p->thread.s[1] = (unsigned long)args->fn_arg; } else { @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) if (clone_flags & CLONE_SETTLS) childregs->tp = tls; childregs->a0 = 0; /* Return value of fork() */ - p->thread.ra = (unsigned long)ret_from_fork; + p->thread.s[0] = 0; } + p->thread.ra = (unsigned long)ret_from_fork; p->thread.sp = (unsigned long)childregs; /* kernel sp */ return 0; } -- 2.34.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork @ 2022-09-25 17:53 ` Jisheng Zhang 0 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm The ret_from_kernel_thread() behaves similarly with ret_from_fork(), the only difference is whether call the fn(arg) or not, this can be acchieved by testing fn is NULL or not, I.E s0 is 0 or not. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> --- arch/riscv/kernel/entry.S | 11 +++-------- arch/riscv/kernel/process.c | 5 ++--- 2 files changed, 5 insertions(+), 11 deletions(-) diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 2207cf44a3bc..a3e1ed2fa2ac 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) ENTRY(ret_from_fork) call schedule_tail - move a0, sp /* pt_regs */ - la ra, ret_from_exception - tail syscall_exit_to_user_mode -ENDPROC(ret_from_fork) - -ENTRY(ret_from_kernel_thread) - call schedule_tail + beqz s0, 1f /* not from kernel thread */ /* Call fn(arg) */ move a0, s1 jalr s0 +1: move a0, sp /* pt_regs */ la ra, ret_from_exception tail syscall_exit_to_user_mode -ENDPROC(ret_from_kernel_thread) +ENDPROC(ret_from_fork) #ifdef CONFIG_IRQ_STACKS ENTRY(call_on_stack) diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index ceb9ebab6558..67e7cd123ceb 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -34,7 +34,6 @@ EXPORT_SYMBOL(__stack_chk_guard); #endif extern asmlinkage void ret_from_fork(void); -extern asmlinkage void ret_from_kernel_thread(void); void arch_cpu_idle(void) { @@ -172,7 +171,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) /* Supervisor/Machine, irqs on: */ childregs->status = SR_PP | SR_PIE; - p->thread.ra = (unsigned long)ret_from_kernel_thread; p->thread.s[0] = (unsigned long)args->fn; p->thread.s[1] = (unsigned long)args->fn_arg; } else { @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) if (clone_flags & CLONE_SETTLS) childregs->tp = tls; childregs->a0 = 0; /* Return value of fork() */ - p->thread.ra = (unsigned long)ret_from_fork; + p->thread.s[0] = 0; } + p->thread.ra = (unsigned long)ret_from_fork; p->thread.sp = (unsigned long)childregs; /* kernel sp */ return 0; } -- 2.34.1 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork 2022-09-25 17:53 ` Jisheng Zhang @ 2022-09-25 23:25 ` Guo Ren -1 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-25 23:25 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > the only difference is whether call the fn(arg) or not, this can be > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > --- > arch/riscv/kernel/entry.S | 11 +++-------- > arch/riscv/kernel/process.c | 5 ++--- > 2 files changed, 5 insertions(+), 11 deletions(-) > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > ENTRY(ret_from_fork) > call schedule_tail > - move a0, sp /* pt_regs */ > - la ra, ret_from_exception > - tail syscall_exit_to_user_mode > -ENDPROC(ret_from_fork) > - > -ENTRY(ret_from_kernel_thread) > - call schedule_tail > + beqz s0, 1f /* not from kernel thread */ We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. The s0=0 is also okay for ret_from_fork. /* p->thread holds context to be restored by __switch_to() */ if (unlikely(args->fn)) { /* Kernel thread */ memset(childregs, 0, sizeof(struct pt_regs)); childregs->gp = gp_in_global; /* Supervisor/Machine, irqs on: */ childregs->status = SR_PP | SR_PIE; p->thread.ra = (unsigned long)ret_from_kernel_thread; p->thread.s[0] = (unsigned long)args->fn; p->thread.s[1] = (unsigned long)args->fn_arg; } else { *childregs = *(current_pt_regs()); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ if (usp) /* User fork */ childregs->sp = usp; if (clone_flags & CLONE_SETTLS) childregs->tp = tls; childregs->a0 = 0; /* Return value of fork() */ p->thread.ra = (unsigned long)ret_from_fork; } p->thread.sp = (unsigned long)childregs; /* kernel sp */ > /* Call fn(arg) */ > move a0, s1 > jalr s0 > +1: > move a0, sp /* pt_regs */ > la ra, ret_from_exception > tail syscall_exit_to_user_mode > -ENDPROC(ret_from_kernel_thread) > +ENDPROC(ret_from_fork) > > #ifdef CONFIG_IRQ_STACKS > ENTRY(call_on_stack) > diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c > index ceb9ebab6558..67e7cd123ceb 100644 > --- a/arch/riscv/kernel/process.c > +++ b/arch/riscv/kernel/process.c > @@ -34,7 +34,6 @@ EXPORT_SYMBOL(__stack_chk_guard); > #endif > > extern asmlinkage void ret_from_fork(void); > -extern asmlinkage void ret_from_kernel_thread(void); > > void arch_cpu_idle(void) > { > @@ -172,7 +171,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > /* Supervisor/Machine, irqs on: */ > childregs->status = SR_PP | SR_PIE; > > - p->thread.ra = (unsigned long)ret_from_kernel_thread; > p->thread.s[0] = (unsigned long)args->fn; > p->thread.s[1] = (unsigned long)args->fn_arg; > } else { > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > if (clone_flags & CLONE_SETTLS) > childregs->tp = tls; > childregs->a0 = 0; /* Return value of fork() */ > - p->thread.ra = (unsigned long)ret_from_fork; > + p->thread.s[0] = 0; > } > + p->thread.ra = (unsigned long)ret_from_fork; > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > return 0; > } > -- > 2.34.1 > -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork @ 2022-09-25 23:25 ` Guo Ren 0 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-25 23:25 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > the only difference is whether call the fn(arg) or not, this can be > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > --- > arch/riscv/kernel/entry.S | 11 +++-------- > arch/riscv/kernel/process.c | 5 ++--- > 2 files changed, 5 insertions(+), 11 deletions(-) > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > ENTRY(ret_from_fork) > call schedule_tail > - move a0, sp /* pt_regs */ > - la ra, ret_from_exception > - tail syscall_exit_to_user_mode > -ENDPROC(ret_from_fork) > - > -ENTRY(ret_from_kernel_thread) > - call schedule_tail > + beqz s0, 1f /* not from kernel thread */ We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. The s0=0 is also okay for ret_from_fork. /* p->thread holds context to be restored by __switch_to() */ if (unlikely(args->fn)) { /* Kernel thread */ memset(childregs, 0, sizeof(struct pt_regs)); childregs->gp = gp_in_global; /* Supervisor/Machine, irqs on: */ childregs->status = SR_PP | SR_PIE; p->thread.ra = (unsigned long)ret_from_kernel_thread; p->thread.s[0] = (unsigned long)args->fn; p->thread.s[1] = (unsigned long)args->fn_arg; } else { *childregs = *(current_pt_regs()); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ if (usp) /* User fork */ childregs->sp = usp; if (clone_flags & CLONE_SETTLS) childregs->tp = tls; childregs->a0 = 0; /* Return value of fork() */ p->thread.ra = (unsigned long)ret_from_fork; } p->thread.sp = (unsigned long)childregs; /* kernel sp */ > /* Call fn(arg) */ > move a0, s1 > jalr s0 > +1: > move a0, sp /* pt_regs */ > la ra, ret_from_exception > tail syscall_exit_to_user_mode > -ENDPROC(ret_from_kernel_thread) > +ENDPROC(ret_from_fork) > > #ifdef CONFIG_IRQ_STACKS > ENTRY(call_on_stack) > diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c > index ceb9ebab6558..67e7cd123ceb 100644 > --- a/arch/riscv/kernel/process.c > +++ b/arch/riscv/kernel/process.c > @@ -34,7 +34,6 @@ EXPORT_SYMBOL(__stack_chk_guard); > #endif > > extern asmlinkage void ret_from_fork(void); > -extern asmlinkage void ret_from_kernel_thread(void); > > void arch_cpu_idle(void) > { > @@ -172,7 +171,6 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > /* Supervisor/Machine, irqs on: */ > childregs->status = SR_PP | SR_PIE; > > - p->thread.ra = (unsigned long)ret_from_kernel_thread; > p->thread.s[0] = (unsigned long)args->fn; > p->thread.s[1] = (unsigned long)args->fn_arg; > } else { > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > if (clone_flags & CLONE_SETTLS) > childregs->tp = tls; > childregs->a0 = 0; /* Return value of fork() */ > - p->thread.ra = (unsigned long)ret_from_fork; > + p->thread.s[0] = 0; > } > + p->thread.ra = (unsigned long)ret_from_fork; > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > return 0; > } > -- > 2.34.1 > -- Best Regards Guo Ren ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork 2022-09-25 23:25 ` Guo Ren @ 2022-09-26 16:05 ` Jisheng Zhang -1 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-26 16:05 UTC (permalink / raw) To: Guo Ren Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Mon, Sep 26, 2022 at 07:25:30AM +0800, Guo Ren wrote: > On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > > the only difference is whether call the fn(arg) or not, this can be > > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > --- > > arch/riscv/kernel/entry.S | 11 +++-------- > > arch/riscv/kernel/process.c | 5 ++--- > > 2 files changed, 5 insertions(+), 11 deletions(-) > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > > --- a/arch/riscv/kernel/entry.S > > +++ b/arch/riscv/kernel/entry.S > > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > > > ENTRY(ret_from_fork) > > call schedule_tail > > - move a0, sp /* pt_regs */ > > - la ra, ret_from_exception > > - tail syscall_exit_to_user_mode > > -ENDPROC(ret_from_fork) > > - > > -ENTRY(ret_from_kernel_thread) > > - call schedule_tail > > + beqz s0, 1f /* not from kernel thread */ Hi Guo, > We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. > The s0=0 is also okay for ret_from_fork. IIUC, in ret_from_fork, the s0 comes p->thread.s[0] rather than s0 in pt_regs. > > /* p->thread holds context to be restored by __switch_to() */ > if (unlikely(args->fn)) { > /* Kernel thread */ > memset(childregs, 0, sizeof(struct pt_regs)); > childregs->gp = gp_in_global; > /* Supervisor/Machine, irqs on: */ > childregs->status = SR_PP | SR_PIE; > > p->thread.ra = (unsigned long)ret_from_kernel_thread; > p->thread.s[0] = (unsigned long)args->fn; > p->thread.s[1] = (unsigned long)args->fn_arg; > } else { > *childregs = *(current_pt_regs()); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > if (usp) /* User fork */ > childregs->sp = usp; > if (clone_flags & CLONE_SETTLS) > childregs->tp = tls; > childregs->a0 = 0; /* Return value of fork() */ > p->thread.ra = (unsigned long)ret_from_fork; > } > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > <snip> > > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > if (clone_flags & CLONE_SETTLS) > > childregs->tp = tls; > > childregs->a0 = 0; /* Return value of fork() */ > > - p->thread.ra = (unsigned long)ret_from_fork; > > + p->thread.s[0] = 0; Here we assign 0 to p->thread.s[0] Thanks _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork @ 2022-09-26 16:05 ` Jisheng Zhang 0 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-26 16:05 UTC (permalink / raw) To: Guo Ren Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Mon, Sep 26, 2022 at 07:25:30AM +0800, Guo Ren wrote: > On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > > the only difference is whether call the fn(arg) or not, this can be > > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > --- > > arch/riscv/kernel/entry.S | 11 +++-------- > > arch/riscv/kernel/process.c | 5 ++--- > > 2 files changed, 5 insertions(+), 11 deletions(-) > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > > --- a/arch/riscv/kernel/entry.S > > +++ b/arch/riscv/kernel/entry.S > > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > > > ENTRY(ret_from_fork) > > call schedule_tail > > - move a0, sp /* pt_regs */ > > - la ra, ret_from_exception > > - tail syscall_exit_to_user_mode > > -ENDPROC(ret_from_fork) > > - > > -ENTRY(ret_from_kernel_thread) > > - call schedule_tail > > + beqz s0, 1f /* not from kernel thread */ Hi Guo, > We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. > The s0=0 is also okay for ret_from_fork. IIUC, in ret_from_fork, the s0 comes p->thread.s[0] rather than s0 in pt_regs. > > /* p->thread holds context to be restored by __switch_to() */ > if (unlikely(args->fn)) { > /* Kernel thread */ > memset(childregs, 0, sizeof(struct pt_regs)); > childregs->gp = gp_in_global; > /* Supervisor/Machine, irqs on: */ > childregs->status = SR_PP | SR_PIE; > > p->thread.ra = (unsigned long)ret_from_kernel_thread; > p->thread.s[0] = (unsigned long)args->fn; > p->thread.s[1] = (unsigned long)args->fn_arg; > } else { > *childregs = *(current_pt_regs()); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > if (usp) /* User fork */ > childregs->sp = usp; > if (clone_flags & CLONE_SETTLS) > childregs->tp = tls; > childregs->a0 = 0; /* Return value of fork() */ > p->thread.ra = (unsigned long)ret_from_fork; > } > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > <snip> > > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > if (clone_flags & CLONE_SETTLS) > > childregs->tp = tls; > > childregs->a0 = 0; /* Return value of fork() */ > > - p->thread.ra = (unsigned long)ret_from_fork; > > + p->thread.s[0] = 0; Here we assign 0 to p->thread.s[0] Thanks ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork 2022-09-26 16:05 ` Jisheng Zhang @ 2022-09-26 23:55 ` Guo Ren -1 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-26 23:55 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Tue, Sep 27, 2022 at 12:14 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > On Mon, Sep 26, 2022 at 07:25:30AM +0800, Guo Ren wrote: > > On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > > > the only difference is whether call the fn(arg) or not, this can be > > > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > > --- > > > arch/riscv/kernel/entry.S | 11 +++-------- > > > arch/riscv/kernel/process.c | 5 ++--- > > > 2 files changed, 5 insertions(+), 11 deletions(-) > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > > > > > ENTRY(ret_from_fork) > > > call schedule_tail > > > - move a0, sp /* pt_regs */ > > > - la ra, ret_from_exception > > > - tail syscall_exit_to_user_mode > > > -ENDPROC(ret_from_fork) > > > - > > > -ENTRY(ret_from_kernel_thread) > > > - call schedule_tail > > > + beqz s0, 1f /* not from kernel thread */ > > Hi Guo, > > > We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. > > The s0=0 is also okay for ret_from_fork. > > IIUC, in ret_from_fork, the s0 comes p->thread.s[0] rather than s0 in > pt_regs. Yes, you are correct. > > > > > /* p->thread holds context to be restored by __switch_to() */ > > if (unlikely(args->fn)) { > > /* Kernel thread */ > > memset(childregs, 0, sizeof(struct pt_regs)); > > childregs->gp = gp_in_global; > > /* Supervisor/Machine, irqs on: */ > > childregs->status = SR_PP | SR_PIE; > > > > p->thread.ra = (unsigned long)ret_from_kernel_thread; > > p->thread.s[0] = (unsigned long)args->fn; > > p->thread.s[1] = (unsigned long)args->fn_arg; > > } else { > > *childregs = *(current_pt_regs()); > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Oh, I'm wrong, It's switch_to_restore -> exception_restore. > > if (usp) /* User fork */ > > childregs->sp = usp; > > if (clone_flags & CLONE_SETTLS) > > childregs->tp = tls; > > childregs->a0 = 0; /* Return value of fork() */ > > p->thread.ra = (unsigned long)ret_from_fork; > > } > > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > > > > <snip> > > > > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > > if (clone_flags & CLONE_SETTLS) > > > childregs->tp = tls; > > > childregs->a0 = 0; /* Return value of fork() */ > > > - p->thread.ra = (unsigned long)ret_from_fork; > > > + p->thread.s[0] = 0; > > Here we assign 0 to p->thread.s[0] I missed that. Merge thread & fork is not a good idea, and using fp as the flag is so implicit. ➜ linux git:(rv64sv32) grep ret_from_fork arch -r | grep entry.S arch/arc/kernel/entry.S:ENTRY(ret_from_fork) arch/arc/kernel/entry.S:END(ret_from_fork) arch/csky/kernel/entry.S:ENTRY(ret_from_fork) arch/x86/kernel/process_32.c: * the task-switch, and shows up in ret_from_fork in entry.S, arch/alpha/kernel/entry.S: .globl ret_from_fork arch/alpha/kernel/entry.S: .ent ret_from_fork arch/alpha/kernel/entry.S:ret_from_fork: arch/alpha/kernel/entry.S:.end ret_from_fork arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_fork) arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_fork) arch/hexagon/kernel/vm_entry.S: .globl ret_from_fork arch/hexagon/kernel/vm_entry.S:ret_from_fork: arch/microblaze/kernel/entry.S: (copy_thread makes ret_from_fork the return address in each new thread's arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_fork): arch/m68k/kernel/entry.S:ENTRY(ret_from_fork) arch/arm64/kernel/entry.S:SYM_CODE_START(ret_from_fork) arch/arm64/kernel/entry.S:SYM_CODE_END(ret_from_fork) arch/arm64/kernel/entry.S:NOKPROBE(ret_from_fork) arch/riscv/kernel/entry.S:ENTRY(ret_from_fork) arch/riscv/kernel/entry.S:ENDPROC(ret_from_fork) arch/s390/kernel/entry.S:# a new process exits the kernel with ret_from_fork arch/s390/kernel/entry.S:ENTRY(ret_from_fork) arch/s390/kernel/entry.S: brasl %r14,__ret_from_fork arch/s390/kernel/entry.S:ENDPROC(ret_from_fork) arch/mips/kernel/entry.S:FEXPORT(ret_from_fork) arch/openrisc/kernel/entry.S: /* All syscalls return here... just pay attention to ret_from_fork arch/openrisc/kernel/entry.S:ENTRY(ret_from_fork) arch/openrisc/kernel/entry.S: * that may be either schedule(), ret_from_fork(), or arch/nios2/kernel/entry.S:ENTRY(ret_from_fork) arch/xtensa/kernel/entry.S:ENTRY(ret_from_fork) arch/xtensa/kernel/entry.S:ENDPROC(ret_from_fork) arch/sparc/kernel/entry.S: .globl ret_from_fork arch/sparc/kernel/entry.S:ret_from_fork: ➜ linux git:(rv64sv32) grep ret_from_kernel_thread arch -r | grep entry.S arch/csky/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/alpha/kernel/entry.S: .globl ret_from_kernel_thread arch/alpha/kernel/entry.S: .ent ret_from_kernel_thread arch/alpha/kernel/entry.S:ret_from_kernel_thread: arch/alpha/kernel/entry.S:.end ret_from_kernel_thread arch/parisc/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/parisc/kernel/entry.S:END(ret_from_kernel_thread) arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_kernel_thread) arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_kernel_thread) arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_kernel_thread): arch/m68k/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/riscv/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/riscv/kernel/entry.S:ENDPROC(ret_from_kernel_thread) arch/mips/kernel/entry.S:FEXPORT(ret_from_kernel_thread) arch/openrisc/kernel/entry.S: * ret_from_kernel_thread(). If we are returning to a new thread, arch/nios2/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/xtensa/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/xtensa/kernel/entry.S:ENDPROC(ret_from_kernel_thread) arch/sparc/kernel/entry.S: .globl ret_from_kernel_thread arch/sparc/kernel/entry.S:ret_from_kernel_thread: Many architectures use a similar style. If you want to continue the patch, I think you should first rename ret_from_fork properly, and give an explicit flag definition, not just setting fp = 0. > > Thanks -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork @ 2022-09-26 23:55 ` Guo Ren 0 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-26 23:55 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Tue, Sep 27, 2022 at 12:14 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > On Mon, Sep 26, 2022 at 07:25:30AM +0800, Guo Ren wrote: > > On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > > > the only difference is whether call the fn(arg) or not, this can be > > > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > > --- > > > arch/riscv/kernel/entry.S | 11 +++-------- > > > arch/riscv/kernel/process.c | 5 ++--- > > > 2 files changed, 5 insertions(+), 11 deletions(-) > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > > > > > ENTRY(ret_from_fork) > > > call schedule_tail > > > - move a0, sp /* pt_regs */ > > > - la ra, ret_from_exception > > > - tail syscall_exit_to_user_mode > > > -ENDPROC(ret_from_fork) > > > - > > > -ENTRY(ret_from_kernel_thread) > > > - call schedule_tail > > > + beqz s0, 1f /* not from kernel thread */ > > Hi Guo, > > > We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. > > The s0=0 is also okay for ret_from_fork. > > IIUC, in ret_from_fork, the s0 comes p->thread.s[0] rather than s0 in > pt_regs. Yes, you are correct. > > > > > /* p->thread holds context to be restored by __switch_to() */ > > if (unlikely(args->fn)) { > > /* Kernel thread */ > > memset(childregs, 0, sizeof(struct pt_regs)); > > childregs->gp = gp_in_global; > > /* Supervisor/Machine, irqs on: */ > > childregs->status = SR_PP | SR_PIE; > > > > p->thread.ra = (unsigned long)ret_from_kernel_thread; > > p->thread.s[0] = (unsigned long)args->fn; > > p->thread.s[1] = (unsigned long)args->fn_arg; > > } else { > > *childregs = *(current_pt_regs()); > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Oh, I'm wrong, It's switch_to_restore -> exception_restore. > > if (usp) /* User fork */ > > childregs->sp = usp; > > if (clone_flags & CLONE_SETTLS) > > childregs->tp = tls; > > childregs->a0 = 0; /* Return value of fork() */ > > p->thread.ra = (unsigned long)ret_from_fork; > > } > > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > > > > <snip> > > > > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > > if (clone_flags & CLONE_SETTLS) > > > childregs->tp = tls; > > > childregs->a0 = 0; /* Return value of fork() */ > > > - p->thread.ra = (unsigned long)ret_from_fork; > > > + p->thread.s[0] = 0; > > Here we assign 0 to p->thread.s[0] I missed that. Merge thread & fork is not a good idea, and using fp as the flag is so implicit. ➜ linux git:(rv64sv32) grep ret_from_fork arch -r | grep entry.S arch/arc/kernel/entry.S:ENTRY(ret_from_fork) arch/arc/kernel/entry.S:END(ret_from_fork) arch/csky/kernel/entry.S:ENTRY(ret_from_fork) arch/x86/kernel/process_32.c: * the task-switch, and shows up in ret_from_fork in entry.S, arch/alpha/kernel/entry.S: .globl ret_from_fork arch/alpha/kernel/entry.S: .ent ret_from_fork arch/alpha/kernel/entry.S:ret_from_fork: arch/alpha/kernel/entry.S:.end ret_from_fork arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_fork) arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_fork) arch/hexagon/kernel/vm_entry.S: .globl ret_from_fork arch/hexagon/kernel/vm_entry.S:ret_from_fork: arch/microblaze/kernel/entry.S: (copy_thread makes ret_from_fork the return address in each new thread's arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_fork): arch/m68k/kernel/entry.S:ENTRY(ret_from_fork) arch/arm64/kernel/entry.S:SYM_CODE_START(ret_from_fork) arch/arm64/kernel/entry.S:SYM_CODE_END(ret_from_fork) arch/arm64/kernel/entry.S:NOKPROBE(ret_from_fork) arch/riscv/kernel/entry.S:ENTRY(ret_from_fork) arch/riscv/kernel/entry.S:ENDPROC(ret_from_fork) arch/s390/kernel/entry.S:# a new process exits the kernel with ret_from_fork arch/s390/kernel/entry.S:ENTRY(ret_from_fork) arch/s390/kernel/entry.S: brasl %r14,__ret_from_fork arch/s390/kernel/entry.S:ENDPROC(ret_from_fork) arch/mips/kernel/entry.S:FEXPORT(ret_from_fork) arch/openrisc/kernel/entry.S: /* All syscalls return here... just pay attention to ret_from_fork arch/openrisc/kernel/entry.S:ENTRY(ret_from_fork) arch/openrisc/kernel/entry.S: * that may be either schedule(), ret_from_fork(), or arch/nios2/kernel/entry.S:ENTRY(ret_from_fork) arch/xtensa/kernel/entry.S:ENTRY(ret_from_fork) arch/xtensa/kernel/entry.S:ENDPROC(ret_from_fork) arch/sparc/kernel/entry.S: .globl ret_from_fork arch/sparc/kernel/entry.S:ret_from_fork: ➜ linux git:(rv64sv32) grep ret_from_kernel_thread arch -r | grep entry.S arch/csky/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/alpha/kernel/entry.S: .globl ret_from_kernel_thread arch/alpha/kernel/entry.S: .ent ret_from_kernel_thread arch/alpha/kernel/entry.S:ret_from_kernel_thread: arch/alpha/kernel/entry.S:.end ret_from_kernel_thread arch/parisc/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/parisc/kernel/entry.S:END(ret_from_kernel_thread) arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_kernel_thread) arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_kernel_thread) arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_kernel_thread): arch/m68k/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/riscv/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/riscv/kernel/entry.S:ENDPROC(ret_from_kernel_thread) arch/mips/kernel/entry.S:FEXPORT(ret_from_kernel_thread) arch/openrisc/kernel/entry.S: * ret_from_kernel_thread(). If we are returning to a new thread, arch/nios2/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/xtensa/kernel/entry.S:ENTRY(ret_from_kernel_thread) arch/xtensa/kernel/entry.S:ENDPROC(ret_from_kernel_thread) arch/sparc/kernel/entry.S: .globl ret_from_kernel_thread arch/sparc/kernel/entry.S:ret_from_kernel_thread: Many architectures use a similar style. If you want to continue the patch, I think you should first rename ret_from_fork properly, and give an explicit flag definition, not just setting fp = 0. > > Thanks -- Best Regards Guo Ren ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork 2022-09-26 23:55 ` Guo Ren @ 2022-09-28 16:40 ` Jisheng Zhang -1 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-28 16:40 UTC (permalink / raw) To: Guo Ren Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Tue, Sep 27, 2022 at 07:55:27AM +0800, Guo Ren wrote: > On Tue, Sep 27, 2022 at 12:14 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > On Mon, Sep 26, 2022 at 07:25:30AM +0800, Guo Ren wrote: > > > On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > > > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > > > > the only difference is whether call the fn(arg) or not, this can be > > > > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > > > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > > > --- > > > > arch/riscv/kernel/entry.S | 11 +++-------- > > > > arch/riscv/kernel/process.c | 5 ++--- > > > > 2 files changed, 5 insertions(+), 11 deletions(-) > > > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > > > > --- a/arch/riscv/kernel/entry.S > > > > +++ b/arch/riscv/kernel/entry.S > > > > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > > > > > > > ENTRY(ret_from_fork) > > > > call schedule_tail > > > > - move a0, sp /* pt_regs */ > > > > - la ra, ret_from_exception > > > > - tail syscall_exit_to_user_mode > > > > -ENDPROC(ret_from_fork) > > > > - > > > > -ENTRY(ret_from_kernel_thread) > > > > - call schedule_tail > > > > + beqz s0, 1f /* not from kernel thread */ > > > > Hi Guo, > > > > > We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. > > > The s0=0 is also okay for ret_from_fork. > > > > IIUC, in ret_from_fork, the s0 comes p->thread.s[0] rather than s0 in > > pt_regs. > Yes, you are correct. > > > > > > > > > /* p->thread holds context to be restored by __switch_to() */ > > > if (unlikely(args->fn)) { > > > /* Kernel thread */ > > > memset(childregs, 0, sizeof(struct pt_regs)); > > > childregs->gp = gp_in_global; > > > /* Supervisor/Machine, irqs on: */ > > > childregs->status = SR_PP | SR_PIE; > > > > > > p->thread.ra = (unsigned long)ret_from_kernel_thread; > > > p->thread.s[0] = (unsigned long)args->fn; > > > p->thread.s[1] = (unsigned long)args->fn_arg; > > > } else { > > > *childregs = *(current_pt_regs()); > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Oh, I'm wrong, It's switch_to_restore -> exception_restore. > > > > if (usp) /* User fork */ > > > childregs->sp = usp; > > > if (clone_flags & CLONE_SETTLS) > > > childregs->tp = tls; > > > childregs->a0 = 0; /* Return value of fork() */ > > > p->thread.ra = (unsigned long)ret_from_fork; > > > } > > > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > > > > > > > <snip> > > > > > > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > > > if (clone_flags & CLONE_SETTLS) > > > > childregs->tp = tls; > > > > childregs->a0 = 0; /* Return value of fork() */ > > > > - p->thread.ra = (unsigned long)ret_from_fork; > > > > + p->thread.s[0] = 0; > > > > Here we assign 0 to p->thread.s[0] > I missed that. > > Merge thread & fork is not a good idea, and using fp as the flag is so implicit. > > ➜ linux git:(rv64sv32) grep ret_from_fork arch -r | grep entry.S > arch/arc/kernel/entry.S:ENTRY(ret_from_fork) > arch/arc/kernel/entry.S:END(ret_from_fork) > arch/csky/kernel/entry.S:ENTRY(ret_from_fork) > arch/x86/kernel/process_32.c: * the task-switch, and shows up in > ret_from_fork in entry.S, > arch/alpha/kernel/entry.S: .globl ret_from_fork > arch/alpha/kernel/entry.S: .ent ret_from_fork > arch/alpha/kernel/entry.S:ret_from_fork: > arch/alpha/kernel/entry.S:.end ret_from_fork > arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_fork) > arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_fork) > arch/hexagon/kernel/vm_entry.S: .globl ret_from_fork > arch/hexagon/kernel/vm_entry.S:ret_from_fork: > arch/microblaze/kernel/entry.S: (copy_thread makes ret_from_fork the > return address in each new thread's > arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_fork): > arch/m68k/kernel/entry.S:ENTRY(ret_from_fork) > arch/arm64/kernel/entry.S:SYM_CODE_START(ret_from_fork) > arch/arm64/kernel/entry.S:SYM_CODE_END(ret_from_fork) > arch/arm64/kernel/entry.S:NOKPROBE(ret_from_fork) > arch/riscv/kernel/entry.S:ENTRY(ret_from_fork) > arch/riscv/kernel/entry.S:ENDPROC(ret_from_fork) > arch/s390/kernel/entry.S:# a new process exits the kernel with ret_from_fork > arch/s390/kernel/entry.S:ENTRY(ret_from_fork) > arch/s390/kernel/entry.S: brasl %r14,__ret_from_fork > arch/s390/kernel/entry.S:ENDPROC(ret_from_fork) > arch/mips/kernel/entry.S:FEXPORT(ret_from_fork) > arch/openrisc/kernel/entry.S: /* All syscalls return here... just > pay attention to ret_from_fork > arch/openrisc/kernel/entry.S:ENTRY(ret_from_fork) > arch/openrisc/kernel/entry.S: * that may be either schedule(), > ret_from_fork(), or > arch/nios2/kernel/entry.S:ENTRY(ret_from_fork) > arch/xtensa/kernel/entry.S:ENTRY(ret_from_fork) > arch/xtensa/kernel/entry.S:ENDPROC(ret_from_fork) > arch/sparc/kernel/entry.S: .globl ret_from_fork > arch/sparc/kernel/entry.S:ret_from_fork: > ➜ linux git:(rv64sv32) grep ret_from_kernel_thread arch -r | grep entry.S > arch/csky/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/alpha/kernel/entry.S: .globl ret_from_kernel_thread > arch/alpha/kernel/entry.S: .ent ret_from_kernel_thread > arch/alpha/kernel/entry.S:ret_from_kernel_thread: > arch/alpha/kernel/entry.S:.end ret_from_kernel_thread > arch/parisc/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/parisc/kernel/entry.S:END(ret_from_kernel_thread) > arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_kernel_thread) > arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_kernel_thread) > arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_kernel_thread): > arch/m68k/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/riscv/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/riscv/kernel/entry.S:ENDPROC(ret_from_kernel_thread) > arch/mips/kernel/entry.S:FEXPORT(ret_from_kernel_thread) > arch/openrisc/kernel/entry.S: * ret_from_kernel_thread(). If we > are returning to a new thread, > arch/nios2/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/xtensa/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/xtensa/kernel/entry.S:ENDPROC(ret_from_kernel_thread) > arch/sparc/kernel/entry.S: .globl ret_from_kernel_thread > arch/sparc/kernel/entry.S:ret_from_kernel_thread: > > Many architectures use a similar style. If you want to continue the > patch, I think you should first rename ret_from_fork properly, and > give an explicit flag definition, not just setting fp = 0. > Above list also shows many architectures don't have a ret_from_kernel_thread, I think the reason is simple it behaves similarly as ret_from_fork. As for flag, IMHO, we may missed something as clearing the s[12] array in thread_struct when user fork, because s[12] may contain random kernel memory content, which may be finally leaked to userspace. This is a security hole. A trivial patch of memset(0) can fix it, after this fix, checking the s[0] is straightforward. diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 67e7cd123ceb..50a0f7e4327c 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -174,6 +174,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) p->thread.s[0] = (unsigned long)args->fn; p->thread.s[1] = (unsigned long)args->fn_arg; } else { + memset(&p->thread.s, 0, sizeof(p->thread.s)); *childregs = *(current_pt_regs()); if (usp) /* User fork */ childregs->sp = usp; _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork @ 2022-09-28 16:40 ` Jisheng Zhang 0 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-28 16:40 UTC (permalink / raw) To: Guo Ren Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Tue, Sep 27, 2022 at 07:55:27AM +0800, Guo Ren wrote: > On Tue, Sep 27, 2022 at 12:14 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > On Mon, Sep 26, 2022 at 07:25:30AM +0800, Guo Ren wrote: > > > On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > > > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > > > > the only difference is whether call the fn(arg) or not, this can be > > > > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > > > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > > > --- > > > > arch/riscv/kernel/entry.S | 11 +++-------- > > > > arch/riscv/kernel/process.c | 5 ++--- > > > > 2 files changed, 5 insertions(+), 11 deletions(-) > > > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > > > > --- a/arch/riscv/kernel/entry.S > > > > +++ b/arch/riscv/kernel/entry.S > > > > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > > > > > > > ENTRY(ret_from_fork) > > > > call schedule_tail > > > > - move a0, sp /* pt_regs */ > > > > - la ra, ret_from_exception > > > > - tail syscall_exit_to_user_mode > > > > -ENDPROC(ret_from_fork) > > > > - > > > > -ENTRY(ret_from_kernel_thread) > > > > - call schedule_tail > > > > + beqz s0, 1f /* not from kernel thread */ > > > > Hi Guo, > > > > > We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. > > > The s0=0 is also okay for ret_from_fork. > > > > IIUC, in ret_from_fork, the s0 comes p->thread.s[0] rather than s0 in > > pt_regs. > Yes, you are correct. > > > > > > > > > /* p->thread holds context to be restored by __switch_to() */ > > > if (unlikely(args->fn)) { > > > /* Kernel thread */ > > > memset(childregs, 0, sizeof(struct pt_regs)); > > > childregs->gp = gp_in_global; > > > /* Supervisor/Machine, irqs on: */ > > > childregs->status = SR_PP | SR_PIE; > > > > > > p->thread.ra = (unsigned long)ret_from_kernel_thread; > > > p->thread.s[0] = (unsigned long)args->fn; > > > p->thread.s[1] = (unsigned long)args->fn_arg; > > > } else { > > > *childregs = *(current_pt_regs()); > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Oh, I'm wrong, It's switch_to_restore -> exception_restore. > > > > if (usp) /* User fork */ > > > childregs->sp = usp; > > > if (clone_flags & CLONE_SETTLS) > > > childregs->tp = tls; > > > childregs->a0 = 0; /* Return value of fork() */ > > > p->thread.ra = (unsigned long)ret_from_fork; > > > } > > > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > > > > > > > <snip> > > > > > > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > > > if (clone_flags & CLONE_SETTLS) > > > > childregs->tp = tls; > > > > childregs->a0 = 0; /* Return value of fork() */ > > > > - p->thread.ra = (unsigned long)ret_from_fork; > > > > + p->thread.s[0] = 0; > > > > Here we assign 0 to p->thread.s[0] > I missed that. > > Merge thread & fork is not a good idea, and using fp as the flag is so implicit. > > ➜ linux git:(rv64sv32) grep ret_from_fork arch -r | grep entry.S > arch/arc/kernel/entry.S:ENTRY(ret_from_fork) > arch/arc/kernel/entry.S:END(ret_from_fork) > arch/csky/kernel/entry.S:ENTRY(ret_from_fork) > arch/x86/kernel/process_32.c: * the task-switch, and shows up in > ret_from_fork in entry.S, > arch/alpha/kernel/entry.S: .globl ret_from_fork > arch/alpha/kernel/entry.S: .ent ret_from_fork > arch/alpha/kernel/entry.S:ret_from_fork: > arch/alpha/kernel/entry.S:.end ret_from_fork > arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_fork) > arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_fork) > arch/hexagon/kernel/vm_entry.S: .globl ret_from_fork > arch/hexagon/kernel/vm_entry.S:ret_from_fork: > arch/microblaze/kernel/entry.S: (copy_thread makes ret_from_fork the > return address in each new thread's > arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_fork): > arch/m68k/kernel/entry.S:ENTRY(ret_from_fork) > arch/arm64/kernel/entry.S:SYM_CODE_START(ret_from_fork) > arch/arm64/kernel/entry.S:SYM_CODE_END(ret_from_fork) > arch/arm64/kernel/entry.S:NOKPROBE(ret_from_fork) > arch/riscv/kernel/entry.S:ENTRY(ret_from_fork) > arch/riscv/kernel/entry.S:ENDPROC(ret_from_fork) > arch/s390/kernel/entry.S:# a new process exits the kernel with ret_from_fork > arch/s390/kernel/entry.S:ENTRY(ret_from_fork) > arch/s390/kernel/entry.S: brasl %r14,__ret_from_fork > arch/s390/kernel/entry.S:ENDPROC(ret_from_fork) > arch/mips/kernel/entry.S:FEXPORT(ret_from_fork) > arch/openrisc/kernel/entry.S: /* All syscalls return here... just > pay attention to ret_from_fork > arch/openrisc/kernel/entry.S:ENTRY(ret_from_fork) > arch/openrisc/kernel/entry.S: * that may be either schedule(), > ret_from_fork(), or > arch/nios2/kernel/entry.S:ENTRY(ret_from_fork) > arch/xtensa/kernel/entry.S:ENTRY(ret_from_fork) > arch/xtensa/kernel/entry.S:ENDPROC(ret_from_fork) > arch/sparc/kernel/entry.S: .globl ret_from_fork > arch/sparc/kernel/entry.S:ret_from_fork: > ➜ linux git:(rv64sv32) grep ret_from_kernel_thread arch -r | grep entry.S > arch/csky/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/alpha/kernel/entry.S: .globl ret_from_kernel_thread > arch/alpha/kernel/entry.S: .ent ret_from_kernel_thread > arch/alpha/kernel/entry.S:ret_from_kernel_thread: > arch/alpha/kernel/entry.S:.end ret_from_kernel_thread > arch/parisc/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/parisc/kernel/entry.S:END(ret_from_kernel_thread) > arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_kernel_thread) > arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_kernel_thread) > arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_kernel_thread): > arch/m68k/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/riscv/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/riscv/kernel/entry.S:ENDPROC(ret_from_kernel_thread) > arch/mips/kernel/entry.S:FEXPORT(ret_from_kernel_thread) > arch/openrisc/kernel/entry.S: * ret_from_kernel_thread(). If we > are returning to a new thread, > arch/nios2/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/xtensa/kernel/entry.S:ENTRY(ret_from_kernel_thread) > arch/xtensa/kernel/entry.S:ENDPROC(ret_from_kernel_thread) > arch/sparc/kernel/entry.S: .globl ret_from_kernel_thread > arch/sparc/kernel/entry.S:ret_from_kernel_thread: > > Many architectures use a similar style. If you want to continue the > patch, I think you should first rename ret_from_fork properly, and > give an explicit flag definition, not just setting fp = 0. > Above list also shows many architectures don't have a ret_from_kernel_thread, I think the reason is simple it behaves similarly as ret_from_fork. As for flag, IMHO, we may missed something as clearing the s[12] array in thread_struct when user fork, because s[12] may contain random kernel memory content, which may be finally leaked to userspace. This is a security hole. A trivial patch of memset(0) can fix it, after this fix, checking the s[0] is straightforward. diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 67e7cd123ceb..50a0f7e4327c 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -174,6 +174,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) p->thread.s[0] = (unsigned long)args->fn; p->thread.s[1] = (unsigned long)args->fn_arg; } else { + memset(&p->thread.s, 0, sizeof(p->thread.s)); *childregs = *(current_pt_regs()); if (usp) /* User fork */ childregs->sp = usp; ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork 2022-09-28 16:40 ` Jisheng Zhang @ 2022-09-30 11:42 ` Guo Ren -1 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-30 11:42 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Thu, Sep 29, 2022 at 12:49 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > On Tue, Sep 27, 2022 at 07:55:27AM +0800, Guo Ren wrote: > > On Tue, Sep 27, 2022 at 12:14 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > > On Mon, Sep 26, 2022 at 07:25:30AM +0800, Guo Ren wrote: > > > > On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > > > > > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > > > > > the only difference is whether call the fn(arg) or not, this can be > > > > > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > > > > > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > > > > --- > > > > > arch/riscv/kernel/entry.S | 11 +++-------- > > > > > arch/riscv/kernel/process.c | 5 ++--- > > > > > 2 files changed, 5 insertions(+), 11 deletions(-) > > > > > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > > > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > > > > > --- a/arch/riscv/kernel/entry.S > > > > > +++ b/arch/riscv/kernel/entry.S > > > > > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > > > > > > > > > ENTRY(ret_from_fork) > > > > > call schedule_tail > > > > > - move a0, sp /* pt_regs */ > > > > > - la ra, ret_from_exception > > > > > - tail syscall_exit_to_user_mode > > > > > -ENDPROC(ret_from_fork) > > > > > - > > > > > -ENTRY(ret_from_kernel_thread) > > > > > - call schedule_tail > > > > > + beqz s0, 1f /* not from kernel thread */ > > > > > > Hi Guo, > > > > > > > We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. > > > > The s0=0 is also okay for ret_from_fork. > > > > > > IIUC, in ret_from_fork, the s0 comes p->thread.s[0] rather than s0 in > > > pt_regs. > > Yes, you are correct. > > > > > > > > > > > > > /* p->thread holds context to be restored by __switch_to() */ > > > > if (unlikely(args->fn)) { > > > > /* Kernel thread */ > > > > memset(childregs, 0, sizeof(struct pt_regs)); > > > > childregs->gp = gp_in_global; > > > > /* Supervisor/Machine, irqs on: */ > > > > childregs->status = SR_PP | SR_PIE; > > > > > > > > p->thread.ra = (unsigned long)ret_from_kernel_thread; > > > > p->thread.s[0] = (unsigned long)args->fn; > > > > p->thread.s[1] = (unsigned long)args->fn_arg; > > > > } else { > > > > *childregs = *(current_pt_regs()); > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Oh, I'm wrong, It's switch_to_restore -> exception_restore. > > > > > > if (usp) /* User fork */ > > > > childregs->sp = usp; > > > > if (clone_flags & CLONE_SETTLS) > > > > childregs->tp = tls; > > > > childregs->a0 = 0; /* Return value of fork() */ > > > > p->thread.ra = (unsigned long)ret_from_fork; > > > > } > > > > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > > > > > > > > > > <snip> > > > > > > > > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > > > > if (clone_flags & CLONE_SETTLS) > > > > > childregs->tp = tls; > > > > > childregs->a0 = 0; /* Return value of fork() */ > > > > > - p->thread.ra = (unsigned long)ret_from_fork; > > > > > + p->thread.s[0] = 0; > > > > > > Here we assign 0 to p->thread.s[0] > > I missed that. > > > > Merge thread & fork is not a good idea, and using fp as the flag is so implicit. > > > > ➜ linux git:(rv64sv32) grep ret_from_fork arch -r | grep entry.S > > arch/arc/kernel/entry.S:ENTRY(ret_from_fork) > > arch/arc/kernel/entry.S:END(ret_from_fork) > > arch/csky/kernel/entry.S:ENTRY(ret_from_fork) > > arch/x86/kernel/process_32.c: * the task-switch, and shows up in > > ret_from_fork in entry.S, > > arch/alpha/kernel/entry.S: .globl ret_from_fork > > arch/alpha/kernel/entry.S: .ent ret_from_fork > > arch/alpha/kernel/entry.S:ret_from_fork: > > arch/alpha/kernel/entry.S:.end ret_from_fork > > arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_fork) > > arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_fork) > > arch/hexagon/kernel/vm_entry.S: .globl ret_from_fork > > arch/hexagon/kernel/vm_entry.S:ret_from_fork: > > arch/microblaze/kernel/entry.S: (copy_thread makes ret_from_fork the > > return address in each new thread's > > arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_fork): > > arch/m68k/kernel/entry.S:ENTRY(ret_from_fork) > > arch/arm64/kernel/entry.S:SYM_CODE_START(ret_from_fork) > > arch/arm64/kernel/entry.S:SYM_CODE_END(ret_from_fork) > > arch/arm64/kernel/entry.S:NOKPROBE(ret_from_fork) > > arch/riscv/kernel/entry.S:ENTRY(ret_from_fork) > > arch/riscv/kernel/entry.S:ENDPROC(ret_from_fork) > > arch/s390/kernel/entry.S:# a new process exits the kernel with ret_from_fork > > arch/s390/kernel/entry.S:ENTRY(ret_from_fork) > > arch/s390/kernel/entry.S: brasl %r14,__ret_from_fork > > arch/s390/kernel/entry.S:ENDPROC(ret_from_fork) > > arch/mips/kernel/entry.S:FEXPORT(ret_from_fork) > > arch/openrisc/kernel/entry.S: /* All syscalls return here... just > > pay attention to ret_from_fork > > arch/openrisc/kernel/entry.S:ENTRY(ret_from_fork) > > arch/openrisc/kernel/entry.S: * that may be either schedule(), > > ret_from_fork(), or > > arch/nios2/kernel/entry.S:ENTRY(ret_from_fork) > > arch/xtensa/kernel/entry.S:ENTRY(ret_from_fork) > > arch/xtensa/kernel/entry.S:ENDPROC(ret_from_fork) > > arch/sparc/kernel/entry.S: .globl ret_from_fork > > arch/sparc/kernel/entry.S:ret_from_fork: > > ➜ linux git:(rv64sv32) grep ret_from_kernel_thread arch -r | grep entry.S > > arch/csky/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/alpha/kernel/entry.S: .globl ret_from_kernel_thread > > arch/alpha/kernel/entry.S: .ent ret_from_kernel_thread > > arch/alpha/kernel/entry.S:ret_from_kernel_thread: > > arch/alpha/kernel/entry.S:.end ret_from_kernel_thread > > arch/parisc/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/parisc/kernel/entry.S:END(ret_from_kernel_thread) > > arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_kernel_thread) > > arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_kernel_thread) > > arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_kernel_thread): > > arch/m68k/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/riscv/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/riscv/kernel/entry.S:ENDPROC(ret_from_kernel_thread) > > arch/mips/kernel/entry.S:FEXPORT(ret_from_kernel_thread) > > arch/openrisc/kernel/entry.S: * ret_from_kernel_thread(). If we > > are returning to a new thread, > > arch/nios2/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/xtensa/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/xtensa/kernel/entry.S:ENDPROC(ret_from_kernel_thread) > > arch/sparc/kernel/entry.S: .globl ret_from_kernel_thread > > arch/sparc/kernel/entry.S:ret_from_kernel_thread: > > > > Many architectures use a similar style. If you want to continue the > > patch, I think you should first rename ret_from_fork properly, and > > give an explicit flag definition, not just setting fp = 0. > > > > Above list also shows many architectures don't have a > ret_from_kernel_thread, I think the reason is simple it behaves > similarly as ret_from_fork. After looking at x86 & arm64, you've convinced me. Acked-by: Guo Ren <guoren@kernel.org> > As for flag, IMHO, we may missed something as clearing the s[12] > array in thread_struct when user fork, because s[12] may contain > random kernel memory content, which may be finally leaked to > userspace. This is a security hole. > > A trivial patch of memset(0) can fix it, after this fix, checking the > s[0] is straightforward. > > diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c > index 67e7cd123ceb..50a0f7e4327c 100644 > --- a/arch/riscv/kernel/process.c > +++ b/arch/riscv/kernel/process.c > @@ -174,6 +174,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > p->thread.s[0] = (unsigned long)args->fn; > p->thread.s[1] = (unsigned long)args->fn_arg; > } else { > + memset(&p->thread.s, 0, sizeof(p->thread.s)); Good catch. s[12] may leave some information about the kernel. It could be a separate patch with a Fixes flag. > *childregs = *(current_pt_regs()); > if (usp) /* User fork */ > childregs->sp = usp; > -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork @ 2022-09-30 11:42 ` Guo Ren 0 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-30 11:42 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Thu, Sep 29, 2022 at 12:49 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > On Tue, Sep 27, 2022 at 07:55:27AM +0800, Guo Ren wrote: > > On Tue, Sep 27, 2022 at 12:14 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > > On Mon, Sep 26, 2022 at 07:25:30AM +0800, Guo Ren wrote: > > > > On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > > > > > > The ret_from_kernel_thread() behaves similarly with ret_from_fork(), > > > > > the only difference is whether call the fn(arg) or not, this can be > > > > > acchieved by testing fn is NULL or not, I.E s0 is 0 or not. > > > > > > > > > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > > > > > --- > > > > > arch/riscv/kernel/entry.S | 11 +++-------- > > > > > arch/riscv/kernel/process.c | 5 ++--- > > > > > 2 files changed, 5 insertions(+), 11 deletions(-) > > > > > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > > > index 2207cf44a3bc..a3e1ed2fa2ac 100644 > > > > > --- a/arch/riscv/kernel/entry.S > > > > > +++ b/arch/riscv/kernel/entry.S > > > > > @@ -323,20 +323,15 @@ END(handle_kernel_stack_overflow) > > > > > > > > > > ENTRY(ret_from_fork) > > > > > call schedule_tail > > > > > - move a0, sp /* pt_regs */ > > > > > - la ra, ret_from_exception > > > > > - tail syscall_exit_to_user_mode > > > > > -ENDPROC(ret_from_fork) > > > > > - > > > > > -ENTRY(ret_from_kernel_thread) > > > > > - call schedule_tail > > > > > + beqz s0, 1f /* not from kernel thread */ > > > > > > Hi Guo, > > > > > > > We can't use s0 as condition for ret_from_fork/ret_from_kernel_thread. > > > > The s0=0 is also okay for ret_from_fork. > > > > > > IIUC, in ret_from_fork, the s0 comes p->thread.s[0] rather than s0 in > > > pt_regs. > > Yes, you are correct. > > > > > > > > > > > > > /* p->thread holds context to be restored by __switch_to() */ > > > > if (unlikely(args->fn)) { > > > > /* Kernel thread */ > > > > memset(childregs, 0, sizeof(struct pt_regs)); > > > > childregs->gp = gp_in_global; > > > > /* Supervisor/Machine, irqs on: */ > > > > childregs->status = SR_PP | SR_PIE; > > > > > > > > p->thread.ra = (unsigned long)ret_from_kernel_thread; > > > > p->thread.s[0] = (unsigned long)args->fn; > > > > p->thread.s[1] = (unsigned long)args->fn_arg; > > > > } else { > > > > *childregs = *(current_pt_regs()); > > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > Oh, I'm wrong, It's switch_to_restore -> exception_restore. > > > > > > if (usp) /* User fork */ > > > > childregs->sp = usp; > > > > if (clone_flags & CLONE_SETTLS) > > > > childregs->tp = tls; > > > > childregs->a0 = 0; /* Return value of fork() */ > > > > p->thread.ra = (unsigned long)ret_from_fork; > > > > } > > > > p->thread.sp = (unsigned long)childregs; /* kernel sp */ > > > > > > > > > > <snip> > > > > > > > > @@ -182,8 +180,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > > > > > if (clone_flags & CLONE_SETTLS) > > > > > childregs->tp = tls; > > > > > childregs->a0 = 0; /* Return value of fork() */ > > > > > - p->thread.ra = (unsigned long)ret_from_fork; > > > > > + p->thread.s[0] = 0; > > > > > > Here we assign 0 to p->thread.s[0] > > I missed that. > > > > Merge thread & fork is not a good idea, and using fp as the flag is so implicit. > > > > ➜ linux git:(rv64sv32) grep ret_from_fork arch -r | grep entry.S > > arch/arc/kernel/entry.S:ENTRY(ret_from_fork) > > arch/arc/kernel/entry.S:END(ret_from_fork) > > arch/csky/kernel/entry.S:ENTRY(ret_from_fork) > > arch/x86/kernel/process_32.c: * the task-switch, and shows up in > > ret_from_fork in entry.S, > > arch/alpha/kernel/entry.S: .globl ret_from_fork > > arch/alpha/kernel/entry.S: .ent ret_from_fork > > arch/alpha/kernel/entry.S:ret_from_fork: > > arch/alpha/kernel/entry.S:.end ret_from_fork > > arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_fork) > > arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_fork) > > arch/hexagon/kernel/vm_entry.S: .globl ret_from_fork > > arch/hexagon/kernel/vm_entry.S:ret_from_fork: > > arch/microblaze/kernel/entry.S: (copy_thread makes ret_from_fork the > > return address in each new thread's > > arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_fork): > > arch/m68k/kernel/entry.S:ENTRY(ret_from_fork) > > arch/arm64/kernel/entry.S:SYM_CODE_START(ret_from_fork) > > arch/arm64/kernel/entry.S:SYM_CODE_END(ret_from_fork) > > arch/arm64/kernel/entry.S:NOKPROBE(ret_from_fork) > > arch/riscv/kernel/entry.S:ENTRY(ret_from_fork) > > arch/riscv/kernel/entry.S:ENDPROC(ret_from_fork) > > arch/s390/kernel/entry.S:# a new process exits the kernel with ret_from_fork > > arch/s390/kernel/entry.S:ENTRY(ret_from_fork) > > arch/s390/kernel/entry.S: brasl %r14,__ret_from_fork > > arch/s390/kernel/entry.S:ENDPROC(ret_from_fork) > > arch/mips/kernel/entry.S:FEXPORT(ret_from_fork) > > arch/openrisc/kernel/entry.S: /* All syscalls return here... just > > pay attention to ret_from_fork > > arch/openrisc/kernel/entry.S:ENTRY(ret_from_fork) > > arch/openrisc/kernel/entry.S: * that may be either schedule(), > > ret_from_fork(), or > > arch/nios2/kernel/entry.S:ENTRY(ret_from_fork) > > arch/xtensa/kernel/entry.S:ENTRY(ret_from_fork) > > arch/xtensa/kernel/entry.S:ENDPROC(ret_from_fork) > > arch/sparc/kernel/entry.S: .globl ret_from_fork > > arch/sparc/kernel/entry.S:ret_from_fork: > > ➜ linux git:(rv64sv32) grep ret_from_kernel_thread arch -r | grep entry.S > > arch/csky/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/alpha/kernel/entry.S: .globl ret_from_kernel_thread > > arch/alpha/kernel/entry.S: .ent ret_from_kernel_thread > > arch/alpha/kernel/entry.S:ret_from_kernel_thread: > > arch/alpha/kernel/entry.S:.end ret_from_kernel_thread > > arch/parisc/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/parisc/kernel/entry.S:END(ret_from_kernel_thread) > > arch/loongarch/kernel/entry.S:SYM_CODE_START(ret_from_kernel_thread) > > arch/loongarch/kernel/entry.S:SYM_CODE_END(ret_from_kernel_thread) > > arch/microblaze/kernel/entry.S:C_ENTRY(ret_from_kernel_thread): > > arch/m68k/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/riscv/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/riscv/kernel/entry.S:ENDPROC(ret_from_kernel_thread) > > arch/mips/kernel/entry.S:FEXPORT(ret_from_kernel_thread) > > arch/openrisc/kernel/entry.S: * ret_from_kernel_thread(). If we > > are returning to a new thread, > > arch/nios2/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/xtensa/kernel/entry.S:ENTRY(ret_from_kernel_thread) > > arch/xtensa/kernel/entry.S:ENDPROC(ret_from_kernel_thread) > > arch/sparc/kernel/entry.S: .globl ret_from_kernel_thread > > arch/sparc/kernel/entry.S:ret_from_kernel_thread: > > > > Many architectures use a similar style. If you want to continue the > > patch, I think you should first rename ret_from_fork properly, and > > give an explicit flag definition, not just setting fp = 0. > > > > Above list also shows many architectures don't have a > ret_from_kernel_thread, I think the reason is simple it behaves > similarly as ret_from_fork. After looking at x86 & arm64, you've convinced me. Acked-by: Guo Ren <guoren@kernel.org> > As for flag, IMHO, we may missed something as clearing the s[12] > array in thread_struct when user fork, because s[12] may contain > random kernel memory content, which may be finally leaked to > userspace. This is a security hole. > > A trivial patch of memset(0) can fix it, after this fix, checking the > s[0] is straightforward. > > diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c > index 67e7cd123ceb..50a0f7e4327c 100644 > --- a/arch/riscv/kernel/process.c > +++ b/arch/riscv/kernel/process.c > @@ -174,6 +174,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > p->thread.s[0] = (unsigned long)args->fn; > p->thread.s[1] = (unsigned long)args->fn_arg; > } else { > + memset(&p->thread.s, 0, sizeof(p->thread.s)); Good catch. s[12] may leave some information about the kernel. It could be a separate patch with a Fixes flag. > *childregs = *(current_pt_regs()); > if (usp) /* User fork */ > childregs->sp = usp; > -- Best Regards Guo Ren ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack 2022-09-25 17:53 ` Jisheng Zhang @ 2022-09-25 17:53 ` Jisheng Zhang -1 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm Currently, when detecting vmap stack overflow, riscv firstly switches to the so called shadow stack, then use this shadow stack to call the get_overflow_stack() to get the overflow stack. However, there's a race here if two or more harts use the same shadow stack at the same time. To solve this race, we rely on two facts: 1. the kernel thread pointer I.E "tp" register can still be gotten from the CSR_SCRATCH register, thus we can clobber tp under the condition that we restore tp from CSR_SCRATCH later. 2. Once vmap stack overflow happen, panic is coming soon, no performance concern at all, so we don't need to define the overflow stack as percpu var, we can simplify it into an array. Thus we can use tp as a tmp register to get the cpu id to calculate the offset of overflow stack for each cpu w/o shadow stack any more. Thus the race condition is removed as side effect. NOTE: we can use similar mechanism to let each cpu use different shadow stack to fix the race codition, but if we can remove shadow stack usage totally, why not. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") --- arch/riscv/include/asm/asm-prototypes.h | 1 - arch/riscv/include/asm/thread_info.h | 4 +- arch/riscv/kernel/asm-offsets.c | 1 + arch/riscv/kernel/entry.S | 54 ++++--------------------- arch/riscv/kernel/traps.c | 15 +------ 5 files changed, 11 insertions(+), 64 deletions(-) diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h index ef386fcf3939..4a06fa0f6493 100644 --- a/arch/riscv/include/asm/asm-prototypes.h +++ b/arch/riscv/include/asm/asm-prototypes.h @@ -25,7 +25,6 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_s); DECLARE_DO_ERROR_INFO(do_trap_ecall_m); DECLARE_DO_ERROR_INFO(do_trap_break); -asmlinkage unsigned long get_overflow_stack(void); asmlinkage void handle_bad_stack(struct pt_regs *regs); #endif /* _ASM_RISCV_PROTOTYPES_H */ diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index c970d41dc4c6..c604a5212a73 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -28,14 +28,12 @@ #define THREAD_SHIFT (PAGE_SHIFT + THREAD_SIZE_ORDER) #define OVERFLOW_STACK_SIZE SZ_4K -#define SHADOW_OVERFLOW_STACK_SIZE (1024) +#define OVERFLOW_STACK_SHIFT 12 #define IRQ_STACK_SIZE THREAD_SIZE #ifndef __ASSEMBLY__ -extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)]; - #include <asm/processor.h> #include <asm/csr.h> diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index df9444397908..62bf3bacc322 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -37,6 +37,7 @@ void asm_offsets(void) OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp); + OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index a3e1ed2fa2ac..442d93beffcf 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -223,54 +223,14 @@ END(ret_from_exception) #ifdef CONFIG_VMAP_STACK ENTRY(handle_kernel_stack_overflow) - la sp, shadow_stack - addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE - - //save caller register to shadow stack - addi sp, sp, -(PT_SIZE_ON_STACK) - REG_S x1, PT_RA(sp) - REG_S x5, PT_T0(sp) - REG_S x6, PT_T1(sp) - REG_S x7, PT_T2(sp) - REG_S x10, PT_A0(sp) - REG_S x11, PT_A1(sp) - REG_S x12, PT_A2(sp) - REG_S x13, PT_A3(sp) - REG_S x14, PT_A4(sp) - REG_S x15, PT_A5(sp) - REG_S x16, PT_A6(sp) - REG_S x17, PT_A7(sp) - REG_S x28, PT_T3(sp) - REG_S x29, PT_T4(sp) - REG_S x30, PT_T5(sp) - REG_S x31, PT_T6(sp) - - la ra, restore_caller_reg - tail get_overflow_stack - -restore_caller_reg: - //save per-cpu overflow stack - REG_S a0, -8(sp) - //restore caller register from shadow_stack - REG_L x1, PT_RA(sp) - REG_L x5, PT_T0(sp) - REG_L x6, PT_T1(sp) - REG_L x7, PT_T2(sp) - REG_L x10, PT_A0(sp) - REG_L x11, PT_A1(sp) - REG_L x12, PT_A2(sp) - REG_L x13, PT_A3(sp) - REG_L x14, PT_A4(sp) - REG_L x15, PT_A5(sp) - REG_L x16, PT_A6(sp) - REG_L x17, PT_A7(sp) - REG_L x28, PT_T3(sp) - REG_L x29, PT_T4(sp) - REG_L x30, PT_T5(sp) - REG_L x31, PT_T6(sp) + la sp, overflow_stack + /* use tp as tmp register since we can restore it from CSR_SCRATCH */ + REG_L tp, TASK_TI_CPU(tp) + addi tp, tp, 1 + slli tp, tp, OVERFLOW_STACK_SHIFT + add sp, sp, tp + csrr tp, CSR_SCRATCH - //load per-cpu overflow stack - REG_L sp, -8(sp) addi sp, sp, -(PT_SIZE_ON_STACK) //save context to overflow stack diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 73f06cd149d9..2a2a977c1eff 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -216,23 +216,12 @@ int is_valid_bugaddr(unsigned long pc) #endif /* CONFIG_GENERIC_BUG */ #ifdef CONFIG_VMAP_STACK -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], - overflow_stack)__aligned(16); -/* - * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used - * to get per-cpu overflow stack(get_overflow_stack). - */ -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; -asmlinkage unsigned long get_overflow_stack(void) -{ - return (unsigned long)this_cpu_ptr(overflow_stack) + - OVERFLOW_STACK_SIZE; -} +unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); asmlinkage void handle_bad_stack(struct pt_regs *regs) { unsigned long tsk_stk = (unsigned long)current->stack; - unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack); + unsigned long ovf_stk = (unsigned long)overflow_stack[raw_smp_processor_id()]; console_verbose(); -- 2.34.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack @ 2022-09-25 17:53 ` Jisheng Zhang 0 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm Currently, when detecting vmap stack overflow, riscv firstly switches to the so called shadow stack, then use this shadow stack to call the get_overflow_stack() to get the overflow stack. However, there's a race here if two or more harts use the same shadow stack at the same time. To solve this race, we rely on two facts: 1. the kernel thread pointer I.E "tp" register can still be gotten from the CSR_SCRATCH register, thus we can clobber tp under the condition that we restore tp from CSR_SCRATCH later. 2. Once vmap stack overflow happen, panic is coming soon, no performance concern at all, so we don't need to define the overflow stack as percpu var, we can simplify it into an array. Thus we can use tp as a tmp register to get the cpu id to calculate the offset of overflow stack for each cpu w/o shadow stack any more. Thus the race condition is removed as side effect. NOTE: we can use similar mechanism to let each cpu use different shadow stack to fix the race codition, but if we can remove shadow stack usage totally, why not. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") --- arch/riscv/include/asm/asm-prototypes.h | 1 - arch/riscv/include/asm/thread_info.h | 4 +- arch/riscv/kernel/asm-offsets.c | 1 + arch/riscv/kernel/entry.S | 54 ++++--------------------- arch/riscv/kernel/traps.c | 15 +------ 5 files changed, 11 insertions(+), 64 deletions(-) diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h index ef386fcf3939..4a06fa0f6493 100644 --- a/arch/riscv/include/asm/asm-prototypes.h +++ b/arch/riscv/include/asm/asm-prototypes.h @@ -25,7 +25,6 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_s); DECLARE_DO_ERROR_INFO(do_trap_ecall_m); DECLARE_DO_ERROR_INFO(do_trap_break); -asmlinkage unsigned long get_overflow_stack(void); asmlinkage void handle_bad_stack(struct pt_regs *regs); #endif /* _ASM_RISCV_PROTOTYPES_H */ diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index c970d41dc4c6..c604a5212a73 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -28,14 +28,12 @@ #define THREAD_SHIFT (PAGE_SHIFT + THREAD_SIZE_ORDER) #define OVERFLOW_STACK_SIZE SZ_4K -#define SHADOW_OVERFLOW_STACK_SIZE (1024) +#define OVERFLOW_STACK_SHIFT 12 #define IRQ_STACK_SIZE THREAD_SIZE #ifndef __ASSEMBLY__ -extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)]; - #include <asm/processor.h> #include <asm/csr.h> diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c index df9444397908..62bf3bacc322 100644 --- a/arch/riscv/kernel/asm-offsets.c +++ b/arch/riscv/kernel/asm-offsets.c @@ -37,6 +37,7 @@ void asm_offsets(void) OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp); + OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index a3e1ed2fa2ac..442d93beffcf 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -223,54 +223,14 @@ END(ret_from_exception) #ifdef CONFIG_VMAP_STACK ENTRY(handle_kernel_stack_overflow) - la sp, shadow_stack - addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE - - //save caller register to shadow stack - addi sp, sp, -(PT_SIZE_ON_STACK) - REG_S x1, PT_RA(sp) - REG_S x5, PT_T0(sp) - REG_S x6, PT_T1(sp) - REG_S x7, PT_T2(sp) - REG_S x10, PT_A0(sp) - REG_S x11, PT_A1(sp) - REG_S x12, PT_A2(sp) - REG_S x13, PT_A3(sp) - REG_S x14, PT_A4(sp) - REG_S x15, PT_A5(sp) - REG_S x16, PT_A6(sp) - REG_S x17, PT_A7(sp) - REG_S x28, PT_T3(sp) - REG_S x29, PT_T4(sp) - REG_S x30, PT_T5(sp) - REG_S x31, PT_T6(sp) - - la ra, restore_caller_reg - tail get_overflow_stack - -restore_caller_reg: - //save per-cpu overflow stack - REG_S a0, -8(sp) - //restore caller register from shadow_stack - REG_L x1, PT_RA(sp) - REG_L x5, PT_T0(sp) - REG_L x6, PT_T1(sp) - REG_L x7, PT_T2(sp) - REG_L x10, PT_A0(sp) - REG_L x11, PT_A1(sp) - REG_L x12, PT_A2(sp) - REG_L x13, PT_A3(sp) - REG_L x14, PT_A4(sp) - REG_L x15, PT_A5(sp) - REG_L x16, PT_A6(sp) - REG_L x17, PT_A7(sp) - REG_L x28, PT_T3(sp) - REG_L x29, PT_T4(sp) - REG_L x30, PT_T5(sp) - REG_L x31, PT_T6(sp) + la sp, overflow_stack + /* use tp as tmp register since we can restore it from CSR_SCRATCH */ + REG_L tp, TASK_TI_CPU(tp) + addi tp, tp, 1 + slli tp, tp, OVERFLOW_STACK_SHIFT + add sp, sp, tp + csrr tp, CSR_SCRATCH - //load per-cpu overflow stack - REG_L sp, -8(sp) addi sp, sp, -(PT_SIZE_ON_STACK) //save context to overflow stack diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c index 73f06cd149d9..2a2a977c1eff 100644 --- a/arch/riscv/kernel/traps.c +++ b/arch/riscv/kernel/traps.c @@ -216,23 +216,12 @@ int is_valid_bugaddr(unsigned long pc) #endif /* CONFIG_GENERIC_BUG */ #ifdef CONFIG_VMAP_STACK -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], - overflow_stack)__aligned(16); -/* - * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used - * to get per-cpu overflow stack(get_overflow_stack). - */ -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; -asmlinkage unsigned long get_overflow_stack(void) -{ - return (unsigned long)this_cpu_ptr(overflow_stack) + - OVERFLOW_STACK_SIZE; -} +unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); asmlinkage void handle_bad_stack(struct pt_regs *regs) { unsigned long tsk_stk = (unsigned long)current->stack; - unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack); + unsigned long ovf_stk = (unsigned long)overflow_stack[raw_smp_processor_id()]; console_verbose(); -- 2.34.1 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack 2022-09-25 17:53 ` Jisheng Zhang @ 2022-09-25 23:46 ` Guo Ren -1 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-25 23:46 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm Good job, it's a good fixup. On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > Currently, when detecting vmap stack overflow, riscv firstly switches > to the so called shadow stack, then use this shadow stack to call the > get_overflow_stack() to get the overflow stack. However, there's > a race here if two or more harts use the same shadow stack at the same > time. > > To solve this race, we rely on two facts: > 1. the kernel thread pointer I.E "tp" register can still be gotten from > the CSR_SCRATCH register, thus we can clobber tp under the condition > that we restore tp from CSR_SCRATCH later. > > 2. Once vmap stack overflow happen, panic is coming soon, no > performance concern at all, so we don't need to define the overflow > stack as percpu var, we can simplify it into an array. > > Thus we can use tp as a tmp register to get the cpu id to calculate > the offset of overflow stack for each cpu w/o shadow stack any more. > Thus the race condition is removed as side effect. > > NOTE: we can use similar mechanism to let each cpu use different shadow > stack to fix the race codition, but if we can remove shadow stack usage > totally, why not. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") > --- > arch/riscv/include/asm/asm-prototypes.h | 1 - > arch/riscv/include/asm/thread_info.h | 4 +- > arch/riscv/kernel/asm-offsets.c | 1 + > arch/riscv/kernel/entry.S | 54 ++++--------------------- > arch/riscv/kernel/traps.c | 15 +------ > 5 files changed, 11 insertions(+), 64 deletions(-) > > diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h > index ef386fcf3939..4a06fa0f6493 100644 > --- a/arch/riscv/include/asm/asm-prototypes.h > +++ b/arch/riscv/include/asm/asm-prototypes.h > @@ -25,7 +25,6 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_s); > DECLARE_DO_ERROR_INFO(do_trap_ecall_m); > DECLARE_DO_ERROR_INFO(do_trap_break); > > -asmlinkage unsigned long get_overflow_stack(void); > asmlinkage void handle_bad_stack(struct pt_regs *regs); > > #endif /* _ASM_RISCV_PROTOTYPES_H */ > diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h > index c970d41dc4c6..c604a5212a73 100644 > --- a/arch/riscv/include/asm/thread_info.h > +++ b/arch/riscv/include/asm/thread_info.h > @@ -28,14 +28,12 @@ > > #define THREAD_SHIFT (PAGE_SHIFT + THREAD_SIZE_ORDER) > #define OVERFLOW_STACK_SIZE SZ_4K > -#define SHADOW_OVERFLOW_STACK_SIZE (1024) > +#define OVERFLOW_STACK_SHIFT 12 > > #define IRQ_STACK_SIZE THREAD_SIZE > > #ifndef __ASSEMBLY__ > > -extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)]; > - > #include <asm/processor.h> > #include <asm/csr.h> > > diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c > index df9444397908..62bf3bacc322 100644 > --- a/arch/riscv/kernel/asm-offsets.c > +++ b/arch/riscv/kernel/asm-offsets.c > @@ -37,6 +37,7 @@ void asm_offsets(void) > OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); > OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); > OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp); > + OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); > > OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); > OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index a3e1ed2fa2ac..442d93beffcf 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -223,54 +223,14 @@ END(ret_from_exception) > > #ifdef CONFIG_VMAP_STACK > ENTRY(handle_kernel_stack_overflow) > - la sp, shadow_stack > - addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE > - > - //save caller register to shadow stack > - addi sp, sp, -(PT_SIZE_ON_STACK) > - REG_S x1, PT_RA(sp) > - REG_S x5, PT_T0(sp) > - REG_S x6, PT_T1(sp) > - REG_S x7, PT_T2(sp) > - REG_S x10, PT_A0(sp) > - REG_S x11, PT_A1(sp) > - REG_S x12, PT_A2(sp) > - REG_S x13, PT_A3(sp) > - REG_S x14, PT_A4(sp) > - REG_S x15, PT_A5(sp) > - REG_S x16, PT_A6(sp) > - REG_S x17, PT_A7(sp) > - REG_S x28, PT_T3(sp) > - REG_S x29, PT_T4(sp) > - REG_S x30, PT_T5(sp) > - REG_S x31, PT_T6(sp) > - > - la ra, restore_caller_reg > - tail get_overflow_stack > - > -restore_caller_reg: > - //save per-cpu overflow stack > - REG_S a0, -8(sp) > - //restore caller register from shadow_stack > - REG_L x1, PT_RA(sp) > - REG_L x5, PT_T0(sp) > - REG_L x6, PT_T1(sp) > - REG_L x7, PT_T2(sp) > - REG_L x10, PT_A0(sp) > - REG_L x11, PT_A1(sp) > - REG_L x12, PT_A2(sp) > - REG_L x13, PT_A3(sp) > - REG_L x14, PT_A4(sp) > - REG_L x15, PT_A5(sp) > - REG_L x16, PT_A6(sp) > - REG_L x17, PT_A7(sp) > - REG_L x28, PT_T3(sp) > - REG_L x29, PT_T4(sp) > - REG_L x30, PT_T5(sp) > - REG_L x31, PT_T6(sp) > + la sp, overflow_stack > + /* use tp as tmp register since we can restore it from CSR_SCRATCH */ > + REG_L tp, TASK_TI_CPU(tp) > + addi tp, tp, 1 > + slli tp, tp, OVERFLOW_STACK_SHIFT > + add sp, sp, tp > + csrr tp, CSR_SCRATCH > > - //load per-cpu overflow stack > - REG_L sp, -8(sp) > addi sp, sp, -(PT_SIZE_ON_STACK) > > //save context to overflow stack > diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c > index 73f06cd149d9..2a2a977c1eff 100644 > --- a/arch/riscv/kernel/traps.c > +++ b/arch/riscv/kernel/traps.c > @@ -216,23 +216,12 @@ int is_valid_bugaddr(unsigned long pc) > #endif /* CONFIG_GENERIC_BUG */ > > #ifdef CONFIG_VMAP_STACK > -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], > - overflow_stack)__aligned(16); > -/* > - * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used > - * to get per-cpu overflow stack(get_overflow_stack). > - */ > -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; > -asmlinkage unsigned long get_overflow_stack(void) > -{ > - return (unsigned long)this_cpu_ptr(overflow_stack) + > - OVERFLOW_STACK_SIZE; > -} > +unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); Agree with using a custom per CPU definition for easy entry.S implementation. Reviewed-by: Guo Ren <guoren@kernel.org> > > asmlinkage void handle_bad_stack(struct pt_regs *regs) > { > unsigned long tsk_stk = (unsigned long)current->stack; > - unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack); > + unsigned long ovf_stk = (unsigned long)overflow_stack[raw_smp_processor_id()]; > > console_verbose(); > > -- > 2.34.1 > -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack @ 2022-09-25 23:46 ` Guo Ren 0 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-25 23:46 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm Good job, it's a good fixup. On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > Currently, when detecting vmap stack overflow, riscv firstly switches > to the so called shadow stack, then use this shadow stack to call the > get_overflow_stack() to get the overflow stack. However, there's > a race here if two or more harts use the same shadow stack at the same > time. > > To solve this race, we rely on two facts: > 1. the kernel thread pointer I.E "tp" register can still be gotten from > the CSR_SCRATCH register, thus we can clobber tp under the condition > that we restore tp from CSR_SCRATCH later. > > 2. Once vmap stack overflow happen, panic is coming soon, no > performance concern at all, so we don't need to define the overflow > stack as percpu var, we can simplify it into an array. > > Thus we can use tp as a tmp register to get the cpu id to calculate > the offset of overflow stack for each cpu w/o shadow stack any more. > Thus the race condition is removed as side effect. > > NOTE: we can use similar mechanism to let each cpu use different shadow > stack to fix the race codition, but if we can remove shadow stack usage > totally, why not. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") > --- > arch/riscv/include/asm/asm-prototypes.h | 1 - > arch/riscv/include/asm/thread_info.h | 4 +- > arch/riscv/kernel/asm-offsets.c | 1 + > arch/riscv/kernel/entry.S | 54 ++++--------------------- > arch/riscv/kernel/traps.c | 15 +------ > 5 files changed, 11 insertions(+), 64 deletions(-) > > diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h > index ef386fcf3939..4a06fa0f6493 100644 > --- a/arch/riscv/include/asm/asm-prototypes.h > +++ b/arch/riscv/include/asm/asm-prototypes.h > @@ -25,7 +25,6 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_s); > DECLARE_DO_ERROR_INFO(do_trap_ecall_m); > DECLARE_DO_ERROR_INFO(do_trap_break); > > -asmlinkage unsigned long get_overflow_stack(void); > asmlinkage void handle_bad_stack(struct pt_regs *regs); > > #endif /* _ASM_RISCV_PROTOTYPES_H */ > diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h > index c970d41dc4c6..c604a5212a73 100644 > --- a/arch/riscv/include/asm/thread_info.h > +++ b/arch/riscv/include/asm/thread_info.h > @@ -28,14 +28,12 @@ > > #define THREAD_SHIFT (PAGE_SHIFT + THREAD_SIZE_ORDER) > #define OVERFLOW_STACK_SIZE SZ_4K > -#define SHADOW_OVERFLOW_STACK_SIZE (1024) > +#define OVERFLOW_STACK_SHIFT 12 > > #define IRQ_STACK_SIZE THREAD_SIZE > > #ifndef __ASSEMBLY__ > > -extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)]; > - > #include <asm/processor.h> > #include <asm/csr.h> > > diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c > index df9444397908..62bf3bacc322 100644 > --- a/arch/riscv/kernel/asm-offsets.c > +++ b/arch/riscv/kernel/asm-offsets.c > @@ -37,6 +37,7 @@ void asm_offsets(void) > OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); > OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); > OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp); > + OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); > > OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); > OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index a3e1ed2fa2ac..442d93beffcf 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -223,54 +223,14 @@ END(ret_from_exception) > > #ifdef CONFIG_VMAP_STACK > ENTRY(handle_kernel_stack_overflow) > - la sp, shadow_stack > - addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE > - > - //save caller register to shadow stack > - addi sp, sp, -(PT_SIZE_ON_STACK) > - REG_S x1, PT_RA(sp) > - REG_S x5, PT_T0(sp) > - REG_S x6, PT_T1(sp) > - REG_S x7, PT_T2(sp) > - REG_S x10, PT_A0(sp) > - REG_S x11, PT_A1(sp) > - REG_S x12, PT_A2(sp) > - REG_S x13, PT_A3(sp) > - REG_S x14, PT_A4(sp) > - REG_S x15, PT_A5(sp) > - REG_S x16, PT_A6(sp) > - REG_S x17, PT_A7(sp) > - REG_S x28, PT_T3(sp) > - REG_S x29, PT_T4(sp) > - REG_S x30, PT_T5(sp) > - REG_S x31, PT_T6(sp) > - > - la ra, restore_caller_reg > - tail get_overflow_stack > - > -restore_caller_reg: > - //save per-cpu overflow stack > - REG_S a0, -8(sp) > - //restore caller register from shadow_stack > - REG_L x1, PT_RA(sp) > - REG_L x5, PT_T0(sp) > - REG_L x6, PT_T1(sp) > - REG_L x7, PT_T2(sp) > - REG_L x10, PT_A0(sp) > - REG_L x11, PT_A1(sp) > - REG_L x12, PT_A2(sp) > - REG_L x13, PT_A3(sp) > - REG_L x14, PT_A4(sp) > - REG_L x15, PT_A5(sp) > - REG_L x16, PT_A6(sp) > - REG_L x17, PT_A7(sp) > - REG_L x28, PT_T3(sp) > - REG_L x29, PT_T4(sp) > - REG_L x30, PT_T5(sp) > - REG_L x31, PT_T6(sp) > + la sp, overflow_stack > + /* use tp as tmp register since we can restore it from CSR_SCRATCH */ > + REG_L tp, TASK_TI_CPU(tp) > + addi tp, tp, 1 > + slli tp, tp, OVERFLOW_STACK_SHIFT > + add sp, sp, tp > + csrr tp, CSR_SCRATCH > > - //load per-cpu overflow stack > - REG_L sp, -8(sp) > addi sp, sp, -(PT_SIZE_ON_STACK) > > //save context to overflow stack > diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c > index 73f06cd149d9..2a2a977c1eff 100644 > --- a/arch/riscv/kernel/traps.c > +++ b/arch/riscv/kernel/traps.c > @@ -216,23 +216,12 @@ int is_valid_bugaddr(unsigned long pc) > #endif /* CONFIG_GENERIC_BUG */ > > #ifdef CONFIG_VMAP_STACK > -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], > - overflow_stack)__aligned(16); > -/* > - * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used > - * to get per-cpu overflow stack(get_overflow_stack). > - */ > -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; > -asmlinkage unsigned long get_overflow_stack(void) > -{ > - return (unsigned long)this_cpu_ptr(overflow_stack) + > - OVERFLOW_STACK_SIZE; > -} > +unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); Agree with using a custom per CPU definition for easy entry.S implementation. Reviewed-by: Guo Ren <guoren@kernel.org> > > asmlinkage void handle_bad_stack(struct pt_regs *regs) > { > unsigned long tsk_stk = (unsigned long)current->stack; > - unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack); > + unsigned long ovf_stk = (unsigned long)overflow_stack[raw_smp_processor_id()]; > > console_verbose(); > > -- > 2.34.1 > -- Best Regards Guo Ren ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack 2022-09-25 23:46 ` Guo Ren @ 2022-09-27 0:18 ` Jisheng Zhang -1 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-27 0:18 UTC (permalink / raw) To: Guo Ren Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm > > #ifdef CONFIG_VMAP_STACK > > -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], > > - overflow_stack)__aligned(16); > > -/* > > - * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used > > - * to get per-cpu overflow stack(get_overflow_stack). > > - */ > > -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; > > -asmlinkage unsigned long get_overflow_stack(void) > > -{ > > - return (unsigned long)this_cpu_ptr(overflow_stack) + > > - OVERFLOW_STACK_SIZE; > > -} > > +unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); If NR_CPUS is large, there's a non-trival memory waste, I have a solution for this case, will send a new version today. Thanks _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack @ 2022-09-27 0:18 ` Jisheng Zhang 0 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-27 0:18 UTC (permalink / raw) To: Guo Ren Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm > > #ifdef CONFIG_VMAP_STACK > > -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], > > - overflow_stack)__aligned(16); > > -/* > > - * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used > > - * to get per-cpu overflow stack(get_overflow_stack). > > - */ > > -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; > > -asmlinkage unsigned long get_overflow_stack(void) > > -{ > > - return (unsigned long)this_cpu_ptr(overflow_stack) + > > - OVERFLOW_STACK_SIZE; > > -} > > +unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); If NR_CPUS is large, there's a non-trival memory waste, I have a solution for this case, will send a new version today. Thanks ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack 2022-09-27 0:18 ` Jisheng Zhang @ 2022-09-27 2:00 ` Guo Ren -1 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-27 2:00 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Tue, Sep 27, 2022 at 8:28 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > #ifdef CONFIG_VMAP_STACK > > > -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], > > > - overflow_stack)__aligned(16); > > > -/* > > > - * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used > > > - * to get per-cpu overflow stack(get_overflow_stack). > > > - */ > > > -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; > > > -asmlinkage unsigned long get_overflow_stack(void) > > > -{ > > > - return (unsigned long)this_cpu_ptr(overflow_stack) + > > > - OVERFLOW_STACK_SIZE; > > > -} > > > +unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); > > If NR_CPUS is large, there's a non-trival memory waste, I have a > solution for this case, will send a new version today. Er... Yes, we can't bypass the percpu mechanism. I also forgot the percpu basic concept. In the end, I prefer the previous solution, maybe just simply giving an atomic flag would be okay. (But we only have one register (sp) which could be used, it seems not simple.) > > Thanks -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack @ 2022-09-27 2:00 ` Guo Ren 0 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-27 2:00 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Tue, Sep 27, 2022 at 8:28 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > > > > #ifdef CONFIG_VMAP_STACK > > > -static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], > > > - overflow_stack)__aligned(16); > > > -/* > > > - * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used > > > - * to get per-cpu overflow stack(get_overflow_stack). > > > - */ > > > -long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; > > > -asmlinkage unsigned long get_overflow_stack(void) > > > -{ > > > - return (unsigned long)this_cpu_ptr(overflow_stack) + > > > - OVERFLOW_STACK_SIZE; > > > -} > > > +unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); > > If NR_CPUS is large, there's a non-trival memory waste, I have a > solution for this case, will send a new version today. Er... Yes, we can't bypass the percpu mechanism. I also forgot the percpu basic concept. In the end, I prefer the previous solution, maybe just simply giving an atomic flag would be okay. (But we only have one register (sp) which could be used, it seems not simple.) > > Thanks -- Best Regards Guo Ren ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack 2022-09-25 17:53 ` Jisheng Zhang @ 2022-11-23 17:02 ` Deepak Gupta -1 siblings, 0 replies; 32+ messages in thread From: Deepak Gupta @ 2022-11-23 17:02 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Mon, Sep 26, 2022 at 01:53:55AM +0800, Jisheng Zhang wrote: >Currently, when detecting vmap stack overflow, riscv firstly switches >to the so called shadow stack, then use this shadow stack to call the >get_overflow_stack() to get the overflow stack. However, there's >a race here if two or more harts use the same shadow stack at the same >time. > >To solve this race, we rely on two facts: >1. the kernel thread pointer I.E "tp" register can still be gotten from >the CSR_SCRATCH register, thus we can clobber tp under the condition >that we restore tp from CSR_SCRATCH later. > >2. Once vmap stack overflow happen, panic is coming soon, no >performance concern at all, so we don't need to define the overflow >stack as percpu var, we can simplify it into an array. > >Thus we can use tp as a tmp register to get the cpu id to calculate >the offset of overflow stack for each cpu w/o shadow stack any more. >Thus the race condition is removed as side effect. > >NOTE: we can use similar mechanism to let each cpu use different shadow >stack to fix the race codition, but if we can remove shadow stack usage >totally, why not. I've a different patch to solve the same problem. Using an asm macro which allows obtaining per cpu symbols. Thus we can switch to overflow_stack directly from entry.S Patch coming soon. > >Signed-off-by: Jisheng Zhang <jszhang@kernel.org> >Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") >--- > arch/riscv/include/asm/asm-prototypes.h | 1 - > arch/riscv/include/asm/thread_info.h | 4 +- > arch/riscv/kernel/asm-offsets.c | 1 + > arch/riscv/kernel/entry.S | 54 ++++--------------------- > arch/riscv/kernel/traps.c | 15 +------ > 5 files changed, 11 insertions(+), 64 deletions(-) > >diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h >index ef386fcf3939..4a06fa0f6493 100644 >--- a/arch/riscv/include/asm/asm-prototypes.h >+++ b/arch/riscv/include/asm/asm-prototypes.h >@@ -25,7 +25,6 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_s); > DECLARE_DO_ERROR_INFO(do_trap_ecall_m); > DECLARE_DO_ERROR_INFO(do_trap_break); > >-asmlinkage unsigned long get_overflow_stack(void); > asmlinkage void handle_bad_stack(struct pt_regs *regs); > > #endif /* _ASM_RISCV_PROTOTYPES_H */ >diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h >index c970d41dc4c6..c604a5212a73 100644 >--- a/arch/riscv/include/asm/thread_info.h >+++ b/arch/riscv/include/asm/thread_info.h >@@ -28,14 +28,12 @@ > > #define THREAD_SHIFT (PAGE_SHIFT + THREAD_SIZE_ORDER) > #define OVERFLOW_STACK_SIZE SZ_4K >-#define SHADOW_OVERFLOW_STACK_SIZE (1024) >+#define OVERFLOW_STACK_SHIFT 12 > > #define IRQ_STACK_SIZE THREAD_SIZE > > #ifndef __ASSEMBLY__ > >-extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)]; >- > #include <asm/processor.h> > #include <asm/csr.h> > >diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c >index df9444397908..62bf3bacc322 100644 >--- a/arch/riscv/kernel/asm-offsets.c >+++ b/arch/riscv/kernel/asm-offsets.c >@@ -37,6 +37,7 @@ void asm_offsets(void) > OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); > OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); > OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp); >+ OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); > > OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); > OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); >diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S >index a3e1ed2fa2ac..442d93beffcf 100644 >--- a/arch/riscv/kernel/entry.S >+++ b/arch/riscv/kernel/entry.S >@@ -223,54 +223,14 @@ END(ret_from_exception) > > #ifdef CONFIG_VMAP_STACK > ENTRY(handle_kernel_stack_overflow) >- la sp, shadow_stack >- addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE >- >- //save caller register to shadow stack >- addi sp, sp, -(PT_SIZE_ON_STACK) >- REG_S x1, PT_RA(sp) >- REG_S x5, PT_T0(sp) >- REG_S x6, PT_T1(sp) >- REG_S x7, PT_T2(sp) >- REG_S x10, PT_A0(sp) >- REG_S x11, PT_A1(sp) >- REG_S x12, PT_A2(sp) >- REG_S x13, PT_A3(sp) >- REG_S x14, PT_A4(sp) >- REG_S x15, PT_A5(sp) >- REG_S x16, PT_A6(sp) >- REG_S x17, PT_A7(sp) >- REG_S x28, PT_T3(sp) >- REG_S x29, PT_T4(sp) >- REG_S x30, PT_T5(sp) >- REG_S x31, PT_T6(sp) >- >- la ra, restore_caller_reg >- tail get_overflow_stack >- >-restore_caller_reg: >- //save per-cpu overflow stack >- REG_S a0, -8(sp) >- //restore caller register from shadow_stack >- REG_L x1, PT_RA(sp) >- REG_L x5, PT_T0(sp) >- REG_L x6, PT_T1(sp) >- REG_L x7, PT_T2(sp) >- REG_L x10, PT_A0(sp) >- REG_L x11, PT_A1(sp) >- REG_L x12, PT_A2(sp) >- REG_L x13, PT_A3(sp) >- REG_L x14, PT_A4(sp) >- REG_L x15, PT_A5(sp) >- REG_L x16, PT_A6(sp) >- REG_L x17, PT_A7(sp) >- REG_L x28, PT_T3(sp) >- REG_L x29, PT_T4(sp) >- REG_L x30, PT_T5(sp) >- REG_L x31, PT_T6(sp) >+ la sp, overflow_stack >+ /* use tp as tmp register since we can restore it from CSR_SCRATCH */ >+ REG_L tp, TASK_TI_CPU(tp) >+ addi tp, tp, 1 >+ slli tp, tp, OVERFLOW_STACK_SHIFT >+ add sp, sp, tp >+ csrr tp, CSR_SCRATCH > >- //load per-cpu overflow stack >- REG_L sp, -8(sp) > addi sp, sp, -(PT_SIZE_ON_STACK) > > //save context to overflow stack >diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c >index 73f06cd149d9..2a2a977c1eff 100644 >--- a/arch/riscv/kernel/traps.c >+++ b/arch/riscv/kernel/traps.c >@@ -216,23 +216,12 @@ int is_valid_bugaddr(unsigned long pc) > #endif /* CONFIG_GENERIC_BUG */ > > #ifdef CONFIG_VMAP_STACK >-static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], >- overflow_stack)__aligned(16); >-/* >- * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used >- * to get per-cpu overflow stack(get_overflow_stack). >- */ >-long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; >-asmlinkage unsigned long get_overflow_stack(void) >-{ >- return (unsigned long)this_cpu_ptr(overflow_stack) + >- OVERFLOW_STACK_SIZE; >-} >+unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); > > asmlinkage void handle_bad_stack(struct pt_regs *regs) > { > unsigned long tsk_stk = (unsigned long)current->stack; >- unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack); >+ unsigned long ovf_stk = (unsigned long)overflow_stack[raw_smp_processor_id()]; > > console_verbose(); > >-- >2.34.1 > _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack @ 2022-11-23 17:02 ` Deepak Gupta 0 siblings, 0 replies; 32+ messages in thread From: Deepak Gupta @ 2022-11-23 17:02 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm On Mon, Sep 26, 2022 at 01:53:55AM +0800, Jisheng Zhang wrote: >Currently, when detecting vmap stack overflow, riscv firstly switches >to the so called shadow stack, then use this shadow stack to call the >get_overflow_stack() to get the overflow stack. However, there's >a race here if two or more harts use the same shadow stack at the same >time. > >To solve this race, we rely on two facts: >1. the kernel thread pointer I.E "tp" register can still be gotten from >the CSR_SCRATCH register, thus we can clobber tp under the condition >that we restore tp from CSR_SCRATCH later. > >2. Once vmap stack overflow happen, panic is coming soon, no >performance concern at all, so we don't need to define the overflow >stack as percpu var, we can simplify it into an array. > >Thus we can use tp as a tmp register to get the cpu id to calculate >the offset of overflow stack for each cpu w/o shadow stack any more. >Thus the race condition is removed as side effect. > >NOTE: we can use similar mechanism to let each cpu use different shadow >stack to fix the race codition, but if we can remove shadow stack usage >totally, why not. I've a different patch to solve the same problem. Using an asm macro which allows obtaining per cpu symbols. Thus we can switch to overflow_stack directly from entry.S Patch coming soon. > >Signed-off-by: Jisheng Zhang <jszhang@kernel.org> >Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") >--- > arch/riscv/include/asm/asm-prototypes.h | 1 - > arch/riscv/include/asm/thread_info.h | 4 +- > arch/riscv/kernel/asm-offsets.c | 1 + > arch/riscv/kernel/entry.S | 54 ++++--------------------- > arch/riscv/kernel/traps.c | 15 +------ > 5 files changed, 11 insertions(+), 64 deletions(-) > >diff --git a/arch/riscv/include/asm/asm-prototypes.h b/arch/riscv/include/asm/asm-prototypes.h >index ef386fcf3939..4a06fa0f6493 100644 >--- a/arch/riscv/include/asm/asm-prototypes.h >+++ b/arch/riscv/include/asm/asm-prototypes.h >@@ -25,7 +25,6 @@ DECLARE_DO_ERROR_INFO(do_trap_ecall_s); > DECLARE_DO_ERROR_INFO(do_trap_ecall_m); > DECLARE_DO_ERROR_INFO(do_trap_break); > >-asmlinkage unsigned long get_overflow_stack(void); > asmlinkage void handle_bad_stack(struct pt_regs *regs); > > #endif /* _ASM_RISCV_PROTOTYPES_H */ >diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h >index c970d41dc4c6..c604a5212a73 100644 >--- a/arch/riscv/include/asm/thread_info.h >+++ b/arch/riscv/include/asm/thread_info.h >@@ -28,14 +28,12 @@ > > #define THREAD_SHIFT (PAGE_SHIFT + THREAD_SIZE_ORDER) > #define OVERFLOW_STACK_SIZE SZ_4K >-#define SHADOW_OVERFLOW_STACK_SIZE (1024) >+#define OVERFLOW_STACK_SHIFT 12 > > #define IRQ_STACK_SIZE THREAD_SIZE > > #ifndef __ASSEMBLY__ > >-extern long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE / sizeof(long)]; >- > #include <asm/processor.h> > #include <asm/csr.h> > >diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c >index df9444397908..62bf3bacc322 100644 >--- a/arch/riscv/kernel/asm-offsets.c >+++ b/arch/riscv/kernel/asm-offsets.c >@@ -37,6 +37,7 @@ void asm_offsets(void) > OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); > OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp); > OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp); >+ OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); > > OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]); > OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]); >diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S >index a3e1ed2fa2ac..442d93beffcf 100644 >--- a/arch/riscv/kernel/entry.S >+++ b/arch/riscv/kernel/entry.S >@@ -223,54 +223,14 @@ END(ret_from_exception) > > #ifdef CONFIG_VMAP_STACK > ENTRY(handle_kernel_stack_overflow) >- la sp, shadow_stack >- addi sp, sp, SHADOW_OVERFLOW_STACK_SIZE >- >- //save caller register to shadow stack >- addi sp, sp, -(PT_SIZE_ON_STACK) >- REG_S x1, PT_RA(sp) >- REG_S x5, PT_T0(sp) >- REG_S x6, PT_T1(sp) >- REG_S x7, PT_T2(sp) >- REG_S x10, PT_A0(sp) >- REG_S x11, PT_A1(sp) >- REG_S x12, PT_A2(sp) >- REG_S x13, PT_A3(sp) >- REG_S x14, PT_A4(sp) >- REG_S x15, PT_A5(sp) >- REG_S x16, PT_A6(sp) >- REG_S x17, PT_A7(sp) >- REG_S x28, PT_T3(sp) >- REG_S x29, PT_T4(sp) >- REG_S x30, PT_T5(sp) >- REG_S x31, PT_T6(sp) >- >- la ra, restore_caller_reg >- tail get_overflow_stack >- >-restore_caller_reg: >- //save per-cpu overflow stack >- REG_S a0, -8(sp) >- //restore caller register from shadow_stack >- REG_L x1, PT_RA(sp) >- REG_L x5, PT_T0(sp) >- REG_L x6, PT_T1(sp) >- REG_L x7, PT_T2(sp) >- REG_L x10, PT_A0(sp) >- REG_L x11, PT_A1(sp) >- REG_L x12, PT_A2(sp) >- REG_L x13, PT_A3(sp) >- REG_L x14, PT_A4(sp) >- REG_L x15, PT_A5(sp) >- REG_L x16, PT_A6(sp) >- REG_L x17, PT_A7(sp) >- REG_L x28, PT_T3(sp) >- REG_L x29, PT_T4(sp) >- REG_L x30, PT_T5(sp) >- REG_L x31, PT_T6(sp) >+ la sp, overflow_stack >+ /* use tp as tmp register since we can restore it from CSR_SCRATCH */ >+ REG_L tp, TASK_TI_CPU(tp) >+ addi tp, tp, 1 >+ slli tp, tp, OVERFLOW_STACK_SHIFT >+ add sp, sp, tp >+ csrr tp, CSR_SCRATCH > >- //load per-cpu overflow stack >- REG_L sp, -8(sp) > addi sp, sp, -(PT_SIZE_ON_STACK) > > //save context to overflow stack >diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c >index 73f06cd149d9..2a2a977c1eff 100644 >--- a/arch/riscv/kernel/traps.c >+++ b/arch/riscv/kernel/traps.c >@@ -216,23 +216,12 @@ int is_valid_bugaddr(unsigned long pc) > #endif /* CONFIG_GENERIC_BUG */ > > #ifdef CONFIG_VMAP_STACK >-static DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], >- overflow_stack)__aligned(16); >-/* >- * shadow stack, handled_ kernel_ stack_ overflow(in kernel/entry.S) is used >- * to get per-cpu overflow stack(get_overflow_stack). >- */ >-long shadow_stack[SHADOW_OVERFLOW_STACK_SIZE/sizeof(long)]; >-asmlinkage unsigned long get_overflow_stack(void) >-{ >- return (unsigned long)this_cpu_ptr(overflow_stack) + >- OVERFLOW_STACK_SIZE; >-} >+unsigned long overflow_stack[NR_CPUS][OVERFLOW_STACK_SIZE/sizeof(long)] __aligned(16); > > asmlinkage void handle_bad_stack(struct pt_regs *regs) > { > unsigned long tsk_stk = (unsigned long)current->stack; >- unsigned long ovf_stk = (unsigned long)this_cpu_ptr(overflow_stack); >+ unsigned long ovf_stk = (unsigned long)overflow_stack[raw_smp_processor_id()]; > > console_verbose(); > >-- >2.34.1 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 4/4] riscv: entry: consolidate general regs saving into save_gp 2022-09-25 17:53 ` Jisheng Zhang @ 2022-09-25 17:53 ` Jisheng Zhang -1 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm Consolidate the saving GPs(except sp and tp into save_gp macro. No functional change. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> --- arch/riscv/kernel/entry.S | 85 ++++++++++++++------------------------- 1 file changed, 31 insertions(+), 54 deletions(-) diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 442d93beffcf..04e11d257ad6 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -14,31 +14,8 @@ #include <asm/asm-offsets.h> #include <asm/errata_list.h> -ENTRY(handle_exception) - /* - * If coming from userspace, preserve the user thread pointer and load - * the kernel thread pointer. If we came from the kernel, the scratch - * register will contain 0, and we should continue on the current TP. - */ - csrrw tp, CSR_SCRATCH, tp - bnez tp, _save_context - -_restore_kernel_tpsp: - csrr tp, CSR_SCRATCH - REG_S sp, TASK_TI_KERNEL_SP(tp) - -#ifdef CONFIG_VMAP_STACK - addi sp, sp, -(PT_SIZE_ON_STACK) - srli sp, sp, THREAD_SHIFT - andi sp, sp, 0x1 - bnez sp, handle_kernel_stack_overflow - REG_L sp, TASK_TI_KERNEL_SP(tp) -#endif - -_save_context: - REG_S sp, TASK_TI_USER_SP(tp) - REG_L sp, TASK_TI_KERNEL_SP(tp) - addi sp, sp, -(PT_SIZE_ON_STACK) + /* save all GPs except sp and tp */ + .macro save_gp REG_S x1, PT_RA(sp) REG_S x3, PT_GP(sp) REG_S x5, PT_T0(sp) @@ -68,6 +45,34 @@ _save_context: REG_S x29, PT_T4(sp) REG_S x30, PT_T5(sp) REG_S x31, PT_T6(sp) + .endm + +ENTRY(handle_exception) + /* + * If coming from userspace, preserve the user thread pointer and load + * the kernel thread pointer. If we came from the kernel, the scratch + * register will contain 0, and we should continue on the current TP. + */ + csrrw tp, CSR_SCRATCH, tp + bnez tp, _save_context + +_restore_kernel_tpsp: + csrr tp, CSR_SCRATCH + REG_S sp, TASK_TI_KERNEL_SP(tp) + +#ifdef CONFIG_VMAP_STACK + addi sp, sp, -(PT_SIZE_ON_STACK) + srli sp, sp, THREAD_SHIFT + andi sp, sp, 0x1 + bnez sp, handle_kernel_stack_overflow + REG_L sp, TASK_TI_KERNEL_SP(tp) +#endif + +_save_context: + REG_S sp, TASK_TI_USER_SP(tp) + REG_L sp, TASK_TI_KERNEL_SP(tp) + addi sp, sp, -(PT_SIZE_ON_STACK) + save_gp /* * Disable user-mode memory access as it should only be set in the @@ -234,35 +239,7 @@ ENTRY(handle_kernel_stack_overflow) addi sp, sp, -(PT_SIZE_ON_STACK) //save context to overflow stack - REG_S x1, PT_RA(sp) - REG_S x3, PT_GP(sp) - REG_S x5, PT_T0(sp) - REG_S x6, PT_T1(sp) - REG_S x7, PT_T2(sp) - REG_S x8, PT_S0(sp) - REG_S x9, PT_S1(sp) - REG_S x10, PT_A0(sp) - REG_S x11, PT_A1(sp) - REG_S x12, PT_A2(sp) - REG_S x13, PT_A3(sp) - REG_S x14, PT_A4(sp) - REG_S x15, PT_A5(sp) - REG_S x16, PT_A6(sp) - REG_S x17, PT_A7(sp) - REG_S x18, PT_S2(sp) - REG_S x19, PT_S3(sp) - REG_S x20, PT_S4(sp) - REG_S x21, PT_S5(sp) - REG_S x22, PT_S6(sp) - REG_S x23, PT_S7(sp) - REG_S x24, PT_S8(sp) - REG_S x25, PT_S9(sp) - REG_S x26, PT_S10(sp) - REG_S x27, PT_S11(sp) - REG_S x28, PT_T3(sp) - REG_S x29, PT_T4(sp) - REG_S x30, PT_T5(sp) - REG_S x31, PT_T6(sp) + save_gp REG_L s0, TASK_TI_KERNEL_SP(tp) csrr s1, CSR_STATUS -- 2.34.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 4/4] riscv: entry: consolidate general regs saving into save_gp @ 2022-09-25 17:53 ` Jisheng Zhang 0 siblings, 0 replies; 32+ messages in thread From: Jisheng Zhang @ 2022-09-25 17:53 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou, Guo Ren, Nathan Chancellor, Nick Desaulniers Cc: linux-riscv, linux-kernel, llvm Consolidate the saving GPs(except sp and tp into save_gp macro. No functional change. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> --- arch/riscv/kernel/entry.S | 85 ++++++++++++++------------------------- 1 file changed, 31 insertions(+), 54 deletions(-) diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 442d93beffcf..04e11d257ad6 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -14,31 +14,8 @@ #include <asm/asm-offsets.h> #include <asm/errata_list.h> -ENTRY(handle_exception) - /* - * If coming from userspace, preserve the user thread pointer and load - * the kernel thread pointer. If we came from the kernel, the scratch - * register will contain 0, and we should continue on the current TP. - */ - csrrw tp, CSR_SCRATCH, tp - bnez tp, _save_context - -_restore_kernel_tpsp: - csrr tp, CSR_SCRATCH - REG_S sp, TASK_TI_KERNEL_SP(tp) - -#ifdef CONFIG_VMAP_STACK - addi sp, sp, -(PT_SIZE_ON_STACK) - srli sp, sp, THREAD_SHIFT - andi sp, sp, 0x1 - bnez sp, handle_kernel_stack_overflow - REG_L sp, TASK_TI_KERNEL_SP(tp) -#endif - -_save_context: - REG_S sp, TASK_TI_USER_SP(tp) - REG_L sp, TASK_TI_KERNEL_SP(tp) - addi sp, sp, -(PT_SIZE_ON_STACK) + /* save all GPs except sp and tp */ + .macro save_gp REG_S x1, PT_RA(sp) REG_S x3, PT_GP(sp) REG_S x5, PT_T0(sp) @@ -68,6 +45,34 @@ _save_context: REG_S x29, PT_T4(sp) REG_S x30, PT_T5(sp) REG_S x31, PT_T6(sp) + .endm + +ENTRY(handle_exception) + /* + * If coming from userspace, preserve the user thread pointer and load + * the kernel thread pointer. If we came from the kernel, the scratch + * register will contain 0, and we should continue on the current TP. + */ + csrrw tp, CSR_SCRATCH, tp + bnez tp, _save_context + +_restore_kernel_tpsp: + csrr tp, CSR_SCRATCH + REG_S sp, TASK_TI_KERNEL_SP(tp) + +#ifdef CONFIG_VMAP_STACK + addi sp, sp, -(PT_SIZE_ON_STACK) + srli sp, sp, THREAD_SHIFT + andi sp, sp, 0x1 + bnez sp, handle_kernel_stack_overflow + REG_L sp, TASK_TI_KERNEL_SP(tp) +#endif + +_save_context: + REG_S sp, TASK_TI_USER_SP(tp) + REG_L sp, TASK_TI_KERNEL_SP(tp) + addi sp, sp, -(PT_SIZE_ON_STACK) + save_gp /* * Disable user-mode memory access as it should only be set in the @@ -234,35 +239,7 @@ ENTRY(handle_kernel_stack_overflow) addi sp, sp, -(PT_SIZE_ON_STACK) //save context to overflow stack - REG_S x1, PT_RA(sp) - REG_S x3, PT_GP(sp) - REG_S x5, PT_T0(sp) - REG_S x6, PT_T1(sp) - REG_S x7, PT_T2(sp) - REG_S x8, PT_S0(sp) - REG_S x9, PT_S1(sp) - REG_S x10, PT_A0(sp) - REG_S x11, PT_A1(sp) - REG_S x12, PT_A2(sp) - REG_S x13, PT_A3(sp) - REG_S x14, PT_A4(sp) - REG_S x15, PT_A5(sp) - REG_S x16, PT_A6(sp) - REG_S x17, PT_A7(sp) - REG_S x18, PT_S2(sp) - REG_S x19, PT_S3(sp) - REG_S x20, PT_S4(sp) - REG_S x21, PT_S5(sp) - REG_S x22, PT_S6(sp) - REG_S x23, PT_S7(sp) - REG_S x24, PT_S8(sp) - REG_S x25, PT_S9(sp) - REG_S x26, PT_S10(sp) - REG_S x27, PT_S11(sp) - REG_S x28, PT_T3(sp) - REG_S x29, PT_T4(sp) - REG_S x30, PT_T5(sp) - REG_S x31, PT_T6(sp) + save_gp REG_L s0, TASK_TI_KERNEL_SP(tp) csrr s1, CSR_STATUS -- 2.34.1 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4] riscv: entry: consolidate general regs saving into save_gp 2022-09-25 17:53 ` Jisheng Zhang @ 2022-09-25 23:29 ` Guo Ren -1 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-25 23:29 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm Acked-by: Guo Ren <kernel.org> For the save & restore pair, I also suggest adding restore_gp. On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > Consolidate the saving GPs(except sp and tp into save_gp macro. > No functional change. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > --- > arch/riscv/kernel/entry.S | 85 ++++++++++++++------------------------- > 1 file changed, 31 insertions(+), 54 deletions(-) > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 442d93beffcf..04e11d257ad6 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -14,31 +14,8 @@ > #include <asm/asm-offsets.h> > #include <asm/errata_list.h> > > -ENTRY(handle_exception) > - /* > - * If coming from userspace, preserve the user thread pointer and load > - * the kernel thread pointer. If we came from the kernel, the scratch > - * register will contain 0, and we should continue on the current TP. > - */ > - csrrw tp, CSR_SCRATCH, tp > - bnez tp, _save_context > - > -_restore_kernel_tpsp: > - csrr tp, CSR_SCRATCH > - REG_S sp, TASK_TI_KERNEL_SP(tp) > - > -#ifdef CONFIG_VMAP_STACK > - addi sp, sp, -(PT_SIZE_ON_STACK) > - srli sp, sp, THREAD_SHIFT > - andi sp, sp, 0x1 > - bnez sp, handle_kernel_stack_overflow > - REG_L sp, TASK_TI_KERNEL_SP(tp) > -#endif > - > -_save_context: > - REG_S sp, TASK_TI_USER_SP(tp) > - REG_L sp, TASK_TI_KERNEL_SP(tp) > - addi sp, sp, -(PT_SIZE_ON_STACK) > + /* save all GPs except sp and tp */ > + .macro save_gp > REG_S x1, PT_RA(sp) > REG_S x3, PT_GP(sp) > REG_S x5, PT_T0(sp) > @@ -68,6 +45,34 @@ _save_context: > REG_S x29, PT_T4(sp) > REG_S x30, PT_T5(sp) > REG_S x31, PT_T6(sp) > + .endm > + > +ENTRY(handle_exception) > + /* > + * If coming from userspace, preserve the user thread pointer and load > + * the kernel thread pointer. If we came from the kernel, the scratch > + * register will contain 0, and we should continue on the current TP. > + */ > + csrrw tp, CSR_SCRATCH, tp > + bnez tp, _save_context > + > +_restore_kernel_tpsp: > + csrr tp, CSR_SCRATCH > + REG_S sp, TASK_TI_KERNEL_SP(tp) > + > +#ifdef CONFIG_VMAP_STACK > + addi sp, sp, -(PT_SIZE_ON_STACK) > + srli sp, sp, THREAD_SHIFT > + andi sp, sp, 0x1 > + bnez sp, handle_kernel_stack_overflow > + REG_L sp, TASK_TI_KERNEL_SP(tp) > +#endif > + > +_save_context: > + REG_S sp, TASK_TI_USER_SP(tp) > + REG_L sp, TASK_TI_KERNEL_SP(tp) > + addi sp, sp, -(PT_SIZE_ON_STACK) > + save_gp > > /* > * Disable user-mode memory access as it should only be set in the > @@ -234,35 +239,7 @@ ENTRY(handle_kernel_stack_overflow) > addi sp, sp, -(PT_SIZE_ON_STACK) > > //save context to overflow stack > - REG_S x1, PT_RA(sp) > - REG_S x3, PT_GP(sp) > - REG_S x5, PT_T0(sp) > - REG_S x6, PT_T1(sp) > - REG_S x7, PT_T2(sp) > - REG_S x8, PT_S0(sp) > - REG_S x9, PT_S1(sp) > - REG_S x10, PT_A0(sp) > - REG_S x11, PT_A1(sp) > - REG_S x12, PT_A2(sp) > - REG_S x13, PT_A3(sp) > - REG_S x14, PT_A4(sp) > - REG_S x15, PT_A5(sp) > - REG_S x16, PT_A6(sp) > - REG_S x17, PT_A7(sp) > - REG_S x18, PT_S2(sp) > - REG_S x19, PT_S3(sp) > - REG_S x20, PT_S4(sp) > - REG_S x21, PT_S5(sp) > - REG_S x22, PT_S6(sp) > - REG_S x23, PT_S7(sp) > - REG_S x24, PT_S8(sp) > - REG_S x25, PT_S9(sp) > - REG_S x26, PT_S10(sp) > - REG_S x27, PT_S11(sp) > - REG_S x28, PT_T3(sp) > - REG_S x29, PT_T4(sp) > - REG_S x30, PT_T5(sp) > - REG_S x31, PT_T6(sp) > + save_gp > > REG_L s0, TASK_TI_KERNEL_SP(tp) > csrr s1, CSR_STATUS > -- > 2.34.1 > -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/4] riscv: entry: consolidate general regs saving into save_gp @ 2022-09-25 23:29 ` Guo Ren 0 siblings, 0 replies; 32+ messages in thread From: Guo Ren @ 2022-09-25 23:29 UTC (permalink / raw) To: Jisheng Zhang Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Nathan Chancellor, Nick Desaulniers, linux-riscv, linux-kernel, llvm Acked-by: Guo Ren <kernel.org> For the save & restore pair, I also suggest adding restore_gp. On Mon, Sep 26, 2022 at 2:03 AM Jisheng Zhang <jszhang@kernel.org> wrote: > > Consolidate the saving GPs(except sp and tp into save_gp macro. > No functional change. > > Signed-off-by: Jisheng Zhang <jszhang@kernel.org> > --- > arch/riscv/kernel/entry.S | 85 ++++++++++++++------------------------- > 1 file changed, 31 insertions(+), 54 deletions(-) > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 442d93beffcf..04e11d257ad6 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -14,31 +14,8 @@ > #include <asm/asm-offsets.h> > #include <asm/errata_list.h> > > -ENTRY(handle_exception) > - /* > - * If coming from userspace, preserve the user thread pointer and load > - * the kernel thread pointer. If we came from the kernel, the scratch > - * register will contain 0, and we should continue on the current TP. > - */ > - csrrw tp, CSR_SCRATCH, tp > - bnez tp, _save_context > - > -_restore_kernel_tpsp: > - csrr tp, CSR_SCRATCH > - REG_S sp, TASK_TI_KERNEL_SP(tp) > - > -#ifdef CONFIG_VMAP_STACK > - addi sp, sp, -(PT_SIZE_ON_STACK) > - srli sp, sp, THREAD_SHIFT > - andi sp, sp, 0x1 > - bnez sp, handle_kernel_stack_overflow > - REG_L sp, TASK_TI_KERNEL_SP(tp) > -#endif > - > -_save_context: > - REG_S sp, TASK_TI_USER_SP(tp) > - REG_L sp, TASK_TI_KERNEL_SP(tp) > - addi sp, sp, -(PT_SIZE_ON_STACK) > + /* save all GPs except sp and tp */ > + .macro save_gp > REG_S x1, PT_RA(sp) > REG_S x3, PT_GP(sp) > REG_S x5, PT_T0(sp) > @@ -68,6 +45,34 @@ _save_context: > REG_S x29, PT_T4(sp) > REG_S x30, PT_T5(sp) > REG_S x31, PT_T6(sp) > + .endm > + > +ENTRY(handle_exception) > + /* > + * If coming from userspace, preserve the user thread pointer and load > + * the kernel thread pointer. If we came from the kernel, the scratch > + * register will contain 0, and we should continue on the current TP. > + */ > + csrrw tp, CSR_SCRATCH, tp > + bnez tp, _save_context > + > +_restore_kernel_tpsp: > + csrr tp, CSR_SCRATCH > + REG_S sp, TASK_TI_KERNEL_SP(tp) > + > +#ifdef CONFIG_VMAP_STACK > + addi sp, sp, -(PT_SIZE_ON_STACK) > + srli sp, sp, THREAD_SHIFT > + andi sp, sp, 0x1 > + bnez sp, handle_kernel_stack_overflow > + REG_L sp, TASK_TI_KERNEL_SP(tp) > +#endif > + > +_save_context: > + REG_S sp, TASK_TI_USER_SP(tp) > + REG_L sp, TASK_TI_KERNEL_SP(tp) > + addi sp, sp, -(PT_SIZE_ON_STACK) > + save_gp > > /* > * Disable user-mode memory access as it should only be set in the > @@ -234,35 +239,7 @@ ENTRY(handle_kernel_stack_overflow) > addi sp, sp, -(PT_SIZE_ON_STACK) > > //save context to overflow stack > - REG_S x1, PT_RA(sp) > - REG_S x3, PT_GP(sp) > - REG_S x5, PT_T0(sp) > - REG_S x6, PT_T1(sp) > - REG_S x7, PT_T2(sp) > - REG_S x8, PT_S0(sp) > - REG_S x9, PT_S1(sp) > - REG_S x10, PT_A0(sp) > - REG_S x11, PT_A1(sp) > - REG_S x12, PT_A2(sp) > - REG_S x13, PT_A3(sp) > - REG_S x14, PT_A4(sp) > - REG_S x15, PT_A5(sp) > - REG_S x16, PT_A6(sp) > - REG_S x17, PT_A7(sp) > - REG_S x18, PT_S2(sp) > - REG_S x19, PT_S3(sp) > - REG_S x20, PT_S4(sp) > - REG_S x21, PT_S5(sp) > - REG_S x22, PT_S6(sp) > - REG_S x23, PT_S7(sp) > - REG_S x24, PT_S8(sp) > - REG_S x25, PT_S9(sp) > - REG_S x26, PT_S10(sp) > - REG_S x27, PT_S11(sp) > - REG_S x28, PT_T3(sp) > - REG_S x29, PT_T4(sp) > - REG_S x30, PT_T5(sp) > - REG_S x31, PT_T6(sp) > + save_gp > > REG_L s0, TASK_TI_KERNEL_SP(tp) > csrr s1, CSR_STATUS > -- > 2.34.1 > -- Best Regards Guo Ren ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2022-11-23 17:03 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-25 17:53 [PATCH 0/4] riscv: entry: further clean up and VMAP_STACK fix Jisheng Zhang
2022-09-25 17:53 ` Jisheng Zhang
2022-09-25 17:53 ` [PATCH 1/4] riscv: remove extra level wrappers of trace_hardirqs_{on,off} Jisheng Zhang
2022-09-25 17:53 ` Jisheng Zhang
2022-09-25 23:18 ` Guo Ren
2022-09-25 23:18 ` Guo Ren
2022-09-25 17:53 ` [PATCH 2/4] riscv: consolidate ret_from_kernel_thread into ret_from_fork Jisheng Zhang
2022-09-25 17:53 ` Jisheng Zhang
2022-09-25 23:25 ` Guo Ren
2022-09-25 23:25 ` Guo Ren
2022-09-26 16:05 ` Jisheng Zhang
2022-09-26 16:05 ` Jisheng Zhang
2022-09-26 23:55 ` Guo Ren
2022-09-26 23:55 ` Guo Ren
2022-09-28 16:40 ` Jisheng Zhang
2022-09-28 16:40 ` Jisheng Zhang
2022-09-30 11:42 ` Guo Ren
2022-09-30 11:42 ` Guo Ren
2022-09-25 17:53 ` [PATCH 3/4] riscv: fix race when vmap stack overflow and remove shadow_stack Jisheng Zhang
2022-09-25 17:53 ` Jisheng Zhang
2022-09-25 23:46 ` Guo Ren
2022-09-25 23:46 ` Guo Ren
2022-09-27 0:18 ` Jisheng Zhang
2022-09-27 0:18 ` Jisheng Zhang
2022-09-27 2:00 ` Guo Ren
2022-09-27 2:00 ` Guo Ren
2022-11-23 17:02 ` Deepak Gupta
2022-11-23 17:02 ` Deepak Gupta
2022-09-25 17:53 ` [PATCH 4/4] riscv: entry: consolidate general regs saving into save_gp Jisheng Zhang
2022-09-25 17:53 ` Jisheng Zhang
2022-09-25 23:29 ` Guo Ren
2022-09-25 23:29 ` Guo Ren
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.