* [PATCH] riscv: Improve exception and system call latency
@ 2023-12-25 4:00 Anton Blanchard
2023-12-25 10:09 ` Jisheng Zhang
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Anton Blanchard @ 2023-12-25 4:00 UTC (permalink / raw)
To: paul.walmsley, palmer, aou; +Cc: linux-riscv, linux-kernel, Anton Blanchard
Many CPUs implement return address branch prediction as a stack. The
RISCV architecture refers to this as a return address stack (RAS). If
this gets corrupted then the CPU will mispredict at least one but
potentally many function returns.
There are two issues with the current RISCV exception code:
- We are using the alternate link stack (x5/t0) for the indirect branch
which makes the hardware think this is a function return. This will
corrupt the RAS.
- We modify the return address of handle_exception to point to
ret_from_exception. This will also corrupt the RAS.
Testing the null system call latency before and after the patch:
Visionfive2 (StarFive JH7110 / U74)
baseline: 189.87 ns
patched: 176.76 ns
Lichee pi 4a (T-Head TH1520 / C910)
baseline: 666.58 ns
patched: 636.90 ns
Just over 7% on the U74 and just over 4% on the C910.
Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
---
This introduces some complexity in the stackframe walk code. PowerPC
resolves the multiple exception exit paths issue by placing a value into
the exception stack frame (basically the word "REGS") that the stack frame
code can look for. Perhaps something to look at.
arch/riscv/kernel/entry.S | 21 ++++++++++++++-------
arch/riscv/kernel/stacktrace.c | 14 +++++++++++++-
2 files changed, 27 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 54ca4564a926..89af35edbf6c 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception)
scs_load_current_if_task_changed s5
move a0, sp /* pt_regs */
- la ra, ret_from_exception
/*
* MSB of cause differentiates between
@@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception)
bge s4, zero, 1f
/* Handle interrupts */
- tail do_irq
+ call do_irq
+.globl ret_from_irq_exception
+ret_from_irq_exception:
+ j ret_from_exception
1:
/* Handle other exceptions */
slli t0, s4, RISCV_LGPTR
@@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception)
la t2, excp_vect_table_end
add t0, t1, t0
/* Check if exception code lies within bounds */
- bgeu t0, t2, 1f
- REG_L t0, 0(t0)
- jr t0
-1:
- tail do_trap_unknown
+ bgeu t0, t2, 3f
+ REG_L t1, 0(t0)
+2: jalr ra,t1
+.globl ret_from_other_exception
+ret_from_other_exception:
+ j ret_from_exception
+3:
+
+ la t1, do_trap_unknown
+ j 2b
SYM_CODE_END(handle_exception)
/*
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index 64a9c093aef9..b9cd131bbc4c 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -17,6 +17,18 @@
#ifdef CONFIG_FRAME_POINTER
extern asmlinkage void ret_from_exception(void);
+extern asmlinkage void ret_from_irq_exception(void);
+extern asmlinkage void ret_from_other_exception(void);
+
+static inline bool is_exception_frame(unsigned long pc)
+{
+ if ((pc == (unsigned long)ret_from_exception) ||
+ (pc == (unsigned long)ret_from_irq_exception) ||
+ (pc == (unsigned long)ret_from_other_exception))
+ return true;
+
+ return false;
+}
void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
bool (*fn)(void *, unsigned long), void *arg)
@@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
fp = frame->fp;
pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
&frame->ra);
- if (pc == (unsigned long)ret_from_exception) {
+ if (is_exception_frame(pc)) {
if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc)))
break;
--
2.25.1
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH] riscv: Improve exception and system call latency 2023-12-25 4:00 [PATCH] riscv: Improve exception and system call latency Anton Blanchard @ 2023-12-25 10:09 ` Jisheng Zhang 2023-12-26 3:56 ` Guo Ren 2024-06-07 6:13 ` [PATCH v2] " Cyril Bur 2 siblings, 0 replies; 11+ messages in thread From: Jisheng Zhang @ 2023-12-25 10:09 UTC (permalink / raw) To: Anton Blanchard; +Cc: paul.walmsley, palmer, aou, linux-riscv, linux-kernel On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote: > Many CPUs implement return address branch prediction as a stack. The > RISCV architecture refers to this as a return address stack (RAS). If > this gets corrupted then the CPU will mispredict at least one but > potentally many function returns. > > There are two issues with the current RISCV exception code: > > - We are using the alternate link stack (x5/t0) for the indirect branch > which makes the hardware think this is a function return. This will > corrupt the RAS. > > - We modify the return address of handle_exception to point to > ret_from_exception. This will also corrupt the RAS. > > Testing the null system call latency before and after the patch: > > Visionfive2 (StarFive JH7110 / U74) > baseline: 189.87 ns > patched: 176.76 ns > > Lichee pi 4a (T-Head TH1520 / C910) > baseline: 666.58 ns > patched: 636.90 ns > > Just over 7% on the U74 and just over 4% on the C910. Nice improvement! > > Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> > --- > > This introduces some complexity in the stackframe walk code. PowerPC > resolves the multiple exception exit paths issue by placing a value into > the exception stack frame (basically the word "REGS") that the stack frame > code can look for. Perhaps something to look at. > > arch/riscv/kernel/entry.S | 21 ++++++++++++++------- > arch/riscv/kernel/stacktrace.c | 14 +++++++++++++- > 2 files changed, 27 insertions(+), 8 deletions(-) > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 54ca4564a926..89af35edbf6c 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > scs_load_current_if_task_changed s5 > > move a0, sp /* pt_regs */ > - la ra, ret_from_exception > > /* > * MSB of cause differentiates between > @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception) > bge s4, zero, 1f > > /* Handle interrupts */ > - tail do_irq > + call do_irq > +.globl ret_from_irq_exception > +ret_from_irq_exception: > + j ret_from_exception > 1: > /* Handle other exceptions */ > slli t0, s4, RISCV_LGPTR > @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception) > la t2, excp_vect_table_end > add t0, t1, t0 > /* Check if exception code lies within bounds */ > - bgeu t0, t2, 1f > - REG_L t0, 0(t0) > - jr t0 > -1: > - tail do_trap_unknown > + bgeu t0, t2, 3f > + REG_L t1, 0(t0) > +2: jalr ra,t1 can be simplified to jalr t1 with the above change, Reviewed-by: Jisheng Zhang <jszhang@kernel.org> > +.globl ret_from_other_exception > +ret_from_other_exception: > + j ret_from_exception > +3: > + > + la t1, do_trap_unknown > + j 2b > SYM_CODE_END(handle_exception) > > /* > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c > index 64a9c093aef9..b9cd131bbc4c 100644 > --- a/arch/riscv/kernel/stacktrace.c > +++ b/arch/riscv/kernel/stacktrace.c > @@ -17,6 +17,18 @@ > #ifdef CONFIG_FRAME_POINTER > > extern asmlinkage void ret_from_exception(void); > +extern asmlinkage void ret_from_irq_exception(void); > +extern asmlinkage void ret_from_other_exception(void); > + > +static inline bool is_exception_frame(unsigned long pc) > +{ > + if ((pc == (unsigned long)ret_from_exception) || > + (pc == (unsigned long)ret_from_irq_exception) || > + (pc == (unsigned long)ret_from_other_exception)) > + return true; > + > + return false; > +} > > void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > bool (*fn)(void *, unsigned long), void *arg) > @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > fp = frame->fp; > pc = ftrace_graph_ret_addr(current, NULL, frame->ra, > &frame->ra); > - if (pc == (unsigned long)ret_from_exception) { > + if (is_exception_frame(pc)) { > if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > break; > > -- > 2.25.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] riscv: Improve exception and system call latency 2023-12-25 4:00 [PATCH] riscv: Improve exception and system call latency Anton Blanchard 2023-12-25 10:09 ` Jisheng Zhang @ 2023-12-26 3:56 ` Guo Ren 2024-06-03 4:38 ` Cyril Bur 2024-06-07 6:13 ` [PATCH v2] " Cyril Bur 2 siblings, 1 reply; 11+ messages in thread From: Guo Ren @ 2023-12-26 3:56 UTC (permalink / raw) To: Anton Blanchard; +Cc: paul.walmsley, palmer, aou, linux-riscv, linux-kernel On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote: > Many CPUs implement return address branch prediction as a stack. The > RISCV architecture refers to this as a return address stack (RAS). If > this gets corrupted then the CPU will mispredict at least one but > potentally many function returns. > > There are two issues with the current RISCV exception code: > > - We are using the alternate link stack (x5/t0) for the indirect branch > which makes the hardware think this is a function return. This will > corrupt the RAS. > > - We modify the return address of handle_exception to point to > ret_from_exception. This will also corrupt the RAS. > > Testing the null system call latency before and after the patch: > > Visionfive2 (StarFive JH7110 / U74) > baseline: 189.87 ns > patched: 176.76 ns > > Lichee pi 4a (T-Head TH1520 / C910) > baseline: 666.58 ns > patched: 636.90 ns > > Just over 7% on the U74 and just over 4% on the C910. Yes, the wrong "jalr zero, t0/ra" would pop RAS and destroy the RAS layout of the hardware for the userspace. How about giving a fake push for the RAS to connect "jalr zero, ra" of sub-function call return? I'm curious if you could measure the difference with only one RAS misprediction. diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 54ca4564a926..94c7d2be35d0 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -93,7 +93,8 @@ SYM_CODE_START(handle_exception) bge s4, zero, 1f /* Handle interrupts */ - tail do_irq + auipc t0, do_irq + jalr t0, t0 1: /* Handle other exceptions */ slli t0, s4, RISCV_LGPTR @@ -103,9 +104,10 @@ SYM_CODE_START(handle_exception) /* Check if exception code lies within bounds */ bgeu t0, t2, 1f REG_L t0, 0(t0) - jr t0 + jalr t0, t0 1: - tail do_trap_unknown + auipc t0, do_trap_unknown + jalr t0, t0 SYM_CODE_END(handle_exception) You could prepare a deeper userspace stack calling if you want better measurement results. > > Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> > Reviewed-by: Jisheng Zhang <jszhang@kernel.org> > --- > > This introduces some complexity in the stackframe walk code. PowerPC > resolves the multiple exception exit paths issue by placing a value into > the exception stack frame (basically the word "REGS") that the stack frame > code can look for. Perhaps something to look at. > > arch/riscv/kernel/entry.S | 21 ++++++++++++++------- > arch/riscv/kernel/stacktrace.c | 14 +++++++++++++- > 2 files changed, 27 insertions(+), 8 deletions(-) > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 54ca4564a926..89af35edbf6c 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > scs_load_current_if_task_changed s5 > > move a0, sp /* pt_regs */ > - la ra, ret_from_exception > > /* > * MSB of cause differentiates between > @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception) > bge s4, zero, 1f > > /* Handle interrupts */ > - tail do_irq > + call do_irq > +.globl ret_from_irq_exception > +ret_from_irq_exception: > + j ret_from_exception > 1: > /* Handle other exceptions */ > slli t0, s4, RISCV_LGPTR > @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception) > la t2, excp_vect_table_end > add t0, t1, t0 > /* Check if exception code lies within bounds */ > - bgeu t0, t2, 1f > - REG_L t0, 0(t0) > - jr t0 > -1: > - tail do_trap_unknown > + bgeu t0, t2, 3f > + REG_L t1, 0(t0) > +2: jalr ra,t1 > +.globl ret_from_other_exception > +ret_from_other_exception: > + j ret_from_exception > +3: > + > + la t1, do_trap_unknown > + j 2b > SYM_CODE_END(handle_exception) > > /* > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c > index 64a9c093aef9..b9cd131bbc4c 100644 > --- a/arch/riscv/kernel/stacktrace.c > +++ b/arch/riscv/kernel/stacktrace.c > @@ -17,6 +17,18 @@ > #ifdef CONFIG_FRAME_POINTER > > extern asmlinkage void ret_from_exception(void); > +extern asmlinkage void ret_from_irq_exception(void); > +extern asmlinkage void ret_from_other_exception(void); > + > +static inline bool is_exception_frame(unsigned long pc) > +{ > + if ((pc == (unsigned long)ret_from_exception) || > + (pc == (unsigned long)ret_from_irq_exception) || > + (pc == (unsigned long)ret_from_other_exception)) > + return true; > + > + return false; > +} We needn't put too many .globl in the entry.S, and just check that pc is in SYM_CODE_START/END(handle_exception), then entry.S would be cleaner: diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 54ca4564a926..d452d5f12b1b 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) scs_load_current_if_task_changed s5 move a0, sp /* pt_regs */ /* * MSB of cause differentiates between @@ -93,7 +92,8 @@ SYM_CODE_START(handle_exception) bge s4, zero, 1f /* Handle interrupts */ call do_irq j ret_from_exception 1: /* Handle other exceptions */ slli t0, s4, RISCV_LGPTR @@ -102,10 +102,12 @@ SYM_CODE_START(handle_exception) add t0, t1, t0 /* Check if exception code lies within bounds */ bgeu t0, t2, 1f REG_L ra, 0(t0) jalr ra, ra j ret_from_exception 1: call do_trap_unknown j ret_from_exception SYM_CODE_END(handle_exception) > > void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > bool (*fn)(void *, unsigned long), void *arg) > @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > fp = frame->fp; > pc = ftrace_graph_ret_addr(current, NULL, frame->ra, > &frame->ra); > - if (pc == (unsigned long)ret_from_exception) { > + if (is_exception_frame(pc)) { > if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > break; > > -- > 2.25.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] riscv: Improve exception and system call latency 2023-12-26 3:56 ` Guo Ren @ 2024-06-03 4:38 ` Cyril Bur 2024-06-03 6:39 ` Guo Ren 0 siblings, 1 reply; 11+ messages in thread From: Cyril Bur @ 2024-06-03 4:38 UTC (permalink / raw) To: Guo Ren Cc: Anton Blanchard, paul.walmsley, palmer, aou, linux-riscv, linux-kernel [ apologies, I think my mailer is going to mess up the formatting ] On 26 Dec 2023, at 2:56 PM, Guo Ren <guoren@kernel.org> wrote: On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote: Many CPUs implement return address branch prediction as a stack. The RISCV architecture refers to this as a return address stack (RAS). If this gets corrupted then the CPU will mispredict at least one but potentally many function returns. There are two issues with the current RISCV exception code: - We are using the alternate link stack (x5/t0) for the indirect branch which makes the hardware think this is a function return. This will corrupt the RAS. - We modify the return address of handle_exception to point to ret_from_exception. This will also corrupt the RAS. Testing the null system call latency before and after the patch: Visionfive2 (StarFive JH7110 / U74) baseline: 189.87 ns patched: 176.76 ns Lichee pi 4a (T-Head TH1520 / C910) baseline: 666.58 ns patched: 636.90 ns Just over 7% on the U74 and just over 4% on the C910. Yes, the wrong "jalr zero, t0/ra" would pop RAS and destroy the RAS layout of the hardware for the userspace. How about giving a fake push for the RAS to connect "jalr zero, ra" of sub-function call return? I'm curious if you could measure the difference with only one RAS misprediction. diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 54ca4564a926..94c7d2be35d0 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -93,7 +93,8 @@ SYM_CODE_START(handle_exception) bge s4, zero, 1f /* Handle interrupts */ - tail do_irq + auipc t0, do_irq + jalr t0, t0 1: /* Handle other exceptions */ slli t0, s4, RISCV_LGPTR @@ -103,9 +104,10 @@ SYM_CODE_START(handle_exception) /* Check if exception code lies within bounds */ bgeu t0, t2, 1f REG_L t0, 0(t0) - jr t0 + jalr t0, t0 1: - tail do_trap_unknown + auipc t0, do_trap_unknown + jalr t0, t0 SYM_CODE_END(handle_exception) You could prepare a deeper userspace stack calling if you want better measurement results. Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> --- This introduces some complexity in the stackframe walk code. PowerPC resolves the multiple exception exit paths issue by placing a value into the exception stack frame (basically the word "REGS") that the stack frame code can look for. Perhaps something to look at. arch/riscv/kernel/entry.S | 21 ++++++++++++++------- arch/riscv/kernel/stacktrace.c | 14 +++++++++++++- 2 files changed, 27 insertions(+), 8 deletions(-) diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 54ca4564a926..89af35edbf6c 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) scs_load_current_if_task_changed s5 move a0, sp /* pt_regs */ - la ra, ret_from_exception /* * MSB of cause differentiates between @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception) bge s4, zero, 1f /* Handle interrupts */ - tail do_irq + call do_irq +.globl ret_from_irq_exception +ret_from_irq_exception: + j ret_from_exception 1: /* Handle other exceptions */ slli t0, s4, RISCV_LGPTR @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception) la t2, excp_vect_table_end add t0, t1, t0 /* Check if exception code lies within bounds */ - bgeu t0, t2, 1f - REG_L t0, 0(t0) - jr t0 -1: - tail do_trap_unknown + bgeu t0, t2, 3f + REG_L t1, 0(t0) +2: jalr ra,t1 +.globl ret_from_other_exception +ret_from_other_exception: + j ret_from_exception +3: + + la t1, do_trap_unknown + j 2b SYM_CODE_END(handle_exception) /* diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c index 64a9c093aef9..b9cd131bbc4c 100644 --- a/arch/riscv/kernel/stacktrace.c +++ b/arch/riscv/kernel/stacktrace.c @@ -17,6 +17,18 @@ #ifdef CONFIG_FRAME_POINTER extern asmlinkage void ret_from_exception(void); +extern asmlinkage void ret_from_irq_exception(void); +extern asmlinkage void ret_from_other_exception(void); + +static inline bool is_exception_frame(unsigned long pc) +{ + if ((pc == (unsigned long)ret_from_exception) || + (pc == (unsigned long)ret_from_irq_exception) || + (pc == (unsigned long)ret_from_other_exception)) + return true; + + return false; +} We needn't put too many .globl in the entry.S, and just check that pc is in SYM_CODE_START/END(handle_exception), then entry.S would be cleaner: Hi Guo, I've taken this patch over from Anton, mostly just to tidy it up. I'd like to incorporate what you mention here but I'm not sure how to achieve it. Have I missed something obvious? As things currently stand there doesn't seem to be a way to get the end (or size) of handle_exception in C code. Your advice is greatly appreciated, Thanks, Cyril diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 54ca4564a926..d452d5f12b1b 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) scs_load_current_if_task_changed s5 move a0, sp /* pt_regs */ /* * MSB of cause differentiates between @@ -93,7 +92,8 @@ SYM_CODE_START(handle_exception) bge s4, zero, 1f /* Handle interrupts */ call do_irq j ret_from_exception 1: /* Handle other exceptions */ slli t0, s4, RISCV_LGPTR @@ -102,10 +102,12 @@ SYM_CODE_START(handle_exception) add t0, t1, t0 /* Check if exception code lies within bounds */ bgeu t0, t2, 1f REG_L ra, 0(t0) jalr ra, ra j ret_from_exception 1: call do_trap_unknown j ret_from_exception SYM_CODE_END(handle_exception) void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, bool (*fn)(void *, unsigned long), void *arg) @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, fp = frame->fp; pc = ftrace_graph_ret_addr(current, NULL, frame->ra, &frame->ra); - if (pc == (unsigned long)ret_from_exception) { + if (is_exception_frame(pc)) { if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) break; -- 2.25.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] riscv: Improve exception and system call latency 2024-06-03 4:38 ` Cyril Bur @ 2024-06-03 6:39 ` Guo Ren 2024-06-04 8:15 ` [CAUTION - External Sender] " Cyril Bur 0 siblings, 1 reply; 11+ messages in thread From: Guo Ren @ 2024-06-03 6:39 UTC (permalink / raw) To: Cyril Bur Cc: Anton Blanchard, paul.walmsley, palmer, aou, linux-riscv, linux-kernel On Mon, Jun 3, 2024 at 12:38 PM Cyril Bur <cyrilbur@tenstorrent.com> wrote: > > [ apologies, I think my mailer is going to mess up the formatting ] > > On 26 Dec 2023, at 2:56 PM, Guo Ren <guoren@kernel.org> wrote: > > On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote: > > Many CPUs implement return address branch prediction as a stack. The > RISCV architecture refers to this as a return address stack (RAS). If > this gets corrupted then the CPU will mispredict at least one but > potentally many function returns. > > There are two issues with the current RISCV exception code: > > - We are using the alternate link stack (x5/t0) for the indirect branch > which makes the hardware think this is a function return. This will > corrupt the RAS. > > - We modify the return address of handle_exception to point to > ret_from_exception. This will also corrupt the RAS. > > Testing the null system call latency before and after the patch: > > Visionfive2 (StarFive JH7110 / U74) > baseline: 189.87 ns > patched: 176.76 ns > > Lichee pi 4a (T-Head TH1520 / C910) > baseline: 666.58 ns > patched: 636.90 ns > > Just over 7% on the U74 and just over 4% on the C910. > > > Yes, the wrong "jalr zero, t0/ra" would pop RAS and destroy the RAS > layout of the hardware for the userspace. How about giving a fake push > for the RAS to connect "jalr zero, ra" of sub-function call return? I'm > curious if you could measure the difference with only one RAS > misprediction. > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 54ca4564a926..94c7d2be35d0 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -93,7 +93,8 @@ SYM_CODE_START(handle_exception) > bge s4, zero, 1f > > /* Handle interrupts */ > - tail do_irq > + auipc t0, do_irq > + jalr t0, t0 > 1: > /* Handle other exceptions */ > slli t0, s4, RISCV_LGPTR > @@ -103,9 +104,10 @@ SYM_CODE_START(handle_exception) > /* Check if exception code lies within bounds */ > bgeu t0, t2, 1f > REG_L t0, 0(t0) > - jr t0 > + jalr t0, t0 > 1: > - tail do_trap_unknown > + auipc t0, do_trap_unknown > + jalr t0, t0 > SYM_CODE_END(handle_exception) > > You could prepare a deeper userspace stack calling if you want better > measurement results. > > > Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> > Reviewed-by: Jisheng Zhang <jszhang@kernel.org> > --- > > This introduces some complexity in the stackframe walk code. PowerPC > resolves the multiple exception exit paths issue by placing a value into > the exception stack frame (basically the word "REGS") that the stack frame > code can look for. Perhaps something to look at. > > arch/riscv/kernel/entry.S | 21 ++++++++++++++------- > arch/riscv/kernel/stacktrace.c | 14 +++++++++++++- > 2 files changed, 27 insertions(+), 8 deletions(-) > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 54ca4564a926..89af35edbf6c 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > scs_load_current_if_task_changed s5 > > move a0, sp /* pt_regs */ > - la ra, ret_from_exception > > /* > * MSB of cause differentiates between > @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception) > bge s4, zero, 1f > > /* Handle interrupts */ > - tail do_irq > + call do_irq > +.globl ret_from_irq_exception > +ret_from_irq_exception: > + j ret_from_exception > 1: > /* Handle other exceptions */ > slli t0, s4, RISCV_LGPTR > @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception) > la t2, excp_vect_table_end > add t0, t1, t0 > /* Check if exception code lies within bounds */ > - bgeu t0, t2, 1f > - REG_L t0, 0(t0) > - jr t0 > -1: > - tail do_trap_unknown > + bgeu t0, t2, 3f > + REG_L t1, 0(t0) > +2: jalr ra,t1 > +.globl ret_from_other_exception > +ret_from_other_exception: > + j ret_from_exception > +3: > + > + la t1, do_trap_unknown > + j 2b > SYM_CODE_END(handle_exception) > > /* > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c > index 64a9c093aef9..b9cd131bbc4c 100644 > --- a/arch/riscv/kernel/stacktrace.c > +++ b/arch/riscv/kernel/stacktrace.c > @@ -17,6 +17,18 @@ > #ifdef CONFIG_FRAME_POINTER > > extern asmlinkage void ret_from_exception(void); > +extern asmlinkage void ret_from_irq_exception(void); > +extern asmlinkage void ret_from_other_exception(void); > + > +static inline bool is_exception_frame(unsigned long pc) > +{ > + if ((pc == (unsigned long)ret_from_exception) || > + (pc == (unsigned long)ret_from_irq_exception) || > + (pc == (unsigned long)ret_from_other_exception)) > + return true; > + > + return false; > +} > > We needn't put too many .globl in the entry.S, and just check that pc is > in SYM_CODE_START/END(handle_exception), then entry.S would be cleaner: > > Hi Guo, > > I've taken this patch over from Anton, mostly just to tidy it up. I'd > like to incorporate > what you mention here but I'm not sure how to achieve it. Have I > missed something > obvious? As things currently stand there doesn't seem to be a way to get the end > (or size) of handle_exception in C code. "just check that pc is in SYM_CODE_START/END(handle_exception)." Sorry, I think my previous description is wrong. Instead, "We needn't modify anything in stacktrace.c because we keep ra = ret_from_exception." I want only cleaner and smaller modifications to the entry.S to satisfy RAS prediction performance requirements. > > Your advice is greatly appreciated, > > Thanks, > > Cyril > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 54ca4564a926..d452d5f12b1b 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > scs_load_current_if_task_changed s5 > > move a0, sp /* pt_regs */ > > /* > * MSB of cause differentiates between > @@ -93,7 +92,8 @@ SYM_CODE_START(handle_exception) > bge s4, zero, 1f > > /* Handle interrupts */ > call do_irq > j ret_from_exception > 1: > /* Handle other exceptions */ > slli t0, s4, RISCV_LGPTR > @@ -102,10 +102,12 @@ SYM_CODE_START(handle_exception) > add t0, t1, t0 > /* Check if exception code lies within bounds */ > bgeu t0, t2, 1f > REG_L ra, 0(t0) > jalr ra, ra > j ret_from_exception > 1: > call do_trap_unknown > j ret_from_exception > SYM_CODE_END(handle_exception) > > > > void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > bool (*fn)(void *, unsigned long), void *arg) > @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct > *task, struct pt_regs *regs, > fp = frame->fp; > pc = ftrace_graph_ret_addr(current, NULL, frame->ra, > &frame->ra); > - if (pc == (unsigned long)ret_from_exception) { > + if (is_exception_frame(pc)) { > if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > break; > > -- > 2.25.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CAUTION - External Sender] Re: [PATCH] riscv: Improve exception and system call latency 2024-06-03 6:39 ` Guo Ren @ 2024-06-04 8:15 ` Cyril Bur 2024-06-05 5:52 ` Guo Ren 0 siblings, 1 reply; 11+ messages in thread From: Cyril Bur @ 2024-06-04 8:15 UTC (permalink / raw) To: Guo Ren Cc: Anton Blanchard, paul.walmsley, palmer, aou, linux-riscv, linux-kernel On Mon, Jun 3, 2024 at 4:39 PM Guo Ren <guoren@kernel.org> wrote: > > On Mon, Jun 3, 2024 at 12:38 PM Cyril Bur <cyrilbur@tenstorrent.com> wrote: > > > > [ apologies, I think my mailer is going to mess up the formatting ] > > > > On 26 Dec 2023, at 2:56 PM, Guo Ren <guoren@kernel.org> wrote: > > > > On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote: > > > > Many CPUs implement return address branch prediction as a stack. The > > RISCV architecture refers to this as a return address stack (RAS). If > > this gets corrupted then the CPU will mispredict at least one but > > potentally many function returns. > > > > There are two issues with the current RISCV exception code: > > > > - We are using the alternate link stack (x5/t0) for the indirect branch > > which makes the hardware think this is a function return. This will > > corrupt the RAS. > > > > - We modify the return address of handle_exception to point to > > ret_from_exception. This will also corrupt the RAS. > > > > Testing the null system call latency before and after the patch: > > > > Visionfive2 (StarFive JH7110 / U74) > > baseline: 189.87 ns > > patched: 176.76 ns > > > > Lichee pi 4a (T-Head TH1520 / C910) > > baseline: 666.58 ns > > patched: 636.90 ns > > > > Just over 7% on the U74 and just over 4% on the C910. > > > > > > Yes, the wrong "jalr zero, t0/ra" would pop RAS and destroy the RAS > > layout of the hardware for the userspace. How about giving a fake push > > for the RAS to connect "jalr zero, ra" of sub-function call return? I'm > > curious if you could measure the difference with only one RAS > > misprediction. > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > index 54ca4564a926..94c7d2be35d0 100644 > > --- a/arch/riscv/kernel/entry.S > > +++ b/arch/riscv/kernel/entry.S > > @@ -93,7 +93,8 @@ SYM_CODE_START(handle_exception) > > bge s4, zero, 1f > > > > /* Handle interrupts */ > > - tail do_irq > > + auipc t0, do_irq > > + jalr t0, t0 > > 1: > > /* Handle other exceptions */ > > slli t0, s4, RISCV_LGPTR > > @@ -103,9 +104,10 @@ SYM_CODE_START(handle_exception) > > /* Check if exception code lies within bounds */ > > bgeu t0, t2, 1f > > REG_L t0, 0(t0) > > - jr t0 > > + jalr t0, t0 > > 1: > > - tail do_trap_unknown > > + auipc t0, do_trap_unknown > > + jalr t0, t0 > > SYM_CODE_END(handle_exception) > > > > You could prepare a deeper userspace stack calling if you want better > > measurement results. > > > > > > Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> > > Reviewed-by: Jisheng Zhang <jszhang@kernel.org> > > --- > > > > This introduces some complexity in the stackframe walk code. PowerPC > > resolves the multiple exception exit paths issue by placing a value into > > the exception stack frame (basically the word "REGS") that the stack frame > > code can look for. Perhaps something to look at. > > > > arch/riscv/kernel/entry.S | 21 ++++++++++++++------- > > arch/riscv/kernel/stacktrace.c | 14 +++++++++++++- > > 2 files changed, 27 insertions(+), 8 deletions(-) > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > index 54ca4564a926..89af35edbf6c 100644 > > --- a/arch/riscv/kernel/entry.S > > +++ b/arch/riscv/kernel/entry.S > > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > > scs_load_current_if_task_changed s5 > > > > move a0, sp /* pt_regs */ > > - la ra, ret_from_exception > > > > /* > > * MSB of cause differentiates between > > @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception) > > bge s4, zero, 1f > > > > /* Handle interrupts */ > > - tail do_irq > > + call do_irq > > +.globl ret_from_irq_exception > > +ret_from_irq_exception: > > + j ret_from_exception > > 1: > > /* Handle other exceptions */ > > slli t0, s4, RISCV_LGPTR > > @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception) > > la t2, excp_vect_table_end > > add t0, t1, t0 > > /* Check if exception code lies within bounds */ > > - bgeu t0, t2, 1f > > - REG_L t0, 0(t0) > > - jr t0 > > -1: > > - tail do_trap_unknown > > + bgeu t0, t2, 3f > > + REG_L t1, 0(t0) > > +2: jalr ra,t1 > > +.globl ret_from_other_exception > > +ret_from_other_exception: > > + j ret_from_exception > > +3: > > + > > + la t1, do_trap_unknown > > + j 2b > > SYM_CODE_END(handle_exception) > > > > /* > > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c > > index 64a9c093aef9..b9cd131bbc4c 100644 > > --- a/arch/riscv/kernel/stacktrace.c > > +++ b/arch/riscv/kernel/stacktrace.c > > @@ -17,6 +17,18 @@ > > #ifdef CONFIG_FRAME_POINTER > > > > extern asmlinkage void ret_from_exception(void); > > +extern asmlinkage void ret_from_irq_exception(void); > > +extern asmlinkage void ret_from_other_exception(void); > > + > > +static inline bool is_exception_frame(unsigned long pc) > > +{ > > + if ((pc == (unsigned long)ret_from_exception) || > > + (pc == (unsigned long)ret_from_irq_exception) || > > + (pc == (unsigned long)ret_from_other_exception)) > > + return true; > > + > > + return false; > > +} > > > > We needn't put too many .globl in the entry.S, and just check that pc is > > in SYM_CODE_START/END(handle_exception), then entry.S would be cleaner: > > > > Hi Guo, > > > > I've taken this patch over from Anton, mostly just to tidy it up. I'd > > like to incorporate > > what you mention here but I'm not sure how to achieve it. Have I > > missed something > > obvious? As things currently stand there doesn't seem to be a way to get the end > > (or size) of handle_exception in C code. > "just check that pc is in SYM_CODE_START/END(handle_exception)." > Sorry, I think my previous description is wrong. > > Instead, "We needn't modify anything in stacktrace.c because we keep > ra = ret_from_exception." > > I want only cleaner and smaller modifications to the entry.S to > satisfy RAS prediction performance requirements. > I completely agree with keeping entry.S as clean as possible. I'm trying to understand what you mean. Isn't the point of the patch to remove ra = ret_from_exception? I'm not sure but maybe we can leave entry.S as it is and the check in stacktrace.c can become a check for pc == handle_exception? > > > > > Your advice is greatly appreciated, > > > > Thanks, > > > > Cyril > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > index 54ca4564a926..d452d5f12b1b 100644 > > --- a/arch/riscv/kernel/entry.S > > +++ b/arch/riscv/kernel/entry.S > > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > > scs_load_current_if_task_changed s5 > > > > move a0, sp /* pt_regs */ > > > > /* > > * MSB of cause differentiates between > > @@ -93,7 +92,8 @@ SYM_CODE_START(handle_exception) > > bge s4, zero, 1f > > > > /* Handle interrupts */ > > call do_irq > > j ret_from_exception > > 1: > > /* Handle other exceptions */ > > slli t0, s4, RISCV_LGPTR > > @@ -102,10 +102,12 @@ SYM_CODE_START(handle_exception) > > add t0, t1, t0 > > /* Check if exception code lies within bounds */ > > bgeu t0, t2, 1f > > REG_L ra, 0(t0) > > jalr ra, ra > > j ret_from_exception > > 1: > > call do_trap_unknown > > j ret_from_exception > > SYM_CODE_END(handle_exception) > > > > > > > > void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > > bool (*fn)(void *, unsigned long), void *arg) > > @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct > > *task, struct pt_regs *regs, > > fp = frame->fp; > > pc = ftrace_graph_ret_addr(current, NULL, frame->ra, > > &frame->ra); > > - if (pc == (unsigned long)ret_from_exception) { > > + if (is_exception_frame(pc)) { > > if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > > break; > > > > -- > > 2.25.1 > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > -- > Best Regards > Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CAUTION - External Sender] Re: [PATCH] riscv: Improve exception and system call latency 2024-06-04 8:15 ` [CAUTION - External Sender] " Cyril Bur @ 2024-06-05 5:52 ` Guo Ren 2024-06-05 5:53 ` Guo Ren 0 siblings, 1 reply; 11+ messages in thread From: Guo Ren @ 2024-06-05 5:52 UTC (permalink / raw) To: Cyril Bur Cc: Anton Blanchard, paul.walmsley, palmer, aou, linux-riscv, linux-kernel On Tue, Jun 4, 2024 at 4:16 PM Cyril Bur <cyrilbur@tenstorrent.com> wrote: > > On Mon, Jun 3, 2024 at 4:39 PM Guo Ren <guoren@kernel.org> wrote: > > > > On Mon, Jun 3, 2024 at 12:38 PM Cyril Bur <cyrilbur@tenstorrent.com> wrote: > > > > > > [ apologies, I think my mailer is going to mess up the formatting ] > > > > > > On 26 Dec 2023, at 2:56 PM, Guo Ren <guoren@kernel.org> wrote: > > > > > > On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote: > > > > > > Many CPUs implement return address branch prediction as a stack. The > > > RISCV architecture refers to this as a return address stack (RAS). If > > > this gets corrupted then the CPU will mispredict at least one but > > > potentally many function returns. > > > > > > There are two issues with the current RISCV exception code: > > > > > > - We are using the alternate link stack (x5/t0) for the indirect branch > > > which makes the hardware think this is a function return. This will > > > corrupt the RAS. > > > > > > - We modify the return address of handle_exception to point to > > > ret_from_exception. This will also corrupt the RAS. > > > > > > Testing the null system call latency before and after the patch: > > > > > > Visionfive2 (StarFive JH7110 / U74) > > > baseline: 189.87 ns > > > patched: 176.76 ns > > > > > > Lichee pi 4a (T-Head TH1520 / C910) > > > baseline: 666.58 ns > > > patched: 636.90 ns > > > > > > Just over 7% on the U74 and just over 4% on the C910. > > > > > > > > > Yes, the wrong "jalr zero, t0/ra" would pop RAS and destroy the RAS > > > layout of the hardware for the userspace. How about giving a fake push > > > for the RAS to connect "jalr zero, ra" of sub-function call return? I'm > > > curious if you could measure the difference with only one RAS > > > misprediction. > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 54ca4564a926..94c7d2be35d0 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -93,7 +93,8 @@ SYM_CODE_START(handle_exception) > > > bge s4, zero, 1f > > > > > > /* Handle interrupts */ > > > - tail do_irq > > > + auipc t0, do_irq > > > + jalr t0, t0 > > > 1: > > > /* Handle other exceptions */ > > > slli t0, s4, RISCV_LGPTR > > > @@ -103,9 +104,10 @@ SYM_CODE_START(handle_exception) > > > /* Check if exception code lies within bounds */ > > > bgeu t0, t2, 1f > > > REG_L t0, 0(t0) > > > - jr t0 > > > + jalr t0, t0 > > > 1: > > > - tail do_trap_unknown > > > + auipc t0, do_trap_unknown > > > + jalr t0, t0 > > > SYM_CODE_END(handle_exception) > > > > > > You could prepare a deeper userspace stack calling if you want better > > > measurement results. > > > > > > > > > Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> > > > Reviewed-by: Jisheng Zhang <jszhang@kernel.org> > > > --- > > > > > > This introduces some complexity in the stackframe walk code. PowerPC > > > resolves the multiple exception exit paths issue by placing a value into > > > the exception stack frame (basically the word "REGS") that the stack frame > > > code can look for. Perhaps something to look at. > > > > > > arch/riscv/kernel/entry.S | 21 ++++++++++++++------- > > > arch/riscv/kernel/stacktrace.c | 14 +++++++++++++- > > > 2 files changed, 27 insertions(+), 8 deletions(-) > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 54ca4564a926..89af35edbf6c 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > > > scs_load_current_if_task_changed s5 > > > > > > move a0, sp /* pt_regs */ > > > - la ra, ret_from_exception > > > > > > /* > > > * MSB of cause differentiates between > > > @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception) > > > bge s4, zero, 1f > > > > > > /* Handle interrupts */ > > > - tail do_irq > > > + call do_irq > > > +.globl ret_from_irq_exception > > > +ret_from_irq_exception: > > > + j ret_from_exception > > > 1: > > > /* Handle other exceptions */ > > > slli t0, s4, RISCV_LGPTR > > > @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception) > > > la t2, excp_vect_table_end > > > add t0, t1, t0 > > > /* Check if exception code lies within bounds */ > > > - bgeu t0, t2, 1f > > > - REG_L t0, 0(t0) > > > - jr t0 > > > -1: > > > - tail do_trap_unknown > > > + bgeu t0, t2, 3f > > > + REG_L t1, 0(t0) > > > +2: jalr ra,t1 > > > +.globl ret_from_other_exception > > > +ret_from_other_exception: > > > + j ret_from_exception > > > +3: > > > + > > > + la t1, do_trap_unknown > > > + j 2b > > > SYM_CODE_END(handle_exception) > > > > > > /* > > > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c > > > index 64a9c093aef9..b9cd131bbc4c 100644 > > > --- a/arch/riscv/kernel/stacktrace.c > > > +++ b/arch/riscv/kernel/stacktrace.c > > > @@ -17,6 +17,18 @@ > > > #ifdef CONFIG_FRAME_POINTER > > > > > > extern asmlinkage void ret_from_exception(void); > > > +extern asmlinkage void ret_from_irq_exception(void); > > > +extern asmlinkage void ret_from_other_exception(void); > > > + > > > +static inline bool is_exception_frame(unsigned long pc) > > > +{ > > > + if ((pc == (unsigned long)ret_from_exception) || > > > + (pc == (unsigned long)ret_from_irq_exception) || > > > + (pc == (unsigned long)ret_from_other_exception)) > > > + return true; > > > + > > > + return false; > > > +} > > > > > > We needn't put too many .globl in the entry.S, and just check that pc is > > > in SYM_CODE_START/END(handle_exception), then entry.S would be cleaner: > > > > > > Hi Guo, > > > > > > I've taken this patch over from Anton, mostly just to tidy it up. I'd > > > like to incorporate > > > what you mention here but I'm not sure how to achieve it. Have I > > > missed something > > > obvious? As things currently stand there doesn't seem to be a way to get the end > > > (or size) of handle_exception in C code. > > "just check that pc is in SYM_CODE_START/END(handle_exception)." > > Sorry, I think my previous description is wrong. > > > > Instead, "We needn't modify anything in stacktrace.c because we keep > > ra = ret_from_exception." > > > > I want only cleaner and smaller modifications to the entry.S to > > satisfy RAS prediction performance requirements. > > > > I completely agree with keeping entry.S as clean as possible. > > I'm trying to understand what you mean. > Isn't the point of the patch to remove ra = ret_from_exception? > > I'm not sure but maybe we can leave entry.S as it is and the check in > stacktrace.c can become a check for pc == handle_exception? Yes, we needn't modify stacktrace.c anymore. Because we still keep "ra = handle_exception". > > > > > > > > > Your advice is greatly appreciated, > > > > > > Thanks, > > > > > > Cyril > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > index 54ca4564a926..d452d5f12b1b 100644 > > > --- a/arch/riscv/kernel/entry.S > > > +++ b/arch/riscv/kernel/entry.S > > > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > > > scs_load_current_if_task_changed s5 > > > > > > move a0, sp /* pt_regs */ > > > > > > /* > > > * MSB of cause differentiates between > > > @@ -93,7 +92,8 @@ SYM_CODE_START(handle_exception) > > > bge s4, zero, 1f > > > > > > /* Handle interrupts */ > > > call do_irq > > > j ret_from_exception > > > 1: > > > /* Handle other exceptions */ > > > slli t0, s4, RISCV_LGPTR > > > @@ -102,10 +102,12 @@ SYM_CODE_START(handle_exception) > > > add t0, t1, t0 > > > /* Check if exception code lies within bounds */ > > > bgeu t0, t2, 1f > > > REG_L ra, 0(t0) > > > jalr ra, ra > > > j ret_from_exception > > > 1: > > > call do_trap_unknown > > > j ret_from_exception > > > SYM_CODE_END(handle_exception) > > > > > > > > > > > > void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > > > bool (*fn)(void *, unsigned long), void *arg) > > > @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct > > > *task, struct pt_regs *regs, > > > fp = frame->fp; > > > pc = ftrace_graph_ret_addr(current, NULL, frame->ra, > > > &frame->ra); > > > - if (pc == (unsigned long)ret_from_exception) { > > > + if (is_exception_frame(pc)) { > > > if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > > > break; > > > > > > -- > > > 2.25.1 > > > > > > > > > _______________________________________________ > > > linux-riscv mailing list > > > linux-riscv@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > > > > > -- > > Best Regards > > Guo Ren -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [CAUTION - External Sender] Re: [PATCH] riscv: Improve exception and system call latency 2024-06-05 5:52 ` Guo Ren @ 2024-06-05 5:53 ` Guo Ren 0 siblings, 0 replies; 11+ messages in thread From: Guo Ren @ 2024-06-05 5:53 UTC (permalink / raw) To: Cyril Bur Cc: Anton Blanchard, paul.walmsley, palmer, aou, linux-riscv, linux-kernel On Wed, Jun 5, 2024 at 1:52 PM Guo Ren <guoren@kernel.org> wrote: > > On Tue, Jun 4, 2024 at 4:16 PM Cyril Bur <cyrilbur@tenstorrent.com> wrote: > > > > On Mon, Jun 3, 2024 at 4:39 PM Guo Ren <guoren@kernel.org> wrote: > > > > > > On Mon, Jun 3, 2024 at 12:38 PM Cyril Bur <cyrilbur@tenstorrent.com> wrote: > > > > > > > > [ apologies, I think my mailer is going to mess up the formatting ] > > > > > > > > On 26 Dec 2023, at 2:56 PM, Guo Ren <guoren@kernel.org> wrote: > > > > > > > > On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote: > > > > > > > > Many CPUs implement return address branch prediction as a stack. The > > > > RISCV architecture refers to this as a return address stack (RAS). If > > > > this gets corrupted then the CPU will mispredict at least one but > > > > potentally many function returns. > > > > > > > > There are two issues with the current RISCV exception code: > > > > > > > > - We are using the alternate link stack (x5/t0) for the indirect branch > > > > which makes the hardware think this is a function return. This will > > > > corrupt the RAS. > > > > > > > > - We modify the return address of handle_exception to point to > > > > ret_from_exception. This will also corrupt the RAS. > > > > > > > > Testing the null system call latency before and after the patch: > > > > > > > > Visionfive2 (StarFive JH7110 / U74) > > > > baseline: 189.87 ns > > > > patched: 176.76 ns > > > > > > > > Lichee pi 4a (T-Head TH1520 / C910) > > > > baseline: 666.58 ns > > > > patched: 636.90 ns > > > > > > > > Just over 7% on the U74 and just over 4% on the C910. > > > > > > > > > > > > Yes, the wrong "jalr zero, t0/ra" would pop RAS and destroy the RAS > > > > layout of the hardware for the userspace. How about giving a fake push > > > > for the RAS to connect "jalr zero, ra" of sub-function call return? I'm > > > > curious if you could measure the difference with only one RAS > > > > misprediction. > > > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > > index 54ca4564a926..94c7d2be35d0 100644 > > > > --- a/arch/riscv/kernel/entry.S > > > > +++ b/arch/riscv/kernel/entry.S > > > > @@ -93,7 +93,8 @@ SYM_CODE_START(handle_exception) > > > > bge s4, zero, 1f > > > > > > > > /* Handle interrupts */ > > > > - tail do_irq > > > > + auipc t0, do_irq > > > > + jalr t0, t0 > > > > 1: > > > > /* Handle other exceptions */ > > > > slli t0, s4, RISCV_LGPTR > > > > @@ -103,9 +104,10 @@ SYM_CODE_START(handle_exception) > > > > /* Check if exception code lies within bounds */ > > > > bgeu t0, t2, 1f > > > > REG_L t0, 0(t0) > > > > - jr t0 > > > > + jalr t0, t0 > > > > 1: > > > > - tail do_trap_unknown > > > > + auipc t0, do_trap_unknown > > > > + jalr t0, t0 > > > > SYM_CODE_END(handle_exception) > > > > > > > > You could prepare a deeper userspace stack calling if you want better > > > > measurement results. > > > > > > > > > > > > Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> > > > > Reviewed-by: Jisheng Zhang <jszhang@kernel.org> > > > > --- > > > > > > > > This introduces some complexity in the stackframe walk code. PowerPC > > > > resolves the multiple exception exit paths issue by placing a value into > > > > the exception stack frame (basically the word "REGS") that the stack frame > > > > code can look for. Perhaps something to look at. > > > > > > > > arch/riscv/kernel/entry.S | 21 ++++++++++++++------- > > > > arch/riscv/kernel/stacktrace.c | 14 +++++++++++++- > > > > 2 files changed, 27 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > > index 54ca4564a926..89af35edbf6c 100644 > > > > --- a/arch/riscv/kernel/entry.S > > > > +++ b/arch/riscv/kernel/entry.S > > > > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > > > > scs_load_current_if_task_changed s5 > > > > > > > > move a0, sp /* pt_regs */ > > > > - la ra, ret_from_exception > > > > > > > > /* > > > > * MSB of cause differentiates between > > > > @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception) > > > > bge s4, zero, 1f > > > > > > > > /* Handle interrupts */ > > > > - tail do_irq > > > > + call do_irq > > > > +.globl ret_from_irq_exception > > > > +ret_from_irq_exception: > > > > + j ret_from_exception > > > > 1: > > > > /* Handle other exceptions */ > > > > slli t0, s4, RISCV_LGPTR > > > > @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception) > > > > la t2, excp_vect_table_end > > > > add t0, t1, t0 > > > > /* Check if exception code lies within bounds */ > > > > - bgeu t0, t2, 1f > > > > - REG_L t0, 0(t0) > > > > - jr t0 > > > > -1: > > > > - tail do_trap_unknown > > > > + bgeu t0, t2, 3f > > > > + REG_L t1, 0(t0) > > > > +2: jalr ra,t1 > > > > +.globl ret_from_other_exception > > > > +ret_from_other_exception: > > > > + j ret_from_exception > > > > +3: > > > > + > > > > + la t1, do_trap_unknown > > > > + j 2b > > > > SYM_CODE_END(handle_exception) > > > > > > > > /* > > > > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c > > > > index 64a9c093aef9..b9cd131bbc4c 100644 > > > > --- a/arch/riscv/kernel/stacktrace.c > > > > +++ b/arch/riscv/kernel/stacktrace.c > > > > @@ -17,6 +17,18 @@ > > > > #ifdef CONFIG_FRAME_POINTER > > > > > > > > extern asmlinkage void ret_from_exception(void); > > > > +extern asmlinkage void ret_from_irq_exception(void); > > > > +extern asmlinkage void ret_from_other_exception(void); > > > > + > > > > +static inline bool is_exception_frame(unsigned long pc) > > > > +{ > > > > + if ((pc == (unsigned long)ret_from_exception) || > > > > + (pc == (unsigned long)ret_from_irq_exception) || > > > > + (pc == (unsigned long)ret_from_other_exception)) > > > > + return true; > > > > + > > > > + return false; > > > > +} > > > > > > > > We needn't put too many .globl in the entry.S, and just check that pc is > > > > in SYM_CODE_START/END(handle_exception), then entry.S would be cleaner: > > > > > > > > Hi Guo, > > > > > > > > I've taken this patch over from Anton, mostly just to tidy it up. I'd > > > > like to incorporate > > > > what you mention here but I'm not sure how to achieve it. Have I > > > > missed something > > > > obvious? As things currently stand there doesn't seem to be a way to get the end > > > > (or size) of handle_exception in C code. > > > "just check that pc is in SYM_CODE_START/END(handle_exception)." > > > Sorry, I think my previous description is wrong. > > > > > > Instead, "We needn't modify anything in stacktrace.c because we keep > > > ra = ret_from_exception." > > > > > > I want only cleaner and smaller modifications to the entry.S to > > > satisfy RAS prediction performance requirements. > > > > > > > I completely agree with keeping entry.S as clean as possible. > > > > I'm trying to understand what you mean. > > Isn't the point of the patch to remove ra = ret_from_exception? > > > > I'm not sure but maybe we can leave entry.S as it is and the check in > > stacktrace.c can become a check for pc == handle_exception? > Yes, we needn't modify stacktrace.c anymore. Because we still keep "ra > = handle_exception". Sorry for the typo. "ra = ret_from_exception." > > > > > > > > > > > > > > Your advice is greatly appreciated, > > > > > > > > Thanks, > > > > > > > > Cyril > > > > > > > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > > > > index 54ca4564a926..d452d5f12b1b 100644 > > > > --- a/arch/riscv/kernel/entry.S > > > > +++ b/arch/riscv/kernel/entry.S > > > > @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception) > > > > scs_load_current_if_task_changed s5 > > > > > > > > move a0, sp /* pt_regs */ > > > > > > > > /* > > > > * MSB of cause differentiates between > > > > @@ -93,7 +92,8 @@ SYM_CODE_START(handle_exception) > > > > bge s4, zero, 1f > > > > > > > > /* Handle interrupts */ > > > > call do_irq > > > > j ret_from_exception > > > > 1: > > > > /* Handle other exceptions */ > > > > slli t0, s4, RISCV_LGPTR > > > > @@ -102,10 +102,12 @@ SYM_CODE_START(handle_exception) > > > > add t0, t1, t0 > > > > /* Check if exception code lies within bounds */ > > > > bgeu t0, t2, 1f > > > > REG_L ra, 0(t0) > > > > jalr ra, ra > > > > j ret_from_exception > > > > 1: > > > > call do_trap_unknown > > > > j ret_from_exception > > > > SYM_CODE_END(handle_exception) > > > > > > > > > > > > > > > > void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > > > > bool (*fn)(void *, unsigned long), void *arg) > > > > @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct > > > > *task, struct pt_regs *regs, > > > > fp = frame->fp; > > > > pc = ftrace_graph_ret_addr(current, NULL, frame->ra, > > > > &frame->ra); > > > > - if (pc == (unsigned long)ret_from_exception) { > > > > + if (is_exception_frame(pc)) { > > > > if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > > > > break; > > > > > > > > -- > > > > 2.25.1 > > > > > > > > > > > > _______________________________________________ > > > > linux-riscv mailing list > > > > linux-riscv@lists.infradead.org > > > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > > > > > > > > > -- > > > Best Regards > > > Guo Ren > > > > -- > Best Regards > Guo Ren -- Best Regards Guo Ren _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2] riscv: Improve exception and system call latency 2023-12-25 4:00 [PATCH] riscv: Improve exception and system call latency Anton Blanchard 2023-12-25 10:09 ` Jisheng Zhang 2023-12-26 3:56 ` Guo Ren @ 2024-06-07 6:13 ` Cyril Bur 2024-07-11 14:40 ` Jisheng Zhang 2024-07-25 13:20 ` patchwork-bot+linux-riscv 2 siblings, 2 replies; 11+ messages in thread From: Cyril Bur @ 2024-06-07 6:13 UTC (permalink / raw) To: paul.walmsley, palmer, aou; +Cc: linux-riscv, linux-kernel, antonb From: Anton Blanchard <antonb@tenstorrent.com> Many CPUs implement return address branch prediction as a stack. The RISCV architecture refers to this as a return address stack (RAS). If this gets corrupted then the CPU will mispredict at least one but potentally many function returns. There are two issues with the current RISCV exception code: - We are using the alternate link stack (x5/t0) for the indirect branch which makes the hardware think this is a function return. This will corrupt the RAS. - We modify the return address of handle_exception to point to ret_from_exception. This will also corrupt the RAS. Testing the null system call latency before and after the patch: Visionfive2 (StarFive JH7110 / U74) baseline: 189.87 ns patched: 176.76 ns Lichee pi 4a (T-Head TH1520 / C910) baseline: 666.58 ns patched: 636.90 ns Just over 7% on the U74 and just over 4% on the C910. Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com> --- v2: Simplify jalr ra,t1 to jalr t1 Drop extra .globl from entry.S and use ra == handle_exception --- arch/riscv/kernel/entry.S | 17 ++++++++++------- arch/riscv/kernel/stacktrace.c | 4 ++-- 2 files changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S index 68a24cf9481a..c933460ed3e9 100644 --- a/arch/riscv/kernel/entry.S +++ b/arch/riscv/kernel/entry.S @@ -88,7 +88,6 @@ SYM_CODE_START(handle_exception) call riscv_v_context_nesting_start #endif move a0, sp /* pt_regs */ - la ra, ret_from_exception /* * MSB of cause differentiates between @@ -97,7 +96,8 @@ SYM_CODE_START(handle_exception) bge s4, zero, 1f /* Handle interrupts */ - tail do_irq + call do_irq + j ret_from_exception 1: /* Handle other exceptions */ slli t0, s4, RISCV_LGPTR @@ -105,11 +105,14 @@ SYM_CODE_START(handle_exception) la t2, excp_vect_table_end add t0, t1, t0 /* Check if exception code lies within bounds */ - bgeu t0, t2, 1f - REG_L t0, 0(t0) - jr t0 -1: - tail do_trap_unknown + bgeu t0, t2, 3f + REG_L t1, 0(t0) +2: jalr t1 + j ret_from_exception +3: + + la t1, do_trap_unknown + j 2b SYM_CODE_END(handle_exception) ASM_NOKPROBE(handle_exception) diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c index 528ec7cc9a62..5eb3d135b717 100644 --- a/arch/riscv/kernel/stacktrace.c +++ b/arch/riscv/kernel/stacktrace.c @@ -16,7 +16,7 @@ #ifdef CONFIG_FRAME_POINTER -extern asmlinkage void ret_from_exception(void); +extern asmlinkage void handle_exception(void); static inline int fp_is_valid(unsigned long fp, unsigned long sp) { @@ -70,7 +70,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, fp = frame->fp; pc = ftrace_graph_ret_addr(current, NULL, frame->ra, &frame->ra); - if (pc == (unsigned long)ret_from_exception) { + if (pc == (unsigned long)handle_exception) { if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) break; -- 2.25.1 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] riscv: Improve exception and system call latency 2024-06-07 6:13 ` [PATCH v2] " Cyril Bur @ 2024-07-11 14:40 ` Jisheng Zhang 2024-07-25 13:20 ` patchwork-bot+linux-riscv 1 sibling, 0 replies; 11+ messages in thread From: Jisheng Zhang @ 2024-07-11 14:40 UTC (permalink / raw) To: Cyril Bur; +Cc: paul.walmsley, palmer, aou, linux-riscv, linux-kernel, antonb On Thu, Jun 06, 2024 at 11:13:35PM -0700, Cyril Bur wrote: > From: Anton Blanchard <antonb@tenstorrent.com> > > Many CPUs implement return address branch prediction as a stack. The > RISCV architecture refers to this as a return address stack (RAS). If > this gets corrupted then the CPU will mispredict at least one but > potentally many function returns. > > There are two issues with the current RISCV exception code: > > - We are using the alternate link stack (x5/t0) for the indirect branch > which makes the hardware think this is a function return. This will > corrupt the RAS. > > - We modify the return address of handle_exception to point to > ret_from_exception. This will also corrupt the RAS. > > Testing the null system call latency before and after the patch: > > Visionfive2 (StarFive JH7110 / U74) > baseline: 189.87 ns > patched: 176.76 ns > > Lichee pi 4a (T-Head TH1520 / C910) > baseline: 666.58 ns > patched: 636.90 ns > > Just over 7% on the U74 and just over 4% on the C910. > > Signed-off-by: Anton Blanchard <antonb@tenstorrent.com> > Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com> Tested-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> > --- > v2: > Simplify jalr ra,t1 to jalr t1 > Drop extra .globl from entry.S and use ra == handle_exception > --- > arch/riscv/kernel/entry.S | 17 ++++++++++------- > arch/riscv/kernel/stacktrace.c | 4 ++-- > 2 files changed, 12 insertions(+), 9 deletions(-) > > diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S > index 68a24cf9481a..c933460ed3e9 100644 > --- a/arch/riscv/kernel/entry.S > +++ b/arch/riscv/kernel/entry.S > @@ -88,7 +88,6 @@ SYM_CODE_START(handle_exception) > call riscv_v_context_nesting_start > #endif > move a0, sp /* pt_regs */ > - la ra, ret_from_exception > > /* > * MSB of cause differentiates between > @@ -97,7 +96,8 @@ SYM_CODE_START(handle_exception) > bge s4, zero, 1f > > /* Handle interrupts */ > - tail do_irq > + call do_irq > + j ret_from_exception > 1: > /* Handle other exceptions */ > slli t0, s4, RISCV_LGPTR > @@ -105,11 +105,14 @@ SYM_CODE_START(handle_exception) > la t2, excp_vect_table_end > add t0, t1, t0 > /* Check if exception code lies within bounds */ > - bgeu t0, t2, 1f > - REG_L t0, 0(t0) > - jr t0 > -1: > - tail do_trap_unknown > + bgeu t0, t2, 3f > + REG_L t1, 0(t0) > +2: jalr t1 > + j ret_from_exception > +3: > + > + la t1, do_trap_unknown > + j 2b > SYM_CODE_END(handle_exception) > ASM_NOKPROBE(handle_exception) > > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c > index 528ec7cc9a62..5eb3d135b717 100644 > --- a/arch/riscv/kernel/stacktrace.c > +++ b/arch/riscv/kernel/stacktrace.c > @@ -16,7 +16,7 @@ > > #ifdef CONFIG_FRAME_POINTER > > -extern asmlinkage void ret_from_exception(void); > +extern asmlinkage void handle_exception(void); > > static inline int fp_is_valid(unsigned long fp, unsigned long sp) > { > @@ -70,7 +70,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > fp = frame->fp; > pc = ftrace_graph_ret_addr(current, NULL, frame->ra, > &frame->ra); > - if (pc == (unsigned long)ret_from_exception) { > + if (pc == (unsigned long)handle_exception) { > if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > break; > > -- > 2.25.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] riscv: Improve exception and system call latency 2024-06-07 6:13 ` [PATCH v2] " Cyril Bur 2024-07-11 14:40 ` Jisheng Zhang @ 2024-07-25 13:20 ` patchwork-bot+linux-riscv 1 sibling, 0 replies; 11+ messages in thread From: patchwork-bot+linux-riscv @ 2024-07-25 13:20 UTC (permalink / raw) To: Cyril Bur; +Cc: linux-riscv, paul.walmsley, palmer, aou, linux-kernel, antonb Hello: This patch was applied to riscv/linux.git (for-next) by Palmer Dabbelt <palmer@rivosinc.com>: On Thu, 6 Jun 2024 23:13:35 -0700 you wrote: > From: Anton Blanchard <antonb@tenstorrent.com> > > Many CPUs implement return address branch prediction as a stack. The > RISCV architecture refers to this as a return address stack (RAS). If > this gets corrupted then the CPU will mispredict at least one but > potentally many function returns. > > [...] Here is the summary with links: - [v2] riscv: Improve exception and system call latency https://git.kernel.org/riscv/c/9eb7421003d3 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-07-25 13:20 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-12-25 4:00 [PATCH] riscv: Improve exception and system call latency Anton Blanchard 2023-12-25 10:09 ` Jisheng Zhang 2023-12-26 3:56 ` Guo Ren 2024-06-03 4:38 ` Cyril Bur 2024-06-03 6:39 ` Guo Ren 2024-06-04 8:15 ` [CAUTION - External Sender] " Cyril Bur 2024-06-05 5:52 ` Guo Ren 2024-06-05 5:53 ` Guo Ren 2024-06-07 6:13 ` [PATCH v2] " Cyril Bur 2024-07-11 14:40 ` Jisheng Zhang 2024-07-25 13:20 ` patchwork-bot+linux-riscv
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox