All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jisheng Zhang <jszhang@kernel.org>
To: Anton Blanchard <antonb@tenstorrent.com>
Cc: paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] riscv: Improve exception and system call latency
Date: Mon, 25 Dec 2023 18:09:11 +0800	[thread overview]
Message-ID: <ZYlUxxeEmvewNzyL@xhacker> (raw)
In-Reply-To: <20231225040018.1660554-1-antonb@tenstorrent.com>

On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote:
> Many CPUs implement return address branch prediction as a stack. The
> RISCV architecture refers to this as a return address stack (RAS). If
> this gets corrupted then the CPU will mispredict at least one but
> potentally many function returns.
> 
> There are two issues with the current RISCV exception code:
> 
> - We are using the alternate link stack (x5/t0) for the indirect branch
>   which makes the hardware think this is a function return. This will
>   corrupt the RAS.
> 
> - We modify the return address of handle_exception to point to
>   ret_from_exception. This will also corrupt the RAS.
> 
> Testing the null system call latency before and after the patch:
> 
> Visionfive2 (StarFive JH7110 / U74)
> baseline: 189.87 ns
> patched:  176.76 ns
> 
> Lichee pi 4a (T-Head TH1520 / C910)
> baseline: 666.58 ns
> patched:  636.90 ns
> 
> Just over 7% on the U74 and just over 4% on the C910.

Nice improvement!

> 
> Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
> ---
> 
> This introduces some complexity in the stackframe walk code. PowerPC
> resolves the multiple exception exit paths issue by placing a value into
> the exception stack frame (basically the word "REGS") that the stack frame
> code can look for. Perhaps something to look at.
> 
>  arch/riscv/kernel/entry.S      | 21 ++++++++++++++-------
>  arch/riscv/kernel/stacktrace.c | 14 +++++++++++++-
>  2 files changed, 27 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index 54ca4564a926..89af35edbf6c 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception)
>  	scs_load_current_if_task_changed s5
>  
>  	move a0, sp /* pt_regs */
> -	la ra, ret_from_exception
>  
>  	/*
>  	 * MSB of cause differentiates between
> @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception)
>  	bge s4, zero, 1f
>  
>  	/* Handle interrupts */
> -	tail do_irq
> +	call do_irq
> +.globl ret_from_irq_exception
> +ret_from_irq_exception:
> +	j ret_from_exception
>  1:
>  	/* Handle other exceptions */
>  	slli t0, s4, RISCV_LGPTR
> @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception)
>  	la t2, excp_vect_table_end
>  	add t0, t1, t0
>  	/* Check if exception code lies within bounds */
> -	bgeu t0, t2, 1f
> -	REG_L t0, 0(t0)
> -	jr t0
> -1:
> -	tail do_trap_unknown
> +	bgeu t0, t2, 3f
> +	REG_L t1, 0(t0)
> +2:	jalr ra,t1

can be simplified to
	jalr t1

with the above change,
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>

> +.globl ret_from_other_exception
> +ret_from_other_exception:
> +	j ret_from_exception
> +3:
> +
> +	la t1, do_trap_unknown
> +	j 2b
>  SYM_CODE_END(handle_exception)
>  
>  /*
> diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
> index 64a9c093aef9..b9cd131bbc4c 100644
> --- a/arch/riscv/kernel/stacktrace.c
> +++ b/arch/riscv/kernel/stacktrace.c
> @@ -17,6 +17,18 @@
>  #ifdef CONFIG_FRAME_POINTER
>  
>  extern asmlinkage void ret_from_exception(void);
> +extern asmlinkage void ret_from_irq_exception(void);
> +extern asmlinkage void ret_from_other_exception(void);
> +
> +static inline bool is_exception_frame(unsigned long pc)
> +{
> +	if ((pc == (unsigned long)ret_from_exception) ||
> +	    (pc == (unsigned long)ret_from_irq_exception) ||
> +	    (pc == (unsigned long)ret_from_other_exception))
> +		return true;
> +
> +	return false;
> +}
>  
>  void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
>  			     bool (*fn)(void *, unsigned long), void *arg)
> @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
>  			fp = frame->fp;
>  			pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
>  						   &frame->ra);
> -			if (pc == (unsigned long)ret_from_exception) {
> +			if (is_exception_frame(pc)) {
>  				if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc)))
>  					break;
>  
> -- 
> 2.25.1
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Jisheng Zhang <jszhang@kernel.org>
To: Anton Blanchard <antonb@tenstorrent.com>
Cc: paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] riscv: Improve exception and system call latency
Date: Mon, 25 Dec 2023 18:09:11 +0800	[thread overview]
Message-ID: <ZYlUxxeEmvewNzyL@xhacker> (raw)
In-Reply-To: <20231225040018.1660554-1-antonb@tenstorrent.com>

On Sun, Dec 24, 2023 at 08:00:18PM -0800, Anton Blanchard wrote:
> Many CPUs implement return address branch prediction as a stack. The
> RISCV architecture refers to this as a return address stack (RAS). If
> this gets corrupted then the CPU will mispredict at least one but
> potentally many function returns.
> 
> There are two issues with the current RISCV exception code:
> 
> - We are using the alternate link stack (x5/t0) for the indirect branch
>   which makes the hardware think this is a function return. This will
>   corrupt the RAS.
> 
> - We modify the return address of handle_exception to point to
>   ret_from_exception. This will also corrupt the RAS.
> 
> Testing the null system call latency before and after the patch:
> 
> Visionfive2 (StarFive JH7110 / U74)
> baseline: 189.87 ns
> patched:  176.76 ns
> 
> Lichee pi 4a (T-Head TH1520 / C910)
> baseline: 666.58 ns
> patched:  636.90 ns
> 
> Just over 7% on the U74 and just over 4% on the C910.

Nice improvement!

> 
> Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
> ---
> 
> This introduces some complexity in the stackframe walk code. PowerPC
> resolves the multiple exception exit paths issue by placing a value into
> the exception stack frame (basically the word "REGS") that the stack frame
> code can look for. Perhaps something to look at.
> 
>  arch/riscv/kernel/entry.S      | 21 ++++++++++++++-------
>  arch/riscv/kernel/stacktrace.c | 14 +++++++++++++-
>  2 files changed, 27 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
> index 54ca4564a926..89af35edbf6c 100644
> --- a/arch/riscv/kernel/entry.S
> +++ b/arch/riscv/kernel/entry.S
> @@ -84,7 +84,6 @@ SYM_CODE_START(handle_exception)
>  	scs_load_current_if_task_changed s5
>  
>  	move a0, sp /* pt_regs */
> -	la ra, ret_from_exception
>  
>  	/*
>  	 * MSB of cause differentiates between
> @@ -93,7 +92,10 @@ SYM_CODE_START(handle_exception)
>  	bge s4, zero, 1f
>  
>  	/* Handle interrupts */
> -	tail do_irq
> +	call do_irq
> +.globl ret_from_irq_exception
> +ret_from_irq_exception:
> +	j ret_from_exception
>  1:
>  	/* Handle other exceptions */
>  	slli t0, s4, RISCV_LGPTR
> @@ -101,11 +103,16 @@ SYM_CODE_START(handle_exception)
>  	la t2, excp_vect_table_end
>  	add t0, t1, t0
>  	/* Check if exception code lies within bounds */
> -	bgeu t0, t2, 1f
> -	REG_L t0, 0(t0)
> -	jr t0
> -1:
> -	tail do_trap_unknown
> +	bgeu t0, t2, 3f
> +	REG_L t1, 0(t0)
> +2:	jalr ra,t1

can be simplified to
	jalr t1

with the above change,
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>

> +.globl ret_from_other_exception
> +ret_from_other_exception:
> +	j ret_from_exception
> +3:
> +
> +	la t1, do_trap_unknown
> +	j 2b
>  SYM_CODE_END(handle_exception)
>  
>  /*
> diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
> index 64a9c093aef9..b9cd131bbc4c 100644
> --- a/arch/riscv/kernel/stacktrace.c
> +++ b/arch/riscv/kernel/stacktrace.c
> @@ -17,6 +17,18 @@
>  #ifdef CONFIG_FRAME_POINTER
>  
>  extern asmlinkage void ret_from_exception(void);
> +extern asmlinkage void ret_from_irq_exception(void);
> +extern asmlinkage void ret_from_other_exception(void);
> +
> +static inline bool is_exception_frame(unsigned long pc)
> +{
> +	if ((pc == (unsigned long)ret_from_exception) ||
> +	    (pc == (unsigned long)ret_from_irq_exception) ||
> +	    (pc == (unsigned long)ret_from_other_exception))
> +		return true;
> +
> +	return false;
> +}
>  
>  void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
>  			     bool (*fn)(void *, unsigned long), void *arg)
> @@ -62,7 +74,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
>  			fp = frame->fp;
>  			pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
>  						   &frame->ra);
> -			if (pc == (unsigned long)ret_from_exception) {
> +			if (is_exception_frame(pc)) {
>  				if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc)))
>  					break;
>  
> -- 
> 2.25.1
> 
> 
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

  reply	other threads:[~2023-12-25 10:22 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-25  4:00 [PATCH] riscv: Improve exception and system call latency Anton Blanchard
2023-12-25  4:00 ` Anton Blanchard
2023-12-25 10:09 ` Jisheng Zhang [this message]
2023-12-25 10:09   ` Jisheng Zhang
2023-12-26  3:56 ` Guo Ren
2023-12-26  3:56   ` Guo Ren
2024-06-03  4:38   ` Cyril Bur
2024-06-03  4:38     ` Cyril Bur
2024-06-03  6:39     ` Guo Ren
2024-06-03  6:39       ` Guo Ren
2024-06-04  8:15       ` [CAUTION - External Sender] " Cyril Bur
2024-06-04  8:15         ` Cyril Bur
2024-06-05  5:52         ` Guo Ren
2024-06-05  5:52           ` Guo Ren
2024-06-05  5:53           ` Guo Ren
2024-06-05  5:53             ` Guo Ren
2024-06-07  6:13 ` [PATCH v2] " Cyril Bur
2024-06-07  6:13   ` Cyril Bur
2024-07-11 14:40   ` Jisheng Zhang
2024-07-11 14:40     ` Jisheng Zhang
2024-07-25 13:20   ` patchwork-bot+linux-riscv
2024-07-25 13:20     ` patchwork-bot+linux-riscv

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZYlUxxeEmvewNzyL@xhacker \
    --to=jszhang@kernel.org \
    --cc=antonb@tenstorrent.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.