linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory
@ 2024-05-27 16:12 Ard Biesheuvel
  2024-05-28  9:26 ` Thorsten Scherer
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ard Biesheuvel @ 2024-05-27 16:12 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: arnd, linux, linus.walleij, Ard Biesheuvel, Uwe Kleine-König,
	Justin Chen, Thorsten Scherer, Florian Fainelli, Doug Berger

From: Ard Biesheuvel <ardb@kernel.org>

The frame pointer unwinder relies on a standard layout of the stack
frame, consisting of (in downward order)

Calling frame:
  PC   <---------+
  LR             |
  SP             |
  FP             |
  .. locals ..   |
Callee frame:    |
  PC             |
  LR             |
  SP             |
  FP   ----------+

where after storing its previous value on the stack, FP is made to point
at the location of PC in the callee stack frame.  The ftrace code
assumes that this activation record is pushed first, and that any stack
space for locals is allocated below this. This would imply that the
caller's value of SP can be obtained by adding 4 to FP (which points to
PC in the calling frame).

However, recent versions of GCC appear to deviate from this rule, and so
the only reliable way to obtain the caller's value of SP is to read it
from the activation record. Since this involves a read from memory
rather than simple arithmetic, we need to use the uaccess API here which
protects against inadvertent data aborts due to corruption of data on
the stack.

The plain uaccess API is ftrace instrumented itself, so to avoid
unbounded recursion, use the __get_kernel_nofault() primitive instead.

Closes: https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Closes: https://lore.kernel.org/all/d870c149-4363-43de-b0ea-7125dec5608e@broadcom.com/
Reported-by: Justin Chen <justin.chen@broadcom.com>
Cc: Thorsten Scherer <T.Scherer@eckelmann.de>
Cc: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Doug Berger <doug.berger@broadcom.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/ftrace.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index a0b6d1e3812f..e61591f33a6c 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -232,11 +232,24 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 	unsigned long old;
 
 	if (unlikely(atomic_read(&current->tracing_graph_pause)))
+err_out:
 		return;
 
 	if (IS_ENABLED(CONFIG_UNWINDER_FRAME_POINTER)) {
-		/* FP points one word below parent's top of stack */
-		frame_pointer += 4;
+		/*
+		 * Usually, the stack frames are contiguous in memory but cases
+		 * have been observed where the next stack frame does not live
+		 * at 'frame_pointer + 4' as this code used to assume.
+		 *
+		 * Instead, dereference the field in the stack frame that
+		 * stores the SP of the calling frame: to avoid unbounded
+		 * recursion, this cannot involve any ftrace instrumented
+		 * functions, so use the __get_kernel_nofault() primitive
+		 * directly.
+		 */
+		__get_kernel_nofault(&frame_pointer,
+				     (unsigned long *)(frame_pointer - 8),
+				     unsigned long, err_out);
 	} else {
 		struct stackframe frame = {
 			.fp = frame_pointer,
-- 
2.45.1.288.g0e0cd299f1-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory
  2024-05-27 16:12 [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory Ard Biesheuvel
@ 2024-05-28  9:26 ` Thorsten Scherer
  2024-05-28 11:17 ` Linus Walleij
  2024-05-28 17:04 ` Justin Chen
  2 siblings, 0 replies; 4+ messages in thread
From: Thorsten Scherer @ 2024-05-28  9:26 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, arnd, linux, linus.walleij, Ard Biesheuvel,
	Uwe Kleine-König, Justin Chen, Florian Fainelli, Doug Berger

Hello,

On Mon, May 27, 2024 at 06:12:37PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> The frame pointer unwinder relies on a standard layout of the stack
> frame, consisting of (in downward order)
> 
> Calling frame:
>   PC   <---------+
>   LR             |
>   SP             |
>   FP             |
>   .. locals ..   |
> Callee frame:    |
>   PC             |
>   LR             |
>   SP             |
>   FP   ----------+
> 
> where after storing its previous value on the stack, FP is made to point
> at the location of PC in the callee stack frame.  The ftrace code
> assumes that this activation record is pushed first, and that any stack
> space for locals is allocated below this. This would imply that the
> caller's value of SP can be obtained by adding 4 to FP (which points to
> PC in the calling frame).
> 
> However, recent versions of GCC appear to deviate from this rule, and so
> the only reliable way to obtain the caller's value of SP is to read it
> from the activation record. Since this involves a read from memory
> rather than simple arithmetic, we need to use the uaccess API here which
> protects against inadvertent data aborts due to corruption of data on
> the stack.
> 
> The plain uaccess API is ftrace instrumented itself, so to avoid
> unbounded recursion, use the __get_kernel_nofault() primitive instead.
> 
> Closes: https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76
> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Closes: https://lore.kernel.org/all/d870c149-4363-43de-b0ea-7125dec5608e@broadcom.com/
> Reported-by: Justin Chen <justin.chen@broadcom.com>
> Cc: Thorsten Scherer <T.Scherer@eckelmann.de>
> Cc: Florian Fainelli <florian.fainelli@broadcom.com>
> Cc: Doug Berger <doug.berger@broadcom.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Thank you for this patch.

This fixes the issue on our i.MX25 board (test based on v6.9).

Tested-by: Thorsten Scherer <t.scherer@eckelmann.de>

Best regards
Thorsten

> ---
>  arch/arm/kernel/ftrace.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index a0b6d1e3812f..e61591f33a6c 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -232,11 +232,24 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
>  	unsigned long old;
>  
>  	if (unlikely(atomic_read(&current->tracing_graph_pause)))
> +err_out:
>  		return;
>  
>  	if (IS_ENABLED(CONFIG_UNWINDER_FRAME_POINTER)) {
> -		/* FP points one word below parent's top of stack */
> -		frame_pointer += 4;
> +		/*
> +		 * Usually, the stack frames are contiguous in memory but cases
> +		 * have been observed where the next stack frame does not live
> +		 * at 'frame_pointer + 4' as this code used to assume.
> +		 *
> +		 * Instead, dereference the field in the stack frame that
> +		 * stores the SP of the calling frame: to avoid unbounded
> +		 * recursion, this cannot involve any ftrace instrumented
> +		 * functions, so use the __get_kernel_nofault() primitive
> +		 * directly.
> +		 */
> +		__get_kernel_nofault(&frame_pointer,
> +				     (unsigned long *)(frame_pointer - 8),
> +				     unsigned long, err_out);
>  	} else {
>  		struct stackframe frame = {
>  			.fp = frame_pointer,
> -- 
> 2.45.1.288.g0e0cd299f1-goog
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory
  2024-05-27 16:12 [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory Ard Biesheuvel
  2024-05-28  9:26 ` Thorsten Scherer
@ 2024-05-28 11:17 ` Linus Walleij
  2024-05-28 17:04 ` Justin Chen
  2 siblings, 0 replies; 4+ messages in thread
From: Linus Walleij @ 2024-05-28 11:17 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, arnd, linux, Ard Biesheuvel,
	Uwe Kleine-König, Justin Chen, Thorsten Scherer,
	Florian Fainelli, Doug Berger

On Mon, May 27, 2024 at 6:12 PM Ard Biesheuvel <ardb+git@google.com> wrote:

> From: Ard Biesheuvel <ardb@kernel.org>
>
> The frame pointer unwinder relies on a standard layout of the stack
> frame, consisting of (in downward order)
>
> Calling frame:
>   PC   <---------+
>   LR             |
>   SP             |
>   FP             |
>   .. locals ..   |
> Callee frame:    |
>   PC             |
>   LR             |
>   SP             |
>   FP   ----------+
>
> where after storing its previous value on the stack, FP is made to point
> at the location of PC in the callee stack frame.  The ftrace code
> assumes that this activation record is pushed first, and that any stack
> space for locals is allocated below this. This would imply that the
> caller's value of SP can be obtained by adding 4 to FP (which points to
> PC in the calling frame).
>
> However, recent versions of GCC appear to deviate from this rule, and so
> the only reliable way to obtain the caller's value of SP is to read it
> from the activation record. Since this involves a read from memory
> rather than simple arithmetic, we need to use the uaccess API here which
> protects against inadvertent data aborts due to corruption of data on
> the stack.
>
> The plain uaccess API is ftrace instrumented itself, so to avoid
> unbounded recursion, use the __get_kernel_nofault() primitive instead.
>
> Closes: https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76
> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Closes: https://lore.kernel.org/all/d870c149-4363-43de-b0ea-7125dec5608e@broadcom.com/
> Reported-by: Justin Chen <justin.chen@broadcom.com>
> Cc: Thorsten Scherer <T.Scherer@eckelmann.de>
> Cc: Florian Fainelli <florian.fainelli@broadcom.com>
> Cc: Doug Berger <doug.berger@broadcom.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Good catch and nice patch.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory
  2024-05-27 16:12 [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory Ard Biesheuvel
  2024-05-28  9:26 ` Thorsten Scherer
  2024-05-28 11:17 ` Linus Walleij
@ 2024-05-28 17:04 ` Justin Chen
  2 siblings, 0 replies; 4+ messages in thread
From: Justin Chen @ 2024-05-28 17:04 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-arm-kernel
  Cc: arnd, linux, linus.walleij, Ard Biesheuvel, Uwe Kleine-König,
	Thorsten Scherer, Florian Fainelli, Doug Berger


[-- Attachment #1.1: Type: text/plain, Size: 3541 bytes --]



On 5/27/24 9:12 AM, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> The frame pointer unwinder relies on a standard layout of the stack
> frame, consisting of (in downward order)
> 
> Calling frame:
>    PC   <---------+
>    LR             |
>    SP             |
>    FP             |
>    .. locals ..   |
> Callee frame:    |
>    PC             |
>    LR             |
>    SP             |
>    FP   ----------+
> 
> where after storing its previous value on the stack, FP is made to point
> at the location of PC in the callee stack frame.  The ftrace code
> assumes that this activation record is pushed first, and that any stack
> space for locals is allocated below this. This would imply that the
> caller's value of SP can be obtained by adding 4 to FP (which points to
> PC in the calling frame).
> 
> However, recent versions of GCC appear to deviate from this rule, and so
> the only reliable way to obtain the caller's value of SP is to read it
> from the activation record. Since this involves a read from memory
> rather than simple arithmetic, we need to use the uaccess API here which
> protects against inadvertent data aborts due to corruption of data on
> the stack.
> 
> The plain uaccess API is ftrace instrumented itself, so to avoid
> unbounded recursion, use the __get_kernel_nofault() primitive instead.

Apologies this feel through the cracks. Going back to the original 
thread, I said the first proposed fix worked, but I accidentally tested 
an old configuration. I was investigating why it didn't work and never 
got back to it. Looks like you found out why.

Thanks,
Justin

> 
> Closes: https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76
> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Closes: https://lore.kernel.org/all/d870c149-4363-43de-b0ea-7125dec5608e@broadcom.com/
> Reported-by: Justin Chen <justin.chen@broadcom.com>
> Cc: Thorsten Scherer <T.Scherer@eckelmann.de>
> Cc: Florian Fainelli <florian.fainelli@broadcom.com>
> Cc: Doug Berger <doug.berger@broadcom.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Tested-by: Justin Chen <justin.chen@broadcom.com>

> ---
>   arch/arm/kernel/ftrace.c | 17 +++++++++++++++--
>   1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index a0b6d1e3812f..e61591f33a6c 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -232,11 +232,24 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
>   	unsigned long old;
>   
>   	if (unlikely(atomic_read(&current->tracing_graph_pause)))
> +err_out:
>   		return;
>   
>   	if (IS_ENABLED(CONFIG_UNWINDER_FRAME_POINTER)) {
> -		/* FP points one word below parent's top of stack */
> -		frame_pointer += 4;
> +		/*
> +		 * Usually, the stack frames are contiguous in memory but cases
> +		 * have been observed where the next stack frame does not live
> +		 * at 'frame_pointer + 4' as this code used to assume.
> +		 *
> +		 * Instead, dereference the field in the stack frame that
> +		 * stores the SP of the calling frame: to avoid unbounded
> +		 * recursion, this cannot involve any ftrace instrumented
> +		 * functions, so use the __get_kernel_nofault() primitive
> +		 * directly.
> +		 */
> +		__get_kernel_nofault(&frame_pointer,
> +				     (unsigned long *)(frame_pointer - 8),
> +				     unsigned long, err_out);
>   	} else {
>   		struct stackframe frame = {
>   			.fp = frame_pointer,

[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4206 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-05-28 17:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-27 16:12 [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory Ard Biesheuvel
2024-05-28  9:26 ` Thorsten Scherer
2024-05-28 11:17 ` Linus Walleij
2024-05-28 17:04 ` Justin Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).