* Re: [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory
2024-05-27 16:12 [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory Ard Biesheuvel
@ 2024-05-28 9:26 ` Thorsten Scherer
2024-05-28 11:17 ` Linus Walleij
2024-05-28 17:04 ` Justin Chen
2 siblings, 0 replies; 4+ messages in thread
From: Thorsten Scherer @ 2024-05-28 9:26 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, arnd, linux, linus.walleij, Ard Biesheuvel,
Uwe Kleine-König, Justin Chen, Florian Fainelli, Doug Berger
Hello,
On Mon, May 27, 2024 at 06:12:37PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The frame pointer unwinder relies on a standard layout of the stack
> frame, consisting of (in downward order)
>
> Calling frame:
> PC <---------+
> LR |
> SP |
> FP |
> .. locals .. |
> Callee frame: |
> PC |
> LR |
> SP |
> FP ----------+
>
> where after storing its previous value on the stack, FP is made to point
> at the location of PC in the callee stack frame. The ftrace code
> assumes that this activation record is pushed first, and that any stack
> space for locals is allocated below this. This would imply that the
> caller's value of SP can be obtained by adding 4 to FP (which points to
> PC in the calling frame).
>
> However, recent versions of GCC appear to deviate from this rule, and so
> the only reliable way to obtain the caller's value of SP is to read it
> from the activation record. Since this involves a read from memory
> rather than simple arithmetic, we need to use the uaccess API here which
> protects against inadvertent data aborts due to corruption of data on
> the stack.
>
> The plain uaccess API is ftrace instrumented itself, so to avoid
> unbounded recursion, use the __get_kernel_nofault() primitive instead.
>
> Closes: https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76
> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Closes: https://lore.kernel.org/all/d870c149-4363-43de-b0ea-7125dec5608e@broadcom.com/
> Reported-by: Justin Chen <justin.chen@broadcom.com>
> Cc: Thorsten Scherer <T.Scherer@eckelmann.de>
> Cc: Florian Fainelli <florian.fainelli@broadcom.com>
> Cc: Doug Berger <doug.berger@broadcom.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Thank you for this patch.
This fixes the issue on our i.MX25 board (test based on v6.9).
Tested-by: Thorsten Scherer <t.scherer@eckelmann.de>
Best regards
Thorsten
> ---
> arch/arm/kernel/ftrace.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index a0b6d1e3812f..e61591f33a6c 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -232,11 +232,24 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
> unsigned long old;
>
> if (unlikely(atomic_read(¤t->tracing_graph_pause)))
> +err_out:
> return;
>
> if (IS_ENABLED(CONFIG_UNWINDER_FRAME_POINTER)) {
> - /* FP points one word below parent's top of stack */
> - frame_pointer += 4;
> + /*
> + * Usually, the stack frames are contiguous in memory but cases
> + * have been observed where the next stack frame does not live
> + * at 'frame_pointer + 4' as this code used to assume.
> + *
> + * Instead, dereference the field in the stack frame that
> + * stores the SP of the calling frame: to avoid unbounded
> + * recursion, this cannot involve any ftrace instrumented
> + * functions, so use the __get_kernel_nofault() primitive
> + * directly.
> + */
> + __get_kernel_nofault(&frame_pointer,
> + (unsigned long *)(frame_pointer - 8),
> + unsigned long, err_out);
> } else {
> struct stackframe frame = {
> .fp = frame_pointer,
> --
> 2.45.1.288.g0e0cd299f1-goog
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory
2024-05-27 16:12 [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory Ard Biesheuvel
2024-05-28 9:26 ` Thorsten Scherer
@ 2024-05-28 11:17 ` Linus Walleij
2024-05-28 17:04 ` Justin Chen
2 siblings, 0 replies; 4+ messages in thread
From: Linus Walleij @ 2024-05-28 11:17 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, arnd, linux, Ard Biesheuvel,
Uwe Kleine-König, Justin Chen, Thorsten Scherer,
Florian Fainelli, Doug Berger
On Mon, May 27, 2024 at 6:12 PM Ard Biesheuvel <ardb+git@google.com> wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The frame pointer unwinder relies on a standard layout of the stack
> frame, consisting of (in downward order)
>
> Calling frame:
> PC <---------+
> LR |
> SP |
> FP |
> .. locals .. |
> Callee frame: |
> PC |
> LR |
> SP |
> FP ----------+
>
> where after storing its previous value on the stack, FP is made to point
> at the location of PC in the callee stack frame. The ftrace code
> assumes that this activation record is pushed first, and that any stack
> space for locals is allocated below this. This would imply that the
> caller's value of SP can be obtained by adding 4 to FP (which points to
> PC in the calling frame).
>
> However, recent versions of GCC appear to deviate from this rule, and so
> the only reliable way to obtain the caller's value of SP is to read it
> from the activation record. Since this involves a read from memory
> rather than simple arithmetic, we need to use the uaccess API here which
> protects against inadvertent data aborts due to corruption of data on
> the stack.
>
> The plain uaccess API is ftrace instrumented itself, so to avoid
> unbounded recursion, use the __get_kernel_nofault() primitive instead.
>
> Closes: https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76
> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Closes: https://lore.kernel.org/all/d870c149-4363-43de-b0ea-7125dec5608e@broadcom.com/
> Reported-by: Justin Chen <justin.chen@broadcom.com>
> Cc: Thorsten Scherer <T.Scherer@eckelmann.de>
> Cc: Florian Fainelli <florian.fainelli@broadcom.com>
> Cc: Doug Berger <doug.berger@broadcom.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Good catch and nice patch.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Yours,
Linus Walleij
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory
2024-05-27 16:12 [PATCH] ARM: ftrace: Don't assume stack frames are contiguous in memory Ard Biesheuvel
2024-05-28 9:26 ` Thorsten Scherer
2024-05-28 11:17 ` Linus Walleij
@ 2024-05-28 17:04 ` Justin Chen
2 siblings, 0 replies; 4+ messages in thread
From: Justin Chen @ 2024-05-28 17:04 UTC (permalink / raw)
To: Ard Biesheuvel, linux-arm-kernel
Cc: arnd, linux, linus.walleij, Ard Biesheuvel, Uwe Kleine-König,
Thorsten Scherer, Florian Fainelli, Doug Berger
[-- Attachment #1.1: Type: text/plain, Size: 3541 bytes --]
On 5/27/24 9:12 AM, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> The frame pointer unwinder relies on a standard layout of the stack
> frame, consisting of (in downward order)
>
> Calling frame:
> PC <---------+
> LR |
> SP |
> FP |
> .. locals .. |
> Callee frame: |
> PC |
> LR |
> SP |
> FP ----------+
>
> where after storing its previous value on the stack, FP is made to point
> at the location of PC in the callee stack frame. The ftrace code
> assumes that this activation record is pushed first, and that any stack
> space for locals is allocated below this. This would imply that the
> caller's value of SP can be obtained by adding 4 to FP (which points to
> PC in the calling frame).
>
> However, recent versions of GCC appear to deviate from this rule, and so
> the only reliable way to obtain the caller's value of SP is to read it
> from the activation record. Since this involves a read from memory
> rather than simple arithmetic, we need to use the uaccess API here which
> protects against inadvertent data aborts due to corruption of data on
> the stack.
>
> The plain uaccess API is ftrace instrumented itself, so to avoid
> unbounded recursion, use the __get_kernel_nofault() primitive instead.
Apologies this feel through the cracks. Going back to the original
thread, I said the first proposed fix worked, but I accidentally tested
an old configuration. I was investigating why it didn't work and never
got back to it. Looks like you found out why.
Thanks,
Justin
>
> Closes: https://lore.kernel.org/all/alp44tukzo6mvcwl4ke4ehhmojrqnv6xfcdeuliybxfjfvgd3e@gpjvwj33cc76
> Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Closes: https://lore.kernel.org/all/d870c149-4363-43de-b0ea-7125dec5608e@broadcom.com/
> Reported-by: Justin Chen <justin.chen@broadcom.com>
> Cc: Thorsten Scherer <T.Scherer@eckelmann.de>
> Cc: Florian Fainelli <florian.fainelli@broadcom.com>
> Cc: Doug Berger <doug.berger@broadcom.com>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Justin Chen <justin.chen@broadcom.com>
> ---
> arch/arm/kernel/ftrace.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index a0b6d1e3812f..e61591f33a6c 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -232,11 +232,24 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
> unsigned long old;
>
> if (unlikely(atomic_read(¤t->tracing_graph_pause)))
> +err_out:
> return;
>
> if (IS_ENABLED(CONFIG_UNWINDER_FRAME_POINTER)) {
> - /* FP points one word below parent's top of stack */
> - frame_pointer += 4;
> + /*
> + * Usually, the stack frames are contiguous in memory but cases
> + * have been observed where the next stack frame does not live
> + * at 'frame_pointer + 4' as this code used to assume.
> + *
> + * Instead, dereference the field in the stack frame that
> + * stores the SP of the calling frame: to avoid unbounded
> + * recursion, this cannot involve any ftrace instrumented
> + * functions, so use the __get_kernel_nofault() primitive
> + * directly.
> + */
> + __get_kernel_nofault(&frame_pointer,
> + (unsigned long *)(frame_pointer - 8),
> + unsigned long, err_out);
> } else {
> struct stackframe frame = {
> .fp = frame_pointer,
[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4206 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 4+ messages in thread