* [PATCH 01/10] arm64: pt_regs: assert pt_regs is a multiple of 16 bytes
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-10 14:23 ` Mark Brown
2024-10-10 10:15 ` [PATCH 02/10] arm64: pt_regs: remove stale big-endian layout Mark Rutland
` (10 subsequent siblings)
11 siblings, 1 reply; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
To ensure that the stack is correctly aligned when branching to C code,
we require struct pt_regs to be a multiple of 16 bytes, as noted in a
comment.
Add an explicit assertion for this, so that any accidental violation of
this requirement will be caught by the compiler.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/ptrace.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 0abe975d68a8e..10eca3a3fed49 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -149,8 +149,7 @@ static inline unsigned long pstate_to_compat_psr(const unsigned long pstate)
/*
* This struct defines the way the registers are stored on the stack during an
- * exception. Note that sizeof(struct pt_regs) has to be a multiple of 16 (for
- * stack alignment). struct user_pt_regs must form a prefix of struct pt_regs.
+ * exception. struct user_pt_regs must form a prefix of struct pt_regs.
*/
struct pt_regs {
union {
@@ -180,6 +179,9 @@ struct pt_regs {
u64 exit_rcu;
};
+/* For correct stack alignment, pt_regs has to be a multiple of 16 bytes. */
+static_assert(IS_ALIGNED(sizeof(struct pt_regs), 16));
+
static inline bool in_syscall(struct pt_regs const *regs)
{
return regs->syscallno != NO_SYSCALL;
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 01/10] arm64: pt_regs: assert pt_regs is a multiple of 16 bytes
2024-10-10 10:15 ` [PATCH 01/10] arm64: pt_regs: assert pt_regs is a multiple of 16 bytes Mark Rutland
@ 2024-10-10 14:23 ` Mark Brown
0 siblings, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-10 14:23 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 402 bytes --]
On Thu, Oct 10, 2024 at 11:15:01AM +0100, Mark Rutland wrote:
> To ensure that the stack is correctly aligned when branching to C code,
> we require struct pt_regs to be a multiple of 16 bytes, as noted in a
> comment.
>
> Add an explicit assertion for this, so that any accidental violation of
> this requirement will be caught by the compiler.
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 02/10] arm64: pt_regs: remove stale big-endian layout
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
2024-10-10 10:15 ` [PATCH 01/10] arm64: pt_regs: assert pt_regs is a multiple of 16 bytes Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-10 14:27 ` Mark Brown
2024-10-10 10:15 ` [PATCH 03/10] arm64: pt_regs: rename "pmr_save" -> "pmr" Mark Rutland
` (9 subsequent siblings)
11 siblings, 1 reply; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
For historical reasons the layout of struct pt_regs is dependent on the
ocnfigured endianness, with the order of the 'sycallno' and 'unused2'
fields varying dependent upon whether __AARCH64EB__ is defined. We no
longer depend on the order of these two fields and can remove the
ifdeffery.
The current conditional layout was introduced in commit:
35d0e6fb4d219d64 ("arm64: syscallno is secretly an int, make it official")
At the time, this was necessary so that the entry assembly could use a
single STP instruction to save the pt_regs::{orig_x0,syscallno} fields,
without logic that was conditional on the endianness of the kernel:
| el0_svc_naked:
| stp x0, xscno, [sp, #S_ORIG_X0] // save the original x0 and syscall number
This logic was converted to C in commit:
f37099b6992a0b81 ("arm64: convert syscall trace logic to C")
Since that commit, we no longer manipulate pt_regs::orig_x0 from
assembly, and only manipulate pt_regs::syscallno as a 32-bit quantity
early in the kernel_entry assembly:
| /* Not in a syscall by default (el0_svc overwrites for real syscall) */
| .if \el == 0
| mov w21, #NO_SYSCALL
| str w21, [sp, #S_SYSCALLNO]
| .endif
Given the above, there's no longer a need for the layout of
pt_regs::{syscallno,unused2} to depend on the endianness of the kernel.
This patch removes the ifdeffery and places 'syscallno' before 'unused2'
regardless of the endianess of the kernel. At the same time, 'unused2'
is renamed to 'unused', as it is the only unused field within pt_regs.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/ptrace.h | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 10eca3a3fed49..1ae671fe64d86 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -162,13 +162,9 @@ struct pt_regs {
};
};
u64 orig_x0;
-#ifdef __AARCH64EB__
- u32 unused2;
s32 syscallno;
-#else
- s32 syscallno;
- u32 unused2;
-#endif
+ u32 unused;
+
u64 sdei_ttbr1;
/* Only valid when ARM64_HAS_GIC_PRIO_MASKING is enabled. */
u64 pmr_save;
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 02/10] arm64: pt_regs: remove stale big-endian layout
2024-10-10 10:15 ` [PATCH 02/10] arm64: pt_regs: remove stale big-endian layout Mark Rutland
@ 2024-10-10 14:27 ` Mark Brown
2024-10-11 12:16 ` Miroslav Benes
0 siblings, 1 reply; 31+ messages in thread
From: Mark Brown @ 2024-10-10 14:27 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 267 bytes --]
On Thu, Oct 10, 2024 at 11:15:02AM +0100, Mark Rutland wrote:
> For historical reasons the layout of struct pt_regs is dependent on the
> ocnfigured endianness, with the order of the 'sycallno' and 'unused2'
configured
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 02/10] arm64: pt_regs: remove stale big-endian layout
2024-10-10 14:27 ` Mark Brown
@ 2024-10-11 12:16 ` Miroslav Benes
0 siblings, 0 replies; 31+ messages in thread
From: Miroslav Benes @ 2024-10-11 12:16 UTC (permalink / raw)
To: Mark Brown
Cc: Mark Rutland, linux-arm-kernel, ardb, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
On Thu, 10 Oct 2024, Mark Brown wrote:
> On Thu, Oct 10, 2024 at 11:15:02AM +0100, Mark Rutland wrote:
>
> > For historical reasons the layout of struct pt_regs is dependent on the
> > ocnfigured endianness, with the order of the 'sycallno' and 'unused2'
>
> configured
and 'syscallno'
Miroslav
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 03/10] arm64: pt_regs: rename "pmr_save" -> "pmr"
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
2024-10-10 10:15 ` [PATCH 01/10] arm64: pt_regs: assert pt_regs is a multiple of 16 bytes Mark Rutland
2024-10-10 10:15 ` [PATCH 02/10] arm64: pt_regs: remove stale big-endian layout Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-10 14:27 ` Mark Brown
2024-10-10 10:15 ` [PATCH 04/10] arm64: pt_regs: swap 'unused' and 'pmr' fields Mark Rutland
` (8 subsequent siblings)
11 siblings, 1 reply; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
The pt_regs::pmr_save field is weirdly named relative to all other
pt_regs fields, with a '_save' suffix that doesn't make anything clearer
and only leads to more typing to access the field.
Remove the '_save' suffix.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/daifflags.h | 2 +-
arch/arm64/include/asm/processor.h | 2 +-
arch/arm64/include/asm/ptrace.h | 4 ++--
arch/arm64/kernel/asm-offsets.c | 2 +-
arch/arm64/kernel/entry.S | 4 ++--
arch/arm64/kernel/process.c | 2 +-
6 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/daifflags.h b/arch/arm64/include/asm/daifflags.h
index 55f57dfa8e2fe..fbb5c99eb2f9d 100644
--- a/arch/arm64/include/asm/daifflags.h
+++ b/arch/arm64/include/asm/daifflags.h
@@ -132,7 +132,7 @@ static inline void local_daif_inherit(struct pt_regs *regs)
trace_hardirqs_on();
if (system_uses_irq_prio_masking())
- gic_write_pmr(regs->pmr_save);
+ gic_write_pmr(regs->pmr);
/*
* We can't use local_daif_restore(regs->pstate) here as
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 1438424f00643..03b99164f7f46 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -293,7 +293,7 @@ static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
regs->pc = pc;
if (system_uses_irq_prio_masking())
- regs->pmr_save = GIC_PRIO_IRQON;
+ regs->pmr = GIC_PRIO_IRQON;
}
static inline void start_thread(struct pt_regs *regs, unsigned long pc,
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 1ae671fe64d86..e82e6e47ac9a4 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -167,7 +167,7 @@ struct pt_regs {
u64 sdei_ttbr1;
/* Only valid when ARM64_HAS_GIC_PRIO_MASKING is enabled. */
- u64 pmr_save;
+ u64 pmr;
u64 stackframe[2];
/* Only valid for some EL1 exceptions. */
@@ -211,7 +211,7 @@ static inline void forget_syscall(struct pt_regs *regs)
#define irqs_priority_unmasked(regs) \
(system_uses_irq_prio_masking() ? \
- (regs)->pmr_save == GIC_PRIO_IRQON : \
+ (regs)->pmr == GIC_PRIO_IRQON : \
true)
#define interrupts_enabled(regs) \
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 27de1dddb0abe..eb7fb2f9b9274 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -79,7 +79,7 @@ int main(void)
DEFINE(S_PSTATE, offsetof(struct pt_regs, pstate));
DEFINE(S_SYSCALLNO, offsetof(struct pt_regs, syscallno));
DEFINE(S_SDEI_TTBR1, offsetof(struct pt_regs, sdei_ttbr1));
- DEFINE(S_PMR_SAVE, offsetof(struct pt_regs, pmr_save));
+ DEFINE(S_PMR, offsetof(struct pt_regs, pmr));
DEFINE(S_STACKFRAME, offsetof(struct pt_regs, stackframe));
DEFINE(PT_REGS_SIZE, sizeof(struct pt_regs));
BLANK();
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 7ef0e127b149f..a84ce95ad96ce 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -315,7 +315,7 @@ alternative_if_not ARM64_HAS_GIC_PRIO_MASKING
alternative_else_nop_endif
mrs_s x20, SYS_ICC_PMR_EL1
- str x20, [sp, #S_PMR_SAVE]
+ str x20, [sp, #S_PMR]
mov x20, #GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET
msr_s SYS_ICC_PMR_EL1, x20
@@ -342,7 +342,7 @@ alternative_if_not ARM64_HAS_GIC_PRIO_MASKING
b .Lskip_pmr_restore\@
alternative_else_nop_endif
- ldr x20, [sp, #S_PMR_SAVE]
+ ldr x20, [sp, #S_PMR]
msr_s SYS_ICC_PMR_EL1, x20
/* Ensure priority change is seen by redistributor */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 0540653fbf382..2951b5ba19378 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -227,7 +227,7 @@ void __show_regs(struct pt_regs *regs)
printk("sp : %016llx\n", sp);
if (system_uses_irq_prio_masking())
- printk("pmr_save: %08llx\n", regs->pmr_save);
+ printk("pmr: %08llx\n", regs->pmr);
i = top_reg;
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 03/10] arm64: pt_regs: rename "pmr_save" -> "pmr"
2024-10-10 10:15 ` [PATCH 03/10] arm64: pt_regs: rename "pmr_save" -> "pmr" Mark Rutland
@ 2024-10-10 14:27 ` Mark Brown
0 siblings, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-10 14:27 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 345 bytes --]
On Thu, Oct 10, 2024 at 11:15:03AM +0100, Mark Rutland wrote:
> The pt_regs::pmr_save field is weirdly named relative to all other
> pt_regs fields, with a '_save' suffix that doesn't make anything clearer
> and only leads to more typing to access the field.
>
> Remove the '_save' suffix.
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 04/10] arm64: pt_regs: swap 'unused' and 'pmr' fields
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (2 preceding siblings ...)
2024-10-10 10:15 ` [PATCH 03/10] arm64: pt_regs: rename "pmr_save" -> "pmr" Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-10 14:33 ` Mark Brown
2024-10-10 10:15 ` [PATCH 05/10] arm64: use a common struct frame_record Mark Rutland
` (7 subsequent siblings)
11 siblings, 1 reply; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
In subsequent patches we'll want to add an additional u64 to struct
pt_regs. To make space, this patch swaps the 'unused' and 'pmr' fields,
as the 'pmr' value only requires bits[7:0] and can safely fit into a
u32, which frees up a 64-bit unused field.
The 'lockdep_hardirqs' and 'exit_rcu' fields should eventually be moved
out of pt_regs and managed locally within entry-common.c, so I've left
those as-is for the moment.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/ptrace.h | 6 +++---
arch/arm64/kernel/entry.S | 4 ++--
arch/arm64/kernel/process.c | 2 +-
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index e82e6e47ac9a4..92531aeba5310 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -163,11 +163,11 @@ struct pt_regs {
};
u64 orig_x0;
s32 syscallno;
- u32 unused;
+ u32 pmr;
u64 sdei_ttbr1;
- /* Only valid when ARM64_HAS_GIC_PRIO_MASKING is enabled. */
- u64 pmr;
+ u64 unused;
+
u64 stackframe[2];
/* Only valid for some EL1 exceptions. */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index a84ce95ad96ce..fa6d6d5ca5e02 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -315,7 +315,7 @@ alternative_if_not ARM64_HAS_GIC_PRIO_MASKING
alternative_else_nop_endif
mrs_s x20, SYS_ICC_PMR_EL1
- str x20, [sp, #S_PMR]
+ str w20, [sp, #S_PMR]
mov x20, #GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET
msr_s SYS_ICC_PMR_EL1, x20
@@ -342,7 +342,7 @@ alternative_if_not ARM64_HAS_GIC_PRIO_MASKING
b .Lskip_pmr_restore\@
alternative_else_nop_endif
- ldr x20, [sp, #S_PMR]
+ ldr w20, [sp, #S_PMR]
msr_s SYS_ICC_PMR_EL1, x20
/* Ensure priority change is seen by redistributor */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 2951b5ba19378..c722c1be6fa59 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -227,7 +227,7 @@ void __show_regs(struct pt_regs *regs)
printk("sp : %016llx\n", sp);
if (system_uses_irq_prio_masking())
- printk("pmr: %08llx\n", regs->pmr);
+ printk("pmr: %08x\n", regs->pmr);
i = top_reg;
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 04/10] arm64: pt_regs: swap 'unused' and 'pmr' fields
2024-10-10 10:15 ` [PATCH 04/10] arm64: pt_regs: swap 'unused' and 'pmr' fields Mark Rutland
@ 2024-10-10 14:33 ` Mark Brown
0 siblings, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-10 14:33 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 368 bytes --]
On Thu, Oct 10, 2024 at 11:15:04AM +0100, Mark Rutland wrote:
> In subsequent patches we'll want to add an additional u64 to struct
> pt_regs. To make space, this patch swaps the 'unused' and 'pmr' fields,
> as the 'pmr' value only requires bits[7:0] and can safely fit into a
> u32, which frees up a 64-bit unused field.
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 05/10] arm64: use a common struct frame_record
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (3 preceding siblings ...)
2024-10-10 10:15 ` [PATCH 04/10] arm64: pt_regs: swap 'unused' and 'pmr' fields Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-10 14:40 ` Mark Brown
2024-10-11 12:36 ` Miroslav Benes
2024-10-10 10:15 ` [PATCH 06/10] arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk() Mark Rutland
` (6 subsequent siblings)
11 siblings, 2 replies; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
Currnetly the signal handling code has its own struct frame_record,
the definition of struct pt_regs open-codes a frame record as an array,
and the kernel unwinder hard-codes frame record offsets.
Move to a common struct frame_record that can be used throughout the
kernel.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/ptrace.h | 4 +++-
arch/arm64/include/asm/stacktrace/common.h | 8 +++++---
arch/arm64/include/asm/stacktrace/frame.h | 13 +++++++++++++
arch/arm64/kernel/process.c | 2 +-
arch/arm64/kernel/signal.c | 5 -----
arch/arm64/kernel/stacktrace.c | 2 +-
6 files changed, 23 insertions(+), 11 deletions(-)
create mode 100644 arch/arm64/include/asm/stacktrace/frame.h
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 92531aeba5310..89c02f85f4b11 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -98,6 +98,8 @@
#include <linux/bug.h>
#include <linux/types.h>
+#include <asm/stacktrace/frame.h>
+
/* sizeof(struct user) for AArch32 */
#define COMPAT_USER_SZ 296
@@ -168,7 +170,7 @@ struct pt_regs {
u64 sdei_ttbr1;
u64 unused;
- u64 stackframe[2];
+ struct frame_record stackframe;
/* Only valid for some EL1 exceptions. */
u64 lockdep_hardirqs;
diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index f63dc654e545f..7fab6876e4970 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -137,21 +137,23 @@ static inline int unwind_consume_stack(struct unwind_state *state,
static inline int
unwind_next_frame_record(struct unwind_state *state)
{
+ struct frame_record *record;
unsigned long fp = state->fp;
int err;
if (fp & 0x7)
return -EINVAL;
- err = unwind_consume_stack(state, fp, 16);
+ err = unwind_consume_stack(state, fp, sizeof(*record));
if (err)
return err;
/*
* Record this frame record's values.
*/
- state->fp = READ_ONCE(*(unsigned long *)(fp));
- state->pc = READ_ONCE(*(unsigned long *)(fp + 8));
+ record = (struct frame_record *)fp;
+ state->fp = READ_ONCE(record->fp);
+ state->pc = READ_ONCE(record->lr);
return 0;
}
diff --git a/arch/arm64/include/asm/stacktrace/frame.h b/arch/arm64/include/asm/stacktrace/frame.h
new file mode 100644
index 0000000000000..6397bc847f147
--- /dev/null
+++ b/arch/arm64/include/asm/stacktrace/frame.h
@@ -0,0 +1,13 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_STACKTRACE_FRAME_H
+#define __ASM_STACKTRACE_FRAME_H
+
+/*
+ * A standard AAPCS64 frame record.
+ */
+struct frame_record {
+ u64 fp;
+ u64 lr;
+};
+
+#endif /* __ASM_STACKTRACE_FRAME_H */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index c722c1be6fa59..d45fd114eac3f 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -419,7 +419,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
* For the benefit of the unwinder, set up childregs->stackframe
* as the final frame for the new task.
*/
- p->thread.cpu_context.fp = (unsigned long)childregs->stackframe;
+ p->thread.cpu_context.fp = (unsigned long)&childregs->stackframe;
ptrace_hw_copy_thread(p);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 5619869475304..2c47f9a0e40ba 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -42,11 +42,6 @@ struct rt_sigframe {
struct ucontext uc;
};
-struct frame_record {
- u64 fp;
- u64 lr;
-};
-
struct rt_sigframe_user_layout {
struct rt_sigframe __user *sigframe;
struct frame_record __user *next_frame;
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 2729faaee4b4c..ffe8e4f549566 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -145,7 +145,7 @@ kunwind_next(struct kunwind_state *state)
int err;
/* Final frame; nothing to unwind */
- if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
+ if (fp == (unsigned long)&task_pt_regs(tsk)->stackframe)
return -ENOENT;
err = unwind_next_frame_record(&state->common);
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 05/10] arm64: use a common struct frame_record
2024-10-10 10:15 ` [PATCH 05/10] arm64: use a common struct frame_record Mark Rutland
@ 2024-10-10 14:40 ` Mark Brown
2024-10-11 12:36 ` Miroslav Benes
1 sibling, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-10 14:40 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 421 bytes --]
On Thu, Oct 10, 2024 at 11:15:05AM +0100, Mark Rutland wrote:
> Currnetly the signal handling code has its own struct frame_record,
> the definition of struct pt_regs open-codes a frame record as an array,
> and the kernel unwinder hard-codes frame record offsets.
>
> Move to a common struct frame_record that can be used throughout the
> kernel.
Much cleaner!
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 05/10] arm64: use a common struct frame_record
2024-10-10 10:15 ` [PATCH 05/10] arm64: use a common struct frame_record Mark Rutland
2024-10-10 14:40 ` Mark Brown
@ 2024-10-11 12:36 ` Miroslav Benes
1 sibling, 0 replies; 31+ messages in thread
From: Miroslav Benes @ 2024-10-11 12:36 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, broonie, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
On Thu, 10 Oct 2024, Mark Rutland wrote:
> Currnetly the signal handling code has its own struct frame_record,
Currently
It feels strange to send only typos but the rest really looks ok to me.
Miroslav
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 06/10] arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk()
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (4 preceding siblings ...)
2024-10-10 10:15 ` [PATCH 05/10] arm64: use a common struct frame_record Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-10 14:44 ` Mark Brown
2024-10-10 10:15 ` [PATCH 07/10] arm64: stacktrace: report source of unwind data Mark Rutland
` (5 subsequent siblings)
11 siblings, 1 reply; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
Currently dump_backtrace() can only print the PC value at each step of
the unwind, as this is all the information that arch_stack_walk()
passes to the dump_backtrace_entry() callback.
In future we'd like to print some additional information, such as the
origin of entries (e.g. PC, LR, FP) and/or the reliability thereof.
In preperation for doing so, this patch moves dump_backtrace() over to
kunwind_stack_walk(), which passes the full kunwind_state to the
callback.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kernel/stacktrace.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index ffe8e4f549566..22f8709f0e763 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -294,10 +294,10 @@ noinline noinstr void arch_bpf_stack_walk(bool (*consume_entry)(void *cookie, u6
kunwind_stack_walk(arch_bpf_unwind_consume_entry, &data, current, NULL);
}
-static bool dump_backtrace_entry(void *arg, unsigned long where)
+static bool dump_backtrace_entry(const struct kunwind_state *state, void *arg)
{
char *loglvl = arg;
- printk("%s %pSb\n", loglvl, (void *)where);
+ printk("%s %pSb\n", loglvl, (void *)state->common.pc);
return true;
}
@@ -316,7 +316,7 @@ void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
return;
printk("%sCall trace:\n", loglvl);
- arch_stack_walk(dump_backtrace_entry, (void *)loglvl, tsk, regs);
+ kunwind_stack_walk(dump_backtrace_entry, (void *)loglvl, tsk, regs);
put_task_stack(tsk);
}
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 06/10] arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk()
2024-10-10 10:15 ` [PATCH 06/10] arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk() Mark Rutland
@ 2024-10-10 14:44 ` Mark Brown
0 siblings, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-10 14:44 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 298 bytes --]
On Thu, Oct 10, 2024 at 11:15:06AM +0100, Mark Rutland wrote:
> Currently dump_backtrace() can only print the PC value at each step of
> the unwind, as this is all the information that arch_stack_walk()
> passes to the dump_backtrace_entry() callback.
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 07/10] arm64: stacktrace: report source of unwind data
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (5 preceding siblings ...)
2024-10-10 10:15 ` [PATCH 06/10] arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk() Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-10 15:13 ` Mark Brown
2024-10-10 10:15 ` [PATCH 08/10] arm64: stacktrace: report recovered PCs Mark Rutland
` (4 subsequent siblings)
11 siblings, 1 reply; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
When analysing a stacktrace it can be useful to know where an unwound PC
came from, as in some situations certain sources may be suspect or known
to be unreliable. In future it would also be useful to track this so
that certain unwind steps can be performed in a stateful manner. For
example when unwinding across an exception boundary, we'd ideally unwind
pt_regs::pc, then pt_regs::lr, then the next frame record.
This patch adds an enumerated set of unwind sources, tracks this during
the unwind, and updates dump_backtrace() to log these for interesting
unwind steps.
The interesting sources recorded are:
"C" - the PC came from the caller of an unwind function.
"T" - the PC came from thread_saved_pc() for a blocked task.
"P" - the PC came from a pt_regs::pc.
"U" - the PC came from an unknown source (indicates an unwinder error).
... with nothing recorded when the PC came from a frame_record::pc as
this is the vastly common case and logging this would make it difficult
to spot the more interesting cases.
For example, when triggering a backtrace via magic-sysrq + L, the CPU
handling the sysrq will have a backtrace whose first element is the
caller (C) of dump_backtrace():
| Call trace:
| show_stack+0x18/0x30 (C)
| dump_stack_lvl+0x60/0x80
| dump_stack+0x18/0x24
| nmi_cpu_backtrace+0xfc/0x140
| ...
... and other CPUs will have a backtrace whose first element is their
pt_regs::pc (P) at the instant the backtrace IPI was taken:
| Call trace:
| _raw_spin_unlock_irqrestore+0x8/0x50 (P)
| wake_up_process+0x18/0x24
| process_timeout+0x14/0x20
| call_timer_fn.isra.0+0x24/0x80
| ...
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kernel/stacktrace.c | 47 +++++++++++++++++++++++++++++++---
1 file changed, 43 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 22f8709f0e763..83ab37e13159e 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -20,6 +20,14 @@
#include <asm/stack_pointer.h>
#include <asm/stacktrace.h>
+enum kunwind_source {
+ KUNWIND_SOURCE_UNKNOWN,
+ KUNWIND_SOURCE_FRAME,
+ KUNWIND_SOURCE_CALLER,
+ KUNWIND_SOURCE_TASK,
+ KUNWIND_SOURCE_REGS_PC,
+};
+
/*
* Kernel unwind state
*
@@ -37,6 +45,7 @@ struct kunwind_state {
#ifdef CONFIG_KRETPROBES
struct llist_node *kr_cur;
#endif
+ enum kunwind_source source;
};
static __always_inline void
@@ -45,6 +54,7 @@ kunwind_init(struct kunwind_state *state,
{
unwind_init_common(&state->common);
state->task = task;
+ state->source = KUNWIND_SOURCE_UNKNOWN;
}
/*
@@ -62,6 +72,7 @@ kunwind_init_from_regs(struct kunwind_state *state,
state->common.fp = regs->regs[29];
state->common.pc = regs->pc;
+ state->source = KUNWIND_SOURCE_REGS_PC;
}
/*
@@ -79,6 +90,7 @@ kunwind_init_from_caller(struct kunwind_state *state)
state->common.fp = (unsigned long)__builtin_frame_address(1);
state->common.pc = (unsigned long)__builtin_return_address(0);
+ state->source = KUNWIND_SOURCE_CALLER;
}
/*
@@ -99,6 +111,7 @@ kunwind_init_from_task(struct kunwind_state *state,
state->common.fp = thread_saved_fp(task);
state->common.pc = thread_saved_pc(task);
+ state->source = KUNWIND_SOURCE_TASK;
}
static __always_inline int
@@ -148,9 +161,19 @@ kunwind_next(struct kunwind_state *state)
if (fp == (unsigned long)&task_pt_regs(tsk)->stackframe)
return -ENOENT;
- err = unwind_next_frame_record(&state->common);
- if (err)
- return err;
+ switch (state->source) {
+ case KUNWIND_SOURCE_FRAME:
+ case KUNWIND_SOURCE_CALLER:
+ case KUNWIND_SOURCE_TASK:
+ case KUNWIND_SOURCE_REGS_PC:
+ err = unwind_next_frame_record(&state->common);
+ if (err)
+ return err;
+ state->source = KUNWIND_SOURCE_FRAME;
+ break;
+ default:
+ return -EINVAL;
+ }
state->common.pc = ptrauth_strip_kernel_insn_pac(state->common.pc);
@@ -294,10 +317,26 @@ noinline noinstr void arch_bpf_stack_walk(bool (*consume_entry)(void *cookie, u6
kunwind_stack_walk(arch_bpf_unwind_consume_entry, &data, current, NULL);
}
+static const char *state_source_string(const struct kunwind_state *state)
+{
+ switch (state->source) {
+ case KUNWIND_SOURCE_FRAME: return NULL;
+ case KUNWIND_SOURCE_CALLER: return "C";
+ case KUNWIND_SOURCE_TASK: return "T";
+ case KUNWIND_SOURCE_REGS_PC: return "P";
+ default: return "U";
+ }
+}
+
static bool dump_backtrace_entry(const struct kunwind_state *state, void *arg)
{
+ const char *source = state_source_string(state);
char *loglvl = arg;
- printk("%s %pSb\n", loglvl, (void *)state->common.pc);
+ printk("%s %pSb%s%s%s\n", loglvl,
+ (void *)state->common.pc,
+ source ? " (" : "",
+ source ? source : "",
+ source ? ")" : "");
return true;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 07/10] arm64: stacktrace: report source of unwind data
2024-10-10 10:15 ` [PATCH 07/10] arm64: stacktrace: report source of unwind data Mark Rutland
@ 2024-10-10 15:13 ` Mark Brown
0 siblings, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-10 15:13 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 537 bytes --]
On Thu, Oct 10, 2024 at 11:15:07AM +0100, Mark Rutland wrote:
> When analysing a stacktrace it can be useful to know where an unwound PC
> came from, as in some situations certain sources may be suspect or known
> to be unreliable. In future it would also be useful to track this so
> that certain unwind steps can be performed in a stateful manner. For
> example when unwinding across an exception boundary, we'd ideally unwind
> pt_regs::pc, then pt_regs::lr, then the next frame record.
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 08/10] arm64: stacktrace: report recovered PCs
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (6 preceding siblings ...)
2024-10-10 10:15 ` [PATCH 07/10] arm64: stacktrace: report source of unwind data Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-10 15:16 ` Mark Brown
2024-10-11 12:58 ` Miroslav Benes
2024-10-10 10:15 ` [PATCH 09/10] arm64: stacktrace: split unwind_consume_stack() Mark Rutland
` (3 subsequent siblings)
11 siblings, 2 replies; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
When analysing a stacktrace it can be useful to know whether an unwound
PC has been rewritten by fgraph or kretprobes, as in some situations
these may be suspect or be known to be unreliable.
This patch adds flags to track when an unwind entry has recovered the PC
from fgraph and/or kretprobes, and updates dump_backtrace() to log when
this is the case.
The flags recorded are:
"F" - the PC was recovered from fgraph
"K" - the PC was recovered from kretprobes
These flags are recorded and logged in addition to the original source
of the unwound PC.
For example, with the ftrace_graph profiler enabled globally, and
kretprobes installed on generic_handle_domain_irq() and
do_interrupt_handler(), a backtrace triggered by magic-sysrq + L
reports:
| Call trace:
| show_stack+0x20/0x40 (CF)
| dump_stack_lvl+0x60/0x80 (F)
| dump_stack+0x18/0x28
| nmi_cpu_backtrace+0xfc/0x140
| nmi_trigger_cpumask_backtrace+0x1c8/0x200
| arch_trigger_cpumask_backtrace+0x20/0x40
| sysrq_handle_showallcpus+0x24/0x38 (F)
| __handle_sysrq+0xa8/0x1b0 (F)
| handle_sysrq+0x38/0x50 (F)
| pl011_int+0x460/0x5a8 (F)
| __handle_irq_event_percpu+0x60/0x220 (F)
| handle_irq_event+0x54/0xc0 (F)
| handle_fasteoi_irq+0xa8/0x1d0 (F)
| generic_handle_domain_irq+0x34/0x58 (F)
| gic_handle_irq+0x54/0x140 (FK)
| call_on_irq_stack+0x24/0x58 (F)
| do_interrupt_handler+0x88/0xa0
| el1_interrupt+0x34/0x68 (FK)
| el1h_64_irq_handler+0x18/0x28
| el1h_64_irq+0x64/0x68
| default_idle_call+0x34/0x180
| do_idle+0x204/0x268
| cpu_startup_entry+0x40/0x50 (F)
| rest_init+0xe4/0xf0
| start_kernel+0x744/0x750
| __primary_switched+0x80/0x90
Note that as these flags are reported next to the recovered PC value,
they appear on the callers of instrumented functions, e.g.
gic_handle_irq has a "K" marker because generic_handle_domain_irq was
instrumented with kretprobes and had its return addredd (gic_handle_irq)
rewritten.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kernel/stacktrace.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 83ab37e13159e..f8e231683dad9 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -28,6 +28,14 @@ enum kunwind_source {
KUNWIND_SOURCE_REGS_PC,
};
+union unwind_flags {
+ unsigned long all;
+ struct {
+ unsigned long fgraph : 1,
+ kretprobe : 1;
+ };
+};
+
/*
* Kernel unwind state
*
@@ -46,6 +54,7 @@ struct kunwind_state {
struct llist_node *kr_cur;
#endif
enum kunwind_source source;
+ union unwind_flags flags;
};
static __always_inline void
@@ -55,6 +64,7 @@ kunwind_init(struct kunwind_state *state,
unwind_init_common(&state->common);
state->task = task;
state->source = KUNWIND_SOURCE_UNKNOWN;
+ state->flags.all = 0;
}
/*
@@ -127,6 +137,7 @@ kunwind_recover_return_address(struct kunwind_state *state)
if (WARN_ON_ONCE(state->common.pc == orig_pc))
return -EINVAL;
state->common.pc = orig_pc;
+ state->flags.fgraph = 1;
}
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
@@ -137,6 +148,7 @@ kunwind_recover_return_address(struct kunwind_state *state)
(void *)state->common.fp,
&state->kr_cur);
state->common.pc = orig_pc;
+ state->flags.kretprobe = 1;
}
#endif /* CONFIG_KRETPROBES */
@@ -157,6 +169,8 @@ kunwind_next(struct kunwind_state *state)
unsigned long fp = state->common.fp;
int err;
+ state->flags.all = 0;
+
/* Final frame; nothing to unwind */
if (fp == (unsigned long)&task_pt_regs(tsk)->stackframe)
return -ENOENT;
@@ -331,12 +345,18 @@ static const char *state_source_string(const struct kunwind_state *state)
static bool dump_backtrace_entry(const struct kunwind_state *state, void *arg)
{
const char *source = state_source_string(state);
+ union unwind_flags flags = state->flags;
+ bool has_info = source || flags.all;
char *loglvl = arg;
- printk("%s %pSb%s%s%s\n", loglvl,
+
+ printk("%s %pSb%s%s%s%s%s\n", loglvl,
(void *)state->common.pc,
- source ? " (" : "",
+ has_info ? " (" : "",
source ? source : "",
- source ? ")" : "");
+ flags.fgraph ? "F" : "",
+ flags.kretprobe ? "K" : "",
+ has_info ? ")" : "");
+
return true;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 08/10] arm64: stacktrace: report recovered PCs
2024-10-10 10:15 ` [PATCH 08/10] arm64: stacktrace: report recovered PCs Mark Rutland
@ 2024-10-10 15:16 ` Mark Brown
2024-10-11 12:58 ` Miroslav Benes
1 sibling, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-10 15:16 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 348 bytes --]
On Thu, Oct 10, 2024 at 11:15:08AM +0100, Mark Rutland wrote:
> The flags recorded are:
> "F" - the PC was recovered from fgraph
> "K" - the PC was recovered from kretprobes
> These flags are recorded and logged in addition to the original source
> of the unwound PC.
This should be very useful.
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/10] arm64: stacktrace: report recovered PCs
2024-10-10 10:15 ` [PATCH 08/10] arm64: stacktrace: report recovered PCs Mark Rutland
2024-10-10 15:16 ` Mark Brown
@ 2024-10-11 12:58 ` Miroslav Benes
2024-10-11 15:02 ` Mark Rutland
1 sibling, 1 reply; 31+ messages in thread
From: Miroslav Benes @ 2024-10-11 12:58 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, broonie, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
On Thu, 10 Oct 2024, Mark Rutland wrote:
> When analysing a stacktrace it can be useful to know whether an unwound
> PC has been rewritten by fgraph or kretprobes, as in some situations
> these may be suspect or be known to be unreliable.
>
> This patch adds flags to track when an unwind entry has recovered the PC
> from fgraph and/or kretprobes, and updates dump_backtrace() to log when
> this is the case.
>
> The flags recorded are:
>
> "F" - the PC was recovered from fgraph
> "K" - the PC was recovered from kretprobes
>
> These flags are recorded and logged in addition to the original source
> of the unwound PC.
>
> For example, with the ftrace_graph profiler enabled globally, and
> kretprobes installed on generic_handle_domain_irq() and
> do_interrupt_handler(), a backtrace triggered by magic-sysrq + L
> reports:
>
> | Call trace:
> | show_stack+0x20/0x40 (CF)
> | dump_stack_lvl+0x60/0x80 (F)
> | dump_stack+0x18/0x28
> | nmi_cpu_backtrace+0xfc/0x140
> | nmi_trigger_cpumask_backtrace+0x1c8/0x200
> | arch_trigger_cpumask_backtrace+0x20/0x40
> | sysrq_handle_showallcpus+0x24/0x38 (F)
> | __handle_sysrq+0xa8/0x1b0 (F)
> | handle_sysrq+0x38/0x50 (F)
> | pl011_int+0x460/0x5a8 (F)
> | __handle_irq_event_percpu+0x60/0x220 (F)
> | handle_irq_event+0x54/0xc0 (F)
> | handle_fasteoi_irq+0xa8/0x1d0 (F)
> | generic_handle_domain_irq+0x34/0x58 (F)
> | gic_handle_irq+0x54/0x140 (FK)
> | call_on_irq_stack+0x24/0x58 (F)
> | do_interrupt_handler+0x88/0xa0
> | el1_interrupt+0x34/0x68 (FK)
> | el1h_64_irq_handler+0x18/0x28
> | el1h_64_irq+0x64/0x68
> | default_idle_call+0x34/0x180
> | do_idle+0x204/0x268
> | cpu_startup_entry+0x40/0x50 (F)
> | rest_init+0xe4/0xf0
> | start_kernel+0x744/0x750
> | __primary_switched+0x80/0x90
>
> Note that as these flags are reported next to the recovered PC value,
> they appear on the callers of instrumented functions, e.g.
> gic_handle_irq has a "K" marker because generic_handle_domain_irq was
> instrumented with kretprobes and had its return addredd (gic_handle_irq)
address
Also, you might want to add () to all the functions here to keep it
consistent with the rest of the changelog even though the call trace does
not have them obviously.
Miroslav
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 08/10] arm64: stacktrace: report recovered PCs
2024-10-11 12:58 ` Miroslav Benes
@ 2024-10-11 15:02 ` Mark Rutland
0 siblings, 0 replies; 31+ messages in thread
From: Mark Rutland @ 2024-10-11 15:02 UTC (permalink / raw)
To: Miroslav Benes
Cc: linux-arm-kernel, ardb, broonie, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
On Fri, Oct 11, 2024 at 02:58:11PM +0200, Miroslav Benes wrote:
> On Thu, 10 Oct 2024, Mark Rutland wrote:
> > Note that as these flags are reported next to the recovered PC value,
> > they appear on the callers of instrumented functions, e.g.
> > gic_handle_irq has a "K" marker because generic_handle_domain_irq was
> > instrumented with kretprobes and had its return addredd (gic_handle_irq)
>
> address
>
> Also, you might want to add () to all the functions here to keep it
> consistent with the rest of the changelog even though the call trace does
> not have them obviously.
I was on the fence about what was clearer when I wrote this; my
preference is also to have the '()', so I'll go add that in for
consistency.
Mark.
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 09/10] arm64: stacktrace: split unwind_consume_stack()
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (7 preceding siblings ...)
2024-10-10 10:15 ` [PATCH 08/10] arm64: stacktrace: report recovered PCs Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-11 13:11 ` Mark Brown
2024-10-10 10:15 ` [PATCH 10/10] arm64: stacktrace: unwind exception boundaries Mark Rutland
` (2 subsequent siblings)
11 siblings, 1 reply; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
When unwinding stacks, we use unwind_consume_stack() to both find
whether an object (e.g. a frame record) is on an accessible stack *and*
to update the stack boundaries. This works fine today since we only care
about one type of object which does not overlap other objects.
In subsequent patches we'll want to check whether an object (e.g a frame
record) is on the stack and follow this up by accessing a larger object
containing the first (e.g. a pt_regs with an embedded frame record).
To make that pattern easier to implement, this patch reworks
unwind_find_next_stack() and unwind_consume_stack() so that the former
can be used to check if an object is on any accessible stack, and the
latter is purely used to update the stack boundaries.
As unwind_find_next_stack() is modified to also check the stack
currently being unwound, it is renamed to unwind_find_stack().
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/stacktrace/common.h | 68 +++++++++++++---------
1 file changed, 39 insertions(+), 29 deletions(-)
diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index 7fab6876e4970..821a8fdd31af4 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -60,13 +60,27 @@ static inline void unwind_init_common(struct unwind_state *state)
state->stack = stackinfo_get_unknown();
}
-static struct stack_info *unwind_find_next_stack(const struct unwind_state *state,
- unsigned long sp,
- unsigned long size)
+/**
+ * unwind_find_stack() - Find the accessible stack which entirely contains an
+ * object.
+ *
+ * @state: the current unwind state.
+ * @sp: the base address of the object.
+ * @size: the size of the object.
+ *
+ * Return: a pointer to the relevant stack_info if found; NULL otherwise.
+ */
+static struct stack_info *unwind_find_stack(struct unwind_state *state,
+ unsigned long sp,
+ unsigned long size)
{
- for (int i = 0; i < state->nr_stacks; i++) {
- struct stack_info *info = &state->stacks[i];
+ struct stack_info *info = &state->stack;
+ if (stackinfo_on_stack(info, sp, size))
+ return info;
+
+ for (int i = 0; i < state->nr_stacks; i++) {
+ info = &state->stacks[i];
if (stackinfo_on_stack(info, sp, size))
return info;
}
@@ -75,36 +89,31 @@ static struct stack_info *unwind_find_next_stack(const struct unwind_state *stat
}
/**
- * unwind_consume_stack() - Check if an object is on an accessible stack,
- * updating stack boundaries so that future unwind steps cannot consume this
- * object again.
+ * unwind_consume_stack() - Update stack boundaries so that future unwind steps
+ * cannot consume this object again.
*
* @state: the current unwind state.
+ * @info: the stack_info of the stack containing the object.
* @sp: the base address of the object.
* @size: the size of the object.
*
* Return: 0 upon success, an error code otherwise.
*/
-static inline int unwind_consume_stack(struct unwind_state *state,
- unsigned long sp,
- unsigned long size)
+static inline void unwind_consume_stack(struct unwind_state *state,
+ struct stack_info *info,
+ unsigned long sp,
+ unsigned long size)
{
- struct stack_info *next;
-
- if (stackinfo_on_stack(&state->stack, sp, size))
- goto found;
-
- next = unwind_find_next_stack(state, sp, size);
- if (!next)
- return -EINVAL;
+ struct stack_info tmp;
/*
* Stack transitions are strictly one-way, and once we've
* transitioned from one stack to another, it's never valid to
* unwind back to the old stack.
*
- * Remove the current stack from the list of stacks so that it cannot
- * be found on a subsequent transition.
+ * Destroy the old stack info so that it cannot be found upon a
+ * subsequent transition. If the stack has not changed, we'll
+ * immediately restore the current stack info.
*
* Note that stacks can nest in several valid orders, e.g.
*
@@ -115,16 +124,15 @@ static inline int unwind_consume_stack(struct unwind_state *state,
* ... so we do not check the specific order of stack
* transitions.
*/
- state->stack = *next;
- *next = stackinfo_get_unknown();
+ tmp = *info;
+ *info = stackinfo_get_unknown();
+ state->stack = tmp;
-found:
/*
* Future unwind steps can only consume stack above this frame record.
* Update the current stack to start immediately above it.
*/
state->stack.low = sp + size;
- return 0;
}
/**
@@ -137,16 +145,18 @@ static inline int unwind_consume_stack(struct unwind_state *state,
static inline int
unwind_next_frame_record(struct unwind_state *state)
{
+ struct stack_info *info;
struct frame_record *record;
unsigned long fp = state->fp;
- int err;
if (fp & 0x7)
return -EINVAL;
- err = unwind_consume_stack(state, fp, sizeof(*record));
- if (err)
- return err;
+ info = unwind_find_stack(state, fp, sizeof(*record));
+ if (!info)
+ return -EINVAL;
+
+ unwind_consume_stack(state, info, fp, sizeof(*record));
/*
* Record this frame record's values.
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 09/10] arm64: stacktrace: split unwind_consume_stack()
2024-10-10 10:15 ` [PATCH 09/10] arm64: stacktrace: split unwind_consume_stack() Mark Rutland
@ 2024-10-11 13:11 ` Mark Brown
0 siblings, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-11 13:11 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 390 bytes --]
On Thu, Oct 10, 2024 at 11:15:09AM +0100, Mark Rutland wrote:
> When unwinding stacks, we use unwind_consume_stack() to both find
> whether an object (e.g. a frame record) is on an accessible stack *and*
> to update the stack boundaries. This works fine today since we only care
> about one type of object which does not overlap other objects.
Reviewed-by: Mark Brown <broonie@kernel.org>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH 10/10] arm64: stacktrace: unwind exception boundaries
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (8 preceding siblings ...)
2024-10-10 10:15 ` [PATCH 09/10] arm64: stacktrace: split unwind_consume_stack() Mark Rutland
@ 2024-10-10 10:15 ` Mark Rutland
2024-10-11 15:16 ` Miroslav Benes
2024-10-12 9:22 ` Mark Brown
2024-10-11 15:19 ` [PATCH 00/10] arm64: stacktrace: improve unwind reporting Miroslav Benes
2024-10-15 11:54 ` Puranjay Mohan
11 siblings, 2 replies; 31+ messages in thread
From: Mark Rutland @ 2024-10-10 10:15 UTC (permalink / raw)
To: linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, puranjay12, will
When arm64's stack unwinder encounters an exception boundary, it uses
the pt_regs::stackframe created by the entry code, which has a copy of
the PC and FP at the time the exception was taken. The unwinder doesn't
know anything about pt_regs, and reports the PC from the stackframe, but
does not report the LR.
The LR is only guaranteed to contain the return address at function call
boundaries, and can be used as a scratch register at other times, so the
LR at an exception boundary may or may not be a legitimate return
address. It would be useful to report the LR value regardless, as it can
be helpful when debugging, and in future it will be helpful for reliable
stacktrace support.
This patch changes the way we unwind across exception boundaries,
allowing both the PC and LR to be reported. The entry code creates a
frame_record_meta structure embedded within pt_regs, which the unwinder
uses to find the pt_regs. The unwinder can then extract pt_regs::pc and
pt_regs::lr as two separate unwind steps before continuing with a
regular walk of frame records.
When a PC is unwound from pt_regs::lr, dump_backtrace() will log this
with an "L" marker so that it can be identified easily. For example,
an unwind across an exception boundary will appear as follows:
| el1h_64_irq+0x6c/0x70
| _raw_spin_unlock_irqrestore+0x10/0x60 (P)
| __aarch64_insn_write+0x6c/0x90 (L)
| aarch64_insn_patch_text_nosync+0x28/0x80
... with a (P) entry for pt_regs::pc, and an (L) entry for pt_regs:lr.
Note that the LR may be stale at the point of the exception, for example,
shortly after a return:
| el1h_64_irq+0x6c/0x70
| default_idle_call+0x34/0x180 (P)
| default_idle_call+0x28/0x180 (L)
| do_idle+0x204/0x268
... where the LR points a few instructions before the current PC.
This plays nicely with all the other unwind metadata tracking. With the
ftrace_graph profiler enabled globally, and kretprobes installed on
generic_handle_domain_irq() and do_interrupt_handler(), a backtrace triggered
by magic-sysrq + L reports:
| Call trace:
| show_stack+0x20/0x40 (CF)
| dump_stack_lvl+0x60/0x80 (F)
| dump_stack+0x18/0x28
| nmi_cpu_backtrace+0xfc/0x140
| nmi_trigger_cpumask_backtrace+0x1c8/0x200
| arch_trigger_cpumask_backtrace+0x20/0x40
| sysrq_handle_showallcpus+0x24/0x38 (F)
| __handle_sysrq+0xa8/0x1b0 (F)
| handle_sysrq+0x38/0x50 (F)
| pl011_int+0x460/0x5a8 (F)
| __handle_irq_event_percpu+0x60/0x220 (F)
| handle_irq_event+0x54/0xc0 (F)
| handle_fasteoi_irq+0xa8/0x1d0 (F)
| generic_handle_domain_irq+0x34/0x58 (F)
| gic_handle_irq+0x54/0x140 (FK)
| call_on_irq_stack+0x24/0x58 (F)
| do_interrupt_handler+0x88/0xa0
| el1_interrupt+0x34/0x68 (FK)
| el1h_64_irq_handler+0x18/0x28
| el1h_64_irq+0x6c/0x70
| default_idle_call+0x34/0x180 (P)
| default_idle_call+0x28/0x180 (L)
| do_idle+0x204/0x268
| cpu_startup_entry+0x3c/0x50 (F)
| rest_init+0xe4/0xf0
| start_kernel+0x744/0x750
| __primary_switched+0x88/0x98
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Puranjay Mohan <puranjay12@gmail.com>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/ptrace.h | 4 +-
arch/arm64/include/asm/stacktrace/frame.h | 35 +++++++
arch/arm64/kernel/asm-offsets.c | 1 +
arch/arm64/kernel/entry.S | 12 ++-
arch/arm64/kernel/head.S | 3 +
arch/arm64/kernel/probes/stBV5U5j | 0
arch/arm64/kernel/process.c | 1 +
arch/arm64/kernel/stacktrace.c | 121 ++++++++++++++++++++--
8 files changed, 158 insertions(+), 19 deletions(-)
create mode 100644 arch/arm64/kernel/probes/stBV5U5j
diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index 89c02f85f4b11..47ff8654c5ec1 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -168,9 +168,7 @@ struct pt_regs {
u32 pmr;
u64 sdei_ttbr1;
- u64 unused;
-
- struct frame_record stackframe;
+ struct frame_record_meta stackframe;
/* Only valid for some EL1 exceptions. */
u64 lockdep_hardirqs;
diff --git a/arch/arm64/include/asm/stacktrace/frame.h b/arch/arm64/include/asm/stacktrace/frame.h
index 6397bc847f147..0ee0f6ba0fd86 100644
--- a/arch/arm64/include/asm/stacktrace/frame.h
+++ b/arch/arm64/include/asm/stacktrace/frame.h
@@ -3,6 +3,30 @@
#define __ASM_STACKTRACE_FRAME_H
/*
+ * - FRAME_META_TYPE_NONE
+ *
+ * This value is reserved.
+ *
+ * - FRAME_META_TYPE_FINAL
+ *
+ * The record is the last entry on the stack.
+ * Unwinding should terminate successfully.
+ *
+ * - FRAME_META_TYPE_PT_REGS
+ *
+ * The record is embedded within a struct pt_regs, recording the registers at
+ * an arbitrary point in time.
+ * Unwinding should consume pt_regs::pc, followed by pt_regs::lr.
+ *
+ * Note: all other values are reserved and should result in unwinding
+ * terminating with an error.
+ */
+#define FRAME_META_TYPE_NONE 0
+#define FRAME_META_TYPE_FINAL 1
+#define FRAME_META_TYPE_PT_REGS 2
+
+#ifndef __ASSEMBLY__
+/*
* A standard AAPCS64 frame record.
*/
struct frame_record {
@@ -10,4 +34,15 @@ struct frame_record {
u64 lr;
};
+/*
+ * A metadata frame record indicating a special unwind.
+ * The record::{fp,lr} fields must be zero to indicate the presence of
+ * metadata.
+ */
+struct frame_record_meta {
+ struct frame_record record;
+ u64 type;
+};
+#endif /* __ASSEMBLY */
+
#endif /* __ASM_STACKTRACE_FRAME_H */
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index eb7fb2f9b9274..021f04f97fde5 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -81,6 +81,7 @@ int main(void)
DEFINE(S_SDEI_TTBR1, offsetof(struct pt_regs, sdei_ttbr1));
DEFINE(S_PMR, offsetof(struct pt_regs, pmr));
DEFINE(S_STACKFRAME, offsetof(struct pt_regs, stackframe));
+ DEFINE(S_STACKFRAME_TYPE, offsetof(struct pt_regs, stackframe.type));
DEFINE(PT_REGS_SIZE, sizeof(struct pt_regs));
BLANK();
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index fa6d6d5ca5e02..5ae2a34b50bda 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -25,6 +25,7 @@
#include <asm/processor.h>
#include <asm/ptrace.h>
#include <asm/scs.h>
+#include <asm/stacktrace/frame.h>
#include <asm/thread_info.h>
#include <asm/asm-uaccess.h>
#include <asm/unistd.h>
@@ -284,15 +285,16 @@ alternative_else_nop_endif
stp lr, x21, [sp, #S_LR]
/*
- * For exceptions from EL0, create a final frame record.
- * For exceptions from EL1, create a synthetic frame record so the
- * interrupted code shows up in the backtrace.
+ * Create a metadata frame record. The unwinder will use this to
+ * identify and unwind exception boundaries.
*/
- .if \el == 0
stp xzr, xzr, [sp, #S_STACKFRAME]
+ .if \el == 0
+ mov x0, #FRAME_META_TYPE_FINAL
.else
- stp x29, x22, [sp, #S_STACKFRAME]
+ mov x0, #FRAME_META_TYPE_PT_REGS
.endif
+ str x0, [sp, #S_STACKFRAME_TYPE]
add x29, sp, #S_STACKFRAME
#ifdef CONFIG_ARM64_SW_TTBR0_PAN
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index cb68adcabe078..5ab1970ee5436 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -32,6 +32,7 @@
#include <asm/scs.h>
#include <asm/smp.h>
#include <asm/sysreg.h>
+#include <asm/stacktrace/frame.h>
#include <asm/thread_info.h>
#include <asm/virt.h>
@@ -199,6 +200,8 @@ SYM_CODE_END(preserve_boot_args)
sub sp, sp, #PT_REGS_SIZE
stp xzr, xzr, [sp, #S_STACKFRAME]
+ mov \tmp1, #FRAME_META_TYPE_FINAL
+ str \tmp1, [sp, #S_STACKFRAME_TYPE]
add x29, sp, #S_STACKFRAME
scs_load_current
diff --git a/arch/arm64/kernel/probes/stBV5U5j b/arch/arm64/kernel/probes/stBV5U5j
new file mode 100644
index 0000000000000..e69de29bb2d1d
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index d45fd114eac3f..29904c829de25 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -409,6 +409,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
*/
memset(childregs, 0, sizeof(struct pt_regs));
childregs->pstate = PSR_MODE_EL1h | PSR_IL_BIT;
+ childregs->stackframe.type = FRAME_META_TYPE_FINAL;
p->thread.cpu_context.x19 = (unsigned long)args->fn;
p->thread.cpu_context.x20 = (unsigned long)args->fn_arg;
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index f8e231683dad9..caef85462acb6 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -26,6 +26,7 @@ enum kunwind_source {
KUNWIND_SOURCE_CALLER,
KUNWIND_SOURCE_TASK,
KUNWIND_SOURCE_REGS_PC,
+ KUNWIND_SOURCE_REGS_LR,
};
union unwind_flags {
@@ -55,6 +56,7 @@ struct kunwind_state {
#endif
enum kunwind_source source;
union unwind_flags flags;
+ struct pt_regs *regs;
};
static __always_inline void
@@ -65,6 +67,7 @@ kunwind_init(struct kunwind_state *state,
state->task = task;
state->source = KUNWIND_SOURCE_UNKNOWN;
state->flags.all = 0;
+ state->regs = NULL;
}
/*
@@ -80,6 +83,7 @@ kunwind_init_from_regs(struct kunwind_state *state,
{
kunwind_init(state, current);
+ state->regs = regs;
state->common.fp = regs->regs[29];
state->common.pc = regs->pc;
state->source = KUNWIND_SOURCE_REGS_PC;
@@ -155,6 +159,103 @@ kunwind_recover_return_address(struct kunwind_state *state)
return 0;
}
+static __always_inline
+int kunwind_next_regs_pc(struct kunwind_state *state)
+{
+ struct stack_info *info;
+ unsigned long fp = state->common.fp;
+ struct pt_regs *regs;
+
+ regs = container_of((u64 *)fp, struct pt_regs, stackframe.record.fp);
+
+ info = unwind_find_stack(&state->common, (unsigned long)regs, sizeof(*regs));
+ if (!info)
+ return -EINVAL;
+
+ unwind_consume_stack(&state->common, info, (unsigned long)regs,
+ sizeof(*regs));
+
+ state->regs = regs;
+ state->common.pc = regs->pc;
+ state->common.fp = regs->regs[29];
+ state->source = KUNWIND_SOURCE_REGS_PC;
+ return 0;
+}
+
+static __always_inline int
+kunwind_next_regs_lr(struct kunwind_state *state)
+{
+ /*
+ * The stack for the regs was consumed by kunwind_next_regs_pc(), so we
+ * cannot consume that again here, but we know the regs are safe to
+ * access.
+ */
+ state->common.pc = state->regs->regs[30];
+ state->common.fp = state->regs->regs[29];
+ state->regs = NULL;
+ state->source = KUNWIND_SOURCE_REGS_LR;
+
+ return 0;
+}
+
+static __always_inline int
+kunwind_next_frame_record_meta(struct kunwind_state *state)
+{
+ struct task_struct *tsk = state->task;
+ unsigned long fp = state->common.fp;
+ struct frame_record_meta *meta;
+ struct stack_info *info;
+
+ info = unwind_find_stack(&state->common, fp, sizeof(*meta));
+ if (!info)
+ return -EINVAL;
+
+ meta = (struct frame_record_meta *)fp;
+ switch (READ_ONCE(meta->type)) {
+ case FRAME_META_TYPE_FINAL:
+ if (meta == &task_pt_regs(tsk)->stackframe)
+ return -ENOENT;
+ WARN_ON_ONCE(1);
+ return -EINVAL;
+ case FRAME_META_TYPE_PT_REGS:
+ return kunwind_next_regs_pc(state);
+ default:
+ WARN_ON_ONCE(1);
+ return -EINVAL;
+ }
+}
+
+static __always_inline int
+kunwind_next_frame_record(struct kunwind_state *state)
+{
+ unsigned long fp = state->common.fp;
+ struct frame_record *record;
+ struct stack_info *info;
+ unsigned long new_fp, new_pc;
+
+ if (fp & 0x7)
+ return -EINVAL;
+
+ info = unwind_find_stack(&state->common, fp, sizeof(*record));
+ if (!info)
+ return -EINVAL;
+
+ record = (struct frame_record *)fp;
+ new_fp = READ_ONCE(record->fp);
+ new_pc = READ_ONCE(record->lr);
+
+ if (!new_fp && !new_pc)
+ return kunwind_next_frame_record_meta(state);
+
+ unwind_consume_stack(&state->common, info, fp, sizeof(*record));
+
+ state->common.fp = new_fp;
+ state->common.pc = new_pc;
+ state->source = KUNWIND_SOURCE_FRAME;
+
+ return 0;
+}
+
/*
* Unwind from one frame record (A) to the next frame record (B).
*
@@ -165,30 +266,27 @@ kunwind_recover_return_address(struct kunwind_state *state)
static __always_inline int
kunwind_next(struct kunwind_state *state)
{
- struct task_struct *tsk = state->task;
- unsigned long fp = state->common.fp;
int err;
state->flags.all = 0;
- /* Final frame; nothing to unwind */
- if (fp == (unsigned long)&task_pt_regs(tsk)->stackframe)
- return -ENOENT;
-
switch (state->source) {
case KUNWIND_SOURCE_FRAME:
case KUNWIND_SOURCE_CALLER:
case KUNWIND_SOURCE_TASK:
+ case KUNWIND_SOURCE_REGS_LR:
+ err = kunwind_next_frame_record(state);
+ break;
case KUNWIND_SOURCE_REGS_PC:
- err = unwind_next_frame_record(&state->common);
- if (err)
- return err;
- state->source = KUNWIND_SOURCE_FRAME;
+ err = kunwind_next_regs_lr(state);
break;
default:
- return -EINVAL;
+ err = -EINVAL;
}
+ if (err)
+ return err;
+
state->common.pc = ptrauth_strip_kernel_insn_pac(state->common.pc);
return kunwind_recover_return_address(state);
@@ -338,6 +436,7 @@ static const char *state_source_string(const struct kunwind_state *state)
case KUNWIND_SOURCE_CALLER: return "C";
case KUNWIND_SOURCE_TASK: return "T";
case KUNWIND_SOURCE_REGS_PC: return "P";
+ case KUNWIND_SOURCE_REGS_LR: return "L";
default: return "U";
}
}
--
2.30.2
^ permalink raw reply related [flat|nested] 31+ messages in thread* Re: [PATCH 10/10] arm64: stacktrace: unwind exception boundaries
2024-10-10 10:15 ` [PATCH 10/10] arm64: stacktrace: unwind exception boundaries Mark Rutland
@ 2024-10-11 15:16 ` Miroslav Benes
2024-10-11 16:13 ` Mark Rutland
2024-10-12 9:22 ` Mark Brown
1 sibling, 1 reply; 31+ messages in thread
From: Miroslav Benes @ 2024-10-11 15:16 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, broonie, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
Hi,
> +static __always_inline int
> +kunwind_next_frame_record(struct kunwind_state *state)
> +{
> + unsigned long fp = state->common.fp;
> + struct frame_record *record;
> + struct stack_info *info;
> + unsigned long new_fp, new_pc;
> +
> + if (fp & 0x7)
> + return -EINVAL;
> +
> + info = unwind_find_stack(&state->common, fp, sizeof(*record));
> + if (!info)
> + return -EINVAL;
> +
> + record = (struct frame_record *)fp;
> + new_fp = READ_ONCE(record->fp);
> + new_pc = READ_ONCE(record->lr);
> +
> + if (!new_fp && !new_pc)
> + return kunwind_next_frame_record_meta(state);
> +
> + unwind_consume_stack(&state->common, info, fp, sizeof(*record));
> +
> + state->common.fp = new_fp;
> + state->common.pc = new_pc;
> + state->source = KUNWIND_SOURCE_FRAME;
> +
> + return 0;
> +}
> +
> /*
> * Unwind from one frame record (A) to the next frame record (B).
> *
> @@ -165,30 +266,27 @@ kunwind_recover_return_address(struct kunwind_state *state)
> static __always_inline int
> kunwind_next(struct kunwind_state *state)
> {
> - struct task_struct *tsk = state->task;
> - unsigned long fp = state->common.fp;
> int err;
>
> state->flags.all = 0;
>
> - /* Final frame; nothing to unwind */
> - if (fp == (unsigned long)&task_pt_regs(tsk)->stackframe)
> - return -ENOENT;
> -
> switch (state->source) {
> case KUNWIND_SOURCE_FRAME:
> case KUNWIND_SOURCE_CALLER:
> case KUNWIND_SOURCE_TASK:
> + case KUNWIND_SOURCE_REGS_LR:
> + err = kunwind_next_frame_record(state);
> + break;
> case KUNWIND_SOURCE_REGS_PC:
> - err = unwind_next_frame_record(&state->common);
> - if (err)
> - return err;
> - state->source = KUNWIND_SOURCE_FRAME;
> + err = kunwind_next_regs_lr(state);
the remaining users of unwind_next_frame_record() after this change are in
KVM. How does it work there? What is the difference?
Miroslav
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH 10/10] arm64: stacktrace: unwind exception boundaries
2024-10-11 15:16 ` Miroslav Benes
@ 2024-10-11 16:13 ` Mark Rutland
2024-10-11 16:34 ` Miroslav Benes
0 siblings, 1 reply; 31+ messages in thread
From: Mark Rutland @ 2024-10-11 16:13 UTC (permalink / raw)
To: Miroslav Benes
Cc: linux-arm-kernel, ardb, broonie, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
On Fri, Oct 11, 2024 at 05:16:23PM +0200, Miroslav Benes wrote:
> Hi,
>
> > +static __always_inline int
> > +kunwind_next_frame_record(struct kunwind_state *state)
> > +{
> > + unsigned long fp = state->common.fp;
> > + struct frame_record *record;
> > + struct stack_info *info;
> > + unsigned long new_fp, new_pc;
> > +
> > + if (fp & 0x7)
> > + return -EINVAL;
> > +
> > + info = unwind_find_stack(&state->common, fp, sizeof(*record));
> > + if (!info)
> > + return -EINVAL;
> > +
> > + record = (struct frame_record *)fp;
> > + new_fp = READ_ONCE(record->fp);
> > + new_pc = READ_ONCE(record->lr);
> > +
> > + if (!new_fp && !new_pc)
> > + return kunwind_next_frame_record_meta(state);
> > +
> > + unwind_consume_stack(&state->common, info, fp, sizeof(*record));
> > +
> > + state->common.fp = new_fp;
> > + state->common.pc = new_pc;
> > + state->source = KUNWIND_SOURCE_FRAME;
> > +
> > + return 0;
> > +}
> > +
> > /*
> > * Unwind from one frame record (A) to the next frame record (B).
> > *
> > @@ -165,30 +266,27 @@ kunwind_recover_return_address(struct kunwind_state *state)
> > static __always_inline int
> > kunwind_next(struct kunwind_state *state)
> > {
> > - struct task_struct *tsk = state->task;
> > - unsigned long fp = state->common.fp;
> > int err;
> >
> > state->flags.all = 0;
> >
> > - /* Final frame; nothing to unwind */
> > - if (fp == (unsigned long)&task_pt_regs(tsk)->stackframe)
> > - return -ENOENT;
> > -
> > switch (state->source) {
> > case KUNWIND_SOURCE_FRAME:
> > case KUNWIND_SOURCE_CALLER:
> > case KUNWIND_SOURCE_TASK:
> > + case KUNWIND_SOURCE_REGS_LR:
> > + err = kunwind_next_frame_record(state);
> > + break;
> > case KUNWIND_SOURCE_REGS_PC:
> > - err = unwind_next_frame_record(&state->common);
> > - if (err)
> > - return err;
> > - state->source = KUNWIND_SOURCE_FRAME;
> > + err = kunwind_next_regs_lr(state);
>
> the remaining users of unwind_next_frame_record() after this change are in
> KVM. How does it work there?
The short of it is those are unchanged by this series, and the behaviour
of the KVM stacktrace code is unchanged -- it will continue to not
report exception boundaries.
The KVM hyp code doesn't (currently) use the frame_record_meta records
at exception boundaries, so the KVM stacktrace code won't see those and
only need to unwind regular frame records.
It would be nice to improve that, but IIUC it shouldn't matter for
RELAIBLE_STACKTRACE; all calls to hyp code from the main kernel are
synchronous, and aside from a (fatal) hyp_panic(), executing in a
regular kernel context ensures there's no hyp context to unwind.
> What is the difference?
The kunwind_next_frame_record() function will identify and unwind any
frame_record_meta frames, while unwind_next_frame_record() only handles
regular frame records. Since the KVM hyp code doesn't use
frame_record_meta frames, using unwind_next_frame_record() is
sufficient.
I originally wanted the kunwind code call unwind_next_frame_record() and
then handle any frame_record_meta, but that was painful due to the state
that alters (e.g. the way we track which stack ranges have been
consumed), and duplicating the easrly parts of
unwind_next_frame_record() was simpler overall.
As a general naming convention, the unwind_*() functions are things that
can be shared between the main kernel image and KVM, and the kunwind_*()
functions are specific to the main kernel image.
I'm happy to change the name to more clearly distinguish
kunwind_next_frame_record() from unwind_next_frame_record(), if that
would help?
Mark.
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH 10/10] arm64: stacktrace: unwind exception boundaries
2024-10-11 16:13 ` Mark Rutland
@ 2024-10-11 16:34 ` Miroslav Benes
0 siblings, 0 replies; 31+ messages in thread
From: Miroslav Benes @ 2024-10-11 16:34 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, broonie, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
On Fri, 11 Oct 2024, Mark Rutland wrote:
> On Fri, Oct 11, 2024 at 05:16:23PM +0200, Miroslav Benes wrote:
>
> > > switch (state->source) {
> > > case KUNWIND_SOURCE_FRAME:
> > > case KUNWIND_SOURCE_CALLER:
> > > case KUNWIND_SOURCE_TASK:
> > > + case KUNWIND_SOURCE_REGS_LR:
> > > + err = kunwind_next_frame_record(state);
> > > + break;
> > > case KUNWIND_SOURCE_REGS_PC:
> > > - err = unwind_next_frame_record(&state->common);
> > > - if (err)
> > > - return err;
> > > - state->source = KUNWIND_SOURCE_FRAME;
> > > + err = kunwind_next_regs_lr(state);
> >
> > the remaining users of unwind_next_frame_record() after this change are in
> > KVM. How does it work there?
>
> The short of it is those are unchanged by this series, and the behaviour
> of the KVM stacktrace code is unchanged -- it will continue to not
> report exception boundaries.
>
> The KVM hyp code doesn't (currently) use the frame_record_meta records
> at exception boundaries, so the KVM stacktrace code won't see those and
> only need to unwind regular frame records.
Yes.
> It would be nice to improve that, but IIUC it shouldn't matter for
> RELAIBLE_STACKTRACE; all calls to hyp code from the main kernel are
> synchronous, and aside from a (fatal) hyp_panic(), executing in a
> regular kernel context ensures there's no hyp context to unwind.
I agree. I was mainly asking if there was a reason (how exceptions are
processed for example) why it could not be unified because the first
thought was that the differences are not so big on the code level.
But yeah, out of scope for this patch set.
> > What is the difference?
>
> The kunwind_next_frame_record() function will identify and unwind any
> frame_record_meta frames, while unwind_next_frame_record() only handles
> regular frame records. Since the KVM hyp code doesn't use
> frame_record_meta frames, using unwind_next_frame_record() is
> sufficient.
>
> I originally wanted the kunwind code call unwind_next_frame_record() and
> then handle any frame_record_meta, but that was painful due to the state
> that alters (e.g. the way we track which stack ranges have been
> consumed), and duplicating the easrly parts of
> unwind_next_frame_record() was simpler overall.
Ok.
> As a general naming convention, the unwind_*() functions are things that
> can be shared between the main kernel image and KVM, and the kunwind_*()
> functions are specific to the main kernel image.
>
> I'm happy to change the name to more clearly distinguish
> kunwind_next_frame_record() from unwind_next_frame_record(), if that
> would help?
Nope, as far as I am concerned the convention makes sense and I have no
problem with that.
Thank you,
Miroslav
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 10/10] arm64: stacktrace: unwind exception boundaries
2024-10-10 10:15 ` [PATCH 10/10] arm64: stacktrace: unwind exception boundaries Mark Rutland
2024-10-11 15:16 ` Miroslav Benes
@ 2024-10-12 9:22 ` Mark Brown
1 sibling, 0 replies; 31+ messages in thread
From: Mark Brown @ 2024-10-12 9:22 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, catalin.marinas, jpoimboe, kaleshsingh,
madvenka, maz, mbenes, puranjay12, will
[-- Attachment #1: Type: text/plain, Size: 623 bytes --]
On Thu, Oct 10, 2024 at 11:15:10AM +0100, Mark Rutland wrote:
> When a PC is unwound from pt_regs::lr, dump_backtrace() will log this
> with an "L" marker so that it can be identified easily. For example,
> an unwind across an exception boundary will appear as follows:
> | el1h_64_irq+0x6c/0x70
> | _raw_spin_unlock_irqrestore+0x10/0x60 (P)
> | __aarch64_insn_write+0x6c/0x90 (L)
> | aarch64_insn_patch_text_nosync+0x28/0x80
> ... with a (P) entry for pt_regs::pc, and an (L) entry for pt_regs:lr.
This will be super useful.
Reviewed-by: Mark Brown <broonie@kernel.org>
modulo the stray file Miroslav mentioned.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/10] arm64: stacktrace: improve unwind reporting
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (9 preceding siblings ...)
2024-10-10 10:15 ` [PATCH 10/10] arm64: stacktrace: unwind exception boundaries Mark Rutland
@ 2024-10-11 15:19 ` Miroslav Benes
2024-10-11 16:32 ` Mark Rutland
2024-10-15 11:54 ` Puranjay Mohan
11 siblings, 1 reply; 31+ messages in thread
From: Miroslav Benes @ 2024-10-11 15:19 UTC (permalink / raw)
To: Mark Rutland
Cc: linux-arm-kernel, ardb, broonie, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
Hi,
On Thu, 10 Oct 2024, Mark Rutland wrote:
> This series improves arm64's unwinder to explicitly identify exception
> boundaries, reporting both pt_regs::pc and pt_regs::lr and explicitly
> identifying the source of elements in the stacktrace. This is useful to
> humans when reviewing a stacktrace, and serves as infrastructure that
> can be used for RELIABLE_STACKTRACE in future.
>
> The first 6 patches are preparatory work that are not intended to have
> any functional impact, with patches 7 to 10 making the key changes.
> Largely this involves teaching the unwinder to track metadata for each
> unwind step, and modifying the way we manage pt_regs::stackframe so that
> exception boundaries can be identifier explcitily.
>
> With this series applied, the unwinder will report when unwind elements are not
> simply the result of a frame pointer based unwind, e.g.
>
> | Call trace:
> | show_stack+0x20/0x38 (CF)
> | dump_stack_lvl+0x60/0x80 (F)
> | dump_stack+0x18/0x28
> | nmi_cpu_backtrace+0xfc/0x140
> | nmi_trigger_cpumask_backtrace+0x1c8/0x200
> | arch_trigger_cpumask_backtrace+0x20/0x40
> | sysrq_handle_showallcpus+0x24/0x38 (F)
> | __handle_sysrq+0xa8/0x1b0 (F)
> | handle_sysrq+0x38/0x50 (F)
> | pl011_int+0x420/0x570 (F)
> | __handle_irq_event_percpu+0x60/0x220 (F)
> | handle_irq_event+0x54/0xc0 (F)
> | handle_fasteoi_irq+0xa8/0x1d0 (F)
> | generic_handle_domain_irq+0x34/0x58 (F)
> | gic_handle_irq+0x54/0x140 (FK)
> | call_on_irq_stack+0x24/0x58 (F)
> | do_interrupt_handler+0x88/0xa0
> | el1_interrupt+0x34/0x68 (F)
> | el1h_64_irq_handler+0x18/0x28
> | el1h_64_irq+0x6c/0x70
> | default_idle_call+0x34/0x180 (P)
> | default_idle_call+0x28/0x180 (L)
> | do_idle+0x204/0x268
> | cpu_startup_entry+0x3c/0x50 (F)
> | rest_init+0xe4/0xf0
> | start_kernel+0x738/0x740
> | __primary_switched+0x88/0x98
>
> ... where:
>
> * "C" indicates that the first element of the trace was the caller of an unwind
> function (vs "T" for a blocked task's stave PC, or "P" for a pt_regs::pc).
>
> * "F" indicates that the element was recovered from fgraph (and
> would otherwise have been reported as return_to_handler).
>
> * "K" indicates that the element was recovered from kretprobes (and
> would otherwise have been reported as __kretprobe_trampoline).
>
> * "P" indicates that the element was recovered from pt_regs::pc, and therefore
> this is the first element after an exception boundary.
>
> * "L" indidates that the element was recovered from pt_regs::lr, and therefore
> this may or may not be reliable depending on whether the LR was live at the
> moment the exception was taken.
>
> Mark.
>
> Mark Rutland (10):
> arm64: pt_regs: assert pt_regs is a multiple of 16 bytes
> arm64: pt_regs: remove stale big-endian layout
> arm64: pt_regs: rename "pmr_save" -> "pmr"
> arm64: pt_regs: swap 'unused' and 'pmr' fields
> arm64: use a common struct frame_record
> arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk()
> arm64: stacktrace: report source of unwind data
> arm64: stacktrace: report recovered PCs
> arm64: stacktrace: split unwind_consume_stack()
> arm64: stacktrace: unwind exception boundaries
>
> arch/arm64/include/asm/daifflags.h | 2 +-
> arch/arm64/include/asm/processor.h | 2 +-
> arch/arm64/include/asm/ptrace.h | 22 ++-
> arch/arm64/include/asm/stacktrace/common.h | 74 +++++----
> arch/arm64/include/asm/stacktrace/frame.h | 48 ++++++
> arch/arm64/kernel/asm-offsets.c | 3 +-
> arch/arm64/kernel/entry.S | 16 +-
> arch/arm64/kernel/head.S | 3 +
> arch/arm64/kernel/probes/stBV5U5j | 0
> arch/arm64/kernel/process.c | 5 +-
> arch/arm64/kernel/signal.c | 5 -
> arch/arm64/kernel/stacktrace.c | 176 +++++++++++++++++++--
> 12 files changed, 287 insertions(+), 69 deletions(-)
> create mode 100644 arch/arm64/include/asm/stacktrace/frame.h
Very nice!
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
> create mode 100644 arch/arm64/kernel/probes/stBV5U5j
but this should probably disappear :)
M
^ permalink raw reply [flat|nested] 31+ messages in thread* Re: [PATCH 00/10] arm64: stacktrace: improve unwind reporting
2024-10-11 15:19 ` [PATCH 00/10] arm64: stacktrace: improve unwind reporting Miroslav Benes
@ 2024-10-11 16:32 ` Mark Rutland
0 siblings, 0 replies; 31+ messages in thread
From: Mark Rutland @ 2024-10-11 16:32 UTC (permalink / raw)
To: Miroslav Benes
Cc: linux-arm-kernel, ardb, broonie, catalin.marinas, jpoimboe,
kaleshsingh, madvenka, maz, puranjay12, will
On Fri, Oct 11, 2024 at 05:19:15PM +0200, Miroslav Benes wrote:
> On Thu, 10 Oct 2024, Mark Rutland wrote:
>
> > This series improves arm64's unwinder to explicitly identify exception
> > boundaries, reporting both pt_regs::pc and pt_regs::lr and explicitly
> > identifying the source of elements in the stacktrace. This is useful to
> > humans when reviewing a stacktrace, and serves as infrastructure that
> > can be used for RELIABLE_STACKTRACE in future.
[...]
> > ... where:
> >
> > * "C" indicates that the first element of the trace was the caller of an unwind
> > function (vs "T" for a blocked task's stave PC, or "P" for a pt_regs::pc).
Ugh, s/stave/saved/, at least that was correct in the actual commit
message that introduced it.
[...]
> > arm64: pt_regs: assert pt_regs is a multiple of 16 bytes
> > arm64: pt_regs: remove stale big-endian layout
> > arm64: pt_regs: rename "pmr_save" -> "pmr"
> > arm64: pt_regs: swap 'unused' and 'pmr' fields
> > arm64: use a common struct frame_record
> > arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk()
> > arm64: stacktrace: report source of unwind data
> > arm64: stacktrace: report recovered PCs
> > arm64: stacktrace: split unwind_consume_stack()
> > arm64: stacktrace: unwind exception boundaries
> >
> > arch/arm64/include/asm/daifflags.h | 2 +-
> > arch/arm64/include/asm/processor.h | 2 +-
> > arch/arm64/include/asm/ptrace.h | 22 ++-
> > arch/arm64/include/asm/stacktrace/common.h | 74 +++++----
> > arch/arm64/include/asm/stacktrace/frame.h | 48 ++++++
> > arch/arm64/kernel/asm-offsets.c | 3 +-
> > arch/arm64/kernel/entry.S | 16 +-
> > arch/arm64/kernel/head.S | 3 +
> > arch/arm64/kernel/probes/stBV5U5j | 0
> > arch/arm64/kernel/process.c | 5 +-
> > arch/arm64/kernel/signal.c | 5 -
> > arch/arm64/kernel/stacktrace.c | 176 +++++++++++++++++++--
> > 12 files changed, 287 insertions(+), 69 deletions(-)
> > create mode 100644 arch/arm64/include/asm/stacktrace/frame.h
>
> Very nice!
>
> Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Thanks!
> > create mode 100644 arch/arm64/kernel/probes/stBV5U5j
>
> but this should probably disappear :)
Ack; gone.
I'll hold off sending a v2 with the fixups until middle/late next week
so that people get a chance to respond with anything more significant
and so that I can look over this with fresh eyes and clear out any more
typos...
In the mean time I've pushed out the fixed-up version to:
https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/stacktrace/metadata
Mark.
^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH 00/10] arm64: stacktrace: improve unwind reporting
2024-10-10 10:15 [PATCH 00/10] arm64: stacktrace: improve unwind reporting Mark Rutland
` (10 preceding siblings ...)
2024-10-11 15:19 ` [PATCH 00/10] arm64: stacktrace: improve unwind reporting Miroslav Benes
@ 2024-10-15 11:54 ` Puranjay Mohan
11 siblings, 0 replies; 31+ messages in thread
From: Puranjay Mohan @ 2024-10-15 11:54 UTC (permalink / raw)
To: Mark Rutland, linux-arm-kernel
Cc: ardb, broonie, catalin.marinas, jpoimboe, kaleshsingh, madvenka,
mark.rutland, maz, mbenes, will
[-- Attachment #1: Type: text/plain, Size: 2768 bytes --]
Mark Rutland <mark.rutland@arm.com> writes:
> This series improves arm64's unwinder to explicitly identify exception
> boundaries, reporting both pt_regs::pc and pt_regs::lr and explicitly
> identifying the source of elements in the stacktrace. This is useful to
> humans when reviewing a stacktrace, and serves as infrastructure that
> can be used for RELIABLE_STACKTRACE in future.
>
> The first 6 patches are preparatory work that are not intended to have
> any functional impact, with patches 7 to 10 making the key changes.
> Largely this involves teaching the unwinder to track metadata for each
> unwind step, and modifying the way we manage pt_regs::stackframe so that
> exception boundaries can be identifier explcitily.
>
> With this series applied, the unwinder will report when unwind elements are not
> simply the result of a frame pointer based unwind, e.g.
>
> | Call trace:
> | show_stack+0x20/0x38 (CF)
> | dump_stack_lvl+0x60/0x80 (F)
> | dump_stack+0x18/0x28
> | nmi_cpu_backtrace+0xfc/0x140
> | nmi_trigger_cpumask_backtrace+0x1c8/0x200
> | arch_trigger_cpumask_backtrace+0x20/0x40
> | sysrq_handle_showallcpus+0x24/0x38 (F)
> | __handle_sysrq+0xa8/0x1b0 (F)
> | handle_sysrq+0x38/0x50 (F)
> | pl011_int+0x420/0x570 (F)
> | __handle_irq_event_percpu+0x60/0x220 (F)
> | handle_irq_event+0x54/0xc0 (F)
> | handle_fasteoi_irq+0xa8/0x1d0 (F)
> | generic_handle_domain_irq+0x34/0x58 (F)
> | gic_handle_irq+0x54/0x140 (FK)
> | call_on_irq_stack+0x24/0x58 (F)
> | do_interrupt_handler+0x88/0xa0
> | el1_interrupt+0x34/0x68 (F)
> | el1h_64_irq_handler+0x18/0x28
> | el1h_64_irq+0x6c/0x70
> | default_idle_call+0x34/0x180 (P)
> | default_idle_call+0x28/0x180 (L)
> | do_idle+0x204/0x268
> | cpu_startup_entry+0x3c/0x50 (F)
> | rest_init+0xe4/0xf0
> | start_kernel+0x738/0x740
> | __primary_switched+0x88/0x98
>
> ... where:
>
> * "C" indicates that the first element of the trace was the caller of an unwind
> function (vs "T" for a blocked task's stave PC, or "P" for a pt_regs::pc).
>
> * "F" indicates that the element was recovered from fgraph (and
> would otherwise have been reported as return_to_handler).
>
> * "K" indicates that the element was recovered from kretprobes (and
> would otherwise have been reported as __kretprobe_trampoline).
>
> * "P" indicates that the element was recovered from pt_regs::pc, and therefore
> this is the first element after an exception boundary.
>
> * "L" indidates that the element was recovered from pt_regs::lr, and therefore
> this may or may not be reliable depending on whether the LR was live at the
> moment the exception was taken.
>
> Mark.
with all the typos reported by others fixed.
Reviewed-by: Puranjay Mohan <puranjay12@gmail.com>
Thanks,
Puranjay Mohan
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]
^ permalink raw reply [flat|nested] 31+ messages in thread