linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 00/16] s390: SFrame user space unwinding
@ 2025-07-10 16:35 Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 01/16] fixup! unwind_user: Add frame pointer support Jens Remus
                   ` (15 more replies)
  0 siblings, 16 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

This RFC series adds s390 support for unwinding of user space using
SFrame. It is based on Josh's and Steven's work (see prerequisites
below).  The generic unwind user (sframe) frameworks are extended to
enable support for a few s390-particularities (see patches 6-9),
including unwinding of user space using back chain (see patch 12).
The latter could be broken apart as a separate patch series.

Posting as RFC so that the s390-particularities could be taken into
account in any of the prerequisite series from Steve and to obtain
early feedback to improve my patches.  I would also be fine with any
of the required infrastructure changes being integrated into the
prerequisite series.

Hopefully it was the right think to use the distribtion list from
Steve's prerequisite series.


Motivation:

On s390 unwinding using frame pointer (FP) is unsupported, because of
lack of proper s390 64-bit (s390x) ABI specification and compiler
support.  The ABI does only specify a "preferred" FP register.  Both GCC
and Clang, regardless of compiler option -fno-omit-frame-pointer, setup
the preferred FP register as late as possible, which usually is after
static stack allocation, so that the CFA cannot be deduced from the FP
without any further data, such as provided by DWARF CFI or SFrame.

In theory there is a s390-specific alternative of unwinding using
back chain (compiler option -mbackchain), but this has its own
limitations and there is currently no distribution that builds user
space with back chain.

As a consequence the Kernel stack tracer cannot unwind user space
(except if it is built with back chain).  Recording call graphs of user
space using perf is limited top stack sampling (i.e. perf record
--call-graph dwarf), which generates a fairly large amount of data and
has limitations.

Initial testing of recording call graphs using perf using the s390
support for SFrame provided by this series (on top of Josh's and
Steve's) shows that both the sampling rate and data size notably
improve:

perf record data size is greatly reduced (smaller perf.data):

  SFrame (--call-graph fp):
  # perf record -F 9999 --call-graph fp objdump -wdWF objdump
  [ perf record: Woken up 9 times to write data ]
  [ perf record: Captured and wrote 2.498 MB perf.data (10891 samples) ]

  Stack sampling (--call-graph dwarf) with a default stack size of 8192:
  # perf record -F 9999 --call-graph dwarf objdump -wdWF objdump
  [ perf record: Woken up 270 times to write data ]
  [ perf record: Captured and wrote 67.467 MB perf.data (8241 samples) ]

perf record sampling rate is a lot higher (higher number of events):

  SFrame (--call-graph fp):
  # perf record -F 99999 --call-graph fp objdump -wdWF objdump
  [ perf record: Woken up 213 times to write data ]
  [ perf record: Captured and wrote 53.167 MB perf.data (283993 samples) ]

  Stack sampling (--call-graph dwarf) with a default stack size of 8192:
  # perf record -F 99999 --call-graph dwarf objdump -wdWF objdump
  [ perf record: Woken up 2678 times to write data ]
  Warning:
  Processed 91458 events and lost 45 chunks!
  Check IO/CPU overload!
  Warning:
  Processed 102157 samples and lost 19.24%!
  [ perf record: Captured and wrote 675.513 MB perf.data (82497 samples) ]


Prerequirements:

This RFC series applies on top of Josh's and Steve's series
"[PATCH v8 00/12] unwind_deferred: Implement sframe handling":
https://lore.kernel.org/all/20250708021115.894007410@kernel.org/
Note that this series depends on others.

It is based on top of Steve's branch available at:
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git unwind/main

It depends on my Binutils series "[PATCH v3 00/11] s390: Support to
generate .sframe in assembler and linker":
https://inbox.sourceware.org/binutils/20250627110849.1198336-1-jremus@linux.ibm.com/
Note that my latest v4 of that series is already based on SFrame V2
format changes (i.e. SFRAME_F_FDE_FUNC_START_PCREL), that require
changes to the generic unwind user sframe implementation.

Josh's and Steve's series depends on a Glibc patch from Josh, that adds
support for the prctls introduced in the Kernel:
https://lore.kernel.org/all/20250122023517.lmztuocecdjqzfhc@jpoimboe/
Note that Josh's Glibc patch needs to be adjusted for the updated prctl
numbers from "[PATCH v8 12/12] unwind_user/sframe: Add prctl() interface
for registering .sframe sections":
https://lore.kernel.org/all/20250708021200.397301537@kernel.org/


Overview:

Patch 1 adds and rewords a few comments to Josh's and Steve's user
unwind framework.

Patch 2 aligns asm/dwarf.h to x86 asm/dwarf2.h.

Patch 3 replicates Josh's x86 patch "x86/asm: Avoid emitting DWARF
CFI for non-VDSO" for s390.

Patch 4 replicates Josh's patch "x86/vdso: Enable sframe generation
in VDSO" for s390.  It enables generation of SFrame stack trace
information (.sframe section) for the vDSO if the assembler supports it.
Note that this depends on a new config option CONFIG_AS_SFRAME that is
introduced by a separate series by Josh/Steven, from which I have
included the required patches as PREREQ.

Patch 5 changes the build of the vDSO on s390 to keep the function
symbols for stack tracing purposes.  Note that Josh does this in his
patch "x86/vdso: Enable sframe generation in VDSO", by chaning objcopy
option -S to -g.

Patches 6-9 enable Josh's generic unwind user (sframe) frameworks to
support the following s390 particularities:

- Patch 6 adds support for architectures that define their CFA as SP at
  callsite + offset.

- Patch 7 adds support support for architectures that do not necessarily
  save the RA on the stack (or in another register) in the topmost
  frame (e.g. in the prologue or in lead functions).

- Patch 8 adds support for architectures that save RA/FP in other
  registers.

- Patch 9 adds support for architectures that store the CFA offset
  from CFA base register (e.g. SP or FP) in SFrame encoded.  For
  instance on s390 the CFA offset is stored adjusted by -160 and
  then scaled down by 8 to enable and improve the use of signed 8-bit
  SFrame offsets (i.e. CFA, RA, and FP offset).

Patch 10 introduces frame_pointer() and user_return_address() in
ptrace on s390.  Both are prerequisites for the subsequent patch.

Patch 11 adds support for unwinding of user space using SFrame on
s390.  It leverages the extensions of the generic unwind user
framework from patches 6-9.

Patch 12 introduces unwinding of user space using back chain to the
unwind user framework.

Patch 13 adds support for unwinding of user space using back chain on
s390.

Patches 14-15 are pre-requisite patches from Josh's and Steve's
series "[PATCH v6 0/6] x86/vdso: VDSO updates and fixes for sframes":
https://lore.kernel.org/all/20250425023750.669174660@goodmis.org/
They introduce the config option CONFIG_AS_SFRAME required by patch 4.

Patch 16 is a WIP fixup for user unwind sframe on s390 to use macros
instead of magic numbers that I would like to get some feedback on,
whether that would be the correct approach.

Initially I had a patch on top that uses the unwind user framework in
stack trace on s390 in arch_stack_walk_user_common(), now that it can
unwind user space using back chain.  But a recent change changed
macro for_each_user_frame() private, so that it can no longer be used.
Note that this would still not enable stack traces of user space to be
generated.  The reason is that the stack tracer does not allow for page
faults, causing the unwind user framework attempt to unwind using SFrame
to fail and fallback to unwind using back chain, which usually also
fails, as user space is not built with back chain (see motivation).


Limitations:

Unwinding of user space using back chain cannot - by design - restore
the FP.  Therefore unwiding of subsequent frames using e.g. SFrame may
fail, if the FP is the CFA base register.

Thanks and regards,
Jens

Jens Remus (14):
  fixup! unwind_user: Add frame pointer support
  s390: asm/dwarf.h should only be included in assembly files
  s390/vdso: Avoid emitting DWARF CFI for non-vDSO
  s390/vdso: Enable SFrame generation in vDSO
  s390/vdso: Keep function symbols in vDSO
  unwind_user: Enable archs that define CFA = SP_callsite + offset
  unwind_user: Enable archs that do not necessarily save RA
  unwind_user: Enable archs that save RA/FP in other registers
  unwind_user/sframe: Enable archs with encoded SFrame CFA offsets
  s390/ptrace: Enable HAVE_USER_RA_REG
  s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME
  unwind_user/backchain: Introduce back chain user space unwinding
  s390/unwind_user/backchain: Enable HAVE_UNWIND_USER_BACKCHAIN
  WIP: fixup! s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME

Josh Poimboeuf (2):
  PREREQ: x86/asm: Avoid emitting DWARF CFI for non-VDSO
  PREREQ: x86/vdso: Enable sframe generation in VDSO

 arch/Kconfig                                  |  21 +++
 arch/s390/Kconfig                             |   4 +
 arch/s390/include/asm/dwarf.h                 |  53 +++++---
 arch/s390/include/asm/ptrace.h                |  25 +++-
 arch/s390/include/asm/unwind_user.h           |  83 ++++++++++++
 arch/s390/include/asm/unwind_user_backchain.h | 127 ++++++++++++++++++
 arch/s390/include/asm/unwind_user_sframe.h    |  37 +++++
 arch/s390/kernel/vdso64/Makefile              |   9 +-
 arch/s390/kernel/vdso64/vdso64.lds.S          |   5 +
 arch/x86/entry/vdso/Makefile                  |  10 +-
 arch/x86/entry/vdso/vdso-layout.lds.S         |   3 +
 arch/x86/include/asm/dwarf2.h                 |  54 +++++---
 arch/x86/include/asm/unwind_user.h            |  26 +++-
 include/asm-generic/Kbuild                    |   1 +
 include/asm-generic/unwind_user.h             |  20 +++
 include/asm-generic/unwind_user_sframe.h      |  65 +++++++++
 include/linux/ptrace.h                        |   8 ++
 include/linux/sframe.h                        |   4 +-
 include/linux/unwind_user_backchain.h         |  17 +++
 include/linux/unwind_user_types.h             |  21 ++-
 kernel/unwind/Makefile                        |   1 +
 kernel/unwind/sframe.c                        |  28 ++--
 kernel/unwind/sframe.h                        |  16 +++
 kernel/unwind/user.c                          | 101 +++++++++++---
 kernel/unwind/user_backchain.c                |  13 ++
 25 files changed, 671 insertions(+), 81 deletions(-)
 create mode 100644 arch/s390/include/asm/unwind_user.h
 create mode 100644 arch/s390/include/asm/unwind_user_backchain.h
 create mode 100644 arch/s390/include/asm/unwind_user_sframe.h
 create mode 100644 include/asm-generic/unwind_user_sframe.h
 create mode 100644 include/linux/unwind_user_backchain.h
 create mode 100644 kernel/unwind/user_backchain.c

-- 
2.48.1


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 01/16] fixup! unwind_user: Add frame pointer support
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 02/16] s390: asm/dwarf.h should only be included in assembly files Jens Remus
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

---
 kernel/unwind/user.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c
index 6e7ca9f1293a..d0181c636c6b 100644
--- a/kernel/unwind/user.c
+++ b/kernel/unwind/user.c
@@ -74,6 +74,7 @@ static int unwind_user_next(struct unwind_user_state *state)
 		goto done;
 	}
 
+	/* Get the Canonical Frame Address (CFA) */
 	if (frame->use_fp) {
 		if (state->fp < state->sp)
 			goto done;
@@ -81,11 +82,9 @@ static int unwind_user_next(struct unwind_user_state *state)
 	} else {
 		cfa = state->sp;
 	}
-
-	/* Get the Canonical Frame Address (CFA) */
 	cfa += frame->cfa_off;
 
-	/* stack going in wrong direction? */
+	/* Make sure that stack is not going in wrong direction */
 	if (cfa <= state->sp)
 		goto done;
 
@@ -94,10 +93,11 @@ static int unwind_user_next(struct unwind_user_state *state)
 	if ((cfa + frame->ra_off) & ((1 << shift) - 1))
 		goto done;
 
-	/* Find the Return Address (RA) */
+	/* Get the Return Address (RA) */
 	if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
 		goto done;
 
+	/* Get the Frame Pointer (FP) */
 	if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off, state))
 		goto done;
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 02/16] s390: asm/dwarf.h should only be included in assembly files
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 01/16] fixup! unwind_user: Add frame pointer support Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 03/16] s390/vdso: Avoid emitting DWARF CFI for non-vDSO Jens Remus
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Align to x86 and add a compile-time check that asm/dwarf.h is only
included in pure assembly files.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/s390/include/asm/dwarf.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/s390/include/asm/dwarf.h b/arch/s390/include/asm/dwarf.h
index 390906b8e386..fdf51f66a9ed 100644
--- a/arch/s390/include/asm/dwarf.h
+++ b/arch/s390/include/asm/dwarf.h
@@ -2,7 +2,9 @@
 #ifndef _ASM_S390_DWARF_H
 #define _ASM_S390_DWARF_H
 
-#ifdef __ASSEMBLY__
+#ifndef __ASSEMBLY__
+#warning "asm/dwarf.h should be only included in pure assembly files"
+#endif
 
 #define CFI_STARTPROC		.cfi_startproc
 #define CFI_ENDPROC		.cfi_endproc
@@ -33,6 +35,4 @@
 	.cfi_sections .eh_frame, .debug_frame
 #endif
 
-#endif	/* __ASSEMBLY__ */
-
 #endif	/* _ASM_S390_DWARF_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 03/16] s390/vdso: Avoid emitting DWARF CFI for non-vDSO
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 01/16] fixup! unwind_user: Add frame pointer support Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 02/16] s390: asm/dwarf.h should only be included in assembly files Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 04/16] s390/vdso: Enable SFrame generation in vDSO Jens Remus
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

This replicates Josh's x86 commit TODO ("x86/asm: Avoid emitting DWARF
CFI for non-VDSO") for s390.  It also aligns asm/dwarf.h to x86
asm/dwarf2.h.

It was decided years ago that .cfi_* annotations aren't maintainable in
the kernel.  For the kernel proper, ensure the CFI_* macros don't do
anything.

On the other hand the vDSO library *does* use them, so user space can
unwind through it.

Make sure these macros only work for vDSO.  They aren't actually being
used outside of vDSO anyway, so there's no functional change.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    Link to latest x86 patch:
    https://lore.kernel.org/all/20250425024022.477374378@goodmis.org/

 arch/s390/include/asm/dwarf.h | 45 ++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/arch/s390/include/asm/dwarf.h b/arch/s390/include/asm/dwarf.h
index fdf51f66a9ed..5052ee6a40f7 100644
--- a/arch/s390/include/asm/dwarf.h
+++ b/arch/s390/include/asm/dwarf.h
@@ -6,6 +6,18 @@
 #warning "asm/dwarf.h should be only included in pure assembly files"
 #endif
 
+.macro nocfi args:vararg
+.endm
+
+#ifdef BUILD_VDSO
+
+	/*
+	 * For the vDSO, emit both runtime unwind information and debug
+	 * symbols for the .dbg file.
+	 */
+
+	.cfi_sections .eh_frame, .debug_frame
+
 #define CFI_STARTPROC		.cfi_startproc
 #define CFI_ENDPROC		.cfi_endproc
 #define CFI_DEF_CFA_OFFSET	.cfi_def_cfa_offset
@@ -16,23 +28,24 @@
 #ifdef CONFIG_AS_CFI_VAL_OFFSET
 #define CFI_VAL_OFFSET		.cfi_val_offset
 #else
-#define CFI_VAL_OFFSET		#
+#define CFI_VAL_OFFSET		nocfi
 #endif
 
-#ifndef BUILD_VDSO
-	/*
-	 * Emit CFI data in .debug_frame sections and not in .eh_frame
-	 * sections.  The .eh_frame CFI is used for runtime unwind
-	 * information that is not being used.  Hence, vmlinux.lds.S
-	 * can discard the .eh_frame sections.
-	 */
-	.cfi_sections .debug_frame
-#else
-	/*
-	 * For vDSO, emit CFI data in both, .eh_frame and .debug_frame
-	 * sections.
-	 */
-	.cfi_sections .eh_frame, .debug_frame
-#endif
+#else /* !BUILD_VDSO */
+
+/*
+ * On s390, these macros aren't used outside vDSO.  As well they shouldn't be:
+ * they're fragile and very difficult to maintain.
+ */
+
+#define CFI_STARTPROC		nocfi
+#define CFI_ENDPROC		nocfi
+#define CFI_DEF_CFA_OFFSET	nocfi
+#define CFI_ADJUST_CFA_OFFSET	nocfi
+#define CFI_RESTORE		nocfi
+#define CFI_REL_OFFSET		nocfi
+#define CFI_VAL_OFFSET		nocfi
+
+#endif /* !BUILD_VDSO */
 
 #endif	/* _ASM_S390_DWARF_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 04/16] s390/vdso: Enable SFrame generation in vDSO
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (2 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 03/16] s390/vdso: Avoid emitting DWARF CFI for non-vDSO Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 05/16] s390/vdso: Keep function symbols " Jens Remus
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

This replicates Josh's commit TODO ("x86/vdso: Enable sframe generation
in VDSO") for s390.

Enable SFrame generation in the vDSO library so kernel and user space
can unwind through it.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    Link to latest x86 patch:
    https://lore.kernel.org/all/20250425024023.173709192@goodmis.org/

 arch/s390/include/asm/dwarf.h        | 4 ++++
 arch/s390/kernel/vdso64/Makefile     | 7 ++++++-
 arch/s390/kernel/vdso64/vdso64.lds.S | 5 +++++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/dwarf.h b/arch/s390/include/asm/dwarf.h
index 5052ee6a40f7..aeb54e716d5a 100644
--- a/arch/s390/include/asm/dwarf.h
+++ b/arch/s390/include/asm/dwarf.h
@@ -16,7 +16,11 @@
 	 * symbols for the .dbg file.
 	 */
 
+#ifdef CONFIG_AS_SFRAME
+	.cfi_sections .eh_frame, .debug_frame, .sframe
+#else
 	.cfi_sections .eh_frame, .debug_frame
+#endif
 
 #define CFI_STARTPROC		.cfi_startproc
 #define CFI_ENDPROC		.cfi_endproc
diff --git a/arch/s390/kernel/vdso64/Makefile b/arch/s390/kernel/vdso64/Makefile
index d8f0df742809..e96156b9c4df 100644
--- a/arch/s390/kernel/vdso64/Makefile
+++ b/arch/s390/kernel/vdso64/Makefile
@@ -20,7 +20,11 @@ targets := $(obj-vdso64) $(obj-cvdso64) vdso64.so vdso64.so.dbg
 obj-vdso64 := $(addprefix $(obj)/, $(obj-vdso64))
 obj-cvdso64 := $(addprefix $(obj)/, $(obj-cvdso64))
 
-KBUILD_AFLAGS += -DBUILD_VDSO
+ifeq ($(CONFIG_AS_SFRAME),y)
+	SFRAME_CFLAGS := -Wa,--gsframe
+endif
+
+KBUILD_AFLAGS += -DBUILD_VDSO $(SFRAME_CFLAGS)
 KBUILD_CFLAGS += -DBUILD_VDSO -DDISABLE_BRANCH_PROFILING
 
 KBUILD_AFLAGS_64 := $(filter-out -m64,$(KBUILD_AFLAGS))
@@ -32,6 +36,7 @@ KBUILD_CFLAGS_64 := $(filter-out -mno-pic-data-is-text-relative,$(KBUILD_CFLAGS_
 KBUILD_CFLAGS_64 := $(filter-out -munaligned-symbols,$(KBUILD_CFLAGS_64))
 KBUILD_CFLAGS_64 := $(filter-out -fno-asynchronous-unwind-tables,$(KBUILD_CFLAGS_64))
 KBUILD_CFLAGS_64 += -m64 -fPIC -fno-common -fno-builtin -fasynchronous-unwind-tables
+KBUILD_CFLAGS_64 += $(SFRAME_CFLAGS)
 ldflags-y := -shared -soname=linux-vdso64.so.1 \
 	     --hash-style=both --build-id=sha1 -T
 
diff --git a/arch/s390/kernel/vdso64/vdso64.lds.S b/arch/s390/kernel/vdso64/vdso64.lds.S
index e4f6551ae898..7acecb0d9b7e 100644
--- a/arch/s390/kernel/vdso64/vdso64.lds.S
+++ b/arch/s390/kernel/vdso64/vdso64.lds.S
@@ -50,6 +50,9 @@ SECTIONS
 
 	.eh_frame_hdr	: { *(.eh_frame_hdr) }		:text	:eh_frame_hdr
 	.eh_frame	: { KEEP (*(.eh_frame)) }	:text
+
+	.sframe		: { *(.sframe) }		:text	:sframe
+
 	.gcc_except_table : { *(.gcc_except_table .gcc_except_table.*) }
 
 	.rela.dyn ALIGN(8) : { *(.rela.dyn) }
@@ -114,6 +117,7 @@ SECTIONS
  * Very old versions of ld do not recognize this name token; use the constant.
  */
 #define PT_GNU_EH_FRAME	0x6474e550
+#define PT_GNU_SFRAME	0x6474e554
 
 /*
  * We must supply the ELF program headers explicitly to get just one
@@ -125,6 +129,7 @@ PHDRS
 	dynamic		PT_DYNAMIC FLAGS(4);		/* PF_R */
 	note		PT_NOTE FLAGS(4);		/* PF_R */
 	eh_frame_hdr	PT_GNU_EH_FRAME;
+	sframe		PT_GNU_SFRAME;
 }
 
 /*
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 05/16] s390/vdso: Keep function symbols in vDSO
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (3 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 04/16] s390/vdso: Enable SFrame generation in vDSO Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 06/16] unwind_user: Enable archs that define CFA = SP_callsite + offset Jens Remus
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Keep all function symbols in the vDSO .symtab for stack trace purposes.
This enables perf to lookup these function symbols in addition to those
already exported in vDSO .dynsym.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    Alternatively use objcopy option -g instead of -S (and the -w -K
    filters), as Josh did in "x86/vdso: Enable sframe generation in VDSO":
    https://lore.kernel.org/all/20250425024023.173709192@goodmis.org/
    
    Note that this change does not cause the vDSO build-id to change.
    perf record may therefore not dump an updated copy of the vDSO to
    ~/.debug/[vdso]/<build-id>/vdso, so that perf script may use a
    stale copy without .symtab. Resolve by deleting ~/.debug/.

 arch/s390/kernel/vdso64/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/kernel/vdso64/Makefile b/arch/s390/kernel/vdso64/Makefile
index e96156b9c4df..067753352697 100644
--- a/arch/s390/kernel/vdso64/Makefile
+++ b/arch/s390/kernel/vdso64/Makefile
@@ -58,7 +58,7 @@ $(obj)/vdso64.so.dbg: $(obj)/vdso64.lds $(obj-vdso64) $(obj-cvdso64) FORCE
 	$(call if_changed,vdso_and_check)
 
 # strip rule for the .so file
-$(obj)/%.so: OBJCOPYFLAGS := -S
+$(obj)/%.so: OBJCOPYFLAGS := -S -w -K "__arch_*" -K "__cvdso_*" -K "__s390_vdso_*"
 $(obj)/%.so: $(obj)/%.so.dbg FORCE
 	$(call if_changed,objcopy)
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 06/16] unwind_user: Enable archs that define CFA = SP_callsite + offset
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (4 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 05/16] s390/vdso: Keep function symbols " Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-16 21:32   ` Josh Poimboeuf
  2025-07-10 16:35 ` [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA Jens Remus
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Most architectures define their CFA as the value of the stack pointer
(SP) at the call site in the previous frame, as suggested by the DWARF
standard:

  CFA = <SP at call site>

Enable unwinding of user space for architectures, such as s390, which
define their CFA as the value of the SP at the call site in the previous
frame with an offset:

  CFA = <SP at call site> + offset

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    Alternatively the conditional definition of generic_sframe_sp_val_off()
    could be moved into kernel/unwind/sframe.c.

 arch/x86/include/asm/unwind_user.h       |  2 ++
 include/asm-generic/Kbuild               |  1 +
 include/asm-generic/unwind_user_sframe.h | 30 ++++++++++++++++++++++++
 include/linux/unwind_user_types.h        |  1 +
 kernel/unwind/sframe.c                   |  2 ++
 kernel/unwind/user.c                     |  8 ++++---
 6 files changed, 41 insertions(+), 3 deletions(-)
 create mode 100644 include/asm-generic/unwind_user_sframe.h

diff --git a/arch/x86/include/asm/unwind_user.h b/arch/x86/include/asm/unwind_user.h
index 19634a73612d..c2881840adf4 100644
--- a/arch/x86/include/asm/unwind_user.h
+++ b/arch/x86/include/asm/unwind_user.h
@@ -8,6 +8,7 @@
 	.cfa_off	= (s32)sizeof(long) *  2,				\
 	.ra_off		= (s32)sizeof(long) * -1,				\
 	.fp_off		= (s32)sizeof(long) * -2,				\
+	.sp_val_off	= (s32)0,						\
 	.use_fp		= true,
 
 #ifdef CONFIG_IA32_EMULATION
@@ -16,6 +17,7 @@
 	.cfa_off	= (s32)sizeof(u32)  *  2,				\
 	.ra_off		= (s32)sizeof(u32)  * -1,				\
 	.fp_off		= (s32)sizeof(u32)  * -2,				\
+	.sp_val_off	= (s32)0,						\
 	.use_fp		= true,
 
 #define in_compat_mode(regs) !user_64bit_mode(regs)
diff --git a/include/asm-generic/Kbuild b/include/asm-generic/Kbuild
index b797a2434396..64a15cde776c 100644
--- a/include/asm-generic/Kbuild
+++ b/include/asm-generic/Kbuild
@@ -60,6 +60,7 @@ mandatory-y += topology.h
 mandatory-y += trace_clock.h
 mandatory-y += uaccess.h
 mandatory-y += unwind_user.h
+mandatory-y += unwind_user_sframe.h
 mandatory-y += unwind_user_types.h
 mandatory-y += vermagic.h
 mandatory-y += vga.h
diff --git a/include/asm-generic/unwind_user_sframe.h b/include/asm-generic/unwind_user_sframe.h
new file mode 100644
index 000000000000..6c87a7f29861
--- /dev/null
+++ b/include/asm-generic/unwind_user_sframe.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_UNWIND_USER_SFRAME_H
+#define _ASM_GENERIC_UNWIND_USER_SFRAME_H
+
+#include <linux/types.h>
+
+/**
+ * generic_sframe_sp_val_off - Get generic SP value offset from CFA.
+ *
+ * During unwinding the stack pointer (SP) at call site can be derived
+ * from the Canonical Frame Address (CFA) as follows:
+ *
+ *   SP = CFA + SP_val_off
+ *
+ * Most architectures define the CFA as value of SP at call site, as
+ * suggested by DWARF. For those architectures the SP value offset
+ * from CFA is 0, so that the SP at call site computes as:
+ *
+ *   SP = CFA
+ *
+ * Returns the generic SP value offset from CFA of 0.
+ */
+static inline s32 generic_sframe_sp_val_off(void)
+{
+	return 0;
+}
+
+#define sframe_sp_val_off generic_sframe_sp_val_off
+
+#endif /* _ASM_GENERIC_UNWIND_USER_SFRAME_H */
diff --git a/include/linux/unwind_user_types.h b/include/linux/unwind_user_types.h
index 4d50476e950e..8050a3237a03 100644
--- a/include/linux/unwind_user_types.h
+++ b/include/linux/unwind_user_types.h
@@ -25,6 +25,7 @@ struct unwind_user_frame {
 	s32 cfa_off;
 	s32 ra_off;
 	s32 fp_off;
+	s32 sp_val_off;
 	bool use_fp;
 };
 
diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
index 6159f072bdb6..acbf791e713b 100644
--- a/kernel/unwind/sframe.c
+++ b/kernel/unwind/sframe.c
@@ -12,6 +12,7 @@
 #include <linux/mm.h>
 #include <linux/string_helpers.h>
 #include <linux/sframe.h>
+#include <asm/unwind_user_sframe.h>
 #include <linux/unwind_user_types.h>
 
 #include "sframe.h"
@@ -313,6 +314,7 @@ static __always_inline int __find_fre(struct sframe_section *sec,
 	frame->ra_off  = fre->ra_off;
 	frame->fp_off  = fre->fp_off;
 	frame->use_fp  = SFRAME_FRE_CFA_BASE_REG_ID(fre->info) == SFRAME_BASE_REG_FP;
+	frame->sp_val_off = sframe_sp_val_off();
 
 	return 0;
 }
diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c
index d0181c636c6b..45c8c6932ba6 100644
--- a/kernel/unwind/user.c
+++ b/kernel/unwind/user.c
@@ -52,7 +52,7 @@ static int unwind_user_next(struct unwind_user_state *state)
 {
 	struct unwind_user_frame *frame;
 	struct unwind_user_frame _frame;
-	unsigned long cfa = 0, fp, ra = 0;
+	unsigned long cfa = 0, sp, fp, ra = 0;
 	unsigned int shift;
 
 	if (state->done)
@@ -84,8 +84,10 @@ static int unwind_user_next(struct unwind_user_state *state)
 	}
 	cfa += frame->cfa_off;
 
+	/* Get the Stack Pointer (SP) */
+	sp = cfa + frame->sp_val_off;
 	/* Make sure that stack is not going in wrong direction */
-	if (cfa <= state->sp)
+	if (sp <= state->sp)
 		goto done;
 
 	/* Make sure that the address is word aligned */
@@ -102,7 +104,7 @@ static int unwind_user_next(struct unwind_user_state *state)
 		goto done;
 
 	state->ip = ra;
-	state->sp = cfa;
+	state->sp = sp;
 	if (frame->fp_off)
 		state->fp = fp;
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (5 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 06/16] unwind_user: Enable archs that define CFA = SP_callsite + offset Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-16 23:01   ` Josh Poimboeuf
  2025-07-10 16:35 ` [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers Jens Remus
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Not all architectures have the return address (RA) in user space saved
on the stack on function entry, as x86-64 does due to its CALL
instruction pushing the RA onto the stack.  Architectures/ABIs, such as
s390, also do not necessarily enforce to save the RA in user space on
the stack in the function prologue or even at all, for instance in leaf
functions.

Treat a RA offset from CFA of zero as indication that the RA is not
saved on the stack.  In that case obtain the RA from the RA register.
Allow the SP to be unchanged in the topmost frame, for architectures
where SP at function entry == SP at call site.

Note that treating a RA offset from CFA of zero as indication that
the RA is not saved on the stack additionally allows for architectures,
such as s390, where the frame pointer (FP) may be saved without the RA
being saved as well.  Provided that such architectures represent this
in SFrame by encoding the "missing" RA offset using a padding RA offset
with a value of zero.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/Kconfig                      |  5 +++++
 include/linux/ptrace.h            |  8 ++++++++
 include/linux/sframe.h            |  4 ++--
 include/linux/unwind_user_types.h |  1 +
 kernel/unwind/sframe.c            | 21 +++++++++++---------
 kernel/unwind/user.c              | 32 ++++++++++++++++++++++---------
 6 files changed, 51 insertions(+), 20 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 86eec85cb898..367eaf7e62e0 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -450,6 +450,11 @@ config HAVE_UNWIND_USER_SFRAME
 	bool
 	select UNWIND_USER
 
+config HAVE_USER_RA_REG
+	bool
+	help
+	  The arch passes the return address (RA) in user space in a register.
+
 config SFRAME_VALIDATION
 	bool "Enable .sframe section debugging"
 	depends on HAVE_UNWIND_USER_SFRAME
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 90507d4afcd6..a245c8586673 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -397,6 +397,14 @@ static inline void user_single_step_report(struct pt_regs *regs)
 #define exception_ip(x) instruction_pointer(x)
 #endif
 
+#ifndef user_return_address
+static inline unsigned long user_return_address(const struct pt_regs *regs)
+{
+	WARN_ON_ONCE(1);
+	return 0;
+}
+#endif
+
 extern int task_current_syscall(struct task_struct *target, struct syscall_info *info);
 
 extern void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction *oact);
diff --git a/include/linux/sframe.h b/include/linux/sframe.h
index b79c5ec09229..e3c6414f1a17 100644
--- a/include/linux/sframe.h
+++ b/include/linux/sframe.h
@@ -33,7 +33,7 @@ extern void sframe_free_mm(struct mm_struct *mm);
 extern int sframe_add_section(unsigned long sframe_start, unsigned long sframe_end,
 			      unsigned long text_start, unsigned long text_end);
 extern int sframe_remove_section(unsigned long sframe_addr);
-extern int sframe_find(unsigned long ip, struct unwind_user_frame *frame);
+extern int sframe_find(unsigned long ip, struct unwind_user_frame *frame, bool topmost);
 
 static inline bool current_has_sframe(void)
 {
@@ -52,7 +52,7 @@ static inline int sframe_add_section(unsigned long sframe_start, unsigned long s
 	return -ENOSYS;
 }
 static inline int sframe_remove_section(unsigned long sframe_addr) { return -ENOSYS; }
-static inline int sframe_find(unsigned long ip, struct unwind_user_frame *frame) { return -ENOSYS; }
+static inline int sframe_find(unsigned long ip, struct unwind_user_frame *frame, bool topmost) { return -ENOSYS; }
 static inline bool current_has_sframe(void) { return false; }
 
 #endif /* CONFIG_HAVE_UNWIND_USER_SFRAME */
diff --git a/include/linux/unwind_user_types.h b/include/linux/unwind_user_types.h
index 8050a3237a03..adef01698bb3 100644
--- a/include/linux/unwind_user_types.h
+++ b/include/linux/unwind_user_types.h
@@ -35,6 +35,7 @@ struct unwind_user_state {
 	unsigned long fp;
 	struct arch_unwind_user_state arch;
 	enum unwind_user_type type;
+	bool topmost;
 	bool done;
 };
 
diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
index acbf791e713b..5bfaf06e6cd2 100644
--- a/kernel/unwind/sframe.c
+++ b/kernel/unwind/sframe.c
@@ -222,12 +222,8 @@ static __always_inline int __read_fre(struct sframe_section *sec,
 	offset_count--;
 
 	ra_off = sec->ra_off;
-	if (!ra_off) {
-		if (!offset_count--) {
-			dbg_sec_uaccess("zero offset_count, can't find ra_off\n");
-			return -EFAULT;
-		}
-
+	if (!ra_off && offset_count) {
+		offset_count--;
 		UNSAFE_GET_USER_INC(ra_off, cur, offset_size, Efault);
 	}
 
@@ -257,7 +253,8 @@ static __always_inline int __read_fre(struct sframe_section *sec,
 
 static __always_inline int __find_fre(struct sframe_section *sec,
 				      struct sframe_fde *fde, unsigned long ip,
-				      struct unwind_user_frame *frame)
+				      struct unwind_user_frame *frame,
+				      bool topmost)
 {
 	unsigned char fde_type = SFRAME_FUNC_FDE_TYPE(fde->info);
 	struct sframe_fre *fre, *prev_fre = NULL;
@@ -310,6 +307,12 @@ static __always_inline int __find_fre(struct sframe_section *sec,
 		return -EINVAL;
 	fre = prev_fre;
 
+	if ((!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost) && !fre->ra_off) {
+		dbg_sec_uaccess("fde addr 0x%x: zero ra_off\n",
+				fde->start_addr);
+		return -EINVAL;
+	}
+
 	frame->cfa_off = fre->cfa_off;
 	frame->ra_off  = fre->ra_off;
 	frame->fp_off  = fre->fp_off;
@@ -319,7 +322,7 @@ static __always_inline int __find_fre(struct sframe_section *sec,
 	return 0;
 }
 
-int sframe_find(unsigned long ip, struct unwind_user_frame *frame)
+int sframe_find(unsigned long ip, struct unwind_user_frame *frame, bool topmost)
 {
 	struct mm_struct *mm = current->mm;
 	struct sframe_section *sec;
@@ -343,7 +346,7 @@ int sframe_find(unsigned long ip, struct unwind_user_frame *frame)
 	if (ret)
 		goto end;
 
-	ret = __find_fre(sec, &fde, ip, frame);
+	ret = __find_fre(sec, &fde, ip, frame, topmost);
 end:
 	user_read_access_end();
 
diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c
index 45c8c6932ba6..03a6da36192f 100644
--- a/kernel/unwind/user.c
+++ b/kernel/unwind/user.c
@@ -3,6 +3,7 @@
 * Generic interfaces for unwinding user space
 */
 #include <linux/kernel.h>
+#include <linux/ptrace.h>
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
 #include <linux/unwind_user.h>
@@ -53,6 +54,7 @@ static int unwind_user_next(struct unwind_user_state *state)
 	struct unwind_user_frame *frame;
 	struct unwind_user_frame _frame;
 	unsigned long cfa = 0, sp, fp, ra = 0;
+	bool topmost = state->topmost;
 	unsigned int shift;
 
 	if (state->done)
@@ -63,7 +65,7 @@ static int unwind_user_next(struct unwind_user_state *state)
 	} else if (sframe_state(state)) {
 		/* sframe expects the frame to be local storage */
 		frame = &_frame;
-		if (sframe_find(state->ip, frame)) {
+		if (sframe_find(state->ip, frame, topmost)) {
 			if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP))
 				goto done;
 			frame = &fp_frame;
@@ -86,18 +88,28 @@ static int unwind_user_next(struct unwind_user_state *state)
 
 	/* Get the Stack Pointer (SP) */
 	sp = cfa + frame->sp_val_off;
-	/* Make sure that stack is not going in wrong direction */
-	if (sp <= state->sp)
+	/*
+	 * Make sure that stack is not going in wrong direction.  Allow SP
+	 * to be unchanged for the topmost frame, by subtracting topmost,
+	 * which is either 0 or 1.
+	 */
+	if (sp <= state->sp - topmost)
 		goto done;
 
-	/* Make sure that the address is word aligned */
-	shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
-	if ((cfa + frame->ra_off) & ((1 << shift) - 1))
-		goto done;
 
 	/* Get the Return Address (RA) */
-	if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
-		goto done;
+	if (frame->ra_off) {
+		/* Make sure that the address is word aligned */
+		shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
+		if ((cfa + frame->ra_off) & ((1 << shift) - 1))
+			goto done;
+		if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
+			goto done;
+	} else {
+		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
+			goto done;
+		ra = user_return_address(task_pt_regs(current));
+	}
 
 	/* Get the Frame Pointer (FP) */
 	if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off, state))
@@ -110,6 +122,7 @@ static int unwind_user_next(struct unwind_user_state *state)
 
 	arch_unwind_user_next(state);
 
+	state->topmost = false;
 	return 0;
 
 done:
@@ -140,6 +153,7 @@ static int unwind_user_start(struct unwind_user_state *state)
 	state->ip = instruction_pointer(regs);
 	state->sp = user_stack_pointer(regs);
 	state->fp = frame_pointer(regs);
+	state->topmost = true;
 
 	arch_unwind_user_init(state, regs);
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (6 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-17  2:01   ` Josh Poimboeuf
  2025-07-10 16:35 ` [RFC PATCH v1 09/16] unwind_user/sframe: Enable archs with encoded SFrame CFA offsets Jens Remus
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Enable unwinding of user space for architectures, such as s390, that
save the return address (RA) and/or frame pointer (FP) in other
registers.  This is only valid in the topmost frame, for instance when
in a leaf function.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/Kconfig                             |  7 ++++
 arch/x86/include/asm/unwind_user.h       | 24 +++++++++---
 include/asm-generic/unwind_user.h        | 20 ++++++++++
 include/asm-generic/unwind_user_sframe.h | 24 ++++++++++++
 include/linux/unwind_user_types.h        | 18 ++++++++-
 kernel/unwind/sframe.c                   |  4 +-
 kernel/unwind/user.c                     | 47 ++++++++++++++++++++----
 7 files changed, 126 insertions(+), 18 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 367eaf7e62e0..9e28dffe42cb 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -455,6 +455,13 @@ config HAVE_USER_RA_REG
 	help
 	  The arch passes the return address (RA) in user space in a register.
 
+config HAVE_UNWIND_USER_LOC_REG
+	bool
+	help
+	  The arch potentially saves the return address (RA) and/or frame
+	  pointer (FP) register values in user space in other registers, when
+	  in the topmost frame (e.g. in leaf function).
+
 config SFRAME_VALIDATION
 	bool "Enable .sframe section debugging"
 	depends on HAVE_UNWIND_USER_SFRAME
diff --git a/arch/x86/include/asm/unwind_user.h b/arch/x86/include/asm/unwind_user.h
index c2881840adf4..925d208aa39d 100644
--- a/arch/x86/include/asm/unwind_user.h
+++ b/arch/x86/include/asm/unwind_user.h
@@ -5,18 +5,30 @@
 #include <linux/unwind_user_types.h>
 
 #define ARCH_INIT_USER_FP_FRAME							\
-	.cfa_off	= (s32)sizeof(long) *  2,				\
-	.ra_off		= (s32)sizeof(long) * -1,				\
-	.fp_off		= (s32)sizeof(long) * -2,				\
+	.cfa_off	= (s32)sizeof(long) * 2,				\
+	.ra		= {							\
+		.loc = UNWIND_USER_LOC_STACK,					\
+		.frame_off = (s32)sizeof(long) * -1,				\
+	},									\
+	.fp		= {							\
+		.loc = UNWIND_USER_LOC_STACK,					\
+		.frame_off = (s32)sizeof(long) * -2,				\
+	},									\
 	.sp_val_off	= (s32)0,						\
 	.use_fp		= true,
 
 #ifdef CONFIG_IA32_EMULATION
 
 #define ARCH_INIT_USER_COMPAT_FP_FRAME						\
-	.cfa_off	= (s32)sizeof(u32)  *  2,				\
-	.ra_off		= (s32)sizeof(u32)  * -1,				\
-	.fp_off		= (s32)sizeof(u32)  * -2,				\
+	.cfa_off	= (s32)sizeof(u32) * 2,					\
+	.ra		= {							\
+		.loc = UNWIND_USER_LOC_STACK,					\
+		.frame_off = (s32)sizeof(u32) * -1,				\
+	},									\
+	.fp		= {							\
+		.loc = UNWIND_USER_LOC_STACK,					\
+		.frame_off = (s32)sizeof(u32) * -2,				\
+	},									\
 	.sp_val_off	= (s32)0,						\
 	.use_fp		= true,
 
diff --git a/include/asm-generic/unwind_user.h b/include/asm-generic/unwind_user.h
index b8882b909944..3891b7cfe3b8 100644
--- a/include/asm-generic/unwind_user.h
+++ b/include/asm-generic/unwind_user.h
@@ -2,4 +2,24 @@
 #ifndef _ASM_GENERIC_UNWIND_USER_H
 #define _ASM_GENERIC_UNWIND_USER_H
 
+#include <asm/unwind_user_types.h>
+
+#ifndef unwind_user_get_reg
+
+/**
+ * generic_unwind_user_get_reg - Get register value.
+ * @val: Register value.
+ * @regnum: DWARF register number to obtain the value from.
+ *
+ * Returns zero if successful. Otherwise -EINVAL.
+ */
+static inline int generic_unwind_user_get_reg(unsigned long *val, int regnum)
+{
+	return -EINVAL;
+}
+
+#define unwind_user_get_reg generic_unwind_user_get_reg
+
+#endif /* !unwind_user_get_reg */
+
 #endif /* _ASM_GENERIC_UNWIND_USER_H */
diff --git a/include/asm-generic/unwind_user_sframe.h b/include/asm-generic/unwind_user_sframe.h
index 6c87a7f29861..8cef3e0857b6 100644
--- a/include/asm-generic/unwind_user_sframe.h
+++ b/include/asm-generic/unwind_user_sframe.h
@@ -2,8 +2,31 @@
 #ifndef _ASM_GENERIC_UNWIND_USER_SFRAME_H
 #define _ASM_GENERIC_UNWIND_USER_SFRAME_H
 
+#include <linux/unwind_user_types.h>
 #include <linux/types.h>
 
+/**
+ * generic_sframe_set_frame_reginfo - Populate info to unwind FP/RA register
+ * from SFrame offset.
+ * @reginfo: Unwind info for FP/RA register.
+ * @offset: SFrame offset value.
+ *
+ * A non-zero offset value denotes a stack offset from CFA and indicates
+ * that the register is saved on the stack. A zero offset value indicates
+ * that the register is not saved.
+ */
+static inline void generic_sframe_set_frame_reginfo(
+	struct unwind_user_reginfo *reginfo,
+	s32 offset)
+{
+	if (offset) {
+		reginfo->loc = UNWIND_USER_LOC_STACK;
+		reginfo->frame_off = offset;
+	} else {
+		reginfo->loc = UNWIND_USER_LOC_NONE;
+	}
+}
+
 /**
  * generic_sframe_sp_val_off - Get generic SP value offset from CFA.
  *
@@ -25,6 +48,7 @@ static inline s32 generic_sframe_sp_val_off(void)
 	return 0;
 }
 
+#define sframe_set_frame_reginfo generic_sframe_set_frame_reginfo
 #define sframe_sp_val_off generic_sframe_sp_val_off
 
 #endif /* _ASM_GENERIC_UNWIND_USER_SFRAME_H */
diff --git a/include/linux/unwind_user_types.h b/include/linux/unwind_user_types.h
index adef01698bb3..57fd16e314cf 100644
--- a/include/linux/unwind_user_types.h
+++ b/include/linux/unwind_user_types.h
@@ -21,10 +21,24 @@ struct unwind_stacktrace {
 	unsigned long	*entries;
 };
 
+enum unwind_user_loc {
+	UNWIND_USER_LOC_NONE,
+	UNWIND_USER_LOC_STACK,
+	UNWIND_USER_LOC_REG,
+};
+
+struct unwind_user_reginfo {
+	enum unwind_user_loc loc;
+	union {
+		s32 frame_off;
+		int regnum;
+	};
+};
+
 struct unwind_user_frame {
 	s32 cfa_off;
-	s32 ra_off;
-	s32 fp_off;
+	struct unwind_user_reginfo ra;
+	struct unwind_user_reginfo fp;
 	s32 sp_val_off;
 	bool use_fp;
 };
diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
index 5bfaf06e6cd2..43ef3a8c4c26 100644
--- a/kernel/unwind/sframe.c
+++ b/kernel/unwind/sframe.c
@@ -314,8 +314,8 @@ static __always_inline int __find_fre(struct sframe_section *sec,
 	}
 
 	frame->cfa_off = fre->cfa_off;
-	frame->ra_off  = fre->ra_off;
-	frame->fp_off  = fre->fp_off;
+	sframe_set_frame_reginfo(&frame->ra, fre->ra_off);
+	sframe_set_frame_reginfo(&frame->fp, fre->fp_off);
 	frame->use_fp  = SFRAME_FRE_CFA_BASE_REG_ID(fre->info) == SFRAME_BASE_REG_FP;
 	frame->sp_val_off = sframe_sp_val_off();
 
diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c
index 03a6da36192f..ee00d39d2a8e 100644
--- a/kernel/unwind/user.c
+++ b/kernel/unwind/user.c
@@ -98,26 +98,57 @@ static int unwind_user_next(struct unwind_user_state *state)
 
 
 	/* Get the Return Address (RA) */
-	if (frame->ra_off) {
+	switch (frame->ra.loc) {
+	case UNWIND_USER_LOC_NONE:
+		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
+			goto done;
+		ra = user_return_address(task_pt_regs(current));
+		break;
+	case UNWIND_USER_LOC_STACK:
+		if (!frame->ra.frame_off)
+			goto done;
 		/* Make sure that the address is word aligned */
 		shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
-		if ((cfa + frame->ra_off) & ((1 << shift) - 1))
+		if ((cfa + frame->ra.frame_off) & ((1 << shift) - 1))
 			goto done;
-		if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
+		if (unwind_get_user_long(ra, cfa + frame->ra.frame_off, state))
 			goto done;
-	} else {
-		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
+		break;
+	case UNWIND_USER_LOC_REG:
+		if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_LOC_REG) || !topmost)
 			goto done;
-		ra = user_return_address(task_pt_regs(current));
+		if (unwind_user_get_reg(&ra, frame->ra.regnum))
+			goto done;
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		goto done;
 	}
 
 	/* Get the Frame Pointer (FP) */
-	if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off, state))
+	switch (frame->fp.loc) {
+	case UNWIND_USER_LOC_NONE:
+		break;
+	case UNWIND_USER_LOC_STACK:
+		if (!frame->fp.frame_off)
+			goto done;
+		if (unwind_get_user_long(fp, cfa + frame->fp.frame_off, state))
+			goto done;
+		break;
+	case UNWIND_USER_LOC_REG:
+		if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_LOC_REG) || !topmost)
+			goto done;
+		if (unwind_user_get_reg(&fp, frame->fp.regnum))
+			goto done;
+		break;
+	default:
+		WARN_ON_ONCE(1);
 		goto done;
+	}
 
 	state->ip = ra;
 	state->sp = sp;
-	if (frame->fp_off)
+	if (frame->fp.loc != UNWIND_USER_LOC_NONE)
 		state->fp = fp;
 
 	arch_unwind_user_next(state);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 09/16] unwind_user/sframe: Enable archs with encoded SFrame CFA offsets
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (7 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 10/16] s390/ptrace: Enable HAVE_USER_RA_REG Jens Remus
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Enable architectures, such as s390, which store SFrame CFA offset values
encoded, to e.g. make (better) use of unsigned 8-bit SFrame offsets.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 include/asm-generic/unwind_user_sframe.h | 11 +++++++++++
 kernel/unwind/sframe.c                   |  1 +
 2 files changed, 12 insertions(+)

diff --git a/include/asm-generic/unwind_user_sframe.h b/include/asm-generic/unwind_user_sframe.h
index 8cef3e0857b6..cea0410a259a 100644
--- a/include/asm-generic/unwind_user_sframe.h
+++ b/include/asm-generic/unwind_user_sframe.h
@@ -5,6 +5,16 @@
 #include <linux/unwind_user_types.h>
 #include <linux/types.h>
 
+/**
+ * generic_sframe_cfa_offset_decode - Decode SFrame CFA offset.
+ *
+ * Returns the decoded SFrame CFA offset value.
+ */
+static inline s32 generic_sframe_cfa_offset_decode(s32 offset)
+{
+	return offset;
+}
+
 /**
  * generic_sframe_set_frame_reginfo - Populate info to unwind FP/RA register
  * from SFrame offset.
@@ -48,6 +58,7 @@ static inline s32 generic_sframe_sp_val_off(void)
 	return 0;
 }
 
+#define sframe_cfa_offset_decode generic_sframe_cfa_offset_decode
 #define sframe_set_frame_reginfo generic_sframe_set_frame_reginfo
 #define sframe_sp_val_off generic_sframe_sp_val_off
 
diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
index 43ef3a8c4c26..e8658401a286 100644
--- a/kernel/unwind/sframe.c
+++ b/kernel/unwind/sframe.c
@@ -220,6 +220,7 @@ static __always_inline int __read_fre(struct sframe_section *sec,
 
 	UNSAFE_GET_USER_INC(cfa_off, cur, offset_size, Efault);
 	offset_count--;
+	cfa_off = sframe_cfa_offset_decode(cfa_off);
 
 	ra_off = sec->ra_off;
 	if (!ra_off && offset_count) {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 10/16] s390/ptrace: Enable HAVE_USER_RA_REG
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (8 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 09/16] unwind_user/sframe: Enable archs with encoded SFrame CFA offsets Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 11/16] s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME Jens Remus
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Provide user_return_address().  Additionally provide frame_pointer().

On s390 register 11 is the preferred frame pointer (FP) register
and register 14 is the return address (RA) register in user space.

While at it convert instruction_pointer() and user_stack_pointer()
from macros to inline functions, to align their definition with
x86-64 and AArch64.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/s390/Kconfig              |  1 +
 arch/s390/include/asm/ptrace.h | 25 ++++++++++++++++++++++---
 2 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 0c16dc443e2f..f4ea52c1f0ba 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -239,6 +239,7 @@ config S390
 	select HAVE_SETUP_PER_CPU_AREA
 	select HAVE_SOFTIRQ_ON_OWN_STACK
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_USER_RA_REG
 	select HAVE_VIRT_CPU_ACCOUNTING
 	select HAVE_VIRT_CPU_ACCOUNTING_IDLE
 	select HOTPLUG_SMT
diff --git a/arch/s390/include/asm/ptrace.h b/arch/s390/include/asm/ptrace.h
index 0905fa99a31e..71d81280e5eb 100644
--- a/arch/s390/include/asm/ptrace.h
+++ b/arch/s390/include/asm/ptrace.h
@@ -212,8 +212,6 @@ void update_cr_regs(struct task_struct *task);
 #define arch_has_block_step()	(1)
 
 #define user_mode(regs) (((regs)->psw.mask & PSW_MASK_PSTATE) != 0)
-#define instruction_pointer(regs) ((regs)->psw.addr)
-#define user_stack_pointer(regs)((regs)->gprs[15])
 #define profile_pc(regs) instruction_pointer(regs)
 
 static inline long regs_return_value(struct pt_regs *regs)
@@ -230,11 +228,32 @@ static inline void instruction_pointer_set(struct pt_regs *regs,
 int regs_query_register_offset(const char *name);
 const char *regs_query_register_name(unsigned int offset);
 
-static __always_inline unsigned long kernel_stack_pointer(struct pt_regs *regs)
+static __always_inline unsigned long kernel_stack_pointer(const struct pt_regs *regs)
 {
 	return regs->gprs[15];
 }
 
+static __always_inline unsigned long instruction_pointer(const struct pt_regs *regs)
+{
+	return regs->psw.addr;
+}
+
+static __always_inline unsigned long frame_pointer(const struct pt_regs *regs)
+{
+	return regs->gprs[11];
+}
+
+static __always_inline unsigned long user_stack_pointer(const struct pt_regs *regs)
+{
+	return regs->gprs[15];
+}
+
+static __always_inline unsigned long user_return_address(const struct pt_regs *regs)
+{
+	return regs->gprs[14];
+}
+#define user_return_address user_return_address
+
 static __always_inline unsigned long regs_get_register(struct pt_regs *regs, unsigned int offset)
 {
 	if (offset >= NUM_GPRS)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 11/16] s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (9 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 10/16] s390/ptrace: Enable HAVE_USER_RA_REG Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-08-01 12:53   ` Heiko Carstens
  2025-07-10 16:35 ` [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding Jens Remus
                   ` (4 subsequent siblings)
  15 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Add s390 support for unwinding of user space using SFrame.  This
leverages the previous commits to address the following s390
particularities:

- The CFA is defined as the value of the stack pointer (SP) at call
  site in the previous frame + 160.  Therefore the SP unwinds as
  SP = CFA - 160.  Therefore use a SP value offset from CFA of -160.

- The return address (RA) is not saved on the stack at function entry.
  It is also not saved in the function prologue, when in leaf functions.
  Therefore the RA does not necessarily need to be unwound in the first
  unwinding step for the topmost frame.

- The frame pointer (FP) and/or return address (RA) may be saved in
  other registers when in leaf functions.  GCC effectively uses
  floating-point registers (FPR) for this purpose.  Therefore DWARF
  register numbers may be encoded in the SFrame FP/RA offsets.

- To make use of the signed 8-bit SFrame offset size and effectively
  reduce the .sframe section size the SFrame CFA offset values are
  encoded as (CFA - 160) / 8.  This is because the lowest CFA offset
  value on s390 is by definition +160 (= value at function entry),
  which does not fit into a signed 8-bit SFrame offset.  Therefore
  the CFA offset values are stored adjusted by -160.  Additionally
  they are scaled by the s390-specific DWARF data scaling factor of 8.
  The s390x ELF ABI [1] guarantees that the CFA offset values are
  always aligned on an 8-byte boundary.

[1]: s390x ELF ABI,
     https://github.com/IBM/s390x-abi/releases

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/s390/Kconfig                          |  2 +
 arch/s390/include/asm/unwind_user.h        | 83 ++++++++++++++++++++++
 arch/s390/include/asm/unwind_user_sframe.h | 37 ++++++++++
 3 files changed, 122 insertions(+)
 create mode 100644 arch/s390/include/asm/unwind_user.h
 create mode 100644 arch/s390/include/asm/unwind_user_sframe.h

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index f4ea52c1f0ba..8b29a8f0f9c3 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -239,6 +239,8 @@ config S390
 	select HAVE_SETUP_PER_CPU_AREA
 	select HAVE_SOFTIRQ_ON_OWN_STACK
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_UNWIND_USER_LOC_REG
+	select HAVE_UNWIND_USER_SFRAME
 	select HAVE_USER_RA_REG
 	select HAVE_VIRT_CPU_ACCOUNTING
 	select HAVE_VIRT_CPU_ACCOUNTING_IDLE
diff --git a/arch/s390/include/asm/unwind_user.h b/arch/s390/include/asm/unwind_user.h
new file mode 100644
index 000000000000..daae1545e203
--- /dev/null
+++ b/arch/s390/include/asm/unwind_user.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_S390_UNWIND_USER_H
+#define _ASM_S390_UNWIND_USER_H
+
+#include <linux/sched/task_stack.h>
+#include <linux/types.h>
+#include <asm/fpu-insn.h>
+
+static inline void __s390_get_dwarf_fpr(unsigned long *val, int regnum)
+{
+	switch (regnum) {
+	case 16:
+		fpu_std(0, (freg_t *)val);
+		break;
+	case 17:
+		fpu_std(2, (freg_t *)val);
+		break;
+	case 18:
+		fpu_std(4, (freg_t *)val);
+		break;
+	case 19:
+		fpu_std(6, (freg_t *)val);
+		break;
+	case 20:
+		fpu_std(1, (freg_t *)val);
+		break;
+	case 21:
+		fpu_std(3, (freg_t *)val);
+		break;
+	case 22:
+		fpu_std(5, (freg_t *)val);
+		break;
+	case 23:
+		fpu_std(7, (freg_t *)val);
+		break;
+	case 24:
+		fpu_std(8, (freg_t *)val);
+		break;
+	case 25:
+		fpu_std(10, (freg_t *)val);
+		break;
+	case 26:
+		fpu_std(12, (freg_t *)val);
+		break;
+	case 27:
+		fpu_std(14, (freg_t *)val);
+		break;
+	case 28:
+		fpu_std(9, (freg_t *)val);
+		break;
+	case 29:
+		fpu_std(11, (freg_t *)val);
+		break;
+	case 30:
+		fpu_std(13, (freg_t *)val);
+		break;
+	case 31:
+		fpu_std(15, (freg_t *)val);
+		break;
+	default:
+		*val = 0;
+	}
+}
+
+static inline int s390_unwind_user_get_reg(unsigned long *val, int regnum)
+{
+	if (0 <= regnum && regnum <= 15) {
+		struct pt_regs *regs = task_pt_regs(current);
+		*val = regs->gprs[regnum];
+	} else if (16 <= regnum && regnum <= 31) {
+		__s390_get_dwarf_fpr(val, regnum);
+	} else {
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+#define unwind_user_get_reg s390_unwind_user_get_reg
+
+#include <asm-generic/unwind_user.h>
+
+#endif /* _ASM_S390_UNWIND_USER_H */
diff --git a/arch/s390/include/asm/unwind_user_sframe.h b/arch/s390/include/asm/unwind_user_sframe.h
new file mode 100644
index 000000000000..2216e6921fd8
--- /dev/null
+++ b/arch/s390/include/asm/unwind_user_sframe.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_S390_UNWIND_USER_SFRAME_H
+#define _ASM_S390_UNWIND_USER_SFRAME_H
+
+#include <linux/unwind_user.h>
+#include <linux/types.h>
+
+static inline s32 arch_sframe_cfa_offset_decode(s32 offset)
+{
+	return (offset << 3) + 160;
+}
+
+static inline void arch_sframe_set_frame_reginfo(
+	struct unwind_user_reginfo *reginfo,
+	s32 offset)
+{
+	if (offset & 1) {
+		reginfo->loc = UNWIND_USER_LOC_REG;
+		reginfo->regnum = offset >> 1;
+	} else if (offset) {
+		reginfo->loc = UNWIND_USER_LOC_STACK;
+		reginfo->frame_off = offset;
+	} else {
+		reginfo->loc = UNWIND_USER_LOC_NONE;
+	}
+}
+
+static inline s32 arch_sframe_sp_val_off(void)
+{
+	return -160;
+}
+
+#define sframe_cfa_offset_decode arch_sframe_cfa_offset_decode
+#define sframe_set_frame_reginfo arch_sframe_set_frame_reginfo
+#define sframe_sp_val_off arch_sframe_sp_val_off
+
+#endif /* _ASM_S390_UNWIND_USER_SFRAME_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (10 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 11/16] s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-17  2:06   ` Josh Poimboeuf
  2025-07-10 16:35 ` [RFC PATCH v1 13/16] s390/unwind_user/backchain: Enable HAVE_UNWIND_USER_BACKCHAIN Jens Remus
                   ` (3 subsequent siblings)
  15 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Add support for unwinding of user space using back chain to the
unwind user interface.  Use it as secondary fallback for unwinding
using SFrame, if that fails and the primary fallback using frame
pointer is not available.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/Kconfig                          |  6 ++++++
 include/linux/unwind_user_backchain.h | 17 +++++++++++++++++
 include/linux/unwind_user_types.h     |  1 +
 kernel/unwind/Makefile                |  1 +
 kernel/unwind/user.c                  | 24 +++++++++++++++++++++---
 kernel/unwind/user_backchain.c        | 13 +++++++++++++
 6 files changed, 59 insertions(+), 3 deletions(-)
 create mode 100644 include/linux/unwind_user_backchain.h
 create mode 100644 kernel/unwind/user_backchain.c

diff --git a/arch/Kconfig b/arch/Kconfig
index 9e28dffe42cb..4fe16ad6f053 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -438,6 +438,12 @@ config HAVE_HARDLOCKUP_DETECTOR_ARCH
 config UNWIND_USER
 	bool
 
+config HAVE_UNWIND_USER_BACKCHAIN
+	bool
+	select UNWIND_USER
+	help
+	  The arch supports unwinding of user space using back chain.
+
 config HAVE_UNWIND_USER_FP
 	bool
 	select UNWIND_USER
diff --git a/include/linux/unwind_user_backchain.h b/include/linux/unwind_user_backchain.h
new file mode 100644
index 000000000000..daae74c97c54
--- /dev/null
+++ b/include/linux/unwind_user_backchain.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_UNWIND_USER_BACKCHAIN_H
+#define _LINUX_UNWIND_USER_BACKCHAIN_H
+
+struct unwind_user_state;
+
+#ifdef CONFIG_HAVE_UNWIND_USER_BACKCHAIN
+
+extern int unwind_user_backchain_next(struct unwind_user_state *state);
+
+#else /* !CONFIG_HAVE_UNWIND_USER_BACKCHAIN */
+
+static inline int unwind_user_backchain_next(struct unwind_user_state *state) { return -EINVAL; }
+
+#endif /* !CONFIG_HAVE_UNWIND_USER_BACKCHAIN */
+
+#endif /* _LINUX_UNWIND_USER_BACKCHAIN_H */
diff --git a/include/linux/unwind_user_types.h b/include/linux/unwind_user_types.h
index 57fd16e314cf..41b1bc082cb1 100644
--- a/include/linux/unwind_user_types.h
+++ b/include/linux/unwind_user_types.h
@@ -14,6 +14,7 @@ enum unwind_user_type {
 	UNWIND_USER_TYPE_FP,
 	UNWIND_USER_TYPE_COMPAT_FP,
 	UNWIND_USER_TYPE_SFRAME,
+	UNWIND_USER_TYPE_BACKCHAIN,
 };
 
 struct unwind_stacktrace {
diff --git a/kernel/unwind/Makefile b/kernel/unwind/Makefile
index 146038165865..38cef261abcb 100644
--- a/kernel/unwind/Makefile
+++ b/kernel/unwind/Makefile
@@ -1,2 +1,3 @@
  obj-$(CONFIG_UNWIND_USER)		+= user.o deferred.o
+ obj-$(CONFIG_HAVE_UNWIND_USER_BACKCHAIN)	+= user_backchain.o
  obj-$(CONFIG_HAVE_UNWIND_USER_SFRAME)	+= sframe.o
diff --git a/kernel/unwind/user.c b/kernel/unwind/user.c
index ee00d39d2a8e..3c3f75bc146b 100644
--- a/kernel/unwind/user.c
+++ b/kernel/unwind/user.c
@@ -7,6 +7,7 @@
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
 #include <linux/unwind_user.h>
+#include <linux/unwind_user_backchain.h>
 #include <linux/uaccess.h>
 #include <linux/sframe.h>
 
@@ -39,6 +40,12 @@ static inline bool sframe_state(struct unwind_user_state *state)
 	       state->type == UNWIND_USER_TYPE_SFRAME;
 }
 
+static inline bool backchain_state(struct unwind_user_state *state)
+{
+	return IS_ENABLED(CONFIG_HAVE_UNWIND_USER_BACKCHAIN) &&
+	       state->type == UNWIND_USER_TYPE_BACKCHAIN;
+}
+
 #define unwind_get_user_long(to, from, state)				\
 ({									\
 	int __ret;							\
@@ -66,12 +73,20 @@ static int unwind_user_next(struct unwind_user_state *state)
 		/* sframe expects the frame to be local storage */
 		frame = &_frame;
 		if (sframe_find(state->ip, frame, topmost)) {
-			if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP))
-				goto done;
-			frame = &fp_frame;
+			if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP)) {
+				frame = &fp_frame;
+			} else if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_BACKCHAIN)) {
+				if (unwind_user_backchain_next(state))
+					goto done;
+				goto done_backchain;
+			}
 		}
 	} else if (fp_state(state)) {
 		frame = &fp_frame;
+	} else if (backchain_state(state)) {
+		if (unwind_user_backchain_next(state))
+			goto done;
+		goto done_backchain;
 	} else {
 		goto done;
 	}
@@ -153,6 +168,7 @@ static int unwind_user_next(struct unwind_user_state *state)
 
 	arch_unwind_user_next(state);
 
+done_backchain:
 	state->topmost = false;
 	return 0;
 
@@ -178,6 +194,8 @@ static int unwind_user_start(struct unwind_user_state *state)
 		state->type = UNWIND_USER_TYPE_SFRAME;
 	else if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP))
 		state->type = UNWIND_USER_TYPE_FP;
+	else if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_BACKCHAIN))
+		state->type = UNWIND_USER_TYPE_BACKCHAIN;
 	else
 		state->type = UNWIND_USER_TYPE_NONE;
 
diff --git a/kernel/unwind/user_backchain.c b/kernel/unwind/user_backchain.c
new file mode 100644
index 000000000000..5b60a3d4f34f
--- /dev/null
+++ b/kernel/unwind/user_backchain.c
@@ -0,0 +1,13 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#define pr_fmt(fmt)	"backchain: " fmt
+
+#include <linux/sched.h>
+#include <linux/unwind_user.h>
+#include <linux/unwind_user_backchain.h>
+#include <asm/unwind_user_backchain.h>
+
+int unwind_user_backchain_next(struct unwind_user_state *state)
+{
+	return arch_unwind_user_backchain_next(state);
+}
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 13/16] s390/unwind_user/backchain: Enable HAVE_UNWIND_USER_BACKCHAIN
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (11 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 14/16] PREREQ: x86/asm: Avoid emitting DWARF CFI for non-VDSO Jens Remus
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Enable unwinding of user space using back chain on s390.  Based on
arch_stack_walk_user_common() in arch/s390/kernel/stacktrace.c.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/s390/Kconfig                             |   1 +
 arch/s390/include/asm/unwind_user_backchain.h | 127 ++++++++++++++++++
 2 files changed, 128 insertions(+)
 create mode 100644 arch/s390/include/asm/unwind_user_backchain.h

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 8b29a8f0f9c3..49f231123040 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -239,6 +239,7 @@ config S390
 	select HAVE_SETUP_PER_CPU_AREA
 	select HAVE_SOFTIRQ_ON_OWN_STACK
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_UNWIND_USER_BACKCHAIN
 	select HAVE_UNWIND_USER_LOC_REG
 	select HAVE_UNWIND_USER_SFRAME
 	select HAVE_USER_RA_REG
diff --git a/arch/s390/include/asm/unwind_user_backchain.h b/arch/s390/include/asm/unwind_user_backchain.h
new file mode 100644
index 000000000000..ceb56b9d8411
--- /dev/null
+++ b/arch/s390/include/asm/unwind_user_backchain.h
@@ -0,0 +1,127 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_S390_UNWIND_USER_BACKCHAIN_H
+#define _ASM_S390_UNWIND_USER_BACKCHAIN_H
+
+#include <linux/security.h>
+#ifndef ASM_OFFSETS_C
+#include <asm/asm-offsets.h>
+#endif
+
+struct stack_frame_user {
+	unsigned long backchain;
+	unsigned long unused;
+	/* Argument register save area. */
+	unsigned long r2;
+	unsigned long r3;
+	unsigned long r4;
+	unsigned long r5;
+	unsigned long r6;
+	/* Other register save area. */
+	unsigned long r7;
+	unsigned long r8;
+	unsigned long r9;
+	unsigned long r10;
+	unsigned long r11;
+	unsigned long r12;
+	unsigned long r13;
+	unsigned long r14;
+	unsigned long r15;
+	/* FP argument register save area. */
+	unsigned long f0;
+	unsigned long f2;
+	unsigned long f4;
+	unsigned long f6;
+};
+
+struct stack_frame_vdso_wrapper {
+	struct stack_frame_user sf;
+	unsigned long return_address;
+};
+
+/**
+ * ip_invalid - Perform some basic checks whether an instruction pointer (IP)
+ * taken from an unreliable source is invalid
+ * @ip: The instruction pointer to be validated
+ *
+ * returns whether the instruction pointer is invalid
+ */
+static inline bool ip_invalid(unsigned long ip)
+{
+	if (ip & 1)
+		return true;
+	if (ip < mmap_min_addr)
+		return true;
+	if (ip >= current->mm->context.asce_limit)
+		return true;
+	return false;
+}
+
+/**
+ * ip_within_vdso - Check whether an instruction pointer (IP) is within vDSO
+ * @ip: The instruction pointer
+ *
+ * returns whether the instruction pointer is within vDSO
+ */
+static inline bool ip_within_vdso(unsigned long ip)
+{
+	return in_range(ip, current->mm->context.vdso_base, vdso_text_size());
+}
+
+/**
+ * arch_unwind_user_backchain_next - Unwind one frame using backchain
+ * @state: The unwind user state
+ *
+ * returns zero when successful, otherwise -EINVAL.
+ */
+static inline int arch_unwind_user_backchain_next(struct unwind_user_state *state)
+{
+	struct stack_frame_user __user *sf;
+	unsigned long sp, ra;
+
+	sf = (void __user *)state->sp;
+	if (__get_user(sp, (unsigned long __user *)&sf->backchain))
+		return -EINVAL;
+
+	/*
+	 * vDSO entry code on s390 has a non-standard stack frame layout.
+	 * See vDSO user wrapper code for details.
+	 */
+	if (!sp && ip_within_vdso(state->ip)) {
+		struct stack_frame_vdso_wrapper *sf_vdso = (void __user *)sf;
+
+		if (__get_user(ra, (unsigned long __user *)&sf_vdso->return_address))
+			return -EINVAL;
+		sf = (void __user *)((unsigned long)sf + STACK_FRAME_VDSO_OVERHEAD);
+		if (__get_user(sp, (unsigned long __user *)&sf->backchain))
+			return -EINVAL;
+	} else {
+		sf = (void __user *)sp;
+		if (__get_user(ra, (unsigned long __user *)&sf->r14))
+			return -EINVAL;
+	}
+
+	/* ABI requires SP to be 8-byte aligned. */
+	if (sp & 0x7)
+		return -EINVAL;
+
+	/*
+	 * If the IP is invalid and this is the topmost frame,
+	 * assume the RA register has not been saved yet.
+	 */
+	if (ip_invalid(ra)) {
+		if (!state->topmost || !IS_ENABLED(CONFIG_HAVE_USER_RA_REG))
+			return -EINVAL;
+
+		ra = user_return_address(task_pt_regs(current));
+		if (ip_invalid(ra))
+			return -EINVAL;
+	}
+
+	state->sp = sp;
+	state->ip = ra;
+	state->fp = 0;
+
+	return 0;
+}
+
+#endif /* _ASM_S390_UNWIND_USER_BACKCHAIN_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 14/16] PREREQ: x86/asm: Avoid emitting DWARF CFI for non-VDSO
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (12 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 13/16] s390/unwind_user/backchain: Enable HAVE_UNWIND_USER_BACKCHAIN Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 15/16] PREREQ: x86/vdso: Enable sframe generation in VDSO Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 16/16] WIP: fixup! s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME Jens Remus
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James, Steven Rostedt (Google)

From: Josh Poimboeuf <jpoimboe@kernel.org>

It was decided years ago that .cfi_* annotations aren't maintainable in
the kernel.  They were replaced by objtool unwind hints.  For the kernel
proper, ensure the CFI_* macros don't do anything.

On the other hand the VDSO library *does* use them, so user space can
unwind through it.

Make sure these macros only work for VDSO.  They aren't actually being
used outside of VDSO anyway, so there's no functional change.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---

Notes (jremus):
    Link: https://lore.kernel.org/all/20250425024022.477374378@goodmis.org/
    
    This is only a prerequisite for the subsequent prerequisite patch
    "x86/vdso: Enable sframe generation in VDSO" to apply cleanly.

 arch/x86/include/asm/dwarf2.h | 51 ++++++++++++++++++++++++-----------
 1 file changed, 35 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/dwarf2.h b/arch/x86/include/asm/dwarf2.h
index 302e11b15da8..65d958ef1178 100644
--- a/arch/x86/include/asm/dwarf2.h
+++ b/arch/x86/include/asm/dwarf2.h
@@ -6,6 +6,15 @@
 #warning "asm/dwarf2.h should be only included in pure assembly files"
 #endif
 
+#ifdef BUILD_VDSO
+
+	/*
+	 * For the vDSO, emit both runtime unwind information and debug
+	 * symbols for the .dbg file.
+	 */
+
+	.cfi_sections .eh_frame, .debug_frame
+
 #define CFI_STARTPROC		.cfi_startproc
 #define CFI_ENDPROC		.cfi_endproc
 #define CFI_DEF_CFA		.cfi_def_cfa
@@ -21,21 +30,31 @@
 #define CFI_UNDEFINED		.cfi_undefined
 #define CFI_ESCAPE		.cfi_escape
 
-#ifndef BUILD_VDSO
-	/*
-	 * Emit CFI data in .debug_frame sections, not .eh_frame sections.
-	 * The latter we currently just discard since we don't do DWARF
-	 * unwinding at runtime.  So only the offline DWARF information is
-	 * useful to anyone.  Note we should not use this directive if we
-	 * ever decide to enable DWARF unwinding at runtime.
-	 */
-	.cfi_sections .debug_frame
-#else
-	 /*
-	  * For the vDSO, emit both runtime unwind information and debug
-	  * symbols for the .dbg file.
-	  */
-	.cfi_sections .eh_frame, .debug_frame
-#endif
+#else /* !BUILD_VDSO */
+
+/*
+ * On x86, these macros aren't used outside VDSO.  As well they shouldn't be:
+ * they're fragile and very difficult to maintain.
+ */
+
+.macro nocfi args:vararg
+.endm
+
+#define CFI_STARTPROC		nocfi
+#define CFI_ENDPROC		nocfi
+#define CFI_DEF_CFA		nocfi
+#define CFI_DEF_CFA_REGISTER	nocfi
+#define CFI_DEF_CFA_OFFSET	nocfi
+#define CFI_ADJUST_CFA_OFFSET	nocfi
+#define CFI_OFFSET		nocfi
+#define CFI_REL_OFFSET		nocfi
+#define CFI_REGISTER		nocfi
+#define CFI_RESTORE		nocfi
+#define CFI_REMEMBER_STATE	nocfi
+#define CFI_RESTORE_STATE	nocfi
+#define CFI_UNDEFINED		nocfi
+#define CFI_ESCAPE		nocfi
+
+#endif /* !BUILD_VDSO */
 
 #endif /* _ASM_X86_DWARF2_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 15/16] PREREQ: x86/vdso: Enable sframe generation in VDSO
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (13 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 14/16] PREREQ: x86/asm: Avoid emitting DWARF CFI for non-VDSO Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  2025-07-10 16:35 ` [RFC PATCH v1 16/16] WIP: fixup! s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME Jens Remus
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James, Steven Rostedt (Google)

From: Josh Poimboeuf <jpoimboe@kernel.org>

Enable sframe generation in the VDSO library so kernel and user space
can unwind through it.

Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---

Notes (jremus):
    Link: https://lore.kernel.org/all/20250425024023.173709192@goodmis.org/
    
    This is a prerequisite for patch "s390/vdso: Enable SFrame generation in
    vDSO" to actually generate SFrame in vDSO, as it introduces config
    option CONFIG_AS_SFRAME.

 arch/Kconfig                          |  3 +++
 arch/x86/entry/vdso/Makefile          | 10 +++++++---
 arch/x86/entry/vdso/vdso-layout.lds.S |  3 +++
 arch/x86/include/asm/dwarf2.h         |  5 ++++-
 4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index 4fe16ad6f053..1c3daccd9072 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -435,6 +435,9 @@ config HAVE_HARDLOCKUP_DETECTOR_ARCH
 	  It uses the same command line parameters, and sysctl interface,
 	  as the generic hardlockup detectors.
 
+config AS_SFRAME
+	def_bool $(as-instr,.cfi_sections .sframe\n.cfi_startproc\n.cfi_endproc)
+
 config UNWIND_USER
 	bool
 
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 54d3e9774d62..2dc0e0ca19a7 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -47,13 +47,17 @@ quiet_cmd_vdso2c = VDSO2C  $@
 $(obj)/vdso-image-%.c: $(obj)/vdso%.so.dbg $(obj)/vdso%.so $(obj)/vdso2c FORCE
 	$(call if_changed,vdso2c)
 
+ifeq ($(CONFIG_AS_SFRAME),y)
+  SFRAME_CFLAGS := -Wa,-gsframe
+endif
+
 #
 # Don't omit frame pointers for ease of userspace debugging, but do
 # optimize sibling calls.
 #
 CFL := $(PROFILING) -mcmodel=small -fPIC -O2 -fasynchronous-unwind-tables -m64 \
        $(filter -g%,$(KBUILD_CFLAGS)) -fno-stack-protector \
-       -fno-omit-frame-pointer -foptimize-sibling-calls \
+       -fno-omit-frame-pointer $(SFRAME_CFLAGS) -foptimize-sibling-calls \
        -DDISABLE_BRANCH_PROFILING -DBUILD_VDSO
 
 ifdef CONFIG_MITIGATION_RETPOLINE
@@ -63,7 +67,7 @@ endif
 endif
 
 $(vobjs): KBUILD_CFLAGS := $(filter-out $(PADDING_CFLAGS) $(CC_FLAGS_LTO) $(CC_FLAGS_CFI) $(RANDSTRUCT_CFLAGS) $(GCC_PLUGINS_CFLAGS) $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS)) $(CFL)
-$(vobjs): KBUILD_AFLAGS += -DBUILD_VDSO
+$(vobjs): KBUILD_AFLAGS += -DBUILD_VDSO $(SFRAME_CFLAGS)
 
 #
 # vDSO code runs in userspace and -pg doesn't help with profiling anyway.
@@ -104,7 +108,7 @@ $(obj)/%-x32.o: $(obj)/%.o FORCE
 
 targets += vdsox32.lds $(vobjx32s-y)
 
-$(obj)/%.so: OBJCOPYFLAGS := -S --remove-section __ex_table
+$(obj)/%.so: OBJCOPYFLAGS := -g --remove-section __ex_table
 $(obj)/%.so: $(obj)/%.so.dbg FORCE
 	$(call if_changed,objcopy)
 
diff --git a/arch/x86/entry/vdso/vdso-layout.lds.S b/arch/x86/entry/vdso/vdso-layout.lds.S
index ec1ac191a057..c57cfddfc37d 100644
--- a/arch/x86/entry/vdso/vdso-layout.lds.S
+++ b/arch/x86/entry/vdso/vdso-layout.lds.S
@@ -59,6 +59,7 @@ SECTIONS
 	.eh_frame_hdr	: { *(.eh_frame_hdr) }		:text	:eh_frame_hdr
 	.eh_frame	: { KEEP (*(.eh_frame)) }	:text
 
+	.sframe		: { *(.sframe) }		:text	:sframe
 
 	/*
 	 * Text is well-separated from actual data: there's plenty of
@@ -87,6 +88,7 @@ SECTIONS
  * Very old versions of ld do not recognize this name token; use the constant.
  */
 #define PT_GNU_EH_FRAME	0x6474e550
+#define PT_GNU_SFRAME	0x6474e554
 
 /*
  * We must supply the ELF program headers explicitly to get just one
@@ -98,4 +100,5 @@ PHDRS
 	dynamic		PT_DYNAMIC	FLAGS(4);		/* PF_R */
 	note		PT_NOTE		FLAGS(4);		/* PF_R */
 	eh_frame_hdr	PT_GNU_EH_FRAME;
+	sframe		PT_GNU_SFRAME;
 }
diff --git a/arch/x86/include/asm/dwarf2.h b/arch/x86/include/asm/dwarf2.h
index 65d958ef1178..ce294e6c9017 100644
--- a/arch/x86/include/asm/dwarf2.h
+++ b/arch/x86/include/asm/dwarf2.h
@@ -12,8 +12,11 @@
 	 * For the vDSO, emit both runtime unwind information and debug
 	 * symbols for the .dbg file.
 	 */
-
+#if defined(__x86_64__) && defined(CONFIG_AS_SFRAME)
+	.cfi_sections .eh_frame, .debug_frame, .sframe
+#else
 	.cfi_sections .eh_frame, .debug_frame
+#endif
 
 #define CFI_STARTPROC		.cfi_startproc
 #define CFI_ENDPROC		.cfi_endproc
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFC PATCH v1 16/16] WIP: fixup! s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME
  2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
                   ` (14 preceding siblings ...)
  2025-07-10 16:35 ` [RFC PATCH v1 15/16] PREREQ: x86/vdso: Enable sframe generation in VDSO Jens Remus
@ 2025-07-10 16:35 ` Jens Remus
  15 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-10 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt
  Cc: Jens Remus, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Josh Poimboeuf,
	Peter Zijlstra, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Thomas Gleixner, Andrii Nakryiko, Indu Bhagat, Jose E. Marchesi,
	Beau Belgrave, Linus Torvalds, Andrew Morton, Jens Axboe,
	Florian Weimer, Sam James

Add s390-specific SFrame format definitions.  Note that SFRAME_ABI_*
(and thus SFRAME_ABI_S390_ENDIAN_BIG) is currently unused.

Include <asm/unwind_user_sframe.h> after "sframe.h" to make those
s390-specific definitions available to architecture-specific unwind
user sframe code, particularly the s390-specific one.

Use the s390-specific definitions in the s390-specific unwind user
sframe code to get rid of all the magic numbers.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    Alternatively the s390-specific definitions could also be added to the
    s390-specific unwind user sframe header.  The current implementation
    follows Binutils approach to have all SFrame format definitions in one
    central header file.

 arch/s390/include/asm/unwind_user_sframe.h |  8 ++++----
 kernel/unwind/sframe.c                     |  2 +-
 kernel/unwind/sframe.h                     | 16 ++++++++++++++++
 3 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/arch/s390/include/asm/unwind_user_sframe.h b/arch/s390/include/asm/unwind_user_sframe.h
index 2216e6921fd8..e5139cc2ba5a 100644
--- a/arch/s390/include/asm/unwind_user_sframe.h
+++ b/arch/s390/include/asm/unwind_user_sframe.h
@@ -7,16 +7,16 @@
 
 static inline s32 arch_sframe_cfa_offset_decode(s32 offset)
 {
-	return (offset << 3) + 160;
+	return SFRAME_V2_S390X_CFA_OFFSET_DECODE(offset);
 }
 
 static inline void arch_sframe_set_frame_reginfo(
 	struct unwind_user_reginfo *reginfo,
 	s32 offset)
 {
-	if (offset & 1) {
+	if (SFRAME_V2_S390X_OFFSET_IS_REGNUM(offset)) {
 		reginfo->loc = UNWIND_USER_LOC_REG;
-		reginfo->regnum = offset >> 1;
+		reginfo->regnum = SFRAME_V2_S390X_OFFSET_DECODE_REGNUM(offset);
 	} else if (offset) {
 		reginfo->loc = UNWIND_USER_LOC_STACK;
 		reginfo->frame_off = offset;
@@ -27,7 +27,7 @@ static inline void arch_sframe_set_frame_reginfo(
 
 static inline s32 arch_sframe_sp_val_off(void)
 {
-	return -160;
+	return SFRAME_S390X_SP_VAL_OFFSET;
 }
 
 #define sframe_cfa_offset_decode arch_sframe_cfa_offset_decode
diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
index e8658401a286..cd82de310c58 100644
--- a/kernel/unwind/sframe.c
+++ b/kernel/unwind/sframe.c
@@ -12,11 +12,11 @@
 #include <linux/mm.h>
 #include <linux/string_helpers.h>
 #include <linux/sframe.h>
-#include <asm/unwind_user_sframe.h>
 #include <linux/unwind_user_types.h>
 
 #include "sframe.h"
 #include "sframe_debug.h"
+#include <asm/unwind_user_sframe.h>
 
 struct sframe_fre {
 	unsigned int	size;
diff --git a/kernel/unwind/sframe.h b/kernel/unwind/sframe.h
index e9bfccfaf5b4..3e60b6e30b51 100644
--- a/kernel/unwind/sframe.h
+++ b/kernel/unwind/sframe.h
@@ -17,6 +17,7 @@
 #define SFRAME_ABI_AARCH64_ENDIAN_BIG		1
 #define SFRAME_ABI_AARCH64_ENDIAN_LITTLE	2
 #define SFRAME_ABI_AMD64_ENDIAN_LITTLE		3
+#define SFRAME_ABI_S390X_ENDIAN_BIG		4	/* s390 64-bit (s390x) */
 
 #define SFRAME_FDE_TYPE_PCINC			0
 #define SFRAME_FDE_TYPE_PCMASK			1
@@ -68,4 +69,19 @@ struct sframe_fde {
 #define SFRAME_FRE_OFFSET_SIZE(data)		((data >> 5) & 0x3)
 #define SFRAME_FRE_MANGLED_RA_P(data)		((data >> 7) & 0x1)
 
+/* s390 64-bit (s390x) */
+
+#define SFRAME_S390X_SP_VAL_OFFSET			(-160)
+
+#define SFRAME_S390X_CFA_OFFSET_ADJUSTMENT		SFRAME_S390X_SP_VAL_OFFSET
+#define SFRAME_S390X_CFA_OFFSET_ALIGNMENT_FACTOR	8
+#define SFRAME_V2_S390X_CFA_OFFSET_DECODE(offset) \
+  (((offset) * SFRAME_S390X_CFA_OFFSET_ALIGNMENT_FACTOR) \
+   - SFRAME_S390X_CFA_OFFSET_ADJUSTMENT)
+
+#define SFRAME_V2_S390X_OFFSET_IS_REGNUM(offset) \
+  ((offset) & 1)
+#define SFRAME_V2_S390X_OFFSET_DECODE_REGNUM(offset) \
+  ((offset) >> 1)
+
 #endif /* _SFRAME_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 06/16] unwind_user: Enable archs that define CFA = SP_callsite + offset
  2025-07-10 16:35 ` [RFC PATCH v1 06/16] unwind_user: Enable archs that define CFA = SP_callsite + offset Jens Remus
@ 2025-07-16 21:32   ` Josh Poimboeuf
  2025-07-17  9:27     ` Jens Remus
  0 siblings, 1 reply; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-16 21:32 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 10, 2025 at 06:35:12PM +0200, Jens Remus wrote:
> Most architectures define their CFA as the value of the stack pointer
> (SP) at the call site in the previous frame, as suggested by the DWARF
> standard:
> 
>   CFA = <SP at call site>
> 
> Enable unwinding of user space for architectures, such as s390, which
> define their CFA as the value of the SP at the call site in the previous
> frame with an offset:
> 
>   CFA = <SP at call site> + offset

This is a bit confusing, as the comment and code define it as

    SP = CFA + offset

Should the commit log be updated to match that?

> +++ b/arch/x86/include/asm/unwind_user.h
> @@ -8,6 +8,7 @@
>  	.cfa_off	= (s32)sizeof(long) *  2,				\
>  	.ra_off		= (s32)sizeof(long) * -1,				\
>  	.fp_off		= (s32)sizeof(long) * -2,				\
> +	.sp_val_off	= (s32)0,						\

IIUC, this is similar to ra_off and fp_off in that its an offset from
the CFA.  Can we call it "sp_off"?

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA
  2025-07-10 16:35 ` [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA Jens Remus
@ 2025-07-16 23:01   ` Josh Poimboeuf
  2025-07-17 11:09     ` Jens Remus
  0 siblings, 1 reply; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-16 23:01 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 10, 2025 at 06:35:13PM +0200, Jens Remus wrote:
> +++ b/arch/Kconfig
> @@ -450,6 +450,11 @@ config HAVE_UNWIND_USER_SFRAME
>  	bool
>  	select UNWIND_USER
>  
> +config HAVE_USER_RA_REG
> +	bool
> +	help
> +	  The arch passes the return address (RA) in user space in a register.

How about "HAVE_UNWIND_USER_RA_REG" so it matches the existing
namespace?

> @@ -310,6 +307,12 @@ static __always_inline int __find_fre(struct sframe_section *sec,
>  		return -EINVAL;
>  	fre = prev_fre;
>  
> +	if ((!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost) && !fre->ra_off) {
> +		dbg_sec_uaccess("fde addr 0x%x: zero ra_off\n",
> +				fde->start_addr);
> +		return -EINVAL;
> +	}

The topmost frame doesn't necessarily (or even likely) come from before
the prologue, or from a leaf function, so this check would miss the case
where a non-leaf function wrongly has !ra_off after its prologue.

Which in reality is probably fine, as there are other guardrails in
place to catch such bad sframe data.

But then do we think the !ra_off check is still worth the effort?  It
would be simpler to just always assume !ra_off is valid for the
CONFIG_HAVE_USER_RA_REG case.

I think I prefer the simplicity of removing the check, as the error
would be rare, and corrupt sframe would be caught in other ways.

> @@ -86,18 +88,28 @@ static int unwind_user_next(struct unwind_user_state *state)
>  
>  	/* Get the Stack Pointer (SP) */
>  	sp = cfa + frame->sp_val_off;
> -	/* Make sure that stack is not going in wrong direction */
> -	if (sp <= state->sp)
> +	/*
> +	 * Make sure that stack is not going in wrong direction.  Allow SP
> +	 * to be unchanged for the topmost frame, by subtracting topmost,
> +	 * which is either 0 or 1.
> +	 */
> +	if (sp <= state->sp - topmost)
>  		goto done;
>  
> -	/* Make sure that the address is word aligned */
> -	shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
> -	if ((cfa + frame->ra_off) & ((1 << shift) - 1))
> -		goto done;
>  
>  	/* Get the Return Address (RA) */
> -	if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
> -		goto done;
> +	if (frame->ra_off) {
> +		/* Make sure that the address is word aligned */
> +		shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
> +		if ((cfa + frame->ra_off) & ((1 << shift) - 1))
> +			goto done;
> +		if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
> +			goto done;
> +	} else {
> +		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
> +			goto done;

I think this check is redundant with the one in __find_fre()?

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-10 16:35 ` [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers Jens Remus
@ 2025-07-17  2:01   ` Josh Poimboeuf
  2025-07-17  2:50     ` Josh Poimboeuf
                       ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-17  2:01 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 10, 2025 at 06:35:14PM +0200, Jens Remus wrote:
> +#ifndef unwind_user_get_reg
> +
> +/**
> + * generic_unwind_user_get_reg - Get register value.
> + * @val: Register value.
> + * @regnum: DWARF register number to obtain the value from.
> + *
> + * Returns zero if successful. Otherwise -EINVAL.
> + */
> +static inline int generic_unwind_user_get_reg(unsigned long *val, int regnum)
> +{
> +	return -EINVAL;
> +}
> +
> +#define unwind_user_get_reg generic_unwind_user_get_reg
> +
> +#endif /* !unwind_user_get_reg */

I believe the traditional way to do this is to give the function the
same name as the define:

#ifndef unwind_user_get_reg
static inline int unwind_user_get_reg(unsigned long *val, int regnum)
{
	return -EINVAL;
}
#define unwind_user_get_reg unwind_user_get_reg
#endif

> +/**
> + * generic_sframe_set_frame_reginfo - Populate info to unwind FP/RA register
> + * from SFrame offset.
> + * @reginfo: Unwind info for FP/RA register.
> + * @offset: SFrame offset value.
> + *
> + * A non-zero offset value denotes a stack offset from CFA and indicates
> + * that the register is saved on the stack. A zero offset value indicates
> + * that the register is not saved.
> + */
> +static inline void generic_sframe_set_frame_reginfo(
> +	struct unwind_user_reginfo *reginfo,
> +	s32 offset)
> +{
> +	if (offset) {
> +		reginfo->loc = UNWIND_USER_LOC_STACK;
> +		reginfo->frame_off = offset;
> +	} else {
> +		reginfo->loc = UNWIND_USER_LOC_NONE;
> +	}
> +}

This just inits the reginfo struct, can we call it sframe_init_reginfo()?

Also the function comment seems completely superfluous as the function
is completely obvious.

Also the signature should match kernel style, something like:

static inline void
sframe_init_reginfo(struct unwind_user_reginfo *reginfo, s32 offset)

> @@ -98,26 +98,57 @@ static int unwind_user_next(struct unwind_user_state *state)
>  
>  
>  	/* Get the Return Address (RA) */
> -	if (frame->ra_off) {
> +	switch (frame->ra.loc) {
> +	case UNWIND_USER_LOC_NONE:
> +		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
> +			goto done;
> +		ra = user_return_address(task_pt_regs(current));
> +		break;
> +	case UNWIND_USER_LOC_STACK:
> +		if (!frame->ra.frame_off)
> +			goto done;
>  		/* Make sure that the address is word aligned */
>  		shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
> -		if ((cfa + frame->ra_off) & ((1 << shift) - 1))
> +		if ((cfa + frame->ra.frame_off) & ((1 << shift) - 1))
>  			goto done;
> -		if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
> +		if (unwind_get_user_long(ra, cfa + frame->ra.frame_off, state))
>  			goto done;
> -	} else {
> -		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
> +		break;
> +	case UNWIND_USER_LOC_REG:
> +		if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_LOC_REG) || !topmost)
>  			goto done;
> -		ra = user_return_address(task_pt_regs(current));
> +		if (unwind_user_get_reg(&ra, frame->ra.regnum))
> +			goto done;
> +		break;
> +	default:
> +		WARN_ON_ONCE(1);
> +		goto done;

The default case will never happen, can we make it a BUG()?

>  	}
>  
>  	/* Get the Frame Pointer (FP) */
> -	if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off, state))
> +	switch (frame->fp.loc) {
> +	case UNWIND_USER_LOC_NONE:
> +		break;

The UNWIND_USER_LOC_NONE behavior is different here compared to above.
Do we also need UNWIND_USER_LOC_PT_REGS?

> +	case UNWIND_USER_LOC_STACK:
> +		if (!frame->fp.frame_off)
> +			goto done;
> +		if (unwind_get_user_long(fp, cfa + frame->fp.frame_off, state))
> +			goto done;
> +		break;
> +	case UNWIND_USER_LOC_REG:
> +		if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_LOC_REG) || !topmost)
> +			goto done;

The topmost checking is *really* getting cumbersome, I do hope we can
get rid of that.

> +		if (unwind_user_get_reg(&fp, frame->fp.regnum))
> +			goto done;
> +		break;
> +	default:
> +		WARN_ON_ONCE(1);
>  		goto done;
> +	}

BUG(1) here as well.

>  	state->ip = ra;
>  	state->sp = sp;
> -	if (frame->fp_off)
> +	if (frame->fp.loc != UNWIND_USER_LOC_NONE)
>  		state->fp = fp;

Instead of the extra conditional here, can fp be initialized to zero?
Either at the top of the function or in the RA switch statement?

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding
  2025-07-10 16:35 ` [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding Jens Remus
@ 2025-07-17  2:06   ` Josh Poimboeuf
  2025-07-17 12:20     ` Jens Remus
  0 siblings, 1 reply; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-17  2:06 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 10, 2025 at 06:35:18PM +0200, Jens Remus wrote:
> @@ -66,12 +73,20 @@ static int unwind_user_next(struct unwind_user_state *state)
>  		/* sframe expects the frame to be local storage */
>  		frame = &_frame;
>  		if (sframe_find(state->ip, frame, topmost)) {
> -			if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP))
> -				goto done;
> -			frame = &fp_frame;
> +			if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP)) {
> +				frame = &fp_frame;
> +			} else if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_BACKCHAIN)) {
> +				if (unwind_user_backchain_next(state))
> +					goto done;
> +				goto done_backchain;
> +			}
>  		}
>  	} else if (fp_state(state)) {
>  		frame = &fp_frame;
> +	} else if (backchain_state(state)) {
> +		if (unwind_user_backchain_next(state))
> +			goto done;
> +		goto done_backchain;
>  	} else {
>  		goto done;
>  	}
> @@ -153,6 +168,7 @@ static int unwind_user_next(struct unwind_user_state *state)
>  
>  	arch_unwind_user_next(state);
>  
> +done_backchain:
>  	state->topmost = false;
>  	return 0;

This feels very grafted on, is there not some way to make it more
generic, i.e., to just work with CONFIG_HAVE_UNWIND_USER_FP?

Also, if distros aren't even compiling with -mbackchain, I wonder if we
can just not do this altogether :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17  2:01   ` Josh Poimboeuf
@ 2025-07-17  2:50     ` Josh Poimboeuf
  2025-07-17 12:07       ` Jens Remus
  2025-07-17  3:57     ` Steven Rostedt
  2025-07-17 11:28     ` Jens Remus
  2 siblings, 1 reply; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-17  2:50 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Wed, Jul 16, 2025 at 07:01:09PM -0700, Josh Poimboeuf wrote:
> >  	state->ip = ra;
> >  	state->sp = sp;
> > -	if (frame->fp_off)
> > +	if (frame->fp.loc != UNWIND_USER_LOC_NONE)
> >  		state->fp = fp;
> 
> Instead of the extra conditional here, can fp be initialized to zero?
> Either at the top of the function or in the RA switch statement?

So it's been a while since I looked at the original code, but I actually
think there's a bug here.

There's a subtlety in the original code:

	if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off, state))
		goto done;

	state->ip = ra;
	state->sp = cfa;
	if (frame->fp_off)
		state->fp = fp;

	arch_unwind_user_next(state);

Note that unlike !frame->ra_off, !frame->fp_off does NOT end the unwind.
That only means the FP offset is unknown for the current frame.  Which
is a perfectly valid condition, e.g. if the function doesn't have frame
pointers or if it's before the prologue.

In that case, the unwind should continue, and state->fp's existing value
should be preserved, as it might already have a valid value from a
previous frame.

So the following is wrong:

	case UNWIND_USER_LOC_STACK:
		if (!frame->fp.frame_off)
			goto done;
		if (unwind_get_user_long(fp, cfa + frame->fp.frame_off, state))
			goto done;
		break;

Instead of having !fp.frame_off stopping the unwind, it should just
break out of the switch statement and keep going.

And that means the following is also wrong:

	state->ip = ra;
	state->sp = sp;
	if (frame->fp.loc != UNWIND_USER_LOC_NONE)
		state->fp = fp;

because state->fp needs to preserved for the STACK+!fp.frame_off case.

So, something like this?

	bool preserve_fp = false;
	...

	/* Get the Frame Pointer (FP) */
	switch (frame->fp.loc) {
	case UNWIND_USER_LOC_NONE:
		preserve_fp = true;
		break;
	case UNWIND_USER_LOC_STACK:
		if (!frame->fp.frame_off) {
			preserve_fp = true;
			break;
		}
	...

	state->ip = ra;
	state->sp = sp;
	if (!preserve_fp)
		state->fp = fp;

BTW, I would suggest renaming "frame_off" to "offset",
"frame->fp.offset" reads better and is more compact.

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17  2:01   ` Josh Poimboeuf
  2025-07-17  2:50     ` Josh Poimboeuf
@ 2025-07-17  3:57     ` Steven Rostedt
  2025-07-17  7:24       ` Josh Poimboeuf
  2025-07-17 11:28     ` Jens Remus
  2 siblings, 1 reply; 42+ messages in thread
From: Steven Rostedt @ 2025-07-17  3:57 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jens Remus, linux-kernel, linux-trace-kernel, bpf, x86,
	Steven Rostedt, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Andrii Nakryiko,
	Indu Bhagat, Jose E. Marchesi, Beau Belgrave, Linus Torvalds,
	Andrew Morton, Jens Axboe, Florian Weimer, Sam James

On Wed, 16 Jul 2025 19:01:06 -0700
Josh Poimboeuf <jpoimboe@kernel.org> wrote:

> > +		if (unwind_user_get_reg(&ra, frame->ra.regnum))
> > +			goto done;
> > +		break;
> > +	default:
> > +		WARN_ON_ONCE(1);
> > +		goto done;  
> 
> The default case will never happen, can we make it a BUG()?

Is this really serious enough to crash the system? WARN_ON_ONCE() *is* for
things that will never happen.

The only time I ever use BUG() is if it's too dangerous to continue (like a
function graph trampoline that gets corrupted and has no place to return
to). In general, usage of BUG() should be avoided.

-- Steve

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17  3:57     ` Steven Rostedt
@ 2025-07-17  7:24       ` Josh Poimboeuf
  2025-07-17 12:05         ` Steven Rostedt
  0 siblings, 1 reply; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-17  7:24 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jens Remus, linux-kernel, linux-trace-kernel, bpf, x86,
	Steven Rostedt, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Andrii Nakryiko,
	Indu Bhagat, Jose E. Marchesi, Beau Belgrave, Linus Torvalds,
	Andrew Morton, Jens Axboe, Florian Weimer, Sam James

On Wed, Jul 16, 2025 at 11:57:51PM -0400, Steven Rostedt wrote:
> On Wed, 16 Jul 2025 19:01:06 -0700
> Josh Poimboeuf <jpoimboe@kernel.org> wrote:
> 
> > > +		if (unwind_user_get_reg(&ra, frame->ra.regnum))
> > > +			goto done;
> > > +		break;
> > > +	default:
> > > +		WARN_ON_ONCE(1);
> > > +		goto done;  
> > 
> > The default case will never happen, can we make it a BUG()?
> 
> Is this really serious enough to crash the system? WARN_ON_ONCE() *is* for
> things that will never happen.
> 
> The only time I ever use BUG() is if it's too dangerous to continue (like a
> function graph trampoline that gets corrupted and has no place to return
> to). In general, usage of BUG() should be avoided.

This is an unreachable code path, but __builtin_unreachable() is crap
due to undefined behavior.  IMO, BUG() for unreachable paths is cleaner
than WARN_ON_ONCE(), but it doesn't matter much either way I suppose.

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 06/16] unwind_user: Enable archs that define CFA = SP_callsite + offset
  2025-07-16 21:32   ` Josh Poimboeuf
@ 2025-07-17  9:27     ` Jens Remus
  2025-07-18  4:51       ` Josh Poimboeuf
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-17  9:27 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On 16.07.2025 23:32, Josh Poimboeuf wrote:
> On Thu, Jul 10, 2025 at 06:35:12PM +0200, Jens Remus wrote:
>> Most architectures define their CFA as the value of the stack pointer
>> (SP) at the call site in the previous frame, as suggested by the DWARF
>> standard:
>>
>>   CFA = <SP at call site>
>>
>> Enable unwinding of user space for architectures, such as s390, which
>> define their CFA as the value of the SP at the call site in the previous
>> frame with an offset:
>>
>>   CFA = <SP at call site> + offset
> 
> This is a bit confusing, as the comment and code define it as
> 
>     SP = CFA + offset
> 
> Should the commit log be updated to match that?

I agree that the commit message is confusing. Would it help if I replace
it with the following:

Most architectures define their CFA as the value of the stack pointer
(SP) at the call site in the previous frame, as suggested by the DWARF
standard.  Therefore the SP at call site can be unwound using an
implicitly assumed value offset from CFA rule with an offset of zero:

  .cfi_val_offset <SP>, 0

As a result the SP at call site computes as follows:

  SP = CFA

Enable unwinding of user space for architectures, such as s390, which
define their CFA as the value of the SP at the call site in the previous
frame with an offset.  Do so by enabling architectures to override the
default SP value offset from CFA of zero with an architecture-specific
one:

  .cfi_val_offset <SP>, offset
  
So that the SP at call site computes as follows:

  SP = CFA + offset

>> +++ b/arch/x86/include/asm/unwind_user.h
>> @@ -8,6 +8,7 @@
>>  	.cfa_off	= (s32)sizeof(long) *  2,				\
>>  	.ra_off		= (s32)sizeof(long) * -1,				\
>>  	.fp_off		= (s32)sizeof(long) * -2,				\
>> +	.sp_val_off	= (s32)0,						\
> 
> IIUC, this is similar to ra_off and fp_off in that its an offset from
> the CFA.  Can we call it "sp_off"?

My intent was to use the terminology from DWARF CFI (i.e. "offset(N)"
and "val_offset(N)") and the related assembler CFI directives:

  .cfi_offset register, offset:  Previous value of register is saved at
                                 offset from CFA.

  .cfi_val_offset register, offset:  Previous value of register is
                                     CFA + offset. 

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA
  2025-07-16 23:01   ` Josh Poimboeuf
@ 2025-07-17 11:09     ` Jens Remus
  2025-07-18  8:28       ` Jens Remus
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-17 11:09 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On 17.07.2025 01:01, Josh Poimboeuf wrote:
> On Thu, Jul 10, 2025 at 06:35:13PM +0200, Jens Remus wrote:
>> +++ b/arch/Kconfig
>> @@ -450,6 +450,11 @@ config HAVE_UNWIND_USER_SFRAME
>>  	bool
>>  	select UNWIND_USER
>>  
>> +config HAVE_USER_RA_REG
>> +	bool
>> +	help
>> +	  The arch passes the return address (RA) in user space in a register.
> 
> How about "HAVE_UNWIND_USER_RA_REG" so it matches the existing
> namespace?

Ok.  I am open to any improvements.

>> @@ -310,6 +307,12 @@ static __always_inline int __find_fre(struct sframe_section *sec,
>>  		return -EINVAL;
>>  	fre = prev_fre;
>>  
>> +	if ((!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost) && !fre->ra_off) {
>> +		dbg_sec_uaccess("fde addr 0x%x: zero ra_off\n",
>> +				fde->start_addr);
>> +		return -EINVAL;
>> +	}
> 
> The topmost frame doesn't necessarily (or even likely) come from before
> the prologue, or from a leaf function, so this check would miss the case
> where a non-leaf function wrongly has !ra_off after its prologue.
>
> Which in reality is probably fine, as there are other guardrails in
> place to catch such bad sframe data.

On s390 the compiler may intermingle prologue instructions with function
body instructions.  As a result !ra_off is valid as long as the RA
register value from function entry has not been destroyed.  Furthermore
the compiler may decide to save RA only if it is actually about to
destroy the RA register value (e.g. due to a function call).

Note that it is valid to restore the RA register value anywhere in a
function and represent that in SFrame.  Following is an example from
Glibc libc.so psignal() on s390x.  The RA rule changes back to "u"
(= undefined; !ra_off) multiple times, whenever the RA register has
its value from function entry again.

  func idx [589]: pc = 0x6c530, size = 250 bytes <psignal>
  STARTPC         CFA       FP        RA
  000000000006c530  sp+160    u         u
  000000000006c536  sp+160    c-72      c-48
  000000000006c53c  sp+328    c-72      c-48
  000000000006c5b4  sp+160    u         u
  000000000006c5b6  sp+328    c-72      c-48
  000000000006c60c  sp+160    u         u
  000000000006c60e  sp+328    c-72      c-48

> But then do we think the !ra_off check is still worth the effort?  It
> would be simpler to just always assume !ra_off is valid for the
> CONFIG_HAVE_USER_RA_REG case.

It is only valid in the topmost frame.  Otherwise something is wrong.

> I think I prefer the simplicity of removing the check, as the error
> would be rare, and corrupt sframe would be caught in other ways.

But I am fine with removing it in user unwind sframe (above), as
user unwind performs the same check (see below).

>> @@ -86,18 +88,28 @@ static int unwind_user_next(struct unwind_user_state *state)
>>  
>>  	/* Get the Stack Pointer (SP) */
>>  	sp = cfa + frame->sp_val_off;
>> -	/* Make sure that stack is not going in wrong direction */
>> -	if (sp <= state->sp)
>> +	/*
>> +	 * Make sure that stack is not going in wrong direction.  Allow SP
>> +	 * to be unchanged for the topmost frame, by subtracting topmost,
>> +	 * which is either 0 or 1.
>> +	 */
>> +	if (sp <= state->sp - topmost)
>>  		goto done;
>>  
>> -	/* Make sure that the address is word aligned */
>> -	shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
>> -	if ((cfa + frame->ra_off) & ((1 << shift) - 1))
>> -		goto done;
>>  
>>  	/* Get the Return Address (RA) */
>> -	if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
>> -		goto done;
>> +	if (frame->ra_off) {
>> +		/* Make sure that the address is word aligned */
>> +		shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
>> +		if ((cfa + frame->ra_off) & ((1 << shift) - 1))
>> +			goto done;
>> +		if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
>> +			goto done;
>> +	} else {
>> +		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
>> +			goto done;
> 
> I think this check is redundant with the one in __find_fre()?

Correct.  This one needs to stay, as it is at the topmost level (user
unwind checks the unwind info obtained from user unwind sframe or any
other method).

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17  2:01   ` Josh Poimboeuf
  2025-07-17  2:50     ` Josh Poimboeuf
  2025-07-17  3:57     ` Steven Rostedt
@ 2025-07-17 11:28     ` Jens Remus
  2025-07-17 12:10       ` Steven Rostedt
  2025-07-18  4:51       ` Josh Poimboeuf
  2 siblings, 2 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-17 11:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On 17.07.2025 04:01, Josh Poimboeuf wrote:
> On Thu, Jul 10, 2025 at 06:35:14PM +0200, Jens Remus wrote:
>> +#ifndef unwind_user_get_reg
>> +
>> +/**
>> + * generic_unwind_user_get_reg - Get register value.
>> + * @val: Register value.
>> + * @regnum: DWARF register number to obtain the value from.
>> + *
>> + * Returns zero if successful. Otherwise -EINVAL.
>> + */
>> +static inline int generic_unwind_user_get_reg(unsigned long *val, int regnum)
>> +{
>> +	return -EINVAL;
>> +}
>> +
>> +#define unwind_user_get_reg generic_unwind_user_get_reg
>> +
>> +#endif /* !unwind_user_get_reg */
> 
> I believe the traditional way to do this is to give the function the
> same name as the define:
> 
> #ifndef unwind_user_get_reg
> static inline int unwind_user_get_reg(unsigned long *val, int regnum)
> {
> 	return -EINVAL;
> }
> #define unwind_user_get_reg unwind_user_get_reg
> #endif

Thanks!  I will use use suggestion.

>> +/**
>> + * generic_sframe_set_frame_reginfo - Populate info to unwind FP/RA register
>> + * from SFrame offset.
>> + * @reginfo: Unwind info for FP/RA register.
>> + * @offset: SFrame offset value.
>> + *
>> + * A non-zero offset value denotes a stack offset from CFA and indicates
>> + * that the register is saved on the stack. A zero offset value indicates
>> + * that the register is not saved.
>> + */
>> +static inline void generic_sframe_set_frame_reginfo(
>> +	struct unwind_user_reginfo *reginfo,
>> +	s32 offset)
>> +{
>> +	if (offset) {
>> +		reginfo->loc = UNWIND_USER_LOC_STACK;
>> +		reginfo->frame_off = offset;
>> +	} else {
>> +		reginfo->loc = UNWIND_USER_LOC_NONE;
>> +	}
>> +}
> 
> This just inits the reginfo struct, can we call it sframe_init_reginfo()?
> 
> Also the function comment seems completely superfluous as the function
> is completely obvious.
> 
> Also the signature should match kernel style, something like:
> 
> static inline void
> sframe_init_reginfo(struct unwind_user_reginfo *reginfo, s32 offset)

Ditto.

>> @@ -98,26 +98,57 @@ static int unwind_user_next(struct unwind_user_state *state)
>>  
>>  
>>  	/* Get the Return Address (RA) */
>> -	if (frame->ra_off) {
>> +	switch (frame->ra.loc) {
>> +	case UNWIND_USER_LOC_NONE:
>> +		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
>> +			goto done;
>> +		ra = user_return_address(task_pt_regs(current));
>> +		break;
>> +	case UNWIND_USER_LOC_STACK:
>> +		if (!frame->ra.frame_off)
>> +			goto done;
>>  		/* Make sure that the address is word aligned */
>>  		shift = sizeof(long) == 4 || compat_fp_state(state) ? 2 : 3;
>> -		if ((cfa + frame->ra_off) & ((1 << shift) - 1))
>> +		if ((cfa + frame->ra.frame_off) & ((1 << shift) - 1))
>>  			goto done;
>> -		if (unwind_get_user_long(ra, cfa + frame->ra_off, state))
>> +		if (unwind_get_user_long(ra, cfa + frame->ra.frame_off, state))
>>  			goto done;
>> -	} else {
>> -		if (!IS_ENABLED(CONFIG_HAVE_USER_RA_REG) || !topmost)
>> +		break;
>> +	case UNWIND_USER_LOC_REG:
>> +		if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_LOC_REG) || !topmost)
>>  			goto done;
>> -		ra = user_return_address(task_pt_regs(current));
>> +		if (unwind_user_get_reg(&ra, frame->ra.regnum))
>> +			goto done;
>> +		break;
>> +	default:
>> +		WARN_ON_ONCE(1);
>> +		goto done;
> 
> The default case will never happen, can we make it a BUG()?

Whatever Steve and you agree on.  I am new to Kernel development.

>>  	}
>>  
>>  	/* Get the Frame Pointer (FP) */
>> -	if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off, state))
>> +	switch (frame->fp.loc) {
>> +	case UNWIND_USER_LOC_NONE:
>> +		break;
> 
> The UNWIND_USER_LOC_NONE behavior is different here compared to above.

See my comments below.

> Do we also need UNWIND_USER_LOC_PT_REGS?

Sorry, I cannot follow.  Do you suggest to rename UNWIND_USER_LOC_REG to
UNWIND_USER_LOC_PT_REGS?

>> +	case UNWIND_USER_LOC_STACK:
>> +		if (!frame->fp.frame_off)
>> +			goto done;
>> +		if (unwind_get_user_long(fp, cfa + frame->fp.frame_off, state))
>> +			goto done;
>> +		break;
>> +	case UNWIND_USER_LOC_REG:
>> +		if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_LOC_REG) || !topmost)
>> +			goto done;
> 
> The topmost checking is *really* getting cumbersome, I do hope we can
> get rid of that.

Restoring from arbitrary registers is only valid in the topmost frame,
as their values (i.e. task_pt_regs(current)) are only available there.
For other frames only SP, FP, and RA register values are available.

I think this test makes sense.  Is this test really that expensive?

>> +		if (unwind_user_get_reg(&fp, frame->fp.regnum))
>> +			goto done;
>> +		break;
>> +	default:
>> +		WARN_ON_ONCE(1);
>>  		goto done;
>> +	}
> 
> BUG(1) here as well.

Same as for other WARN_ON_ONCE() vs. BUG().

>>  	state->ip = ra;
>>  	state->sp = sp;
>> -	if (frame->fp_off)
>> +	if (frame->fp.loc != UNWIND_USER_LOC_NONE)
>>  		state->fp = fp;
> 
> Instead of the extra conditional here, can fp be initialized to zero?
> Either at the top of the function or in the RA switch statement?

No.  But fp could be initialized to state->fp, so that when it is not
saved and thus not restored it keeps it previous value.

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17  7:24       ` Josh Poimboeuf
@ 2025-07-17 12:05         ` Steven Rostedt
  0 siblings, 0 replies; 42+ messages in thread
From: Steven Rostedt @ 2025-07-17 12:05 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jens Remus, linux-kernel, linux-trace-kernel, bpf, x86,
	Steven Rostedt, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Andrii Nakryiko,
	Indu Bhagat, Jose E. Marchesi, Beau Belgrave, Linus Torvalds,
	Andrew Morton, Jens Axboe, Florian Weimer, Sam James

On Thu, 17 Jul 2025 00:24:47 -0700
Josh Poimboeuf <jpoimboe@kernel.org> wrote:

> > The only time I ever use BUG() is if it's too dangerous to continue (like a
> > function graph trampoline that gets corrupted and has no place to return
> > to). In general, usage of BUG() should be avoided.  
> 
> This is an unreachable code path, but __builtin_unreachable() is crap
> due to undefined behavior.  IMO, BUG() for unreachable paths is cleaner
> than WARN_ON_ONCE(), but it doesn't matter much either way I suppose.

Linus has stated that BUG() should be avoided too. If the code changes in
the future and this suddenly becomes a reachable path, would you rather
have a WARN or a BUG? If you don't have your system set up properly,
the BUG may not even show you why your system crashed.

-- Steve

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17  2:50     ` Josh Poimboeuf
@ 2025-07-17 12:07       ` Jens Remus
  2025-07-18  4:52         ` Josh Poimboeuf
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-17 12:07 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On 17.07.2025 04:50, Josh Poimboeuf wrote:
> On Wed, Jul 16, 2025 at 07:01:09PM -0700, Josh Poimboeuf wrote:
>>>  	state->ip = ra;
>>>  	state->sp = sp;
>>> -	if (frame->fp_off)
>>> +	if (frame->fp.loc != UNWIND_USER_LOC_NONE)
>>>  		state->fp = fp;
>>
>> Instead of the extra conditional here, can fp be initialized to zero?
>> Either at the top of the function or in the RA switch statement?

No, but fp could be initialized to state->fp, so that it retains its
value.

> So it's been a while since I looked at the original code, but I actually
> think there's a bug here.
> 
> There's a subtlety in the original code:
> 
> 	if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off, state))
> 		goto done;
> 
> 	state->ip = ra;
> 	state->sp = cfa;
> 	if (frame->fp_off)
> 		state->fp = fp;
> 
> 	arch_unwind_user_next(state);
> 
> Note that unlike !frame->ra_off, !frame->fp_off does NOT end the unwind.
> That only means the FP offset is unknown for the current frame.  Which
> is a perfectly valid condition, e.g. if the function doesn't have frame
> pointers or if it's before the prologue.
> 
> In that case, the unwind should continue, and state->fp's existing value
> should be preserved, as it might already have a valid value from a
> previous frame.

I fully agree with all of the above.

> So the following is wrong:
> 
> 	case UNWIND_USER_LOC_STACK:
> 		if (!frame->fp.frame_off)
> 			goto done;
> 		if (unwind_get_user_long(fp, cfa + frame->fp.frame_off, state))
> 			goto done;
> 		break;
> 
> Instead of having !fp.frame_off stopping the unwind, it should just
> break out of the switch statement and keep going.

If frame->fp.loc is UNWIND_USER_LOC_STACK then frame->fp.frame_off must
have a value != 0.  At least if we keep the original semantic.

We can omit this check, if we assume all producers of frame behave
correctly.  For instance user unwind sframe would not produce that
(see below).  Could it somehow be made a debug-config-only check?

> And that means the following is also wrong:
> 
> 	state->ip = ra;
> 	state->sp = sp;
> 	if (frame->fp.loc != UNWIND_USER_LOC_NONE)
> 		state->fp = fp;
> 
> because state->fp needs to preserved for the STACK+!fp.frame_off case.

unwind user sframe will not produce that:

static inline void
sframe_init_reginfo(struct unwind_user_reginfo *reginfo, s32 offset)
{
	if (offset) {
		reginfo->loc = UNWIND_USER_LOC_STACK;
		reginfo->frame_off = offset;
	} else {
		reginfo->loc = UNWIND_USER_LOC_NONE;
	}
}

> So, something like this?
> 
> 	bool preserve_fp = false;
> 	...
> 
> 	/* Get the Frame Pointer (FP) */
> 	switch (frame->fp.loc) {
> 	case UNWIND_USER_LOC_NONE:
> 		preserve_fp = true;
> 		break;
> 	case UNWIND_USER_LOC_STACK:
> 		if (!frame->fp.frame_off) {
> 			preserve_fp = true;
> 			break;
> 		}
> 	...
> 
> 	state->ip = ra;
> 	state->sp = sp;
> 	if (!preserve_fp)
> 		state->fp = fp;

I don't think all of this is necessary.

frame->fp.loc == UNWIND_USER_LOC_NONE already indicates the state->fp
retains it's previous value.

> BTW, I would suggest renaming "frame_off" to "offset",
> "frame->fp.offset" reads better and is more compact.

Makes sense.  I'll change that.

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17 11:28     ` Jens Remus
@ 2025-07-17 12:10       ` Steven Rostedt
  2025-07-18  4:51       ` Josh Poimboeuf
  1 sibling, 0 replies; 42+ messages in thread
From: Steven Rostedt @ 2025-07-17 12:10 UTC (permalink / raw)
  To: Jens Remus
  Cc: Josh Poimboeuf, linux-kernel, linux-trace-kernel, bpf, x86,
	Steven Rostedt, Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich,
	Masami Hiramatsu, Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Andrii Nakryiko,
	Indu Bhagat, Jose E. Marchesi, Beau Belgrave, Linus Torvalds,
	Andrew Morton, Jens Axboe, Florian Weimer, Sam James

On Thu, 17 Jul 2025 13:28:25 +0200
Jens Remus <jremus@linux.ibm.com> wrote:

> >> +	default:
> >> +		WARN_ON_ONCE(1);
> >> +		goto done;  
> > 
> > The default case will never happen, can we make it a BUG()?  
> 
> Whatever Steve and you agree on.  I am new to Kernel development.

Keep the WARN_ON(). Linus has yelled at people for using BUG() when a
WARN_ON() would do.

WARN_ON() is meant for things that should never happen. BUG() is reserved
for places in the kernel that is dangerous to continue.

It's even documented:

  https://docs.kernel.org/process/deprecated.html#bug-and-bug-on

-- Steve

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding
  2025-07-17  2:06   ` Josh Poimboeuf
@ 2025-07-17 12:20     ` Jens Remus
  2025-07-18  5:19       ` Josh Poimboeuf
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-17 12:20 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On 17.07.2025 04:06, Josh Poimboeuf wrote:
> On Thu, Jul 10, 2025 at 06:35:18PM +0200, Jens Remus wrote:
>> @@ -66,12 +73,20 @@ static int unwind_user_next(struct unwind_user_state *state)
>>  		/* sframe expects the frame to be local storage */
>>  		frame = &_frame;
>>  		if (sframe_find(state->ip, frame, topmost)) {
>> -			if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP))
>> -				goto done;
>> -			frame = &fp_frame;
>> +			if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_FP)) {
>> +				frame = &fp_frame;
>> +			} else if (IS_ENABLED(CONFIG_HAVE_UNWIND_USER_BACKCHAIN)) {
>> +				if (unwind_user_backchain_next(state))
>> +					goto done;
>> +				goto done_backchain;
>> +			}
>>  		}
>>  	} else if (fp_state(state)) {
>>  		frame = &fp_frame;
>> +	} else if (backchain_state(state)) {
>> +		if (unwind_user_backchain_next(state))
>> +			goto done;
>> +		goto done_backchain;
>>  	} else {
>>  		goto done;
>>  	}
>> @@ -153,6 +168,7 @@ static int unwind_user_next(struct unwind_user_state *state)
>>  
>>  	arch_unwind_user_next(state);
>>  
>> +done_backchain:
>>  	state->topmost = false;
>>  	return 0;
> 
> This feels very grafted on, is there not some way to make it more
> generic, i.e., to just work with CONFIG_HAVE_UNWIND_USER_FP?

I agree.  It could probably be made to compute the cfa_off and ra.offset
or ra.regnum.  Let me explore that, provided there would be any acceptance
for unwind user backchain at all. Note that Power is using backchain as
well, so they may want to build on that as well.

> Also, if distros aren't even compiling with -mbackchain, I wonder if we
> can just not do this altogether :-)

My original intent was to use unwind user's for_each_user_frame() to
replace the exiting stack tracing logic in arch_stack_walk_user_common()
in arch/s390/kernel/stacktrace.c, which currently supports backchain.
Given that for_each_user_frame() was made private in the latest unwind
user series version hinders me.  The use was also low, because the
currentl arch_stack_walk_user_common() implementation does not support
page faults, so that the attempt to use unwind user sframe would always
fail and fallback to unwind user backchain.  My hope was that somebody
with more Kernel skills could give me a few hints at how it could be
made to support deferred unwind. :-)

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 06/16] unwind_user: Enable archs that define CFA = SP_callsite + offset
  2025-07-17  9:27     ` Jens Remus
@ 2025-07-18  4:51       ` Josh Poimboeuf
  0 siblings, 0 replies; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-18  4:51 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 17, 2025 at 11:27:45AM +0200, Jens Remus wrote:
> On 16.07.2025 23:32, Josh Poimboeuf wrote:
> > On Thu, Jul 10, 2025 at 06:35:12PM +0200, Jens Remus wrote:
> >> Most architectures define their CFA as the value of the stack pointer
> >> (SP) at the call site in the previous frame, as suggested by the DWARF
> >> standard:
> >>
> >>   CFA = <SP at call site>
> >>
> >> Enable unwinding of user space for architectures, such as s390, which
> >> define their CFA as the value of the SP at the call site in the previous
> >> frame with an offset:
> >>
> >>   CFA = <SP at call site> + offset
> > 
> > This is a bit confusing, as the comment and code define it as
> > 
> >     SP = CFA + offset
> > 
> > Should the commit log be updated to match that?
> 
> I agree that the commit message is confusing. Would it help if I replace
> it with the following:
> 
> Most architectures define their CFA as the value of the stack pointer
> (SP) at the call site in the previous frame, as suggested by the DWARF
> standard.  Therefore the SP at call site can be unwound using an
> implicitly assumed value offset from CFA rule with an offset of zero:
> 
>   .cfi_val_offset <SP>, 0
> 
> As a result the SP at call site computes as follows:
> 
>   SP = CFA
> 
> Enable unwinding of user space for architectures, such as s390, which
> define their CFA as the value of the SP at the call site in the previous
> frame with an offset.  Do so by enabling architectures to override the
> default SP value offset from CFA of zero with an architecture-specific
> one:
> 
>   .cfi_val_offset <SP>, offset
>   
> So that the SP at call site computes as follows:
> 
>   SP = CFA + offset

Looks good to me, thanks!

> >> +++ b/arch/x86/include/asm/unwind_user.h
> >> @@ -8,6 +8,7 @@
> >>  	.cfa_off	= (s32)sizeof(long) *  2,				\
> >>  	.ra_off		= (s32)sizeof(long) * -1,				\
> >>  	.fp_off		= (s32)sizeof(long) * -2,				\
> >> +	.sp_val_off	= (s32)0,						\
> > 
> > IIUC, this is similar to ra_off and fp_off in that its an offset from
> > the CFA.  Can we call it "sp_off"?
> 
> My intent was to use the terminology from DWARF CFI (i.e. "offset(N)"
> and "val_offset(N)") and the related assembler CFI directives:
> 
>   .cfi_offset register, offset:  Previous value of register is saved at
>                                  offset from CFA.
> 
>   .cfi_val_offset register, offset:  Previous value of register is
>                                      CFA + offset. 

The distinction between "cfi_offset" and "cfi_val_offset" is confusing,
unless one already happens to know CFI syntax (not likely for us kernel
developers).

We don't need to match the DWARF CFI directive naming.  Let's instead
optimize for readability.

I think "sp_off" is fine here, its semantics are similar to the existing
cfa_off field.

The semantics of ra_off and fp_off are different, but those are getting
removed in favor of nested structs in a later patch anyway.

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17 11:28     ` Jens Remus
  2025-07-17 12:10       ` Steven Rostedt
@ 2025-07-18  4:51       ` Josh Poimboeuf
  1 sibling, 0 replies; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-18  4:51 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 17, 2025 at 01:28:25PM +0200, Jens Remus wrote:
> >>  	}
> >>  
> >>  	/* Get the Frame Pointer (FP) */
> >> -	if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off, state))
> >> +	switch (frame->fp.loc) {
> >> +	case UNWIND_USER_LOC_NONE:
> >> +		break;
> > 
> > The UNWIND_USER_LOC_NONE behavior is different here compared to above.
> 
> See my comments below.
> 
> > Do we also need UNWIND_USER_LOC_PT_REGS?
> 
> Sorry, I cannot follow.  Do you suggest to rename UNWIND_USER_LOC_REG to
> UNWIND_USER_LOC_PT_REGS?

I think I completely misunderstood the meaning of UNWIND_USER_LOC_NONE.
Never mind :-)

> >> +	case UNWIND_USER_LOC_STACK:
> >> +		if (!frame->fp.frame_off)
> >> +			goto done;
> >> +		if (unwind_get_user_long(fp, cfa + frame->fp.frame_off, state))
> >> +			goto done;
> >> +		break;
> >> +	case UNWIND_USER_LOC_REG:
> >> +		if (!IS_ENABLED(CONFIG_HAVE_UNWIND_USER_LOC_REG) || !topmost)
> >> +			goto done;
> > 
> > The topmost checking is *really* getting cumbersome, I do hope we can
> > get rid of that.
> 
> Restoring from arbitrary registers is only valid in the topmost frame,
> as their values (i.e. task_pt_regs(current)) are only available there.
> For other frames only SP, FP, and RA register values are available.
> 
> I think this test makes sense.  Is this test really that expensive?

ra_off=0 (UNWIND_USER_LOC_NONE) on a !topmost frame should never happen
unless the sframe entry is bad.  But 0 is *far* from the only potential
bad RA offset value.  In the absolute worst case of a 4 byte offset,
there are 4+ billion other possible bad values that can still go
undetected.

So I question the usefulness of those !topmost tests.  And they do add
complexity to the code.

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers
  2025-07-17 12:07       ` Jens Remus
@ 2025-07-18  4:52         ` Josh Poimboeuf
  0 siblings, 0 replies; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-18  4:52 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 17, 2025 at 02:07:05PM +0200, Jens Remus wrote:
> On 17.07.2025 04:50, Josh Poimboeuf wrote:
> > So the following is wrong:
> > 
> > 	case UNWIND_USER_LOC_STACK:
> > 		if (!frame->fp.frame_off)
> > 			goto done;
> > 		if (unwind_get_user_long(fp, cfa + frame->fp.frame_off, state))
> > 			goto done;
> > 		break;
> > 
> > Instead of having !fp.frame_off stopping the unwind, it should just
> > break out of the switch statement and keep going.
> 
> If frame->fp.loc is UNWIND_USER_LOC_STACK then frame->fp.frame_off must
> have a value != 0.  At least if we keep the original semantic.
> 
> We can omit this check, if we assume all producers of frame behave
> correctly.  For instance user unwind sframe would not produce that
> (see below).  Could it somehow be made a debug-config-only check?

Ah... the !frame->fp.frame_off check for the UNWIND_USER_LOC_STACK case
completely threw me for a loop.  I was confusing that with
UNWIND_USER_LOC_NONE.  So never mind.

And yes, I think that check has no use and can be removed.

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding
  2025-07-17 12:20     ` Jens Remus
@ 2025-07-18  5:19       ` Josh Poimboeuf
  2025-08-01 12:36         ` Heiko Carstens
  0 siblings, 1 reply; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-18  5:19 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 17, 2025 at 02:20:12PM +0200, Jens Remus wrote:
> >> +done_backchain:
> >>  	state->topmost = false;
> >>  	return 0;
> > 
> > This feels very grafted on, is there not some way to make it more
> > generic, i.e., to just work with CONFIG_HAVE_UNWIND_USER_FP?
> 
> I agree.  It could probably be made to compute the cfa_off and ra.offset
> or ra.regnum.  Let me explore that, provided there would be any acceptance
> for unwind user backchain at all. Note that Power is using backchain as
> well, so they may want to build on that as well.
> 
> > Also, if distros aren't even compiling with -mbackchain, I wonder if we
> > can just not do this altogether :-)
> 
> My original intent was to use unwind user's for_each_user_frame() to
> replace the exiting stack tracing logic in arch_stack_walk_user_common()
> in arch/s390/kernel/stacktrace.c, which currently supports backchain.
> Given that for_each_user_frame() was made private in the latest unwind
> user series version hinders me.  The use was also low, because the
> currentl arch_stack_walk_user_common() implementation does not support
> page faults, so that the attempt to use unwind user sframe would always
> fail and fallback to unwind user backchain.  My hope was that somebody
> with more Kernel skills could give me a few hints at how it could be
> made to support deferred unwind. :-)

I believe stack_trace_save_user() is only used by ftrace, and that will
no longer be needed once ftrace starts using unwind_user.

Maybe Heiko knows if that backchain user stacktrace code has any users?

If distros aren't building with -mbackchain, maybe backchain support can
be considered obsoleted by sframe, and we can get away with not
implementing it.

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA
  2025-07-17 11:09     ` Jens Remus
@ 2025-07-18  8:28       ` Jens Remus
  2025-07-18 16:59         ` Josh Poimboeuf
  0 siblings, 1 reply; 42+ messages in thread
From: Jens Remus @ 2025-07-18  8:28 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On 17.07.2025 13:09, Jens Remus wrote:
> On 17.07.2025 01:01, Josh Poimboeuf wrote:
>> On Thu, Jul 10, 2025 at 06:35:13PM +0200, Jens Remus wrote:
>>> +++ b/arch/Kconfig
>>> @@ -450,6 +450,11 @@ config HAVE_UNWIND_USER_SFRAME
>>>  	bool
>>>  	select UNWIND_USER
>>>  
>>> +config HAVE_USER_RA_REG
>>> +	bool
>>> +	help
>>> +	  The arch passes the return address (RA) in user space in a register.
>>
>> How about "HAVE_UNWIND_USER_RA_REG" so it matches the existing
>> namespace?
> 
> Ok.  I am open to any improvements.

Thinking about this again I realized that the config option actually
serves two purposes:

1. Enable code (e.g. unwind user) to determine the presence of the new
   user_return_address().  That is where I derived the name from.
2. Enable unwind user (sframe) to behave differently, if an architecture
   has/uses a RA register (unlike x86, which solely uses the stack).

I think the primary notion is that an architecture has/uses a register
for the return address and thus provides user_return_address().  What
consumers such as unwind user do with that info is secondary.

Thoughts?

Thanks and regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA
  2025-07-18  8:28       ` Jens Remus
@ 2025-07-18 16:59         ` Josh Poimboeuf
  2025-07-21 14:25           ` Jens Remus
  0 siblings, 1 reply; 42+ messages in thread
From: Josh Poimboeuf @ 2025-07-18 16:59 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Fri, Jul 18, 2025 at 10:28:32AM +0200, Jens Remus wrote:
> On 17.07.2025 13:09, Jens Remus wrote:
> > On 17.07.2025 01:01, Josh Poimboeuf wrote:
> >> On Thu, Jul 10, 2025 at 06:35:13PM +0200, Jens Remus wrote:
> >>> +++ b/arch/Kconfig
> >>> @@ -450,6 +450,11 @@ config HAVE_UNWIND_USER_SFRAME
> >>>  	bool
> >>>  	select UNWIND_USER
> >>>  
> >>> +config HAVE_USER_RA_REG
> >>> +	bool
> >>> +	help
> >>> +	  The arch passes the return address (RA) in user space in a register.
> >>
> >> How about "HAVE_UNWIND_USER_RA_REG" so it matches the existing
> >> namespace?
> > 
> > Ok.  I am open to any improvements.
> 
> Thinking about this again I realized that the config option actually
> serves two purposes:
> 
> 1. Enable code (e.g. unwind user) to determine the presence of the new
>    user_return_address().  That is where I derived the name from.
> 2. Enable unwind user (sframe) to behave differently, if an architecture
>    has/uses a RA register (unlike x86, which solely uses the stack).

The sframe CONFIG_HAVE_USER_RA_REG check is redundant with the
unwind_user one, no?  I'm thinking it's better for sframe to just decode
the entry as it is, and then let unwind_user validate things.

> I think the primary notion is that an architecture has/uses a register
> for the return address and thus provides user_return_address().  What
> consumers such as unwind user do with that info is secondary.
> 
> Thoughts?

user_return_address() only has the single user, and is not all that
generically useful anyway (e.g., it warns on x86), so let's keep it
encapsulated in include/linux/unwind_user.h and give it the
"unwind_user" prefix.

Also, "RA_REG" is a bit ambiguous, it sounds almost like that other
option which spills RA to another register.  Conceptually, it's a link
register, so can we rename that to CONFIG_HAVE_UNWIND_USER_LINK_REG and
unwind_user_get_link_reg() or so?

Similarly, CONFIG_HAVE_UNWIND_USER_LOC_REG isn't that descriptive, how
about CONFIG_HAVE_UNWIND_USER_LINK_REG_SPILL?

Also we can get rid of the '#define func_name func_name' things and just
guard those functions with their corresponding CONFIG options in
inclide/linux/unwind_user.h.

Also those two functions should have similar naming and prototypes.

For example, in include/linux/unwind_user.h:

#ifndef CONFIG_HAVE_UNWIND_USER_LINK_REG
int unwind_user_get_link_reg(unsigned long *val)
{
	WARN_ON_ONCE(1);
	return -EINVAL;
}
#endif

#ifndef CONFIG_HAVE_UNWIND_USER_LINK_REG_SPILL
int unwind_user_get_reg(unsigned long *val, unsigned int regnum)
{
	WARN_ON_ONCE(1);
	return -EINVAL;
}
#endif

Then the code can be simplified (assuming no topmost checks):

	/* Get the Return Address (RA) */
	switch (frame->ra.loc) {
	case UNWIND_USER_LOC_NONE:
		if (unwind_user_get_link_reg(&ra))
			goto done;
		break;
	...
	case UNWIND_USER_LOC_REG:
		if (unwind_user_get_reg(&ra, frame->ra.regnum))
			goto done;
		break;
	...

-- 
Josh

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA
  2025-07-18 16:59         ` Josh Poimboeuf
@ 2025-07-21 14:25           ` Jens Remus
  0 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-07-21 14:25 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Heiko Carstens, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

Hello Josh,

thank you very much for your valuable feedback!

On 18.07.2025 18:59, Josh Poimboeuf wrote:
> On Fri, Jul 18, 2025 at 10:28:32AM +0200, Jens Remus wrote:
>> On 17.07.2025 13:09, Jens Remus wrote:
>>> On 17.07.2025 01:01, Josh Poimboeuf wrote:
>>>> On Thu, Jul 10, 2025 at 06:35:13PM +0200, Jens Remus wrote:
>>>>> +++ b/arch/Kconfig
>>>>> @@ -450,6 +450,11 @@ config HAVE_UNWIND_USER_SFRAME
>>>>>  	bool
>>>>>  	select UNWIND_USER
>>>>>  
>>>>> +config HAVE_USER_RA_REG
>>>>> +	bool
>>>>> +	help
>>>>> +	  The arch passes the return address (RA) in user space in a register.
>>>>
>>>> How about "HAVE_UNWIND_USER_RA_REG" so it matches the existing
>>>> namespace?
>>>
>>> Ok.  I am open to any improvements.
>>
>> Thinking about this again I realized that the config option actually
>> serves two purposes:
>>
>> 1. Enable code (e.g. unwind user) to determine the presence of the new
>>    user_return_address().  That is where I derived the name from.
>> 2. Enable unwind user (sframe) to behave differently, if an architecture
>>    has/uses a RA register (unlike x86, which solely uses the stack).
> 
> The sframe CONFIG_HAVE_USER_RA_REG check is redundant with the
> unwind_user one, no?  I'm thinking it's better for sframe to just decode
> the entry as it is, and then let unwind_user validate things.

Ok.  Makes sense.

>> I think the primary notion is that an architecture has/uses a register
>> for the return address and thus provides user_return_address().  What
>> consumers such as unwind user do with that info is secondary.
>>
>> Thoughts?
> 
> user_return_address() only has the single user, and is not all that
> generically useful anyway (e.g., it warns on x86), so let's keep it
> encapsulated in include/linux/unwind_user.h and give it the
> "unwind_user" prefix.

Ok.  I was hesitating to add it to ptrace.h.  Given it falls into the
same category as user_stack_pointer() and frame_pointer() I was
astonished it did not yet exist.  But given unwind user would be the
single user I agree that it is better to do as you suggest.

> Also, "RA_REG" is a bit ambiguous, it sounds almost like that other
> option which spills RA to another register.  Conceptually, it's a link
> register, so can we rename that to CONFIG_HAVE_UNWIND_USER_LINK_REG and
> unwind_user_get_link_reg() or so?

The terminology in DWARF and SFrame is return address (RA) register.
AArch64 uses link register (LR). s390 uses RA register. x86 uses return
address.  While I am open to use your suggestion, I wonder what others
would prefer.

In fact the whole option is not required, if using '#define fname fname'
instead (that you suggest not to down below).  user_unwind.c would then
contain the following as default for architectures that do not define
their custom implementation (or link_reg instead of ra_reg):

#ifndef unwind_user_get_ra_reg
int unwind_user_get_ra_reg(unsigned long *val)
{
	WARN_ON_ONCE(1);
	return -EINVAL;
}
#define unwind_user_get_ra_reg unwind_user_get_ra_reg
#endif

The commit subject should better be changed to one of the following
(with the commit message reworded accordingly):

unwind_user: Enable archs that have a RA register
  or
unwind_user: Enable archs that have a link register

> Similarly, CONFIG_HAVE_UNWIND_USER_LOC_REG isn't that descriptive, how
> about CONFIG_HAVE_UNWIND_USER_LINK_REG_SPILL?

Are you perhaps mixing up two of the new config options introduced by
this and the subsequent patch?

HAVE_USER_RA_REG (introduced by this patch): Indicates the architecture
has/uses a RA/LR register.  I would be fine to rename this to
"CONFIG_HAVE_UNWIND_USER_LINK_REG" (as you suggest), provided "link
register" is the terminology we can all agree on, although DWARF and
SFrame use "return address register".  Otherwise
"CONFIG_HAVE_UNWIND_USER_RA_REG" or even omit, if there is no need for
the option at all.

CONFIG_HAVE_UNWIND_USER_LOC_REG (introduced by the subsequent patch):
Indicates that the architecture may save registers in other registers.
Those support UNWIND_USER_LOC_REG (location register) in addition to
UNWIND_USER_LOC_STACK (location stack).  Note that this applies to
both FP and RA.

> Also we can get rid of the '#define func_name func_name' things and just
> guard those functions with their corresponding CONFIG options in
> inclide/linux/unwind_user.h.
> 
> Also those two functions should have similar naming and prototypes.
> 
> For example, in include/linux/unwind_user.h:
> 
> #ifndef CONFIG_HAVE_UNWIND_USER_LINK_REG
> int unwind_user_get_link_reg(unsigned long *val)
> {
> 	WARN_ON_ONCE(1);
> 	return -EINVAL;
> }
> #endif
> 
> #ifndef CONFIG_HAVE_UNWIND_USER_LINK_REG_SPILL
> int unwind_user_get_reg(unsigned long *val, unsigned int regnum)
> {
> 	WARN_ON_ONCE(1);
> 	return -EINVAL;
> }
> #endif

Is a config option actually required (see above)?  The reason I added
them was to perform checks, that you suggest to omit.  Your below code
suggestion would no longer required the config options at all.  What
is preferable?  The config option would ensure a build error if an
architecture enables the option without providing the arch-specific
implementations.

> Then the code can be simplified (assuming no topmost checks):
> 
> 	/* Get the Return Address (RA) */
> 	switch (frame->ra.loc) {
> 	case UNWIND_USER_LOC_NONE:
> 		if (unwind_user_get_link_reg(&ra))
> 			goto done;
> 		break;
> 	...
> 	case UNWIND_USER_LOC_REG:
> 		if (unwind_user_get_reg(&ra, frame->ra.regnum))
> 			goto done;
> 		break;
> 	...

Thanks and regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding
  2025-07-18  5:19       ` Josh Poimboeuf
@ 2025-08-01 12:36         ` Heiko Carstens
  2025-08-01 15:49           ` Jens Remus
  0 siblings, 1 reply; 42+ messages in thread
From: Heiko Carstens @ 2025-08-01 12:36 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Jens Remus, linux-kernel, linux-trace-kernel, bpf, x86,
	Steven Rostedt, Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 17, 2025 at 10:19:54PM -0700, Josh Poimboeuf wrote:
> On Thu, Jul 17, 2025 at 02:20:12PM +0200, Jens Remus wrote:
> > > Also, if distros aren't even compiling with -mbackchain, I wonder if we
> > > can just not do this altogether :-)
> > 
> > My original intent was to use unwind user's for_each_user_frame() to
> > replace the exiting stack tracing logic in arch_stack_walk_user_common()
> > in arch/s390/kernel/stacktrace.c, which currently supports backchain.
> > Given that for_each_user_frame() was made private in the latest unwind
> > user series version hinders me.  The use was also low, because the
> > currentl arch_stack_walk_user_common() implementation does not support
> > page faults, so that the attempt to use unwind user sframe would always
> > fail and fallback to unwind user backchain.  My hope was that somebody
> > with more Kernel skills could give me a few hints at how it could be
> > made to support deferred unwind. :-)
> 
> I believe stack_trace_save_user() is only used by ftrace, and that will
> no longer be needed once ftrace starts using unwind_user.
> 
> Maybe Heiko knows if that backchain user stacktrace code has any users?
> 
> If distros aren't building with -mbackchain, maybe backchain support can
> be considered obsoleted by sframe, and we can get away with not
> implementing it.

I guess that's a valid option. I know only of some special cases where
users compile everything on their own with -mbackchain to make this
work on a per-case basis. It shouldn't cause to much pain for them to
switch to sframe, as soon as that is available.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 11/16] s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME
  2025-07-10 16:35 ` [RFC PATCH v1 11/16] s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME Jens Remus
@ 2025-08-01 12:53   ` Heiko Carstens
  2025-08-01 15:46     ` Jens Remus
  0 siblings, 1 reply; 42+ messages in thread
From: Heiko Carstens @ 2025-08-01 12:53 UTC (permalink / raw)
  To: Jens Remus
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Josh Poimboeuf, Peter Zijlstra, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Andrii Nakryiko,
	Indu Bhagat, Jose E. Marchesi, Beau Belgrave, Linus Torvalds,
	Andrew Morton, Jens Axboe, Florian Weimer, Sam James

On Thu, Jul 10, 2025 at 06:35:17PM +0200, Jens Remus wrote:
> Add s390 support for unwinding of user space using SFrame.  This
> leverages the previous commits to address the following s390
> particularities:
> 
> - The CFA is defined as the value of the stack pointer (SP) at call
>   site in the previous frame + 160.  Therefore the SP unwinds as
>   SP = CFA - 160.  Therefore use a SP value offset from CFA of -160.
> 
> - The return address (RA) is not saved on the stack at function entry.
>   It is also not saved in the function prologue, when in leaf functions.
>   Therefore the RA does not necessarily need to be unwound in the first
>   unwinding step for the topmost frame.
> 
> - The frame pointer (FP) and/or return address (RA) may be saved in
>   other registers when in leaf functions.  GCC effectively uses
>   floating-point registers (FPR) for this purpose.  Therefore DWARF
>   register numbers may be encoded in the SFrame FP/RA offsets.

...

> +static inline void __s390_get_dwarf_fpr(unsigned long *val, int regnum)
> +{
> +	switch (regnum) {
> +	case 16:
> +		fpu_std(0, (freg_t *)val);
> +		break;

...

> +static inline int s390_unwind_user_get_reg(unsigned long *val, int regnum)
> +{
> +	if (0 <= regnum && regnum <= 15) {
> +		struct pt_regs *regs = task_pt_regs(current);
> +		*val = regs->gprs[regnum];
> +	} else if (16 <= regnum && regnum <= 31) {
> +		__s390_get_dwarf_fpr(val, regnum);

This won't work with other potential in-kernel fpu users. User space fpr
contents may have been written to the current task's fpu save area and fprs
may have been clobbered by in-kernel users; so you need to get register
contents from the correct location. See arch/s390/include/asm/fpu.h.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 11/16] s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME
  2025-08-01 12:53   ` Heiko Carstens
@ 2025-08-01 15:46     ` Jens Remus
  0 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-08-01 15:46 UTC (permalink / raw)
  To: Heiko Carstens
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Josh Poimboeuf, Peter Zijlstra, Ingo Molnar,
	Jiri Olsa, Namhyung Kim, Thomas Gleixner, Andrii Nakryiko,
	Indu Bhagat, Jose E. Marchesi, Beau Belgrave, Linus Torvalds,
	Andrew Morton, Jens Axboe, Florian Weimer, Sam James

On 8/1/2025 2:53 PM, Heiko Carstens wrote:
> On Thu, Jul 10, 2025 at 06:35:17PM +0200, Jens Remus wrote:

>> +static inline void __s390_get_dwarf_fpr(unsigned long *val, int regnum)
>> +{
>> +	switch (regnum) {
>> +	case 16:
>> +		fpu_std(0, (freg_t *)val);
>> +		break;
> 
> ...
> 
>> +static inline int s390_unwind_user_get_reg(unsigned long *val, int regnum)
>> +{
>> +	if (0 <= regnum && regnum <= 15) {
>> +		struct pt_regs *regs = task_pt_regs(current);
>> +		*val = regs->gprs[regnum];
>> +	} else if (16 <= regnum && regnum <= 31) {
>> +		__s390_get_dwarf_fpr(val, regnum);
> 
> This won't work with other potential in-kernel fpu users. User space fpr
> contents may have been written to the current task's fpu save area and fprs
> may have been clobbered by in-kernel users; so you need to get register
> contents from the correct location. See arch/s390/include/asm/fpu.h.

Thanks!  Will implement all the review feedback and send a RFC V2 once I
am back from vacation.  Will be away from keyboard for a few weeks.

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding
  2025-08-01 12:36         ` Heiko Carstens
@ 2025-08-01 15:49           ` Jens Remus
  0 siblings, 0 replies; 42+ messages in thread
From: Jens Remus @ 2025-08-01 15:49 UTC (permalink / raw)
  To: Heiko Carstens, Josh Poimboeuf
  Cc: linux-kernel, linux-trace-kernel, bpf, x86, Steven Rostedt,
	Vasily Gorbik, Ilya Leoshkevich, Masami Hiramatsu,
	Mathieu Desnoyers, Peter Zijlstra, Ingo Molnar, Jiri Olsa,
	Namhyung Kim, Thomas Gleixner, Andrii Nakryiko, Indu Bhagat,
	Jose E. Marchesi, Beau Belgrave, Linus Torvalds, Andrew Morton,
	Jens Axboe, Florian Weimer, Sam James

On 8/1/2025 2:36 PM, Heiko Carstens wrote:
> On Thu, Jul 17, 2025 at 10:19:54PM -0700, Josh Poimboeuf wrote:
>> On Thu, Jul 17, 2025 at 02:20:12PM +0200, Jens Remus wrote:

>> I believe stack_trace_save_user() is only used by ftrace, and that will
>> no longer be needed once ftrace starts using unwind_user.
>>
>> Maybe Heiko knows if that backchain user stacktrace code has any users?
>>
>> If distros aren't building with -mbackchain, maybe backchain support can
>> be considered obsoleted by sframe, and we can get away with not
>> implementing it.
> 
> I guess that's a valid option. I know only of some special cases where
> users compile everything on their own with -mbackchain to make this
> work on a per-case basis. It shouldn't cause to much pain for them to
> switch to sframe, as soon as that is available.

Let me have another stab at a cleaner implementation.  Shouldn't be too
much effort.  We can then decide whether to leave it out or not.

Will be away from keyboard for a few weeks.

Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
+49-7031-16-1128 Office
jremus@de.ibm.com

IBM

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Böblingen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/


^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2025-08-01 15:49 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-10 16:35 [RFC PATCH v1 00/16] s390: SFrame user space unwinding Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 01/16] fixup! unwind_user: Add frame pointer support Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 02/16] s390: asm/dwarf.h should only be included in assembly files Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 03/16] s390/vdso: Avoid emitting DWARF CFI for non-vDSO Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 04/16] s390/vdso: Enable SFrame generation in vDSO Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 05/16] s390/vdso: Keep function symbols " Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 06/16] unwind_user: Enable archs that define CFA = SP_callsite + offset Jens Remus
2025-07-16 21:32   ` Josh Poimboeuf
2025-07-17  9:27     ` Jens Remus
2025-07-18  4:51       ` Josh Poimboeuf
2025-07-10 16:35 ` [RFC PATCH v1 07/16] unwind_user: Enable archs that do not necessarily save RA Jens Remus
2025-07-16 23:01   ` Josh Poimboeuf
2025-07-17 11:09     ` Jens Remus
2025-07-18  8:28       ` Jens Remus
2025-07-18 16:59         ` Josh Poimboeuf
2025-07-21 14:25           ` Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 08/16] unwind_user: Enable archs that save RA/FP in other registers Jens Remus
2025-07-17  2:01   ` Josh Poimboeuf
2025-07-17  2:50     ` Josh Poimboeuf
2025-07-17 12:07       ` Jens Remus
2025-07-18  4:52         ` Josh Poimboeuf
2025-07-17  3:57     ` Steven Rostedt
2025-07-17  7:24       ` Josh Poimboeuf
2025-07-17 12:05         ` Steven Rostedt
2025-07-17 11:28     ` Jens Remus
2025-07-17 12:10       ` Steven Rostedt
2025-07-18  4:51       ` Josh Poimboeuf
2025-07-10 16:35 ` [RFC PATCH v1 09/16] unwind_user/sframe: Enable archs with encoded SFrame CFA offsets Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 10/16] s390/ptrace: Enable HAVE_USER_RA_REG Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 11/16] s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME Jens Remus
2025-08-01 12:53   ` Heiko Carstens
2025-08-01 15:46     ` Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 12/16] unwind_user/backchain: Introduce back chain user space unwinding Jens Remus
2025-07-17  2:06   ` Josh Poimboeuf
2025-07-17 12:20     ` Jens Remus
2025-07-18  5:19       ` Josh Poimboeuf
2025-08-01 12:36         ` Heiko Carstens
2025-08-01 15:49           ` Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 13/16] s390/unwind_user/backchain: Enable HAVE_UNWIND_USER_BACKCHAIN Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 14/16] PREREQ: x86/asm: Avoid emitting DWARF CFI for non-VDSO Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 15/16] PREREQ: x86/vdso: Enable sframe generation in VDSO Jens Remus
2025-07-10 16:35 ` [RFC PATCH v1 16/16] WIP: fixup! s390/unwind_user/sframe: Enable HAVE_UNWIND_USER_SFRAME Jens Remus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).