Linux-ARM-Kernel Archive on lore.kernel.org

Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH RFC 2/4] clk: rockchip: pll: use round-nearest in determine_rate
From: Alexey Charkov @ 2026-04-17 15:11 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
	Michael Turquette, Stephen Boyd
  Cc: Pavel Zhovner, Sebastian Reichel, Andy Yan, devicetree,
	linux-arm-kernel, linux-rockchip, linux-kernel, linux-clk,
	Alexey Charkov
In-Reply-To: <20260417-rk3576-dclk-v1-0-26a9d0dcb2de@flipper.net>

rockchip_pll_determine_rate() walks the rate table in descending order
and picks the first entry <= the requested rate. This floor-rounding
interacts poorly with consumers that use CLK_SET_RATE_PARENT: a divider
iterating candidates asks the PLL for rate*div, and a tiny undershoot
causes the PLL to snap to a much lower entry.

For example, requesting 1991.04 MHz (248.88 MHz * 8) causes the PLL to
return 1968 MHz instead of 1992 MHz — a 24 MHz table gap that produces
a 1.2% pixel clock error when divided back down.

Change to round-to-nearest: for each table entry compute the absolute
distance from the request, and pick the entry with the smallest delta.
The CCF's divider and composite logic handle over/undershoot preferences
via their own ROUND_CLOSEST flags.

Signed-off-by: Alexey Charkov <alchark@flipper.net>
---
 drivers/clk/rockchip/clk-pll.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/clk/rockchip/clk-pll.c b/drivers/clk/rockchip/clk-pll.c
index 6b853800cb6b..c142f2c4fd99 100644
--- a/drivers/clk/rockchip/clk-pll.c
+++ b/drivers/clk/rockchip/clk-pll.c
@@ -66,19 +66,19 @@ static int rockchip_pll_determine_rate(struct clk_hw *hw,
 {
 	struct rockchip_clk_pll *pll = to_rockchip_clk_pll(hw);
 	const struct rockchip_pll_rate_table *rate_table = pll->rate_table;
+	unsigned long best = 0;
 	int i;

-	/* Assuming rate_table is in descending order */
 	for (i = 0; i < pll->rate_count; i++) {
-		if (req->rate >= rate_table[i].rate) {
-			req->rate = rate_table[i].rate;
-
-			return 0;
-		}
+		if (abs((long)req->rate - (long)rate_table[i].rate) <
+		    abs((long)req->rate - (long)best))
+			best = rate_table[i].rate;
 	}

-	/* return minimum supported value */
-	req->rate = rate_table[i - 1].rate;
+	if (best)
+		req->rate = best;
+	else
+		req->rate = rate_table[pll->rate_count - 1].rate;

 	return 0;
 }

-- 
2.52.0

^ permalink raw reply related

* [PATCH RFC 1/4] arm64: dts: rockchip: rk3576: assign dclk_vp1_src to VPLL
From: Alexey Charkov @ 2026-04-17 15:11 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
	Michael Turquette, Stephen Boyd
  Cc: Pavel Zhovner, Sebastian Reichel, Andy Yan, devicetree,
	linux-arm-kernel, linux-rockchip, linux-kernel, linux-clk,
	Alexey Charkov
In-Reply-To: <20260417-rk3576-dclk-v1-0-26a9d0dcb2de@flipper.net>

Reparent dclk_vp1_src from GPLL to VPLL at the SoC level. VPLL is a
programmable PLL with no other consumers, allowing the CRU to synthesize
accurate pixel clocks for VP1's output with arbitrary display modes.

Signed-off-by: Alexey Charkov <alchark@flipper.net>
---
 arch/arm64/boot/dts/rockchip/rk3576.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3576.dtsi b/arch/arm64/boot/dts/rockchip/rk3576.dtsi
index e12a2a0cfb89..2b05900c6c1c 100644
--- a/arch/arm64/boot/dts/rockchip/rk3576.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3576.dtsi
@@ -1338,6 +1338,8 @@ vop: vop@27d00000 {
 				      "dclk_vp1",
 				      "dclk_vp2",
 				      "pll_hdmiphy0";
+			assigned-clocks = <&cru DCLK_VP1_SRC>;
+			assigned-clock-parents = <&cru PLL_VPLL>;
 			iommus = <&vop_mmu>;
 			power-domains = <&power RK3576_PD_VOP>;
 			rockchip,grf = <&sys_grf>;

-- 
2.52.0



^ permalink raw reply related

* [PATCH RFC 0/4] arm64: rockchip: The hunt for exact pixel clocks on RK3576
From: Alexey Charkov @ 2026-04-17 15:11 UTC (permalink / raw)
  To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
	Michael Turquette, Stephen Boyd
  Cc: Pavel Zhovner, Sebastian Reichel, Andy Yan, devicetree,
	linux-arm-kernel, linux-rockchip, linux-kernel, linux-clk,
	Alexey Charkov

Dear all,

Need the help of the collective wisdom of the community.

The problem I'm trying to solve is reliably obtaining the exact pixel
clock for arbitrary display modes supported by the RK3576 SoC.

Rockchip RK3576 has three display output processors VP0~VP2, each
supporting different ranges of display modes, roughly as follows:
- VP0: 4K 120Hz
- VP1: 2.5k 60Hz
- VP2: 1080p 60Hz

Each one obviously needs a pixel clock. The required frequencies for the
pixel clocks vary greatly depending on the display mode, and need to be
matched within a tight tolerance, or else many displays will refuse to
work. E.g. the preferred (maximum) display mode out of VP1 is particularly
awkward, because it requires a pixel clock of 248.88 MHz, which cannot
be obtained using integer dividers from its default clock source (GPLL
at 1188 MHz), and the nearest approximation is 237.6 MHz, which is well
outside the tolerance of e.g. DP specification, resulting in a blank
screen on most displays by default.

The clock sources are of course configurable, in particular there are muxes
connected to each VP for selecting the source of the pixel clock:
- Each VP can take the clock either from the (single!) HDMI PHY or from
  its dedicated dclk_vpX_src mux
- The dclk_vpX_src mux can select the clock from a number of system PLLs
  (GPLL, CPLL, VPLL, BPLL, LPLL)

While the system PLLs can be configured to output a wide range of
frequencies, they are shared between many system components. E.g. on the
current mainline kernel on one of my RK3576 boards I've got the following:
GPLL: 1188 MHz, enable count 20
CPLL: 1000 MHz, enable count 17
VPLL: 594 MHz, enable count 0 (yaay!)
BPLL, LPLL: 816 MHz, enable count 0 (but these last ones don't have
            predividers, so are less flexible)

So ultimately there is exactly one free fractional PLL (VPLL) which can be
used to generate arbitrary pixel clocks, but we have up to three consumers
trying to drive different display modes from it (e.g. HDMI on VP0, DP on
VP1 and MIPI DSI on VP2). We also want to be able to adjust the PLL output
frequency on the fly to satisfy the requirements of the selected display
mode.

And this is where I'm stuck. Trying to satisfy the requirements of up to
three consumers while changing the PLL frequency on the fly sounds like
a poorly tractable mathematical problem (is it 3-SAT?). We can take the
HDMI output out of the equation, because it can be driven from the HDMI
PHY (which is capable of arbitrary rates) instead of the mux, but that
makes the decision of which dclk source to use for a VP block dependent on
which downstream consumer is connected to it (HDMI vs. something else).
Even then we somehow need two devices to cooperate in picking a PLL
frequency that satisfies the requirements of both of them, and change to it
without display corruption. I'm not even sure if the CCF has mechanisms
for that?..

What follows is a brief set of patches which illustrate a partial solution
for the case of "I just need 2.5k60Hz on VP1 via DP and don't care about
the rest". It switches the VP1 unconditionally to use VPLL as the source
for its dclk mux, allows changing the VPLL frequency on the fly, and also
changes the frequency calculation logic to allow for nearest-match
frequencies which are not necessarily rounded down. These are not meant
to be merged as-is, as I see the following issues:
- The flag allowing the PLL to change rate is in the clock driver, while
  the reparenting to an unused PLL is in the device tree. If these go out
  of sync, we might end up trying to change the frequency of a PLL which
  is used by other consumers (I presume that could be dangerous)
- If VP0 happens to be driving DP output, it won't be able to produce the
  2560x1440@60Hz mode for the same reasons as VP1 - then it must also be
  reparented to VPLL and allowed to change its frequency on the fly

It does bring me from a state of "always blank screen on DP output until
the mode is switched to something magically working" to a state of
"most monitors work at the default preferred mode" though.

It is tempting to just reparent both VP0 and VP1 to VPLL and allow both of
them to change its frequency, while leaving VP2 on the default (fixed)
GPLL and relying on the fact that 148.5 MHz (the required frequency for
its maximum supported mode of 1920x1080@60Hz) is conveniently 1188/8 MHz -
just what GPLL can provide. Then also force whichever VP is driving HDMI
output to use the HDMI PHY as its clock source. But we still have the
problem of DT vs. driver coordination, and I'm not sure how to define
the policy for "if you've got HDMI connected, you must use the HDMI PHY
clock for the respective VP, whichever VP that is".

I would very much appreciate any thoughts on how to approach this.

Signed-off-by: Alexey Charkov <alchark@flipper.net>
---
Alexey Charkov (4):
      arm64: dts: rockchip: rk3576: assign dclk_vp1_src to VPLL
      clk: rockchip: pll: use round-nearest in determine_rate
      clk: rockchip: rk3576: allow dclk_vp1_src to propagate rate to parent PLL
      clk: rockchip: rk3576: add ROUND_CLOSEST to dclk_vp1_src divider

 arch/arm64/boot/dts/rockchip/rk3576.dtsi |  2 ++
 drivers/clk/rockchip/clk-pll.c           | 16 ++++++++--------
 drivers/clk/rockchip/clk-rk3576.c        |  4 ++--
 3 files changed, 12 insertions(+), 10 deletions(-)
---
base-commit: c7275b05bc428c7373d97aa2da02d3a7fa6b9f66
change-id: 20260417-rk3576-dclk-4c95bbb67581

Best regards,
-- 
Alexey Charkov <alchark@flipper.net>

^ permalink raw reply

* Re: [PATCH RFC 1/2] arm64: vdso: Prepare for robust futex unlock support
From: Florian Weimer @ 2026-04-17 15:08 UTC (permalink / raw)
  To: André Almeida
  Cc: Catalin Marinas, Will Deacon, Thomas Gleixner, Mark Rutland,
	Mathieu Desnoyers, Sebastian Andrzej Siewior, Carlos O'Donell,
	Peter Zijlstra, Rich Felker, Torvald Riegel, Darren Hart,
	Ingo Molnar, Davidlohr Bueso, Arnd Bergmann, Liam R . Howlett,
	Uros Bizjak, Thomas Weißschuh, linux-arm-kernel,
	linux-kernel, linux-arch, kernel-dev
In-Reply-To: <20260417-tonyk-robust_arm-v1-1-03aa64e2ff1a@igalia.com>

* André Almeida:

> There will be a VDSO function to unlock non-contended robust futexes in
> user space. The unlock sequence is racy vs. clearing the list_pending_op
> pointer in the task's robust list head. To plug this race the kernel needs
> to know the critical section window so it can clear the pointer when the
> task is interrupted within that race window. The window is determined by
> labels in the inline assembly.
>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
> RFC: Those symbols can't be found by the linker after patch 2/2, it fails with:
>
> ld: arch/arm64/kernel/vdso.o: in function `vdso_futex_robust_unlock_update_ips':
> arch/arm64/kernel/vdso.c:72:(.text+0x200): undefined reference to `__futex_list64_try_unlock_cs_success'
> ld: arch/arm64/kernel/vdso.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `__futex_list64_try_unlock_cs_success' which may bind externally can not be used when making a shared object; recompile with -fPIC
> arch/arm64/kernel/vdso.c:72:(.text+0x200): dangerous relocation: unsupported relocation

I think your GLOBLS definition adds a 64 suffix.  That shouldn't be
necessary on AArch64.  It's not reflected in the references, so you end
up with an undefined symbol error.

Thanks,
Florian



^ permalink raw reply

* [PATCH v1 4/4] arm64/unwind_user/sframe: Enable sframe unwinding on arm64
From: Jens Remus @ 2026-04-17 15:08 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Steven Rostedt, Josh Poimboeuf,
	Indu Bhagat, Peter Zijlstra, Dylan Hatch, Weinan Liu
  Cc: Jens Remus, linux-arm-kernel, linux-kernel, Heiko Carstens,
	Ilya Leoshkevich
In-Reply-To: <20260417150827.1183376-1-jremus@linux.ibm.com>

Add arm64 support for unwinding of user space using SFrame.

This leverages the unwind user (sframe) support for s390 which
enables architectures that pass the return address in a register,
may not necessarily save the return address on the stack (for
instance in leaf functions), and have SP at call site equal
SP at entry.

For this purpose provide arm64-specific unwind_user_get_ra_reg() and
unwind_user_get_reg() implementations, which return the value of the
link register (LR) or an arbitrary register in the topmost user space
frame.  Define the arm64 SP and FP DWARF register numbers.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    Note:  An arm64 implementation of unwind_user_get_reg() is strictly
    only needed, if SFrame V3 flexible FDE would get generated for aarch64,
    which is currently not the case in GNU Binutils 2.46.

 arch/arm64/Kconfig                          |  1 +
 arch/arm64/include/asm/unwind_user.h        | 23 +++++++++++++++++++++
 arch/arm64/include/asm/unwind_user_sframe.h |  8 +++++++
 3 files changed, 32 insertions(+)
 create mode 100644 arch/arm64/include/asm/unwind_user_sframe.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 994fd5162a1d..641a3a5fe5c9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -254,6 +254,7 @@ config ARM64
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_UNWIND_USER_FP
+	select HAVE_UNWIND_USER_SFRAME
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
 	select HAVE_GENERIC_VDSO
diff --git a/arch/arm64/include/asm/unwind_user.h b/arch/arm64/include/asm/unwind_user.h
index 0641d4d97b0f..3c7fd8c4ba5b 100644
--- a/arch/arm64/include/asm/unwind_user.h
+++ b/arch/arm64/include/asm/unwind_user.h
@@ -4,6 +4,7 @@
 
 #include <linux/sched/task_stack.h>
 #include <linux/types.h>
+#include <asm/insn.h>
 
 #ifdef CONFIG_UNWIND_USER
 
@@ -16,6 +17,28 @@ static inline int unwind_user_word_size(struct pt_regs *regs)
 	return sizeof(long);
 }
 
+static inline int unwind_user_get_ra_reg(unsigned long *val)
+{
+	struct pt_regs *regs = task_pt_regs(current);
+	*val = regs->regs[AARCH64_INSN_REG_LR];
+	return 0;
+}
+#define unwind_user_get_ra_reg unwind_user_get_ra_reg
+
+static inline int unwind_user_get_reg(unsigned long *val, unsigned int regnum)
+{
+	const struct pt_regs *regs = task_pt_regs(current);
+
+	if (regnum <= 30)
+		/* DWARF register numbers 0..15 */
+		*val = regs->regs[regnum];
+	else
+		return -EINVAL;
+
+	return 0;
+}
+#define unwind_user_get_reg unwind_user_get_reg
+
 #endif /* CONFIG_UNWIND_USER */
 
 #ifdef CONFIG_HAVE_UNWIND_USER_FP
diff --git a/arch/arm64/include/asm/unwind_user_sframe.h b/arch/arm64/include/asm/unwind_user_sframe.h
new file mode 100644
index 000000000000..65c0a6b6c835
--- /dev/null
+++ b/arch/arm64/include/asm/unwind_user_sframe.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ARM64_UNWIND_USER_SFRAME_H
+#define _ASM_ARM64_UNWIND_USER_SFRAME_H
+
+#define SFRAME_REG_SP	31
+#define SFRAME_REG_FP	29
+
+#endif /* _ASM_ARM64_UNWIND_USER_SFRAME_H */
-- 
2.51.0



^ permalink raw reply related

* [PATCH v1 3/4] arm64/vdso: Enable SFrame generation in vDSO
From: Jens Remus @ 2026-04-17 15:08 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Steven Rostedt, Josh Poimboeuf,
	Indu Bhagat, Peter Zijlstra, Dylan Hatch, Weinan Liu
  Cc: Jens Remus, linux-arm-kernel, linux-kernel, Heiko Carstens,
	Ilya Leoshkevich
In-Reply-To: <20260417150827.1183376-1-jremus@linux.ibm.com>

This replicates Josh's x86 patch "x86/vdso: Enable sframe generation
in VDSO" [1] for arm64.

Enable .sframe generation in the vDSO library so kernel and user space
can unwind through it.  Keep all function symbols in the vDSO .symtab
for stack trace purposes.  This enables perf to lookup these function
symbols in addition to those already exported in vDSO .dynsym.

Starting with binutils 2.46 both GNU assembler and GNU linker
exclusively support generating and merging .sframe in SFrame V3 format.
For vDSO, only if supported by the assembler, generate .sframe, collect
it, mark it as KEEP, and generate a GNU_SFRAME program table entry.
Otherwise explicitly discard any .sframe.

[1]: x86/vdso: Enable sframe generation in VDSO,
     https://lore.kernel.org/all/20260211141357.271402-7-jremus@linux.ibm.com/

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    @Dylan:  Adding -Wa,--gsframe-3 to the VDSO CC_FLAGS_ADD_VDSO (and
    AS_FLAGS_ADD_VDSO) may clash with your patch [1] that adds likewise
    to the CC_FLAGS_REMOVE_VDSO.  Any idea how to resolve?
    
    [1]: [PATCH v3 2/8] arm64, unwind: build kernel with sframe V3 info,
         https://lore.kernel.org/all/20260406185000.1378082-3-dylanbhatch@google.com/

 arch/arm64/kernel/vdso/Makefile   | 14 ++++++++++++--
 arch/arm64/kernel/vdso/vdso.lds.S | 21 +++++++++++++++++++++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
index 7dec05dd33b7..1f2f01673397 100644
--- a/arch/arm64/kernel/vdso/Makefile
+++ b/arch/arm64/kernel/vdso/Makefile
@@ -15,6 +15,10 @@ obj-vdso := vgettimeofday.o note.o sigreturn.o vgetrandom.o vgetrandom-chacha.o
 targets := $(obj-vdso) vdso.so vdso.so.dbg
 obj-vdso := $(addprefix $(obj)/, $(obj-vdso))
 
+ifeq ($(CONFIG_AS_SFRAME3),y)
+  SFRAME_CFLAGS := -Wa,--gsframe-3
+endif
+
 btildflags-$(CONFIG_ARM64_BTI_KERNEL) += -z force-bti
 
 # -Bsymbolic has been added for consistency with arm, the compat vDSO and
@@ -41,7 +45,9 @@ CC_FLAGS_REMOVE_VDSO := $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) \
 			$(CC_FLAGS_LTO) $(CC_FLAGS_CFI) \
 			-Wmissing-prototypes -Wmissing-declarations
 
-CC_FLAGS_ADD_VDSO := -O2 -mcmodel=tiny -fasynchronous-unwind-tables
+CC_FLAGS_ADD_VDSO := -O2 -mcmodel=tiny -fasynchronous-unwind-tables $(SFRAME_CFLAGS)
+
+AS_FLAGS_ADD_VDSO := $(SFRAME_CFLAGS)
 
 CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_REMOVE_VDSO)
 CFLAGS_REMOVE_vgetrandom.o = $(CC_FLAGS_REMOVE_VDSO)
@@ -49,6 +55,10 @@ CFLAGS_REMOVE_vgetrandom.o = $(CC_FLAGS_REMOVE_VDSO)
 CFLAGS_vgettimeofday.o = $(CC_FLAGS_ADD_VDSO)
 CFLAGS_vgetrandom.o = $(CC_FLAGS_ADD_VDSO)
 
+AFLAGS_sigreturn.o = $(AS_FLAGS_ADD_VDSO)
+
+AFLAGS_vgetrandom-chacha.o = $(AS_FLAGS_ADD_VDSO)
+
 ifneq ($(c-gettimeofday-y),)
   CFLAGS_vgettimeofday.o += -include $(c-gettimeofday-y)
 endif
@@ -65,7 +75,7 @@ $(obj)/vdso.so.dbg: $(obj)/vdso.lds $(obj-vdso) FORCE
 	$(call if_changed,vdsold_and_vdso_check)
 
 # Strip rule for the .so file
-$(obj)/%.so: OBJCOPYFLAGS := -S
+$(obj)/%.so: OBJCOPYFLAGS := -g
 $(obj)/%.so: $(obj)/%.so.dbg FORCE
 	$(call if_changed,objcopy)
 
diff --git a/arch/arm64/kernel/vdso/vdso.lds.S b/arch/arm64/kernel/vdso/vdso.lds.S
index 52314be29191..527e107ca4b5 100644
--- a/arch/arm64/kernel/vdso/vdso.lds.S
+++ b/arch/arm64/kernel/vdso/vdso.lds.S
@@ -15,6 +15,8 @@
 #include <asm-generic/vmlinux.lds.h>
 #include <vdso/datapage.h>
 
+#define KEEP_SFRAME	IS_ENABLED(CONFIG_AS_SFRAME)
+
 OUTPUT_FORMAT("elf64-littleaarch64", "elf64-bigaarch64", "elf64-littleaarch64")
 OUTPUT_ARCH(aarch64)
 
@@ -68,6 +70,13 @@ SECTIONS
 		*(.igot .igot.plt)
 	}						:text
 
+#if KEEP_SFRAME
+	.sframe		: {
+		KEEP (*(.sframe))
+		*(.sframe.*)
+	}						:text	:sframe
+#endif
+
 	_end = .;
 	PROVIDE(end = .);
 
@@ -78,9 +87,18 @@ SECTIONS
 		*(.data .data.* .gnu.linkonce.d.* .sdata*)
 		*(.bss .sbss .dynbss .dynsbss)
 		*(.eh_frame .eh_frame_hdr)
+#if !KEEP_SFRAME
+		*(.sframe)
+		*(.sframe.*)
+#endif
 	}
 }
 
+/*
+ * Very old versions of ld do not recognize this name token; use the constant.
+ */
+#define PT_GNU_SFRAME	0x6474e554
+
 /*
  * We must supply the ELF program headers explicitly to get just one
  * PT_LOAD segment, and set the flags explicitly to make segments read-only.
@@ -90,6 +108,9 @@ PHDRS
 	text		PT_LOAD		FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
 	dynamic		PT_DYNAMIC	FLAGS(4);		/* PF_R */
 	note		PT_NOTE		FLAGS(4);		/* PF_R */
+#if KEEP_SFRAME
+	sframe		PT_GNU_SFRAME	FLAGS(4);		/* PF_R */
+#endif
 }
 
 /*
-- 
2.51.0



^ permalink raw reply related

* [PATCH v1 0/4] arm64: SFrame user space unwinding
From: Jens Remus @ 2026-04-17 15:08 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Steven Rostedt, Josh Poimboeuf,
	Indu Bhagat, Peter Zijlstra, Dylan Hatch, Weinan Liu
  Cc: Jens Remus, linux-arm-kernel, linux-kernel, Heiko Carstens,
	Ilya Leoshkevich

This series adds arm64 support for unwinding of user space using SFrame V3.
It is based on Josh's, Steven's, and my work.


Prerequirements:

This series applies on top of the latest unwind user sframe series
"[PATCH v13 00/18] unwind_deferred: Implement sframe handling":
https://lore.kernel.org/all/20260127150554.2760964-1-jremus@linux.ibm.com/

For which Steven Rostedt kindly maintains a branch:

  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git  sframe/core

Like above series it depends on binutils 2.46 to be used to build
executables and libraries (e.g. vDSO) with SFrame V3 on aarch64
(using the assembler option --gsframe-3).

The unwind user sframe series depends on a Glibc patch from Josh, that
adds support for the prctls introduced in the Kernel:
https://lore.kernel.org/all/20250122023517.lmztuocecdjqzfhc@jpoimboe/
Note that Josh's Glibc patch needs to be adjusted for the updated prctl
numbers from "[PATCH v13 18/18] unwind_user/sframe: Add prctl() interface
for registering .sframe sections":
https://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git/diff/include/uapi/linux/prctl.h?h=sframe/core


Overview:

Patch 1 enables deferred FP-based unwinding of user space on arm64.
This is used by unwind user as fallback if SFrame is not available.

Patch 2 adds an unsafe_copy_from_user() implementation for arm64.
This is needed by unwind user sframe to access .sframe sections.

Patch 3 enables .sframe generation in vDSO on arm64.

Patch 4 enables deferred SFrame-based unwinding of user space on arm64.


Usage:

perf tools already support the deferred unwinding infrastructure by
using option "--call-graph fp,defer" (name subject to change):

  $ perf record -F 999 --call-graph fp,defer /path/to/executable
  $ perf script


Limitations:

Support for PAC is not yet implemented.  Note that SFrame V3 already
provides the required information though:

  SFRAME_V3_AARCH64_FDE_PAUTH_KEY(fde_info)
  SFRAME_V3_AARCH64_FRE_MANGLED_RA_P(fre_info)


Thanks and regards,
Jens

Jens Remus (4):
  arm64/unwind_user/fp: Enable HAVE_UNWIND_USER_FP
  arm64/uaccess: Add unsafe_copy_from_user() implementation
  arm64/vdso: Enable SFrame generation in vDSO
  arm64/unwind_user/sframe: Enable sframe unwinding on arm64

 arch/arm64/Kconfig                          |  2 +
 arch/arm64/include/asm/uaccess.h            | 39 +++++++++----
 arch/arm64/include/asm/unwind_user.h        | 65 +++++++++++++++++++++
 arch/arm64/include/asm/unwind_user_sframe.h |  8 +++
 arch/arm64/kernel/vdso/Makefile             | 14 ++++-
 arch/arm64/kernel/vdso/vdso.lds.S           | 21 +++++++
 6 files changed, 137 insertions(+), 12 deletions(-)
 create mode 100644 arch/arm64/include/asm/unwind_user.h
 create mode 100644 arch/arm64/include/asm/unwind_user_sframe.h

-- 
2.51.0



^ permalink raw reply

* [PATCH v1 1/4] arm64/unwind_user/fp: Enable HAVE_UNWIND_USER_FP
From: Jens Remus @ 2026-04-17 15:08 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Steven Rostedt, Josh Poimboeuf,
	Indu Bhagat, Peter Zijlstra, Dylan Hatch, Weinan Liu
  Cc: Jens Remus, linux-arm-kernel, linux-kernel, Heiko Carstens,
	Ilya Leoshkevich
In-Reply-To: <20260417150827.1183376-1-jremus@linux.ibm.com>

Add arm64 support for unwinding of user space using frame pointer (FP).

For this purpose enable the config option HAVE_UNWIND_USER_FP and
provide an arm64-specific ARCH_INIT_USER_FP_FRAME definition (specifying
the CFA offset from FP and the FP and RA offsets from CFA).  Unlike x86,
as there is no mean to determine whether the user space IP in the
topmost frame is at function entry, rely on the common definition of
unwind_user_at_function_start(), which always returns false, and common
dummy definition of ARCH_INIT_USER_FP_ENTRY_FRAME.

For unwind user in general provide an arm64-specific implementation
of unwind_user_word_size().

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/arm64/Kconfig                   |  1 +
 arch/arm64/include/asm/unwind_user.h | 42 ++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)
 create mode 100644 arch/arm64/include/asm/unwind_user.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 38dba5f7e4d2..994fd5162a1d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -253,6 +253,7 @@ config ARM64
 	select HAVE_RUST if RUSTC_SUPPORTS_ARM64
 	select HAVE_STACKPROTECTOR
 	select HAVE_SYSCALL_TRACEPOINTS
+	select HAVE_UNWIND_USER_FP
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
 	select HAVE_GENERIC_VDSO
diff --git a/arch/arm64/include/asm/unwind_user.h b/arch/arm64/include/asm/unwind_user.h
new file mode 100644
index 000000000000..0641d4d97b0f
--- /dev/null
+++ b/arch/arm64/include/asm/unwind_user.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ARM64_UNWIND_USER_H
+#define _ASM_ARM64_UNWIND_USER_H
+
+#include <linux/sched/task_stack.h>
+#include <linux/types.h>
+
+#ifdef CONFIG_UNWIND_USER
+
+static inline int unwind_user_word_size(struct pt_regs *regs)
+{
+#ifdef COMPAT
+	if (compat_user_mode(regs))
+		return sizeof(int);
+#endif
+	return sizeof(long);
+}
+
+#endif /* CONFIG_UNWIND_USER */
+
+#ifdef CONFIG_HAVE_UNWIND_USER_FP
+
+#define ARCH_INIT_USER_FP_FRAME(ws)					\
+	.cfa		=  {						\
+		.rule		= UNWIND_USER_CFA_RULE_FP_OFFSET,	\
+		.offset		= 2*(ws),				\
+			},						\
+	.ra		= {						\
+		.rule		= UNWIND_USER_RULE_CFA_OFFSET_DEREF,	\
+		.offset		= -1*(ws),				\
+			},						\
+	.fp		= {						\
+		.rule		= UNWIND_USER_RULE_CFA_OFFSET_DEREF,	\
+		.offset		= -2*(ws),				\
+			},						\
+	.outermost	= false,
+
+#endif /* CONFIG_HAVE_UNWIND_USER_FP */
+
+#include <asm-generic/unwind_user.h>
+
+#endif /* _ASM_ARM64_UNWIND_USER_H */
-- 
2.51.0



^ permalink raw reply related

* [PATCH v1 2/4] arm64/uaccess: Add unsafe_copy_from_user() implementation
From: Jens Remus @ 2026-04-17 15:08 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Steven Rostedt, Josh Poimboeuf,
	Indu Bhagat, Peter Zijlstra, Dylan Hatch, Weinan Liu
  Cc: Jens Remus, linux-arm-kernel, linux-kernel, Heiko Carstens,
	Ilya Leoshkevich
In-Reply-To: <20260417150827.1183376-1-jremus@linux.ibm.com>

This replicates Josh's x86 patch "x86/uaccess:
Add unsafe_copy_from_user() implementation" [1] for arm64.

Add an arm64 implementation of unsafe_copy_from_user() similar to the
existing unsafe_copy_to_user().

For this purpose rename the unsafe_copy_loop() helper to
unsafe_copy_to_user_loop() and introduce a unsafe_copy_from_user_loop()
helper.

While at it rename the unsafe_copy_to_user() local variables
__ucu_{dst|src|len} to __{dst|src|len} and change their pointer
type to void * to align to the x86 patch.

[1]: x86/uaccess: Add unsafe_copy_from_user() implementation,
     https://lore.kernel.org/all/20251119132323.1281768-4-jremus@linux.ibm.com/

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---
 arch/arm64/include/asm/uaccess.h | 39 ++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 9810106a3f66..37d7d16b86a9 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -437,7 +437,7 @@ static inline void user_access_restore(unsigned long enabled) { }
  * We want the unsafe accessors to always be inlined and use
  * the error labels - thus the macro games.
  */
-#define unsafe_copy_loop(dst, src, len, type, label)				\
+#define unsafe_copy_to_user_loop(dst, src, len, type, label)			\
 	while (len >= sizeof(type)) {						\
 		unsafe_put_user(*(type *)(src),(type __user *)(dst),label);	\
 		dst += sizeof(type);						\
@@ -445,15 +445,34 @@ static inline void user_access_restore(unsigned long enabled) { }
 		len -= sizeof(type);						\
 	}
 
-#define unsafe_copy_to_user(_dst,_src,_len,label)			\
-do {									\
-	char __user *__ucu_dst = (_dst);				\
-	const char *__ucu_src = (_src);					\
-	size_t __ucu_len = (_len);					\
-	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u64, label);	\
-	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u32, label);	\
-	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u16, label);	\
-	unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label);	\
+#define unsafe_copy_to_user(_dst, _src, _len, label)				\
+do {										\
+	void __user *__dst = (_dst);						\
+	const void *__src = (_src);						\
+	size_t __len = (_len);							\
+	unsafe_copy_to_user_loop(__dst, __src, __len, u64, label);		\
+	unsafe_copy_to_user_loop(__dst, __src, __len, u32, label);		\
+	unsafe_copy_to_user_loop(__dst, __src, __len, u16, label);		\
+	unsafe_copy_to_user_loop(__dst, __src, __len, u8,  label);		\
+} while (0)
+
+#define unsafe_copy_from_user_loop(dst, src, len, type, label)			\
+	while (len >= sizeof(type)) {						\
+		unsafe_get_user(*(type *)(dst), (type __user *)(src), label);	\
+		dst += sizeof(type);						\
+		src += sizeof(type);						\
+		len -= sizeof(type);						\
+	}
+
+#define unsafe_copy_from_user(_dst, _src, _len, label)				\
+do {										\
+	void *__dst = (_dst);							\
+	void __user *__src = (_src);						\
+	size_t __len = (_len);							\
+	unsafe_copy_from_user_loop(__dst, __src, __len, u64, label);		\
+	unsafe_copy_from_user_loop(__dst, __src, __len, u32, label);		\
+	unsafe_copy_from_user_loop(__dst, __src, __len, u16, label);		\
+	unsafe_copy_from_user_loop(__dst, __src, __len, u8,  label);		\
 } while (0)
 
 #define INLINE_COPY_TO_USER
-- 
2.51.0



^ permalink raw reply related

* [PATCH RFC 2/2] arm64: vdso: Implement __vdso_futex_robust_try_unlock()
From: André Almeida @ 2026-04-17 14:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Thomas Gleixner, Mark Rutland,
	Mathieu Desnoyers, Sebastian Andrzej Siewior, Carlos O'Donell,
	Peter Zijlstra, Florian Weimer, Rich Felker, Torvald Riegel,
	Darren Hart, Ingo Molnar, Davidlohr Bueso, Arnd Bergmann,
	Liam R . Howlett, Uros Bizjak, Thomas Weißschuh
  Cc: linux-arm-kernel, linux-kernel, linux-arch, kernel-dev, LKML,
	André Almeida
In-Reply-To: <20260417-tonyk-robust_arm-v1-0-03aa64e2ff1a@igalia.com>

Based on the x86 implementation, implement the vDSO function for unlocking
a robust futex correctly.

Commit xxxxxxxxxxxx ("x86/vdso: Implement __vdso_futex_robust_try_unlock()") has
the full explanation about why this mechanism is needed.

The unlock assembly sequence for arm64 is:

	__futex_list64_try_unlock_cs_start:
		ldxr	x3, [x0] // Load the value at *futex
		cmp	x1, x3   // Compare with TID
		b.ne	__futex_list64_try_unlock_cs_end
		stlxr	w1, xzr, [x0] // Try to clear *futex
		cbnz	w1, __futex_list64_try_unlock_cs_start
	__futex_list64_try_unlock_cs_success:
		str	xzr, [x2] // After clearing *futex, clear *op_pending
	__futex_list64_try_unlock_cs_end:

The decision regarding if the pointer should be cleared or not lies on checking
the condition flag zero:

	return (regs->user_regs.pstate & PSR_Z_BIT) ?
		(void __user *) regs->user_regs.regs[2] : NULL;

If it's not zero, that means that the comparassion worked and the kernel should
clear op_pending (if userspace didn't managed to) stored at x2.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
RFC:
 - Should I duplicate the explanation found in the x86 commit or can I just
 point to it?
 - Only LL/SC for now but I can add LSE later if this looks good
 - It the objdump I see that op_pending is store at x2. But how stable is this,
 how can I write it in a way that's always x2?
---
 arch/arm64/Kconfig                                 |  1 +
 arch/arm64/include/asm/futex_robust.h              | 35 +++++++++++++
 arch/arm64/kernel/vdso/Makefile                    |  9 +++-
 arch/arm64/kernel/vdso/vdso.lds.S                  |  4 ++
 .../kernel/vdso/vfutex_robust_list_try_unlock.c    | 59 ++++++++++++++++++++++
 5 files changed, 107 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 427151a9db7f..e10cb97a51c7 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -249,6 +249,7 @@ config ARM64
 	select HAVE_RELIABLE_STACKTRACE
 	select HAVE_POSIX_CPU_TIMERS_TASK_WORK
 	select HAVE_FUNCTION_ARG_ACCESS_API
+	select HAVE_FUTEX_ROBUST_UNLOCK
 	select MMU_GATHER_RCU_TABLE_FREE
 	select HAVE_RSEQ
 	select HAVE_RUST if RUSTC_SUPPORTS_ARM64
diff --git a/arch/arm64/include/asm/futex_robust.h b/arch/arm64/include/asm/futex_robust.h
new file mode 100644
index 000000000000..f2b7a2b15cb5
--- /dev/null
+++ b/arch/arm64/include/asm/futex_robust.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ARM64_FUTEX_ROBUST_H
+#define _ASM_ARM64_FUTEX_ROBUST_H
+
+#include <asm/ptrace.h>
+
+static __always_inline void __user *arm64_futex_robust_unlock_get_pop(struct pt_regs *regs)
+{
+	/*
+	 * RFC: According to the objdump bellow, x2 is the address of
+	 * op_pending. How stable is this?
+
+	 <__futex_list64_try_unlock_cs_start>:
+		ldxr	x3, [x0]
+		cmp	x1, x3
+		b.ne	d7c <__futex_list64_try_unlock_cs_end>  // b.any
+		stlxr	w1, xzr, [x0]
+		cbnz	w1, d64 <__futex_list64_try_unlock_cs_start>
+
+	<__futex_list64_try_unlock_cs_success>:
+		str	xzr, [x2]
+
+	<__futex_list64_try_unlock_cs_end>:
+		mov	w0, w3
+		ret
+	*/
+
+	return (regs->user_regs.pstate & PSR_Z_BIT) ? NULL
+		: (void __user *) regs->user_regs.regs[2];
+}
+
+#define arch_futex_robust_unlock_get_pop(regs)	\
+	arm64_futex_robust_unlock_get_pop(regs)
+
+#endif /* _ASM_ARM64_FUTEX_ROBUST_H */
diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
index 7dec05dd33b7..a65893d8100e 100644
--- a/arch/arm64/kernel/vdso/Makefile
+++ b/arch/arm64/kernel/vdso/Makefile
@@ -9,7 +9,8 @@
 # Include the generic Makefile to check the built vdso.
 include $(srctree)/lib/vdso/Makefile.include
 
-obj-vdso := vgettimeofday.o note.o sigreturn.o vgetrandom.o vgetrandom-chacha.o
+obj-vdso := vgettimeofday.o note.o sigreturn.o vgetrandom.o vgetrandom-chacha.o \
+	    vfutex_robust_list_try_unlock.o
 
 # Build rules
 targets := $(obj-vdso) vdso.so vdso.so.dbg
@@ -45,9 +46,11 @@ CC_FLAGS_ADD_VDSO := -O2 -mcmodel=tiny -fasynchronous-unwind-tables
 
 CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_REMOVE_VDSO)
 CFLAGS_REMOVE_vgetrandom.o = $(CC_FLAGS_REMOVE_VDSO)
+CFLAGS_REMOVE_vfutex_robust_list_try_unlock.o = $(CC_FLAGS_REMOVE_VDSO)
 
 CFLAGS_vgettimeofday.o = $(CC_FLAGS_ADD_VDSO)
 CFLAGS_vgetrandom.o = $(CC_FLAGS_ADD_VDSO)
+CFLAGS_vfutex_robust_list_try_unlock.o = $(CC_FLAGS_ADD_VDSO)
 
 ifneq ($(c-gettimeofday-y),)
   CFLAGS_vgettimeofday.o += -include $(c-gettimeofday-y)
@@ -57,6 +60,10 @@ ifneq ($(c-getrandom-y),)
   CFLAGS_vgetrandom.o += -include $(c-getrandom-y)
 endif
 
+ifneq ($(c-vfutex_robust_list_try_unlock-y),)
+  CFLAGS_vfutex_robust_list_try_unlock.o += -include $(c-vfutex_robust_list_try_unlock-y)
+endif
+
 targets += vdso.lds
 CPPFLAGS_vdso.lds += -P -C -U$(ARCH)
 
diff --git a/arch/arm64/kernel/vdso/vdso.lds.S b/arch/arm64/kernel/vdso/vdso.lds.S
index 52314be29191..33ce58516580 100644
--- a/arch/arm64/kernel/vdso/vdso.lds.S
+++ b/arch/arm64/kernel/vdso/vdso.lds.S
@@ -104,6 +104,10 @@ VERSION
 		__kernel_clock_gettime;
 		__kernel_clock_getres;
 		__kernel_getrandom;
+		__vdso_futex_robust_list64_try_unlock;
+#ifdef CONFIG_COMPAT
+		__vdso_futex_robust_list32_try_unlock;
+#endif
 	local: *;
 	};
 }
diff --git a/arch/arm64/kernel/vdso/vfutex_robust_list_try_unlock.c b/arch/arm64/kernel/vdso/vfutex_robust_list_try_unlock.c
new file mode 100644
index 000000000000..a9089d3cacfc
--- /dev/null
+++ b/arch/arm64/kernel/vdso/vfutex_robust_list_try_unlock.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <vdso/futex.h>
+#include <linux/stringify.h>
+
+#define LABEL(name, sz) __stringify(__futex_list##sz##_try_unlock_cs_##name)
+
+#define GLOBLS(sz) ".globl " LABEL(start, sz) ", " LABEL(success, sz) ", " LABEL(end, sz) "\n"
+
+__u32 __vdso_futex_robust_list64_try_unlock(__u32 *lock, __u32 tid, __u64 *pop)
+{
+	__u32 val, result;
+
+	asm volatile (
+		GLOBLS(64)
+		"	prfm pstl1strm, %[lock]			\n"
+		LABEL(start, 64)":				\n"
+		"	ldxr %[val], %[lock]			\n"
+		"	cmp %[tid], %[val]			\n"
+		"	bne " LABEL(end, 64)"			\n"
+		"	stlxr %w[result], xzr, %[lock]		\n"
+		"	cbnz %w[result], " LABEL(start, 64)"	\n"
+		LABEL(success, 64)":				\n"
+		"	str xzr, %[pop]				\n"
+		LABEL(end, 64)":				\n"
+
+		: [val] "=&r" (val), [result] "=r" (result)
+		: [tid] "r" (tid), [lock] "Q" (*lock), [pop] "Q" (*pop)
+		: "memory"
+	);
+
+	return val;
+}
+
+#ifdef CONFIG_COMPAT
+__u32 __vdso_futex_robust_list32_try_unlock(__u32 *lock, __u32 tid, __u32 *pop)
+{
+	__u32 val, result;
+
+	asm volatile (
+		GLOBLS(32)
+		"	prfm pstl1strm, %[lock]			\n"
+		LABEL(start, 32)":				\n"
+		"	ldxr %w[val], %[lock]			\n"
+		"	cmp %w[tid], %w[val]			\n"
+		"	bne " LABEL(end, 32)"			\n"
+		"	stlxr %w[result], wzr, %w[lock]		\n"
+		"	cbnz %w[result], " LABEL(start, 32)"	\n"
+		LABEL(success, 32)":				\n"
+		"	str wzr, %w[pop]			\n"
+		LABEL(end, 32)":				\n"
+
+		: [val] "=&r" (val), [result] "=r" (result)
+		: [tid] "r" (tid), [lock] "Q" (*lock), [pop] "Q" (*pop)
+		: "memory"
+	);
+
+	return val;
+}
+#endif

-- 
2.53.0



^ permalink raw reply related

* [PATCH RFC 0/2] arm64: vdso: Implement __vdso_futex_robust_try_unlock()
From: André Almeida @ 2026-04-17 14:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Thomas Gleixner, Mark Rutland,
	Mathieu Desnoyers, Sebastian Andrzej Siewior, Carlos O'Donell,
	Peter Zijlstra, Florian Weimer, Rich Felker, Torvald Riegel,
	Darren Hart, Ingo Molnar, Davidlohr Bueso, Arnd Bergmann,
	Liam R . Howlett, Uros Bizjak, Thomas Weißschuh
  Cc: linux-arm-kernel, linux-kernel, linux-arch, kernel-dev, LKML,
	André Almeida

Hi folks,

This is my take on implementing the new vDSO for unlocking a robust futex in
arm64. If you don't know what's that, Thomas wrote a good summary,
including the motivation for this work and the x86 implementation:

   https://lore.kernel.org/lkml/878qb89g7b.ffs@tglx/

There are some loose ends in my patchset so I'm sending as a RFC to ask
some questions:

 - I haven't managed to expose the assembly labels correctly, the linker can't
 find it and the compilation fails, more info in patch 1/2
 - If the process is interrupted between the labels, we need to check the
 conditional flags and clear the op_pending address from the register. Using
 objdump I see that op_pending addr is being stored at x2, but I suspect that
 this isn't stable, so I need to figure out how to make sure that the address
 will always be stored in the same register.
 - So far I have implemented only the LL/SC version to make review easier, but I
 can do the LSE version as well.

This patchset works fine with the tests proposed at
https://lore.kernel.org/lkml/20260330120118.012924430@kernel.org/ (but of course
without the labels the complete mechanism doesn't work properly).

---
André Almeida (2):
      arm64: vdso: Prepare for robust futex unlock support
      arm64: vdso: Implement __vdso_futex_robust_try_unlock()

 arch/arm64/Kconfig                                 |  1 +
 arch/arm64/include/asm/futex_robust.h              | 35 +++++++++++++
 arch/arm64/include/asm/vdso.h                      |  4 ++
 arch/arm64/kernel/vdso.c                           | 29 +++++++++++
 arch/arm64/kernel/vdso/Makefile                    |  9 +++-
 arch/arm64/kernel/vdso/vdso.lds.S                  |  4 ++
 .../kernel/vdso/vfutex_robust_list_try_unlock.c    | 59 ++++++++++++++++++++++
 7 files changed, 140 insertions(+), 1 deletion(-)
---
base-commit: 0e8896e9899b607bb168c1cce340596b8c2e3e2b
change-id: 20260416-tonyk-robust_arm-54ff77d2c4e4

Best regards,
--  
André Almeida <andrealmeid@igalia.com>



^ permalink raw reply

* [PATCH RFC 1/2] arm64: vdso: Prepare for robust futex unlock support
From: André Almeida @ 2026-04-17 14:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Thomas Gleixner, Mark Rutland,
	Mathieu Desnoyers, Sebastian Andrzej Siewior, Carlos O'Donell,
	Peter Zijlstra, Florian Weimer, Rich Felker, Torvald Riegel,
	Darren Hart, Ingo Molnar, Davidlohr Bueso, Arnd Bergmann,
	Liam R . Howlett, Uros Bizjak, Thomas Weißschuh
  Cc: linux-arm-kernel, linux-kernel, linux-arch, kernel-dev, LKML,
	André Almeida
In-Reply-To: <20260417-tonyk-robust_arm-v1-0-03aa64e2ff1a@igalia.com>

There will be a VDSO function to unlock non-contended robust futexes in
user space. The unlock sequence is racy vs. clearing the list_pending_op
pointer in the task's robust list head. To plug this race the kernel needs
to know the critical section window so it can clear the pointer when the
task is interrupted within that race window. The window is determined by
labels in the inline assembly.

Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
RFC: Those symbols can't be found by the linker after patch 2/2, it fails with:

ld: arch/arm64/kernel/vdso.o: in function `vdso_futex_robust_unlock_update_ips':
arch/arm64/kernel/vdso.c:72:(.text+0x200): undefined reference to `__futex_list64_try_unlock_cs_success'
ld: arch/arm64/kernel/vdso.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `__futex_list64_try_unlock_cs_success' which may bind externally can not be used when making a shared object; recompile with -fPIC
arch/arm64/kernel/vdso.c:72:(.text+0x200): dangerous relocation: unsupported relocation
---
 arch/arm64/include/asm/vdso.h |  4 ++++
 arch/arm64/kernel/vdso.c      | 29 +++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/arch/arm64/include/asm/vdso.h b/arch/arm64/include/asm/vdso.h
index 232b46969088..182fde1df3dd 100644
--- a/arch/arm64/include/asm/vdso.h
+++ b/arch/arm64/include/asm/vdso.h
@@ -18,6 +18,10 @@
 
 extern char vdso_start[], vdso_end[];
 extern char vdso32_start[], vdso32_end[];
+extern char __futex_list64_try_unlock_cs_success[], __futex_list64_try_unlock_cs_end[];
+#ifdef CONFIG_COMPAT
+extern char __futex_list32_try_unlock_cs_success[], __futex_list32_try_unlock_cs_end[];
+#endif
 
 #endif /* !__ASSEMBLER__ */
 
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
index 592dd8668de4..42a82e73a774 100644
--- a/arch/arm64/kernel/vdso.c
+++ b/arch/arm64/kernel/vdso.c
@@ -11,6 +11,7 @@
 #include <linux/clocksource.h>
 #include <linux/elf.h>
 #include <linux/err.h>
+#include <linux/futex.h>
 #include <linux/errno.h>
 #include <linux/gfp.h>
 #include <linux/kernel.h>
@@ -57,6 +58,32 @@ static struct vdso_abi_info vdso_info[] __ro_after_init = {
 #endif /* CONFIG_COMPAT_VDSO */
 };
 
+#ifdef CONFIG_FUTEX_ROBUST_UNLOCK
+static void vdso_futex_robust_unlock_update_ips(enum vdso_abi abi, struct mm_struct *mm)
+{
+	unsigned long vdso = (unsigned long) mm->context.vdso;
+	struct futex_mm_data *fd = &mm->futex;
+
+	/*
+	 * RFC: won't compile due to undefined reference to `__futex_list64_try_unlock_cs_...`
+
+	if (abi == VDSO_ABI_AA64) {
+		futex_set_vdso_cs_range(fd, 0, vdso, (uintptr_t) __futex_list64_try_unlock_cs_success,
+					(uintptr_t) __futex_list64_try_unlock_cs_end, false);
+	}
+
+#ifdef CONFIG_COMPAT
+	if (abi == VDSO_ABI_AA32) {
+		futex_set_vdso_cs_range(fd, 1, vdso, (uintptr_t) __futex_list32_try_unlock_cs_success,
+					(uintptr_t) __futex_list32_try_unlock_cs_end, true);
+	}
+#endif
+	*/
+}
+#else
+static inline void vdso_futex_robust_unlock_update_ips(enum vdso_abi abi, struct mm_struct *mm) { }
+#endif /* CONFIG_FUTEX_ROBUST_UNLOCK */
+
 static int vdso_mremap(const struct vm_special_mapping *sm,
 		struct vm_area_struct *new_vma)
 {
@@ -134,6 +161,8 @@ static int __setup_additional_pages(enum vdso_abi abi,
 	if (IS_ERR(ret))
 		goto up_fail;
 
+	vdso_futex_robust_unlock_update_ips(abi, mm);
+
 	return 0;
 
 up_fail:

-- 
2.53.0



^ permalink raw reply related

* Re: [PATCH v2] raid6: arm64: add SVE optimized implementation for syndrome generation
From: Ard Biesheuvel @ 2026-04-17 14:43 UTC (permalink / raw)
  To: Robin Murphy, Demian Shulhan
  Cc: Christoph Hellwig, Mark Rutland, Song Liu, Yu Kuai, Will Deacon,
	Catalin Marinas, Mark Brown, linux-arm-kernel, Li Nan, linux-raid,
	linux-kernel
In-Reply-To: <8db4defe-8b5e-4cc3-880b-72d46510b034@arm.com>

On Thu, 16 Apr 2026, at 18:26, Robin Murphy wrote:
> On 16/04/2026 3:59 pm, Demian Shulhan wrote:
>> Hi Ard!
...
>>> OK, so the takeaway here is that SVE is only worth the hassle if the
>>> vector length is at least 256 bits. This is not entirely surprising,
>>> but given that Graviton4 went back to 128 bit vectors from 256, I
>>> wonder what the future expectation is here.
>>
>> I agree. The results from the SnapRAID tests are not as impressive as
>> I hoped, and the fact that Neoverse-V2 went back to 128-bit is a red
>> flag. It suggests that wide SVE registers might not be a priority in
>> future architecture versions.
>
> If you look at the Neoverse V1 software optimisation guide[1], the SVE
> instructions generally have half the throughput of their ASIMD
> equivalents (i.e. presumably the vector pipes are still only 128 bits
> wide and SVE is just using them in pairs), so indeed the total
> instruction count is largely meaningless - IPC might be somewhat more
> relevant, but I'd say the only performance number that's really
> meaningful is the end-to-end MB/s measure of how fast the function
> implementation as a whole can process data.

On arm64, kernel mode NEON is mostly used to gain access to AES and SHA
instructions, and only to a lesser degree to speed up ordinary
arithmetic, and so XOR is somewhat of an outlier here.

Given that Neoverse V1 apparently already carves up ordinary arithmetic
performed on 256-bit vectors and operates on 128 bits at a time, I am
rather skeptical that we're likely to see any SVE implementations of the
crypto extensions soon that are meaningfully faster, given that these
are presumably much costlier to implement in terms of gate count, and
therefore likely to be split up even on SVE implementations that can
perform ordinary arithmetic on 256+ bit vectors in a single cycle. Note
that even the arm64 SIMD accelerated CRC implementations rely heavily on
64x64->128 polynomial multiplication.

IOW, before we consider kernel mode SVE, I'd like to see some benchmarks
for other algorithms too.

> It's probably also worth checking whether the current NEON routines
> themselves are actually optimal for modern big CPUs - things have
> moved on quite a bit since Cortex-A57 (whose ASIMD performance could
> also be described as "esoteric" at the best of times...)
>

Some of those crypto routines could definitely be made faster, but it
highly depends on the context whether that actually helps: for instance,
there was a proposal a while ago to incorporate the AES-GCM code from
the OpenSSL project (authored by ARM) but at the time, it slightly
regressed the ~1500 byte case and only gave a substantial improvement
for much larger block sizes, which aren't that common in the kernel for
this particular algorithm.

IOW, any contributions that improve the existing code (or outright
replace it with something faster, for all I care) are highly
appreciated, but they should be motivated by benchmarks that reflect
the use cases that we actually consider important for the algorithm
in question.

^ permalink raw reply

* Re: [PATCH v6 01/30] mm: Introduce kpkeys
From: David Hildenbrand (Arm) @ 2026-04-17 14:37 UTC (permalink / raw)
  To: Kevin Brodsky, linux-hardening
  Cc: linux-kernel, Andrew Morton, Andy Lutomirski, Catalin Marinas,
	Dave Hansen, Ira Weiny, Jann Horn, Jeff Xu, Joey Gouly, Kees Cook,
	Linus Walleij, Lorenzo Stoakes, Marc Zyngier, Mark Brown,
	Matthew Wilcox, Maxwell Bland, Mike Rapoport (IBM),
	Peter Zijlstra, Pierre Langlois, Quentin Perret, Rick Edgecombe,
	Ryan Roberts, Thomas Gleixner, Vlastimil Babka, Will Deacon,
	Yang Shi, Yeoreum Yun, linux-arm-kernel, linux-mm, x86
In-Reply-To: <20260227175518.3728055-2-kevin.brodsky@arm.com>

On 2/27/26 18:54, Kevin Brodsky wrote:
> kpkeys is a simple framework to enable the use of protection keys
> (pkeys) to harden the kernel itself. This patch introduces the basic
> API in <linux/kpkeys.h>: a couple of functions to set and restore
> the pkey register and macros to define guard objects.
> 
> kpkeys introduces a new concept on top of pkeys: the kpkeys level.
> Each level is associated to a set of permissions for the pkeys
> managed by the kpkeys framework. kpkeys_set_level(lvl) sets those
> permissions according to lvl, and returns the original pkey
> register, to be later restored by kpkeys_restore_pkey_reg(). To
> start with, only KPKEYS_LVL_DEFAULT is available, which is meant
> to grant RW access to KPKEYS_PKEY_DEFAULT (i.e. all memory since
> this is the only available pkey for now).
> 
> Because each architecture implementing pkeys uses a different
> representation for the pkey register, and may reserve certain pkeys
> for specific uses, support for kpkeys must be explicitly indicated
> by selecting ARCH_HAS_KPKEYS and defining the following functions in
> <asm/kpkeys.h>, in addition to the macros provided in
> <asm-generic/kpkeys.h>:
> 
> - arch_kpkeys_set_level()
> - arch_kpkeys_restore_pkey_reg()
> - arch_kpkeys_enabled()

Another thing: why not simply drop the "arch_" stuff from these helpers?

-- 
Cheers,

David


^ permalink raw reply

* [PATCH bpf-next v2] arm32, bpf: Reject BPF-to-BPF calls and callbacks in the JIT
From: Puranjay Mohan @ 2026-04-17 14:33 UTC (permalink / raw)
  To: bpf, linux-arm-kernel
  Cc: Puranjay Mohan, Jonas Rebmann, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu, Russell King,
	kernel

The ARM32 BPF JIT does not support BPF-to-BPF function calls
(BPF_PSEUDO_CALL) or callbacks (BPF_PSEUDO_FUNC), but it does
not reject them either.

When a program with subprograms is loaded (e.g. libxdp's XDP
dispatcher uses __noinline__ subprograms, or any program using
callbacks like bpf_loop or bpf_for_each_map_elem), the verifier
invokes bpf_jit_subprogs() which calls bpf_int_jit_compile()
for each subprogram.

For BPF_PSEUDO_CALL, since ARM32 does not reject it, the JIT
silently emits code using the wrong address computation:

    func = __bpf_call_base + imm

where imm is a pc-relative subprogram offset, producing a bogus
function pointer.

For BPF_PSEUDO_FUNC, the ldimm64 handler ignores src_reg and
loads the immediate as a normal 64-bit value without error.

In both cases, build_body() reports success and a JIT image is
allocated. ARM32 lacks the jit_data/extra_pass mechanism needed
for the second JIT pass in bpf_jit_subprogs(). On the second
pass, bpf_int_jit_compile() performs a full fresh compilation,
allocating a new JIT binary and overwriting prog->bpf_func. The
first allocation is never freed. bpf_jit_subprogs() then detects
the function pointer changed and aborts with -ENOTSUPP, but the
original JIT binary has already been leaked. Each program
load/unload cycle leaks one JIT binary allocation, as reported
by kmemleak:

    unreferenced object 0xbf0a1000 (size 4096):
      backtrace:
        bpf_jit_binary_alloc+0x64/0xfc
        bpf_int_jit_compile+0x14c/0x348
        bpf_jit_subprogs+0x4fc/0xa60

Fix this by rejecting both BPF_PSEUDO_CALL in the BPF_CALL
handler and BPF_PSEUDO_FUNC in the BPF_LD_IMM64 handler, falling
through to the existing 'notyet' path. This causes build_body()
to fail before any JIT binary is allocated, so
bpf_int_jit_compile() returns the original program unjitted.
bpf_jit_subprogs() then sees !prog->jited and cleanly falls
back to the interpreter with no leak.

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs")
Reported-by: Jonas Rebmann <jre@pengutronix.de>
Closes: https://lore.kernel.org/bpf/b63e9174-7a3d-4e22-8294-16df07a4af89@pengutronix.de
Tested-by: Jonas Rebmann <jre@pengutronix.de>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
---

Changelog:
v1: https://lore.kernel.org/all/20260417103004.3552500-1-puranjay@kernel.org/
Changes in v2:
- Add Acked-by: Daniel Borkmann <daniel@iogearbox.net>
- Reject BPF_PSEUDO_FUNC in the BPF_LD | BPF_IMM | BPF_DW handler
- Move code below declarations

---
 arch/arm/net/bpf_jit_32.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index deeb8f292454..a900aa973885 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1852,6 +1852,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 	{
 		u64 val = (u32)imm | (u64)insn[1].imm << 32;

+		if (insn->src_reg == BPF_PSEUDO_FUNC)
+			goto notyet;
+
 		emit_a32_mov_i64(dst, val, ctx);

 		return 1;
@@ -2055,6 +2058,9 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 		const s8 *r5 = bpf2a32[BPF_REG_5];
 		const u32 func = (u32)__bpf_call_base + (u32)imm;

+		if (insn->src_reg == BPF_PSEUDO_CALL)
+			goto notyet;
+
 		emit_a32_mov_r64(true, r0, r1, ctx);
 		emit_a32_mov_r64(true, r1, r2, ctx);
 		emit_push_r64(r5, ctx);

base-commit: 1f5ffc672165ff851063a5fd044b727ab2517ae3
-- 
2.52.0

^ permalink raw reply related

* Re: [PATCH] KVM: arm64: pkvm: Adopt MARKER() to define host hypercall ranges
From: Marc Zyngier @ 2026-04-17 14:23 UTC (permalink / raw)
  To: kvmarm, linux-arm-kernel, Marc Zyngier
  Cc: Will Deacon, Vincent Donnefort, Fuad Tabba, Joey Gouly,
	Suzuki K Poulose, Oliver Upton, Zenghui Yu
In-Reply-To: <20260414160528.2218858-1-maz@kernel.org>

On Tue, 14 Apr 2026 17:05:28 +0100, Marc Zyngier wrote:
> The EL2 code defines ranges of host hypercalls that are either
> enabled at boot-time only, used by [nh]VHE KVM, or reserved to pKVM.
> 
> The way these ranges are delineated is error prone, as the enum symbols
> defining the limits are expressed in terms of actual function symbols.
> This means that should a new function be added, special care must be
> taken to also update the limit symbol.
> 
> [...]

Applied to fixes, thanks!

[1/1] KVM: arm64: pkvm: Adopt MARKER() to define host hypercall ranges
      commit: 9b72acd3f770517d3fd4bd7193fd60f3a81d1f69

Cheers,

	M.
-- 
Without deviation from the norm, progress is not possible.




^ permalink raw reply

* Re: [PATCH] KVM: arm64: Re-allow hyp tracing HVCs for [nh]VHE
From: Marc Zyngier @ 2026-04-17 14:23 UTC (permalink / raw)
  To: joey.gouly, suzuki.poulose, yuzenghui, catalin.marinas, will,
	Oliver Upton, Vincent Donnefort
  Cc: linux-arm-kernel, kvmarm, kernel-team
In-Reply-To: <20260414100231.1859687-1-vdonnefort@google.com>

On Tue, 14 Apr 2026 11:02:31 +0100, Vincent Donnefort wrote:
> The introduction of __KVM_HOST_SMCCC_FUNC_MAX_NO_PKVM excluded hyp
> tracing HVCs from the common [nh]VHE/pKVM list. Re-allow them.
> 
> 

Applied to fixes, thanks!

[1/1] KVM: arm64: Re-allow hyp tracing HVCs for [nh]VHE
      commit: ccab51d69b1478b549ad0bbb38f556ab3bfb47ab

Cheers,

	M.
-- 
Without deviation from the norm, progress is not possible.




^ permalink raw reply

* Re: [PATCH bpf-next] bpf, arm32: Reject BPF_PSEUDO_CALL in the JIT
From: Puranjay Mohan @ 2026-04-17 14:16 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: bpf, linux-arm-kernel, jre, ast, daniel, andrii, martin.lau,
	eddyz87, memxor, song, linux, kernel, martin.lau, yonghong.song,
	clm, ihor.solodrai
In-Reply-To: <0e7bf7fc588d9c45349a7b5b7b7a86504bf39069d9b7a043b53138604547f267@mail.kernel.org>

On Fri, Apr 17, 2026 at 12:17 PM <bot+bpf-ci@kernel.org> wrote:
>
> > Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs")
>
> Is this the correct Fixes tag? That commit only modified x64 JIT code
> and BPF core infrastructure for BPF-to-BPF calls - it never touched
> arch/arm/net/bpf_jit_32.c.
>
> The actual bug was introduced by commit 39c13c204bb1 ("arm: eBPF JIT
> compiler") from August 2017, which added the ARM32 eBPF JIT with
> BPF_CALL handling but failed to reject BPF_PSEUDO_CALL. The code has
> been incorrectly computing function addresses as 'func = __bpf_call_base
> + imm' for BPF_PSEUDO_CALL since the ARM32 JIT was first introduced.
>
> Suggested: Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")

I knew Claude would say this!

39c13c204bb1 ("arm: eBPF JIT compiler")  came before BPF_PSEUDO_CALL
was introduced, so the bug was introduced when jit_subprogs() was
added.
The fixes tag in the patch is correct!


^ permalink raw reply

* Re: [PATCH] iommu/arm-smmu-qcom: Fix fastrpc compatible string in ACTLR client match table
From: Shawn Guo @ 2026-04-17 14:10 UTC (permalink / raw)
  To: bibek.patro
  Cc: Rob Clark, Will Deacon, Robin Murphy, Joerg Roedel,
	Dmitry Baryshkov, iommu, linux-arm-msm, linux-arm-kernel,
	linux-kernel, srinivas.kandagatla
In-Reply-To: <20260408130825.3268733-1-bibek.patro@oss.qualcomm.com>

On Wed, Apr 08, 2026 at 06:38:25PM +0530, bibek.patro@oss.qualcomm.com wrote:
> From: Bibek Kumar Patro <bibek.patro@oss.qualcomm.com>
> 
> The qcom_smmu_actlr_client_of_match table contained "qcom,fastrpc" as
> the compatible string for applying ACTLR prefetch settings to FastRPC
> devices. However, "qcom,fastrpc" is the compatible string for the parent
> rpmsg channel node, which is not an IOMMU client — it carries no
> "iommus" property in the device tree and is never attached to an SMMU
> context bank.
> 
> The actual IOMMU clients are the compute context bank (CB) child nodes,
> which use the compatible string "qcom,fastrpc-compute-cb". These nodes
> carry the "iommus" property and are probed by fastrpc_cb_driver via
> fastrpc_cb_probe(), which sets up the DMA mask and IOMMU mappings for
> each FastRPC session. The device tree structure is:
> 
>   fastrpc {
>       compatible = "qcom,fastrpc";        /* rpmsg channel, no iommus */
>       ...
>       compute-cb@3 {
>           compatible = "qcom,fastrpc-compute-cb";
>           iommus = <&apps_smmu 0x1823 0x0>;  /* actual IOMMU client */
>       };
>   };
> 
> Since qcom_smmu_set_actlr_dev() calls of_match_device() against the
> device being attached to the SMMU context bank, the "qcom,fastrpc"
> entry was never matching any device. As a result, the ACTLR prefetch
> settings (PREFETCH_DEEP | CPRE | CMTLB) were silently never applied
> for FastRPC compute context banks.
> 
> Fix this by replacing "qcom,fastrpc" with "qcom,fastrpc-compute-cb"
> in the match table so that the ACTLR settings are correctly applied
> to the compute CB devices that are the true IOMMU clients.
> 
> Assisted-by: Anthropic:claude-4-6-sonnet
> Fixes: 3e35c3e725de ("iommu/arm-smmu: Add ACTLR data and support for qcom_smmu_500")
> Signed-off-by: Bibek Kumar Patro <bibek.patro@oss.qualcomm.com>

Reviewed-by: Shawn Guo <shengchao.guo@oss.qualcomm.com>


^ permalink raw reply

* Re: [PATCH v3 6/8] arm64/module, sframe: Add sframe support for modules.
From: Jens Remus @ 2026-04-17 14:07 UTC (permalink / raw)
  To: Dylan Hatch, Roman Gushchin, Weinan Liu, Will Deacon,
	Josh Poimboeuf, Indu Bhagat, Peter Zijlstra, Steven Rostedt,
	Catalin Marinas, Jiri Kosina
  Cc: Mark Rutland, Prasanna Kumar T S M, Puranjay Mohan, Song Liu,
	joe.lawrence, linux-toolchains, linux-kernel, live-patching,
	linux-arm-kernel
In-Reply-To: <20260406185000.1378082-7-dylanbhatch@google.com>

On 4/6/2026 8:49 PM, Dylan Hatch wrote:
> Add sframe table to mod_arch_specific and support sframe PC lookups when
> an .sframe section can be found on incoming modules.
> 
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> Signed-off-by: Weinan Liu <wnliu@google.com>

Reviewed-by: Jens Remus <jremus@linux.ibm.com>

> ---
>  arch/arm64/include/asm/module.h |  6 +++++
>  arch/arm64/kernel/module.c      |  8 +++++++
>  include/linux/sframe.h          |  2 ++
>  kernel/unwind/sframe.c          | 39 +++++++++++++++++++++++++++++++--
>  4 files changed, 53 insertions(+), 2 deletions(-)
Regards,
Jens
-- 
Jens Remus
Linux on Z Development (D3303)
jremus@de.ibm.com / jremus@linux.ibm.com

IBM Deutschland Research & Development GmbH; Vorsitzender des Aufsichtsrats: Wolfgang Wendt; Geschäftsführung: David Faller; Sitz der Gesellschaft: Ehningen; Registergericht: Amtsgericht Stuttgart, HRB 243294
IBM Data Privacy Statement: https://www.ibm.com/privacy/



^ permalink raw reply

* Re: [PATCH v7 0/3] Mediatek MT8189 JPEG support
From: Nicolas Dufresne @ 2026-04-17 13:30 UTC (permalink / raw)
  To: Jianhua Lin, mchehab, robh, krzk+dt, conor+dt, matthias.bgg,
	angelogioacchino.delregno
  Cc: devicetree, linux-kernel, linux-media, linux-arm-kernel,
	linux-mediatek, Project_Global_Chrome_Upstream_Group, sirius.wang,
	vince-wl.liu, jh.hsu
In-Reply-To: <20260417100519.1043-1-jianhua.lin@mediatek.com>

[-- Attachment #1: Type: text/plain, Size: 3866 bytes --]

Hi,

Le vendredi 17 avril 2026 à 18:05 +0800, Jianhua Lin a écrit :
> This series is based on tag: next-20260410, linux-next/master

What dependencies justify not submitting based on media-committers/next as usual
? Its fine to say you tested against linux-next of course, and if its only
working there, its really nice to explain why.

Nicolas

> 
> Changes compared with v6:
> - Patches 1/3 (dt-bindings: decoder):
>   update the existing `allOf` condition for mediatek,mt8189-jpgdec to
>   make the 'mediatek,larb' property strictly required for MT8189 SoC.
> - Patches 2/3 (dt-bindings: encoder):
>   Add an `allOf` condition to enforce that the `mediatek,larb` property
>   is strictly required when the compatible string contains
>   mediatek,mt8189-jpgenc.
> 
> Changes compared with v5:
> - Patches 1/3 (dt-bindings: decoder):
>   - Drop top-level minItems/maxItems for clock-names per Krzysztof's
>     review.
>   - Refine allOf block to strictly enforce clock constraints.
> 
> Changes compared with v4:
> - Refines the device tree bindings for JPEG decoder and encoder.
>   - Patches 1/3 (dt-bindings: decoder):
>     Moved the standalone compatible string mediatek,mt8189-jpgdec
>     into the first oneOf entry along with mt2701 and mt8173, as
>     suggested by Rob Herring. This correctly groups all independent
>     ICs and removes the redundant items wrapper.
>   - Patches 2/3 (dt-bindings: encoder):
>     Applied the same logic suggested by Rob Herring to the encoder
>     binding. Restructured the compatible property to clearly
>     distinguish between the standalone IC (mediatek,mt8189-jpgenc)
>     and the ICs that must fallback to mediatek,mtk-jpgenc.
> 
> Changes compared with v3:
> - The v4 is resending the cover-letter, because the v3 cover-letter was
>   not sent successfully.
> 
> Changes compared with v2:
> - Dropped the dts patch (arm64: dts: mt8188: update JPEG encoder/decoder
>   compatible) as it belongs to a different tree/series.
> - Patches 1/3 (dt-bindings: decoder):
>   - Changed the MT8189 compatible to be a standalone `const` instead of
>     an `enum`.
>   - Added an `allOf` block with conditional checks to enforce the single
>     clock ("jpgdec") requirement for MT8189, while preserving the
>     two-clock requirement for older SoCs.
>   - Updated commit message to reflect the schema structure changes and
>     hardware differences.
> - Patches 2/3 (dt-bindings: encoder):
>   - Changed the MT8189 compatible to be a standalone `const` instead of
>     an `enum` inside the `items` list, as it does not fallback to
>     "mediatek,mtk-jpgenc" due to 34-bit IOVA requirements.
>   - Updated commit message to explain the standalone compatible design.
> - Patches 3/3 (media: mediatek: jpeg):
>   - Refined commit message for better clarity regarding 34-bit IOVA and
>     single clock configuration.
> 
> Changes compared with v1:
> - Patches 1/4:
>   - Updating commit message
> - Patches 2/4, 3/4: 
>   - Updating commit message
>   - Adjusted property descriptions acorrding to hardware requirements
>   - Improved formatting for better readability and consistency
> - Patches 4/4:
>   - Updating commit message
> 
> Jianhua Lin (3):
>   dt-bindings: media: mediatek-jpeg-decoder: add MT8189 compatible
>     string
>   dt-bindings: media: mediatek-jpeg-encoder: add MT8189 compatible
>     string
>   media: mediatek: jpeg: add compatible for MT8189 SoC
> 
>  .../bindings/media/mediatek-jpeg-decoder.yaml | 48 +++++++++++++++----
>  .../bindings/media/mediatek-jpeg-encoder.yaml | 29 ++++++++---
>  .../platform/mediatek/jpeg/mtk_jpeg_core.c    | 44 +++++++++++++++++
>  3 files changed, 107 insertions(+), 14 deletions(-)

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/7] dt-bindings: thermal: Add Google GS101 TMU
From: Tudor Ambarus @ 2026-04-17 13:28 UTC (permalink / raw)
  To: Alexey Klimov, Rafael J. Wysocki, Daniel Lezcano, Zhang Rui,
	Lukasz Luba, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Krzysztof Kozlowski, Alim Akhtar, Bartlomiej Zolnierkiewicz,
	Kees Cook, Gustavo A. R. Silva, Peter Griffin, André Draszik
  Cc: willmcvicker, jyescas, shin.son, linux-samsung-soc, linux-kernel,
	linux-pm, devicetree, linux-arm-kernel, linux-hardening
In-Reply-To: <DGUJIFLIOK7Y.1Q4PZQU3MOWTT@linaro.org>



On 3/5/26 5:48 AM, Alexey Klimov wrote:
> Hi Tudor,
> 
> On Mon Jan 19, 2026 at 12:08 PM GMT, Tudor Ambarus wrote:
>> Document the Thermal Management Unit (TMU) found on the Google GS101 SoC.
>>
>> The GS101 TMU utilizes a hybrid control model shared between the
>> Application Processor (AP) and the ACPM (Alive Clock and Power Manager)
>> firmware.
>>
>> While the TMU is a standard memory-mapped IP block, on this platform
> 
> this ^^
> 

okay

cut

> Is it Google TMU hardware block or Exynos/Samsung TMU block?
> 
> My understanding at this point is that ACPM interface, ACPM protocols, etc
> appeared on Samsung SoCs before gs101 (maybe even before initial SCMI
> prototyping). It looks like ACPM firmware, communication via mailboxes,
> TMU channel, dealing with TMU behing ACPM, etc are actually a standard
> Samsung Exynos architectural feature, rather than a Google-specific
> implementation. I can't say though what was the first chipset where it
> was implemented.

autov920, exynos850 too can use the hybrid ACPM TMU approach.
I'll generalize the description.
> 
> Given that this is a Samsung design that predates the gs101, would it
> make sense to use more generic name for this binding to reflect that
> it is Exynos-derived? That would save us from generalizing things later

The name has to match the compatible. We can rename it when other Samsung
compatibles are added.

Cheers,
ta


^ permalink raw reply

* [PATCH v2 3/3] dt-bindings: reserved-memory: Change maintainer for BPMP SHMEM
From: Thierry Reding @ 2026-04-17 13:15 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Aaro Koskinen, Geert Uytterhoeven, linux-tegra, linux-arm-kernel,
	linux-pm, linux-omap, linux-m68k, devicetree, linux-kernel
In-Reply-To: <20260417131549.3154534-1-thierry.reding@kernel.org>

From: Thierry Reding <treding@nvidia.com>

Peter sadly passed away a while ago, so change the maintainers for BPMP
SHMEM to Jon and myself.

Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 .../bindings/reserved-memory/nvidia,tegra264-bpmp-shmem.yaml   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/reserved-memory/nvidia,tegra264-bpmp-shmem.yaml b/Documentation/devicetree/bindings/reserved-memory/nvidia,tegra264-bpmp-shmem.yaml
index 4380f622f9a9..6efadc5f8078 100644
--- a/Documentation/devicetree/bindings/reserved-memory/nvidia,tegra264-bpmp-shmem.yaml
+++ b/Documentation/devicetree/bindings/reserved-memory/nvidia,tegra264-bpmp-shmem.yaml
@@ -7,7 +7,8 @@ $schema: http://devicetree.org/meta-schemas/core.yaml#
 title: Tegra CPU-NS - BPMP IPC reserved memory
 
 maintainers:
-  - Peter De Schrijver <pdeschrijver@nvidia.com>
+  - Thierry Reding <thierry.reding@kernel.org>
+  - Jonathan Hunter <jonathanh@nvidia.com>
 
 description: |
   Define a memory region used for communication between CPU-NS and BPMP.
-- 
2.52.0



^ permalink raw reply related

* [PATCH v2 2/3] Documentation: ABI: Take over as contact for sysfs-driver-tegra-fuse
From: Thierry Reding @ 2026-04-17 13:15 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Aaro Koskinen, Geert Uytterhoeven, linux-tegra, linux-arm-kernel,
	linux-pm, linux-omap, linux-m68k, devicetree, linux-kernel
In-Reply-To: <20260417131549.3154534-1-thierry.reding@kernel.org>

From: Thierry Reding <treding@nvidia.com>

Peter sadly passed away a while ago, so I'll be taking over as contact
for this ABI documentation.

Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 Documentation/ABI/testing/sysfs-driver-tegra-fuse | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-tegra-fuse b/Documentation/ABI/testing/sysfs-driver-tegra-fuse
index b8936fad2ccf..47d5513100f6 100644
--- a/Documentation/ABI/testing/sysfs-driver-tegra-fuse
+++ b/Documentation/ABI/testing/sysfs-driver-tegra-fuse
@@ -1,6 +1,6 @@
 What:		/sys/devices/*/<our-device>/fuse
 Date:		February 2014
-Contact:	Peter De Schrijver <pdeschrijver@nvidia.com>
+Contact:	Thierry Reding <thierry.reding@kernel.org>
 Description:	read-only access to the efuses on Tegra20, Tegra30, Tegra114
 		and Tegra124 SoC's from NVIDIA. The efuses contain write once
 		data programmed at the factory. The data is laid out in 32bit
-- 
2.52.0



^ permalink raw reply related

* [PATCH v2 1/3] MAINTAINERS: Move Peter De Schrijver to CREDITS
From: Thierry Reding @ 2026-04-17 13:15 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Aaro Koskinen, Geert Uytterhoeven, linux-tegra, linux-arm-kernel,
	linux-pm, linux-omap, linux-m68k, devicetree, linux-kernel,
	Paul Walmsley

From: Thierry Reding <treding@nvidia.com>

Peter sadly passed away a while back. Paul did a much better job at
finding the right words to mourn this loss than I ever could, so I will
leave this link here:

  https://lore.kernel.org/lkml/alpine.DEB.2.21.999.2407240345480.11116@utopia.booyaka.com/T/#u

Co-developed-by: Paul Walmsley <pjw@kernel.org>
Co-developed-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Co-developed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Changes in v2:
- add more missing entries

 CREDITS     | 10 ++++++++++
 MAINTAINERS |  1 -
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/CREDITS b/CREDITS
index 885fb05d8816..afd1f70b41cf 100644
--- a/CREDITS
+++ b/CREDITS
@@ -3645,7 +3645,17 @@ D: Macintosh IDE Driver
 
 N: Peter De Schrijver
 E: stud11@cc4.kuleuven.ac.be
+E: p2@mind.be
+E: peter.de-schrijver@nokia.com
+E: pdeschrijver@nvidia.com
+E: p2@psychaos.be
+D: Apollo Domain workstations
+D: Ariadne and Hydra Amiga Ethernet drivers
+D: IBM PS/2, Microchannel, and Token Ring support
 D: Mitsumi CD-ROM driver patches March version
+D: TWL4030 power management and audio codec driver
+D: OMAP power management
+D: NVIDIA Tegra clock and BPMP drivers, among many other things
 S: Molenbaan 29
 S: B2240 Zandhoven
 S: Belgium
diff --git a/MAINTAINERS b/MAINTAINERS
index ef978bfca514..ffe20d770249 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -26145,7 +26145,6 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git
 N:	[^a-z]tegra
 
 TEGRA CLOCK DRIVER
-M:	Peter De Schrijver <pdeschrijver@nvidia.com>
 M:	Prashant Gaikwad <pgaikwad@nvidia.com>
 S:	Supported
 F:	drivers/clk/tegra/
-- 
2.52.0



^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox