* [PATCH 08/18] arm64: fpsimd: Use assembler for baseline SME instructions
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
We currently support assemblers which do not support SME instructions,
and have macros to manually encode SME instructions. This was
necessary historically as SME support was developed before assembler
support was widely available, but things have changed:
* All currently supported versions of LLVM support baseline SME
instructions. Building the kernel requires LLVM 15+, while LLVM 13+
supports SME.
* GNU binutils has supported baseline SME instructions since 2.38, which
was released on 09 February 2022. Toolchains using this or later are
widely available. For example Debian 12 (released on 10 June 2023)
provides binutils 2.40. Toolchains provided kernel.org provide
binutils 2.38+ since the GCC 12.1.0 release (released between 06 May
2022 and 17 August 2022).
* For various reasons, SME support was marked as BROKEN, and re-enabled
in v6.16 (released on 27 July 2025). The earliest support LTS kernel
with SME support is v6.18.y, v6.18 was tagged on 30 November 2025, and
contemporary toolchains (GCC 15.2 and binutils 2.45) supported
baseline SME instructions.
* Any distribution which intends to support SME will presumably have a
toolchain that supports baseline SME instructions such that userspace
can be built.
Considering the above, there's no practical benefit to allowing SME to
be built when the toolchain doesn't support baseline SME instructions.
Make CONFIG_ARM64_SME depend on assembler support for SME, and remove
the manual encoding of SME instructions. The various _sme_<insn> macros
are kept for now, and will be cleaned up in subsequent patches.
A couple of SME2 instructions require a more recent toolchain, and are
left as-is for now. I've looked through releases of binutils and LLVM to
find when support was added, and noted this in a comment.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/Kconfig | 5 ++++
arch/arm64/include/asm/fpsimdmacros.h | 38 +++++++++++----------------
2 files changed, 20 insertions(+), 23 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fe60738e5943b..378e50fef247a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2247,10 +2247,15 @@ config ARM64_SVE
booting the kernel. If unsure and you are not observing these
symptoms, you should assume that it is safe to say Y.
+config AS_HAS_SME
+ # Supported by LLVM 13+ and binutils 2.38+
+ def_bool $(as-instr,.arch_extension sme)
+
config ARM64_SME
bool "ARM Scalable Matrix Extension support"
default y
depends on ARM64_SVE
+ depends on AS_HAS_SME
help
The Scalable Matrix Extension (SME) is an extension to the AArch64
execution state which utilises a substantial subset of the SVE
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 1122eea6daacf..d0bdbbf2d44ad 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -148,46 +148,38 @@
pfalse p\np\().b
.endm
-/* SME instruction encodings for non-SME-capable assemblers */
-/* (pre binutils 2.38/LLVM 13) */
+/* Deprecated macros for SME instructions */
/* RDSVL X\nx, #\imm */
.macro _sme_rdsvl nx, imm
- _check_general_reg \nx
- _check_num (\imm), -0x20, 0x1f
- .inst 0x04bf5800 \
- | (\nx) \
- | (((\imm) & 0x3f) << 5)
+ .arch_extension sme
+ rdsvl x\nx, #\imm
.endm
/*
* STR (vector from ZA array):
- * STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ * STR ZA[W\nw, #\offset], [X\nxbase, #\offset, MUL VL]
*/
.macro _sme_str_zav nw, nxbase, offset=0
- _sme_check_wv \nw
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0xe1200000 \
- | (((\nw) & 3) << 13) \
- | ((\nxbase) << 5) \
- | ((\offset) & 7)
+ .arch_extension sme
+ str za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
.endm
/*
* LDR (vector to ZA array):
- * LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ * LDR ZA[w\nw, #\offset], [X\nxbase, #\offset, MUL VL]
*/
.macro _sme_ldr_zav nw, nxbase, offset=0
- _sme_check_wv \nw
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0xe1000000 \
- | (((\nw) & 3) << 13) \
- | ((\nxbase) << 5) \
- | ((\offset) & 7)
+ .arch_extension sme
+ ldr za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
.endm
+/*
+ * SME2 instruction encodings for older assemblers.
+ * Supported by binutils 2.41+.
+ * Supported by LLVM 16+
+ */
+
/*
* LDR (ZT0)
*
--
2.30.2
^ permalink raw reply related
* [PATCH 09/18] arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
The sve_get_vl() and sme_get_vl() functions are wrappers for the RDVL
and RDSVL instructions respectively. There's no need for those to be
out-of-line.
Replace the out-of-line assembly functions with equivalent inline
functions.
The _sve_rdvl assembly macro is unused, and so it is removed. The
_sme_rdsvl assembly macro is still used elsewhere, and so is kept for
now.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 31 +++++++++++++++++++++++++--
arch/arm64/include/asm/fpsimdmacros.h | 6 ------
arch/arm64/kernel/entry-fpsimd.S | 10 ---------
3 files changed, 29 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 8efa3c0402a7a..36cf528e64971 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -22,6 +22,9 @@
#include <linux/stddef.h>
#include <linux/types.h>
+#define __SVE_PREAMBLE ".arch_extension sve\n"
+#define __SME_PREAMBLE ".arch_extension sme\n"
+
/* Masks for extracting the FPSR and FPCR from the FPSCR */
#define VFP_FPSCR_STAT_MASK 0xf800009f
#define VFP_FPSCR_CTRL_MASK 0x07f79f00
@@ -141,11 +144,23 @@ static inline void *thread_zt_state(struct thread_struct *thread)
return thread->sme_state + ZA_SIG_REGS_SIZE(sme_vq);
}
+static inline unsigned int sve_get_vl(void)
+{
+ unsigned int vl;
+
+ asm volatile(
+ __SVE_PREAMBLE
+ " rdvl %x[vl], #1\n"
+ : [vl] "=r" (vl)
+ );
+
+ return vl;
+}
+
extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
extern void sve_load_state(void const *state, u32 const *pfpsr,
int restore_ffr);
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
-extern unsigned int sve_get_vl(void);
extern void sme_save_state(void *state, int zt);
extern void sme_load_state(void const *state, int zt);
@@ -400,8 +415,20 @@ static inline int sme_max_virtualisable_vl(void)
return vec_max_virtualisable_vl(ARM64_VEC_SME);
}
+static inline unsigned int sme_get_vl(void)
+{
+ unsigned int vl;
+
+ asm volatile(
+ __SME_PREAMBLE
+ " rdsvl %x[vl], #1\n"
+ : [vl] "=r" (vl)
+ );
+
+ return vl;
+}
+
extern void sme_alloc(struct task_struct *task, bool flush);
-extern unsigned int sme_get_vl(void);
extern int sme_set_current_vl(unsigned long arg);
extern int sme_get_current_vl(void);
extern void sme_suspend_exit(void);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index d0bdbbf2d44ad..d75c9d4c9989b 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -125,12 +125,6 @@
ldr p\np, [x\nxbase, #\offset, MUL VL]
.endm
-/* RDVL X\nx, #\imm */
-.macro _sve_rdvl nx, imm
- .arch_extension sve
- rdvl x\nx, #\imm
-.endm
-
/* RDFFR (unpredicated): RDFFR P\np.B */
.macro _sve_rdffr np
.arch_extension sve
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 88c555745b584..7f2d31dff8c17 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -57,11 +57,6 @@ SYM_FUNC_START(sve_load_state)
ret
SYM_FUNC_END(sve_load_state)
-SYM_FUNC_START(sve_get_vl)
- _sve_rdvl 0, 1
- ret
-SYM_FUNC_END(sve_get_vl)
-
/*
* Zero all SVE registers but the first 128-bits of each vector
*
@@ -84,11 +79,6 @@ SYM_FUNC_END(sve_flush_live)
#ifdef CONFIG_ARM64_SME
-SYM_FUNC_START(sme_get_vl)
- _sme_rdsvl 0, 1
- ret
-SYM_FUNC_END(sme_get_vl)
-
/*
* Save the ZA and ZT state
*
--
2.30.2
^ permalink raw reply related
* [PATCH 06/18] arm64: fpsimd: Remove sve_set_vq() and sme_set_vq()
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
The sve_set_vq() and sme_set_vq() assembly functions (and the
sve_load_vq and sme_load_vq macros they use) are open-coded forms of
sysreg_clear_set*(). There's no need for these to be implemented
out-of-line in assembly, and the 'vq_minus_1' argument is unusual and
confusing.
Use sysreg_clear_set_s() directly, where the necessary 'vq - 1' encoding
is more obviously part of encoding the register value.
For now, sve_flush_live() is left with the unusual vq_minus_1 argument.
This will be addressed in subsequent patches.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 2 --
arch/arm64/include/asm/fpsimdmacros.h | 22 ----------------------
arch/arm64/kernel/entry-fpsimd.S | 10 ----------
arch/arm64/kernel/fpsimd.c | 24 +++++++++++++-----------
4 files changed, 13 insertions(+), 45 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index d9d00b45ab115..8efa3c0402a7a 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -146,8 +146,6 @@ extern void sve_load_state(void const *state, u32 const *pfpsr,
int restore_ffr);
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
extern unsigned int sve_get_vl(void);
-extern void sve_set_vq(unsigned long vq_minus_1);
-extern void sme_set_vq(unsigned long vq_minus_1);
extern void sme_save_state(void *state, int zt);
extern void sme_load_state(void const *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index cda81d009c9bd..adf33d2da40c3 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -265,28 +265,6 @@
.purgem _for__body
.endm
-/* Update ZCR_EL1.LEN with the new VQ */
-.macro sve_load_vq xvqminus1, xtmp, xtmp2
- mrs_s \xtmp, SYS_ZCR_EL1
- bic \xtmp2, \xtmp, ZCR_ELx_LEN_MASK
- orr \xtmp2, \xtmp2, \xvqminus1
- cmp \xtmp2, \xtmp
- b.eq 921f
- msr_s SYS_ZCR_EL1, \xtmp2 //self-synchronising
-921:
-.endm
-
-/* Update SMCR_EL1.LEN with the new VQ */
-.macro sme_load_vq xvqminus1, xtmp, xtmp2
- mrs_s \xtmp, SYS_SMCR_EL1
- bic \xtmp2, \xtmp, SMCR_ELx_LEN_MASK
- orr \xtmp2, \xtmp2, \xvqminus1
- cmp \xtmp2, \xtmp
- b.eq 921f
- msr_s SYS_SMCR_EL1, \xtmp2 //self-synchronising
-921:
-.endm
-
/* Preserve the first 128-bits of Znz and zero the rest. */
.macro _sve_flush_z nz
_sve_check_zreg \nz
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 6325db1a2179c..88c555745b584 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -62,11 +62,6 @@ SYM_FUNC_START(sve_get_vl)
ret
SYM_FUNC_END(sve_get_vl)
-SYM_FUNC_START(sve_set_vq)
- sve_load_vq x0, x1, x2
- ret
-SYM_FUNC_END(sve_set_vq)
-
/*
* Zero all SVE registers but the first 128-bits of each vector
*
@@ -94,11 +89,6 @@ SYM_FUNC_START(sme_get_vl)
ret
SYM_FUNC_END(sme_get_vl)
-SYM_FUNC_START(sme_set_vq)
- sme_load_vq x0, x1, x2
- ret
-SYM_FUNC_END(sme_set_vq)
-
/*
* Save the ZA and ZT state
*
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index a8395cb303344..2578c2372c89e 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -377,8 +377,10 @@ static void task_fpsimd_load(void)
if (!thread_sm_enabled(¤t->thread))
WARN_ON_ONCE(!test_and_set_thread_flag(TIF_SVE));
- if (test_thread_flag(TIF_SVE))
- sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1);
+ if (test_thread_flag(TIF_SVE)) {
+ unsigned long vq = sve_vq_from_vl(task_get_sve_vl(current));
+ sysreg_clear_set_s(SYS_ZCR_EL1, ZCR_ELx_LEN, vq - 1);
+ }
restore_sve_regs = true;
restore_ffr = true;
@@ -403,8 +405,10 @@ static void task_fpsimd_load(void)
unsigned long sme_vl = task_get_sme_vl(current);
/* Ensure VL is set up for restoring data */
- if (test_thread_flag(TIF_SME))
- sme_set_vq(sve_vq_from_vl(sme_vl) - 1);
+ if (test_thread_flag(TIF_SME)) {
+ unsigned long vq = sve_vq_from_vl(sme_vl);
+ sysreg_clear_set_s(SYS_SMCR_EL1, SMCR_ELx_LEN, vq - 1);
+ }
write_sysreg_s(current->thread.svcr, SYS_SVCR);
@@ -1332,10 +1336,9 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
* any effective streaming mode SVE state.
*/
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
- unsigned long vq_minus_one =
- sve_vq_from_vl(task_get_sve_vl(current)) - 1;
- sve_set_vq(vq_minus_one);
- sve_flush_live(true, vq_minus_one);
+ unsigned long vq = sve_vq_from_vl(task_get_sve_vl(current));
+ sysreg_clear_set_s(SYS_ZCR_EL1, ZCR_ELx_LEN, vq - 1);
+ sve_flush_live(true, vq - 1);
fpsimd_bind_task_to_cpu();
} else {
fpsimd_to_sve(current);
@@ -1465,9 +1468,8 @@ void do_sme_acc(unsigned long esr, struct pt_regs *regs)
WARN_ON(1);
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
- unsigned long vq_minus_one =
- sve_vq_from_vl(task_get_sme_vl(current)) - 1;
- sme_set_vq(vq_minus_one);
+ unsigned long vq = sve_vq_from_vl(task_get_sme_vl(current));
+ sysreg_clear_set_s(SYS_SMCR_EL1, SMCR_ELx_LEN, vq - 1);
fpsimd_bind_task_to_cpu();
} else {
--
2.30.2
^ permalink raw reply related
* [PATCH 07/18] arm64: fpsimd: Use assembler for SVE instructions
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Historically we supported assemblers which could not assemble SVE
instructions. We dropped support for such assemblers in commit:
118c40b7b503 ("kbuild: require gcc-8 and binutils-2.30")
Since that commit, all supported assemblers (binutils and LLVM) are
capable of assembling SVE instructions, and there's no need for us to
manually encode SVE instructions.
Rely on the assembler to encode SVE instructions, and remove the manual
encoding. The various _sve_<insn> macros are kept for now, and will be
cleaned up in subsequent patches.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimdmacros.h | 64 +++++++--------------------
1 file changed, 16 insertions(+), 48 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index adf33d2da40c3..1122eea6daacf 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -99,85 +99,53 @@
.endif
.endm
-/* SVE instruction encodings for non-SVE-capable assemblers */
-/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
+/* Deprecated macros for SVE instructions */
/* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */
.macro _sve_str_v nz, nxbase, offset=0
- _sve_check_zreg \nz
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0xe5804000 \
- | (\nz) \
- | ((\nxbase) << 5) \
- | (((\offset) & 7) << 10) \
- | (((\offset) & 0x1f8) << 13)
+ .arch_extension sve
+ str z\nz, [X\nxbase, #\offset, MUL VL]
.endm
/* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */
.macro _sve_ldr_v nz, nxbase, offset=0
- _sve_check_zreg \nz
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0x85804000 \
- | (\nz) \
- | ((\nxbase) << 5) \
- | (((\offset) & 7) << 10) \
- | (((\offset) & 0x1f8) << 13)
+ .arch_extension sve
+ ldr z\nz, [X\nxbase, #\offset, MUL VL]
.endm
/* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */
.macro _sve_str_p np, nxbase, offset=0
- _sve_check_preg \np
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0xe5800000 \
- | (\np) \
- | ((\nxbase) << 5) \
- | (((\offset) & 7) << 10) \
- | (((\offset) & 0x1f8) << 13)
+ .arch_extension sve
+ str p\np, [X\nxbase, #\offset, MUL VL]
.endm
/* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */
.macro _sve_ldr_p np, nxbase, offset=0
- _sve_check_preg \np
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0x85800000 \
- | (\np) \
- | ((\nxbase) << 5) \
- | (((\offset) & 7) << 10) \
- | (((\offset) & 0x1f8) << 13)
+ .arch_extension sve
+ ldr p\np, [x\nxbase, #\offset, MUL VL]
.endm
/* RDVL X\nx, #\imm */
.macro _sve_rdvl nx, imm
- _check_general_reg \nx
- _check_num (\imm), -0x20, 0x1f
- .inst 0x04bf5000 \
- | (\nx) \
- | (((\imm) & 0x3f) << 5)
+ .arch_extension sve
+ rdvl x\nx, #\imm
.endm
/* RDFFR (unpredicated): RDFFR P\np.B */
.macro _sve_rdffr np
- _sve_check_preg \np
- .inst 0x2519f000 \
- | (\np)
+ .arch_extension sve
+ rdffr p\np\().b
.endm
/* WRFFR P\np.B */
.macro _sve_wrffr np
- _sve_check_preg \np
- .inst 0x25289000 \
- | ((\np) << 5)
+ wrffr p\np\().b
.endm
/* PFALSE P\np.B */
.macro _sve_pfalse np
- _sve_check_preg \np
- .inst 0x2518e400 \
- | (\np)
+ .arch_extension sve
+ pfalse p\np\().b
.endm
/* SME instruction encodings for non-SME-capable assemblers */
--
2.30.2
^ permalink raw reply related
* [PATCH 03/18] KVM: arm64: pkvm: Save host FPMR in host cpu context
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Protected KVM stores most of the host's system register state in
kvm_host_data::host_ctxt, which is an instance of struct
kvm_cpu_context. As kvm_cpu_context::sys_regs[] has a slot for FPMR, we
can store the host's FPMR there.
Do so, and remove kvm_host_data::fpmr.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/kvm_host.h | 3 ---
arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ++++--
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 5 +++--
3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 65eead8362e0b..42b1c4764a4bf 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -775,9 +775,6 @@ struct kvm_host_data {
*/
struct cpu_sve_state *sve_state;
- /* Used by pKVM only. */
- u64 fpmr;
-
/* Ownership of the FP regs */
enum {
FP_STATE_FREE,
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 98b2976837b11..cc4d011a2b380 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -554,6 +554,8 @@ static inline void fpsimd_lazy_switch_to_host(struct kvm_vcpu *vcpu)
static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
{
+ struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+
/*
* Non-protected kvm relies on the host restoring its sve state.
* Protected kvm restores the host's sve state as not to reveal that
@@ -562,11 +564,11 @@ static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
if (system_supports_sve()) {
__hyp_sve_save_host();
} else {
- __fpsimd_save_state(host_data_ptr(host_ctxt.fp_regs));
+ __fpsimd_save_state(&hctxt->fp_regs);
}
if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))
- *host_data_ptr(fpmr) = read_sysreg_s(SYS_FPMR);
+ ctxt_sys_reg(hctxt, FPMR) = read_sysreg_s(SYS_FPMR);
}
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 06db299c37a89..db60f770060e5 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -66,6 +66,7 @@ static void fpsimd_sve_flush(void)
static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
{
+ struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
bool has_fpmr;
if (!guest_owns_fp_regs())
@@ -89,10 +90,10 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
if (system_supports_sve())
__hyp_sve_restore_host();
else
- __fpsimd_restore_state(host_data_ptr(host_ctxt.fp_regs));
+ __fpsimd_restore_state(&hctxt->fp_regs);
if (has_fpmr)
- write_sysreg_s(*host_data_ptr(fpmr), SYS_FPMR);
+ write_sysreg_s(ctxt_sys_reg(hctxt, FPMR), SYS_FPMR);
*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;
}
--
2.30.2
^ permalink raw reply related
* [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc()
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
For historical reasons, do_sve_acc() is structurally different from
do_sme_acc(), and the logic to convert the task from FPSIMD to SVE is
out-of-line in sve_init_regs(). We only use sve_init_regs() within
do_sme_acc(), so it's not necessary for this to be a separate function.
Fold sve_init_regs() into do_sve_acc(), and simplify the associated
comments. This makes do_sve_acc() structurally similar to do_sme_acc(),
making it easier to see similarities and differences.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kernel/fpsimd.c | 48 ++++++++++++++------------------------
1 file changed, 17 insertions(+), 31 deletions(-)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 60a45d600b460..a8395cb303344 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1293,31 +1293,6 @@ void sme_suspend_exit(void)
#endif /* CONFIG_ARM64_SME */
-static void sve_init_regs(void)
-{
- /*
- * Convert the FPSIMD state to SVE, zeroing all the state that
- * is not shared with FPSIMD. If (as is likely) the current
- * state is live in the registers then do this there and
- * update our metadata for the current task including
- * disabling the trap, otherwise update our in-memory copy.
- * We are guaranteed to not be in streaming mode, we can only
- * take a SVE trap when not in streaming mode and we can't be
- * in streaming mode when taking a SME trap.
- */
- if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
- unsigned long vq_minus_one =
- sve_vq_from_vl(task_get_sve_vl(current)) - 1;
- sve_set_vq(vq_minus_one);
- sve_flush_live(true, vq_minus_one);
- fpsimd_bind_task_to_cpu();
- } else {
- fpsimd_to_sve(current);
- current->thread.fp_type = FP_STATE_SVE;
- fpsimd_flush_task_state(current);
- }
-}
-
/*
* Trapped SVE access
*
@@ -1349,13 +1324,24 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
WARN_ON(1); /* SVE access shouldn't have trapped */
/*
- * Even if the task can have used streaming mode we can only
- * generate SVE access traps in normal SVE mode and
- * transitioning out of streaming mode may discard any
- * streaming mode state. Always clear the high bits to avoid
- * any potential errors tracking what is properly initialised.
+ * Convert the FPSIMD state to SVE. Stale SVE state can be present in
+ * registers or memory, so we must zero all state that is not shared
+ * with FPSIMD.
+ *
+ * SVE traps cannot be taken from streaming mode, so there cannot be
+ * any effective streaming mode SVE state.
*/
- sve_init_regs();
+ if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
+ unsigned long vq_minus_one =
+ sve_vq_from_vl(task_get_sve_vl(current)) - 1;
+ sve_set_vq(vq_minus_one);
+ sve_flush_live(true, vq_minus_one);
+ fpsimd_bind_task_to_cpu();
+ } else {
+ fpsimd_to_sve(current);
+ current->thread.fp_type = FP_STATE_SVE;
+ fpsimd_flush_task_state(current);
+ }
put_cpu_fpsimd_context();
}
--
2.30.2
^ permalink raw reply related
* [PATCH 04/18] KVM: arm64: pkvm: Remove struct cpu_sve_state
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
There's no need for struct cpu_sve_state. Code would be simpler and more
robust without it, and removing it will simplify further cleanups (e.g.
adding an opaque type for the sve register state).
Protected KVM stores most of the host's system register state in
kvm_host_data::host_ctxt, which is an instance of struct
kvm_cpu_context. As kvm_cpu_context::sys_regs[] has a slot for ZCR_EL1,
we can store the host's ZCR_EL1 there.
While kvm_cpu_context::sys_regs doesn't have slots for FPSR and FPCR,
these are usually expected to be stored in struct user_fpsimd_state.
For historical reasons, __sve_save_state and __sve_restore_state()
expect a pointer to fpsr *within* struct user_fpsimd_state, assuming the
fpcr will immediately follow, as per the order within struct
user_fpsimd_state. We currently match this ordering in struct
cpu_sve_state, but it would be simpler and more robust to use struct
user_fpsimd_state directly.
After moving ZCR_EL1, FPSR, and FPCR out of struct cpu_sve_state, all
that's left is sve_regs, which can be represented as a pointer without
need for a container struct. This is kept as a pointer to u8 (matching
the array type), as this permits the compiler to catch unbalanced
referencing/dereferencing, which is not possible for pointers to void.
Apply the above changes, and remove cpu_sve_state.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/kvm_host.h | 18 ++----------------
arch/arm64/include/asm/kvm_pkvm.h | 3 +--
arch/arm64/kvm/arm.c | 16 ++++++++--------
arch/arm64/kvm/hyp/include/hyp/switch.h | 9 +++++----
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 9 +++++----
arch/arm64/kvm/hyp/nvhe/setup.c | 4 ++--
6 files changed, 23 insertions(+), 36 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 42b1c4764a4bf..ae24617380b8f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -732,20 +732,6 @@ struct kvm_cpu_context {
u64 *vncr_array;
};
-struct cpu_sve_state {
- __u64 zcr_el1;
-
- /*
- * Ordering is important since __sve_save_state/__sve_restore_state
- * relies on it.
- */
- __u32 fpsr;
- __u32 fpcr;
-
- /* Must be SVE_VQ_BYTES (128 bit) aligned. */
- __u8 sve_regs[];
-};
-
/*
* This structure is instantiated on a per-CPU basis, and contains
* data that is:
@@ -771,9 +757,9 @@ struct kvm_host_data {
/*
* Hyp VA.
- * sve_state is only used in pKVM and if system_supports_sve().
+ * sve_regs is only used in pKVM and if system_supports_sve().
*/
- struct cpu_sve_state *sve_state;
+ u8 *sve_regs;
/* Ownership of the FP regs */
enum {
diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index 2954b311128c7..74fedd9c5ff02 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -188,8 +188,7 @@ static inline size_t pkvm_host_sve_state_size(void)
if (!system_supports_sve())
return 0;
- return size_add(sizeof(struct cpu_sve_state),
- SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl)));
+ return SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl));
}
struct pkvm_mapping {
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 8bb2c7422cc8b..f9fc85a0344e1 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2499,10 +2499,10 @@ static void __init teardown_hyp_mode(void)
continue;
if (free_sve) {
- struct cpu_sve_state *sve_state;
+ u8 *sve_regs;
- sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;
- free_pages((unsigned long) sve_state, pkvm_host_sve_state_order());
+ sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
+ free_pages((unsigned long) sve_regs, pkvm_host_sve_state_order());
}
free_pages(kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu], nvhe_percpu_order());
@@ -2627,7 +2627,7 @@ static int init_pkvm_host_sve_state(void)
if (!page)
return -ENOMEM;
- per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state = page_address(page);
+ per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs = page_address(page);
}
/*
@@ -2648,11 +2648,11 @@ static void finalize_init_hyp_mode(void)
if (system_supports_sve() && is_protected_kvm_enabled()) {
for_each_possible_cpu(cpu) {
- struct cpu_sve_state *sve_state;
+ u8 *sve_regs;
- sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;
- per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state =
- kern_hyp_va(sve_state);
+ sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
+ per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs =
+ kern_hyp_va(sve_regs);
}
}
}
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index cc4d011a2b380..6512dd3f75ae4 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -484,12 +484,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
static inline void __hyp_sve_save_host(void)
{
- struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
+ struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+ u8 *sve_regs = *host_data_ptr(sve_regs);
- sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);
+ ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
- __sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
- &sve_state->fpsr,
+ __sve_save_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
+ &hctxt->fp_regs.fpsr,
true);
}
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index db60f770060e5..04a6d2e0ea73f 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -41,7 +41,8 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
static void __hyp_sve_restore_host(void)
{
- struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
+ struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+ u8 *sve_regs = *host_data_ptr(sve_regs);
/*
* On saving/restoring host sve state, always use the maximum VL for
@@ -53,10 +54,10 @@ static void __hyp_sve_restore_host(void)
* need to be revisited.
*/
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
- __sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
- &sve_state->fpsr,
+ __sve_restore_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
+ &hctxt->fp_regs.fpsr,
true);
- write_sysreg_el1(sve_state->zcr_el1, SYS_ZCR);
+ write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
}
static void fpsimd_sve_flush(void)
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index d461981616d90..cdaf53c833409 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -82,9 +82,9 @@ static int pkvm_create_host_sve_mappings(void)
for (i = 0; i < hyp_nr_cpus; i++) {
struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);
- struct cpu_sve_state *sve_state = host_data->sve_state;
+ u8 *sve_regs = host_data->sve_regs;
- start = kern_hyp_va(sve_state);
+ start = kern_hyp_va(sve_regs);
end = start + PAGE_ALIGN(pkvm_host_sve_state_size());
ret = pkvm_create_mappings(start, end, PAGE_HYP);
if (ret)
--
2.30.2
^ permalink raw reply related
* [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
The __sve_save_state() and __sve_restore_state() functions take a
parameter describing whether to save/restore the FFR, but both functions
silently override this with '1'. This has always been benign (and
callers have all passed 'true' since the parameter was introduced), but
clearly this is not intentional.
Historically, the functions always saved/restored the FFR, and there was
no parameter to control this.
In v5.16, the sve_save and sve_load assembly macros used by
__sve_save_state() and __sve_restore_state() were changed to make
saving/restoring FFR optional. The implementations of __sve_save_state()
and __sve_restore_state() were changed to pass '1' to their respective
macros, and the prototypes of __sve_save_state() and
__sve_restore_state() were unchanged. See commit:
9f5848665788 ("arm64/sve: Make access to FFR optional")
In v6.10, the prototypes of __sve_save_state() and __sve_restore_state()
were changed to add 'save_ffr' and 'restore_ffr' parameters
respectively, but the implementations were not changed to stop passing 1
to their respective macros. All callers were changed to pass 'true' to
__sve_save_state() and __sve_restore_state(). See commit:
45f4ea9bcfe9 ("KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_state")
This is all benign, but clearly unintentional, and it gets in the way of
cleaning up the FPSIMD/SVE/SME code. Remove the unnecessary overriding.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kvm/hyp/fpsimd.S | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
index e950875e31cee..6e16cbfc5df27 100644
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ b/arch/arm64/kvm/hyp/fpsimd.S
@@ -21,13 +21,11 @@ SYM_FUNC_START(__fpsimd_restore_state)
SYM_FUNC_END(__fpsimd_restore_state)
SYM_FUNC_START(__sve_restore_state)
- mov x2, #1
sve_load 0, x1, x2, 3
ret
SYM_FUNC_END(__sve_restore_state)
SYM_FUNC_START(__sve_save_state)
- mov x2, #1
sve_save 0, x1, x2, 3
ret
SYM_FUNC_END(__sve_save_state)
--
2.30.2
^ permalink raw reply related
* [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
Hi.
This series cleans up low-level FPSIMD/SVE/SME state management code,
making it easier to maintain and extend (e.g. adding SME support to
KVM), and enabling better debugging (e.g. by making SVE/SME save/restore
visible to KASAN and KCSAN).
This is purely cleanup, there are NO bugs addressed by this series.
The series aims to do a few key things:
* Make it harder to mis-manage in-memory SVE state and SME state. These
are given opaque types (struct sve_state and struct sme_state), and
the (awkward) calling convention for saving/restoring SVE state is
simplified to take a pointer to the base of the state rather than a
pointer to the FFR within the state.
* Minimize duplications between KVM and the rest of the kernel. The
FPSIMD/SVE/SME routines are moved to inline assembly such that the
same helper functions can be used everywhere, without the need to wrap
assembly macros.
* Make the code easier to follow. Assembly sequences are minimized to
avoid address generation and control-flow that can be written more
clearly in C. Awkward assembly macros are removed where possible.
* Make it easier to debug state management. Explicit instrumentation is
added to the save/restore functions so that KASAN and KCSAN can detect
memory safety issues and concurrency issues.
This instrumentation is inhibited for nVHE hyp objects, and does not
adversely affect KVM. I've confirmed by looking at compiler flags
during the build, and disassembling the relevant object files.
* Remove unnecessary code. By relying on assembler support for SVE and
SME we can remove awkward assembly macros, making the code
significantly simpler and easier to read.
I've compile-tested this with a variety of toolchains:
* GCC 8.1.0 + binutils 2.30
* GCC 11.1.0 + binutils 2.36.1
* GCC 12.1.0 + binutils 2.38
* GCC 15.2.0 + binutils 2.45
* LLVM 15.0.7
* LLVM 21.1.8
I've boot-tested on an SVE+SME capable model, both with KASAN enabled
and without KASAN enabled. All the FPSIMD/SVE/SME kselftests passed in
both configurations, without any KASAN splats. Unfortunately, with KCSAN
enabled, some tests hit timeouts (without any KCSAN splat), which I
believe is simply due to the overhead of KCSAN rather than some adverse
functional effect.
I've boot-tested on an SVE+SME capable model, booting with KVM in each
of:
* VHE mode
* hVHE mode
* Protected mode
In each case I've boot-tested a v7.0 defconfig guest, both with SVE and
without SVE.
Mark.
Mark Rutland (18):
KVM: arm64: Don't include <asm/fpsimdmacros.h>
KVM: arm64: Don't override FFR save/restore argument
KVM: arm64: pkvm: Save host FPMR in host cpu context
KVM: arm64: pkvm: Remove struct cpu_sve_state
arm64: fpsimd: Fold sve_init_regs() into do_sve_acc()
arm64: fpsimd: Remove sve_set_vq() and sme_set_vq()
arm64: fpsimd: Use assembler for SVE instructions
arm64: fpsimd: Use assembler for baseline SME instructions
arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline
arm64: sysreg: Add FPCR and FPSR
arm64: fpsimd: Split FPSR/FPCR from SVE save/restore
arm64: fpsimd: Move fpsimd save/restore inline
arm64: fpsimd: Use opaque type for SVE state
arm64: fpsimd: Use opaque type for SME state
arm64: fpsimd: Move SVE save/restore inline
arm64: fpsimd: Move sve_flush_live() inline
arm64: fpsimd: Move SME save/restore inline
arm64: fpsimd: Remove <asm/fpsimdmacros.h>
arch/arm64/Kconfig | 5 +
arch/arm64/include/asm/fpsimd.h | 369 ++++++++++++++++++++++--
arch/arm64/include/asm/fpsimdmacros.h | 357 -----------------------
arch/arm64/include/asm/kvm_host.h | 27 +-
arch/arm64/include/asm/kvm_hyp.h | 5 -
arch/arm64/include/asm/kvm_pkvm.h | 3 +-
arch/arm64/include/asm/processor.h | 7 +-
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/entry-common.c | 8 +-
arch/arm64/kernel/entry-fpsimd.S | 134 ---------
arch/arm64/kernel/fpsimd.c | 90 +++---
arch/arm64/kvm/arm.c | 16 +-
arch/arm64/kvm/guest.c | 4 +-
arch/arm64/kvm/hyp/entry.S | 1 -
arch/arm64/kvm/hyp/fpsimd.S | 33 ---
arch/arm64/kvm/hyp/include/hyp/switch.h | 23 +-
arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 20 +-
arch/arm64/kvm/hyp/nvhe/setup.c | 4 +-
arch/arm64/kvm/hyp/vhe/Makefile | 2 +-
arch/arm64/tools/sysreg | 45 +++
21 files changed, 480 insertions(+), 677 deletions(-)
delete mode 100644 arch/arm64/include/asm/fpsimdmacros.h
delete mode 100644 arch/arm64/kernel/entry-fpsimd.S
delete mode 100644 arch/arm64/kvm/hyp/fpsimd.S
--
2.30.2
^ permalink raw reply
* [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h>
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
There's no need for hyp/entry.S to include <asm/fpsimdmacros.h>.
The fpsimd macros have never been used by code in hyp/entry.S, and were
instead used by code in hyp/fpsimd.S.
Remove the unnecessary include.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kvm/hyp/entry.S | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 11a10d8f5beb2..308100ed25de9 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -8,7 +8,6 @@
#include <asm/alternative.h>
#include <asm/assembler.h>
-#include <asm/fpsimdmacros.h>
#include <asm/kvm.h>
#include <asm/kvm_arm.h>
#include <asm/kvm_asm.h>
--
2.30.2
^ permalink raw reply related
* Re: [PATCH v22 08/13] mfd: core: Add firmware-node support to MFD cells
From: Lee Jones @ 2026-05-21 13:24 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Shivendra Pratap, Sebastian Reichel, Mark Rutland,
Lorenzo Pieralisi, Rafael J. Wysocki, Daniel Lezcano,
Christian Loehle, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Bjorn Andersson, Konrad Dybcio, Arnd Bergmann,
Souvik Chakravarty, Andy Yan, Matthias Brugger, John Stultz,
Moritz Fischer, Sudeep Holla, linux-pm, linux-kernel,
linux-arm-msm, linux-arm-kernel, devicetree, Florian Fainelli,
Krzysztof Kozlowski, Dmitry Baryshkov, Mukesh Ojha, Andre Draszik,
Greg Kroah-Hartman, Kathiravan Thirumoorthy, Srinivas Kandagatla,
Bartosz Golaszewski
In-Reply-To: <CAMRc=MfqaCjiALZyVBHQs=Taft1M9xmNTFvQHWPrd5PgcTfJDQ@mail.gmail.com>
On Thu, 21 May 2026, Bartosz Golaszewski wrote:
> On Thu, May 21, 2026 at 1:26 PM Lee Jones <lee@kernel.org> wrote:
> >
> > On Thu, 14 May 2026, Shivendra Pratap wrote:
> >
> > > MFD core has no way to register a child device using an explicit firmware
> > > node. This prevents drivers from registering child nodes when those nodes
> > > do not define a compatible string. One such example is the PSCI
> > > "reboot-mode" node, which omits a compatible string as it describes
> > > boot-states provided by the underlying firmware.
> > >
> > > Extend struct mfd_cell with a callback that allows drivers to provide an
> > > explicit firmware node. The node is added to the MFD child device during
> > > registration when none is assigned by device tree, ACPI, or software
> > > matching.
> > >
> > > Suggested-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
> > > Signed-off-by: Shivendra Pratap <shivendra.pratap@oss.qualcomm.com>
> > > ---
> > > drivers/mfd/mfd-core.c | 30 ++++++++++++++++++++++++++++++
> > > include/linux/mfd/core.h | 14 ++++++++++++++
> > > 2 files changed, 44 insertions(+)
> > >
> > > diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c
> > > index 7aa32b90cf1eb7fa0a05bf3dc506e60a262c9850..cc2a2a924d6d3044e29a9f864b536ee325ed797b 100644
> > > --- a/drivers/mfd/mfd-core.c
> > > +++ b/drivers/mfd/mfd-core.c
> > > @@ -10,6 +10,7 @@
> > > #include <linux/kernel.h>
> > > #include <linux/platform_device.h>
> > > #include <linux/acpi.h>
> > > +#include <linux/fwnode.h>
> > > #include <linux/list.h>
> > > #include <linux/property.h>
> > > #include <linux/mfd/core.h>
> > > @@ -148,6 +149,11 @@ static int mfd_match_of_node_to_dev(struct platform_device *pdev,
> > > return 0;
> > > }
> > >
> > > +static void mfd_child_fwnode_put(void *data)
> > > +{
> > > + fwnode_handle_put(data);
> > > +}
> > > +
> > > static int mfd_add_device(struct device *parent, int id,
> > > const struct mfd_cell *cell,
> > > struct resource *mem_base,
> > > @@ -156,6 +162,7 @@ static int mfd_add_device(struct device *parent, int id,
> > > struct resource *res;
> > > struct platform_device *pdev;
> > > struct mfd_of_node_entry *of_entry, *tmp;
> > > + struct fwnode_handle *fwnode;
> > > bool disabled = false;
> > > int ret = -ENOMEM;
> > > int platform_id;
> > > @@ -224,6 +231,29 @@ static int mfd_add_device(struct device *parent, int id,
> > >
> > > mfd_acpi_add_device(cell, pdev);
> > >
> > > + if (!pdev->dev.fwnode && cell->get_child_fwnode) {
> > > + fwnode = cell->get_child_fwnode(parent);
> > > + if (fwnode) {
> > > + device_set_node(&pdev->dev, fwnode);
> > > +
> > > + /*
> > > + * platform_device_release() drops only of_node refs.
> > > + * Track non-OF fwnodes explicitly so they are put on
> > > + * all teardown paths.
> > > + */
> > > + if (!to_of_node(fwnode)) {
> > > + ret = devm_add_action(&pdev->dev,
> > > + mfd_child_fwnode_put,
> > > + fwnode);
> > > + if (ret) {
> > > + device_set_node(&pdev->dev, NULL);
> > > + fwnode_handle_put(fwnode);
> > > + goto fail_of_entry;
> > > + }
> > > + }
> > > + }
> > > + }
> >
> > mfd_add_device() is getting very busy now with support for all of these
> > different registration APIs. Suggest that we start breaking them out.
> >
> > > +
> > > if (cell->pdata_size) {
> > > ret = platform_device_add_data(pdev,
> > > cell->platform_data, cell->pdata_size);
> > > diff --git a/include/linux/mfd/core.h b/include/linux/mfd/core.h
> > > index faeea7abd688f223fb0b31cde0a9b69dfe2a61ff..abfc26c057d6ee46947ba2b6f2e99f420e74b127 100644
> > > --- a/include/linux/mfd/core.h
> > > +++ b/include/linux/mfd/core.h
> > > @@ -50,6 +50,7 @@
> > > #define MFD_DEP_LEVEL_HIGH 1
> > >
> > > struct irq_domain;
> > > +struct fwnode_handle;
> > > struct software_node;
> > >
> > > /* Matches ACPI PNP id, either _HID or _CID, or ACPI _ADR */
> > > @@ -80,6 +81,19 @@ struct mfd_cell {
> > >
> > > /* Software node for the device. */
> > > const struct software_node *swnode;
> > > + /*
> > > + * Callback to return an explicit firmware node.
> > > + * @parent: MFD parent device passed to mfd_add_devices().
> > > + *
> > > + * Called only if OF/ACPI matching did not assign a fwnode.
> > > + * Ownership of the returned reference is transferred to MFD core.
> > > + *
> > > + * Return a referenced fwnode or NULL if none is available.
> > > + *
> > > + * mfd_cell must be zero-initialized or get_child_fwnode must be NULL
> > > + * when unused.
> > > + */
> > > + struct fwnode_handle *(*get_child_fwnode)(struct device *parent);
> >
> > I'm very much against pointers to functions if they can be avoided. Why
> > does fwnode need this and none of the other APIs do?
> >
>
> I suggested it because of its flexibility. The alternative I had in
> mind is something like a new field in mfd_cell:
>
> const char *cell_node_name;
>
> Which - if set - would tell MFD to look up an fwnode that's a child of
> the parent device's node by name - as it may not have a compatible.
Remind me why the chlid device can't look-up its own fwnode?
--
Lee Jones
^ permalink raw reply
* Re: [PATCH v2 1/3] KVM: arm64: Reset page order in pKVM hyp_pool_init
From: Vincent Donnefort @ 2026-05-21 13:21 UTC (permalink / raw)
To: Fuad Tabba
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, kvmarm, kernel-team,
qperret, Sashiko
In-Reply-To: <CA+EHjTxPoxjvMTZX5w+UyVgC=W3VUSDoOQ-tCDLfnae16SqoMQ@mail.gmail.com>
On Thu, May 21, 2026 at 02:07:36PM +0100, Fuad Tabba wrote:
> On Thu, 21 May 2026 at 11:22, Vincent Donnefort <vdonnefort@google.com> wrote:
> >
> > When a VM fails to initialise after its stage-2 hyp_pool has been
> > initialised, that stage-2 must be torn down entirely. This requires
> > resetting both the refcount and the order of its pages back to 0.
> >
> > Currently, reclaim_pgtable_pages() implicitly resets the page order by
> > allocating the entire pool with order-0 granularity. However, in the VM
> > initialisation error path, the addresses of the donated memory (the PGD)
> > are already known, making it unnecessary to iterate over all pages in
> > the pool.
> >
> > Since the vmemmap page order is a hyp_pool-specific field, leaving a
> > non-zero order on hyp_pool destruction is harmless until another pool
> > attempts to admit the page. Instead of resetting this field during
> > destruction, reset it during pool initialization in hyp_pool_init().
> > Note that pages added to the pool outside of the initial pool range
> > (e.g., via guest_s2_zalloc_page()) must still have their order managed
> > manually.
> >
> > While at it, add a WARN_ON() in the hyp_pool attach path to catch
> > unexpected page orders that exceed the pool's max_order.
> >
> > Fixes: 256b4668cd89 ("KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization")
> > Reported-by: Sashiko <sashiko-bot@kernel.org>
> > Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > index 25f04629014e..89eb20d4fee4 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > @@ -322,7 +322,6 @@ void reclaim_pgtable_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
> > while (addr) {
> > page = hyp_virt_to_page(addr);
> > page->refcount = 0;
> > - page->order = 0;
> > push_hyp_memcache(mc, addr, hyp_virt_to_phys);
> > WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(addr), 1));
> > addr = hyp_alloc_pages(&vm->pool, 0);
> > diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > index a1eb27a1a747..c3b3dc5a8ea7 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > @@ -97,6 +97,8 @@ static void __hyp_attach_page(struct hyp_pool *pool,
> > u8 order = p->order;
> > struct hyp_page *buddy;
> >
> > + WARN_ON(p->order > pool->max_order);
> > +
>
> Could you add a brief comment? It took me a minute to figure out what this
> catches. IIUC it's not attach's own input, it's a stale p->order from way back
> when an external page was popped from a memcache (today only via
> guest_s2_zalloc_page()). Right?
I think it'd be self explanatory if that was next the page_add_to_list, but that
wouldn't protect the memset (that's really best-effort though).
How about?
/*
* A page with an order bigger than the pool's max is an 'external' page
* whose order hasn't been reset before being added to the pool.
*/
But now I am thinking I can do way better: we can easily identify external
pages, so I could just force the order to 0 in that case.
WDYS?
>
> With that.
>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Tested-by: Fuad Tabba <tabba@google.com>
>
> Cheers,
> /fuad
>
>
>
>
> > memset(hyp_page_to_virt(p), 0, PAGE_SIZE << p->order);
> >
> > /* Skip coalescing for 'external' pages being freed into the pool. */
> > @@ -237,8 +239,10 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
> >
> > /* Init the vmemmap portion */
> > p = hyp_phys_to_page(phys);
> > - for (i = 0; i < nr_pages; i++)
> > + for (i = 0; i < nr_pages; i++) {
> > hyp_set_page_refcounted(&p[i]);
> > + p[i].order = 0;
> > + }
> >
> > /* Attach the unused pages to the buddy tree */
> > for (i = reserved_pages; i < nr_pages; i++)
> > --
> > 2.54.0.746.g67dd491aae-goog
> >
^ permalink raw reply
* [GIT PULL] amlogic ARM64 DT updates for v7.2 take 1
From: Neil Armstrong @ 2026-05-21 13:19 UTC (permalink / raw)
To: soc, arm; +Cc: linux-amlogic, linux-arm-kernel
Hi,
Here's the Amlogic ARM64 DT changes for v7.2, contains improvements for the Khadas VIM4
and VIM1s SBCs, plus some additions for the Phicomm N1 and a couple of low priority fixes.
This PR is largely the same as `amlogic-arm64-dt-for-v7.1`, but I sent the fixes
separately as `amlogic-fixes-v7.1-rc` as discussed with Arnd, so this tag
`amlogic-arm64-dt-for-v7.2-v1` is based on top of `amlogic-fixes-v7.1-rc`.
Thanks,
Neil
The following changes since commit 174a0ef3b33434f475c87e66f37980e39b73805a:
arm64: dts: meson-gxl-p230: fix ethernet PHY interrupt number (2026-04-21 15:46:29 +0200)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux.git amlogic-arm64-dt-for-v7.2-v1
for you to fetch changes up to 43e6ece01ba673c3bd8dd1638bcd93827d254b3d:
arm64: dts: amlogic: t7: Add i2c pinctrl node (2026-05-18 16:23:29 +0200)
----------------------------------------------------------------
Amlogic ARM64 DT for v7.2 take 1:
- Khadas VIM4 (T7 SoC) features:
- Memory layout fixup
- GIC register range
- Model name fixup
- PWM, eMMC, SD card and SDIO support
- PWM LED
- I2C pinctrl node
- Khadas VIM1s Features
- Bluetooth
- PWM LED
- Power Key
- Function Key via SARADC
- RTC
- Remote control keymap
- Bluetooth node for Phicomm N1
----------------------------------------------------------------
Jian Hu (1):
arm64: dts: amlogic: t7: Add clock controller nodes
Jun Yan (1):
arm64: dts: amlogic: meson-gxl-s905d-phicomm-n1: add bluetooth node
Nick Xie (9):
arm64: dts: amlogic: meson-s4: add UART_A node
arm64: dts: amlogic: meson-s4-s905y4-khadas-vim1s: enable bluetooth
arm64: dts: amlogic: meson-s4-s905y4-khadas-vim1s: add PWM LED support
arm64: dts: amlogic: meson-s4-s905y4-khadas-vim1s: add POWER key support
arm64: dts: amlogic: meson-s4: add internal SARADC controller
arm64: dts: amlogic: meson-s4-s905y4-khadas-vim1s: add Function key support
arm64: dts: amlogic: meson-s4: add VRTC node
arm64: dts: amlogic: meson-s4-s905y4-khadas-vim1s: enable HYM8563 RTC
arm64: dts: amlogic: meson-s4-s905y4-khadas-vim1s: use rc-khadas keymap
Ronald Claveau (12):
arm64: dts: amlogic: t7: Add eMMC, SD card and SDIO pinctrl nodes
arm64: dts: amlogic: t7: Add PWM pinctrl nodes
arm64: dts: amlogic: t7: khadas-vim4: Add power regulators
arm64: dts: amlogic: t7: khadas-vim4: Remove invalid property
arm64: dts: amlogic: t7: Add MMC controller nodes
arm64: dts: amlogic: t7: Add PWM controller nodes
arm64: dts: amlogic: t7: khadas-vim4: Add SDIO power sequence and WiFi clock
arm64: dts: amlogic: t7: khadas-vim4: Add MMC nodes
arm64: dts: amlogic: t7: Fix missing required reset property
arm64: dts: amlogic: t7: khadas-vim4: reorder root node
arm64: dts: amlogic: t7: khadas-vim4: add PWM-driven status LED
arm64: dts: amlogic: t7: Add i2c pinctrl node
.../dts/amlogic/amlogic-t7-a311d2-khadas-vim4.dts | 215 +++++++++
arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi | 482 +++++++++++++++++++++
.../dts/amlogic/meson-gxl-s905d-phicomm-n1.dts | 15 +
.../dts/amlogic/meson-s4-s905y4-khadas-vim1s.dts | 81 ++++
arch/arm64/boot/dts/amlogic/meson-s4.dtsi | 45 ++
5 files changed, 838 insertions(+)
^ permalink raw reply
* [PATCH v6] stm: class: Add MIPI OST protocol support
From: Yingchao Deng @ 2026-05-21 13:14 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Jonathan Corbet, Shuah Khan, Alexander Shishkin, Alexandre Torgue
Cc: linux-kernel, linux-trace-kernel, linux-doc, linux-arm-kernel,
quic_yingdeng, tingwei.zhang, jinlong.mao, jie.gan,
yuanfang.zhang, Yingchao Deng
Add MIPI OST (Open System Trace) protocol support for stm to format the
traces. The OST Protocol abstracts the underlying layers from the sending
and receiving applications, thus removing dependencies on the connection
media and platform implementation.
OST over STP packet consists of Header/Payload/End. Header is designed to
include the information required by all OST packets. Information that is
not shared by all packets is left to the higher layer protocols. Thus, the
OST Protocol Header can be regarded as the first part of a complete OST
Packet Header, while a higher layer header can be regarded as an extension
designed for a specific purpose.
+--------+--------+--------+--------+
| start |version |entity |protocol|
+--------+--------+--------+--------+
| stm version | magic |
+-----------------------------------+
| cpu |
+-----------------------------------+
| timestamp |
| |
+-----------------------------------+
| tgid |
| |
+-----------------------------------+
| payload |
+-----------------------------------+
| ... | end |
+-----------------------------------+
In header, there will be STARTSIMPLE/VERSION/ENTITY/PROTOCOL.
STARTSIMPLE is used to signal the beginning of a simplified OST protocol.
The Version field is a one byte, unsigned number identifying the version
of the OST Protocol. The Entity ID field is a one byte unsigned number
that identifies the source.
Entity ID values (0~239) are defined and controlled by the TS owner, and
shall be unique for the whole TS. The configfs entity attribute allows the
user to configure which Entity ID is associated with each policy node.
The Protocol ID field is a one byte unsigned number identifying the higher
layer protocol of the OST Packet, i.e. identifying the format of the data
after the OST Protocol Header. OST Control Protocol ID value represents
the common control protocol, the remaining Protocol ID values may be used
by any higher layer protocols capable of being transported by the OST
Protocol.
Co-developed-by: Tingwei Zhang <tingwei.zhang@oss.qualcomm.com>
Signed-off-by: Tingwei Zhang <tingwei.zhang@oss.qualcomm.com>
Co-developed-by: Yuanfang Zhang <yuanfang.zhang@oss.qualcomm.com>
Signed-off-by: Yuanfang Zhang <yuanfang.zhang@oss.qualcomm.com>
Co-developed-by: Jinlong Mao <jinlong.mao@oss.qualcomm.com>
Signed-off-by: Jinlong Mao <jinlong.mao@oss.qualcomm.com>
Signed-off-by: Yingchao Deng <yingchao.deng@oss.qualcomm.com>
---
Changes in v6:
1. Rebase on top of linux-next-20260518.
2. Fix Kconfig: 'default CONFIG_STM' -> 'default STM'.
3. Fix documentation grammar issues.
4. Add p_ost entry to Documentation/trace/index.rst.
5. Add missing priv_sz field to stm_protocol_driver registration.
6. Use kzalloc_obj() instead of kzalloc() in ost_output_open().
7. Add mutex protection in entity configfs store handler.
8. Keep the configfs entity attribute: entity ID values (0~239) are
defined and controlled by the TS owner and are deployment-specific.
stm_source_type only carries a small number of in-kernel source
classifications and cannot represent the full range of OST entity
assignments needed in practice. The configfs attribute allows each
policy node to declare its entity.
OST_ENTITY_TYPE_NONE is an enum sentinel (not entity ID 0) that causes
ost_write() to return -EINVAL when no entity is configured, preventing
emission of packets with an unintended entity field.
OST_ENTITY_DIAG (0xEE) is a TS-owner-defined value used by Qualcomm's
diagnostic framework as the standard entity identifier for diagnostic
trace sources.
Link to v5: https://lore.kernel.org/all/20260129-p_ost-v5-1-2b14fff39428@oss.qualcomm.com/
Changes in v5:
1. Add Co-developed-by tag.
2. Use yearless copyright for new file.
- Link to v4: https://lore.kernel.org/all/20251024-p_ost-v4-1-3652a06fd055@oss.qualcomm.com/
Changes in v4:
1. Delete unused variable 'i'.
2. Fix build error: call to undeclared function 'task_tgid_nr'.
Link to v3 - https://lore.kernel.org/all/20251022071834.1658684-1-yingchao.deng@oss.qualcomm.com/
Changes in v3:
1. Add more details about OST.
2. Delete 'entity_available' node, and 'entity' node will show available
and currently selected (shown in square brackets) entity.
3. Removed the usage of config_item->ci_group->cg_subsys->su_mutex.
Link to v2 - https://lore.kernel.org/all/20230419141328.37472-1-quic_jinlmao@quicinc.com/
---
.../ABI/testing/configfs-stp-policy-p_ost | 9 +
Documentation/trace/index.rst | 1 +
Documentation/trace/p_ost.rst | 39 ++++
drivers/hwtracing/stm/Kconfig | 14 ++
drivers/hwtracing/stm/Makefile | 2 +
drivers/hwtracing/stm/p_ost.c | 241 +++++++++++++++++++++
6 files changed, 306 insertions(+)
diff --git a/Documentation/ABI/testing/configfs-stp-policy-p_ost b/Documentation/ABI/testing/configfs-stp-policy-p_ost
new file mode 100644
index 000000000000..8fb160b50c40
--- /dev/null
+++ b/Documentation/ABI/testing/configfs-stp-policy-p_ost
@@ -0,0 +1,9 @@
+What: /config/stp-policy/<device>:p_ost.<policy>/<node>/entity
+Date: May 2026
+KernelVersion: 7.1
+Description:
+ Set the entity ID which identifies the trace source in the
+ OST packet header. Entity ID values (0~239) are defined by
+ the TS owner. Currently supported values are ftrace, console
+ and diag. RW.
+
diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
index 5d9bf4694d5d..9cd1e0b5af6d 100644
--- a/Documentation/trace/index.rst
+++ b/Documentation/trace/index.rst
@@ -72,6 +72,7 @@ interactions and system performance.
intel_th
stm
sys-t
+ p_ost
coresight/index
rv/index
hisi-ptt
diff --git a/Documentation/trace/p_ost.rst b/Documentation/trace/p_ost.rst
new file mode 100644
index 000000000000..2b92e2229653
--- /dev/null
+++ b/Documentation/trace/p_ost.rst
@@ -0,0 +1,39 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+MIPI OST over STP
+===================
+
+The OST (Open System Trace) driver is used with STM class devices to
+generate standardized trace stream. Trace sources can be identified
+by different entity IDs.
+
+CONFIG_STM_PROTO_OST is for p_ost driver enablement. Once this config
+is enabled, you can select the p_ost protocol by command below:
+
+# mkdir /sys/kernel/config/stp-policy/stm0:p_ost.policy
+
+The policy name format is extended like this:
+
+ <device_name>:<protocol_name>.<policy_name>
+
+With a coresight-stm device, it will look like "stm0:p_ost.policy".
+
+With the MIPI OST protocol driver, the attributes for each protocol node are:
+
+# mkdir /sys/kernel/config/stp-policy/stm0:p_ost.policy/default
+# ls /sys/kernel/config/stp-policy/stm0:p_ost.policy/default
+channels entity masters
+
+The entity here is the set of entities that p_ost supports. Currently
+p_ost supports ftrace, console and diag entities.
+
+Set entity:
+# echo 'ftrace' > /sys/kernel/config/stp-policy/stm0:p_ost.policy/default/entity
+
+Get available and currently selected (shown in square brackets) entity:
+# cat /sys/kernel/config/stp-policy/stm0:p_ost.policy/default/entity
+[ftrace] console diag
+
+See Documentation/ABI/testing/configfs-stp-policy-p_ost for more details.
+
diff --git a/drivers/hwtracing/stm/Kconfig b/drivers/hwtracing/stm/Kconfig
index cd7f0b0f3fbe..4c83da5d95a0 100644
--- a/drivers/hwtracing/stm/Kconfig
+++ b/drivers/hwtracing/stm/Kconfig
@@ -40,6 +40,20 @@ config STM_PROTO_SYS_T
If you don't know what this is, say N.
+config STM_PROTO_OST
+ tristate "MIPI OST STM framing protocol driver"
+ default STM
+ help
+ This is an implementation of MIPI OST protocol to be used
+ over the STP transport. In addition to the data payload, it
+ also carries additional metadata for entity, better
+ means of trace source identification, etc.
+
+ The receiving side must be able to decode this protocol in
+ addition to the MIPI STP, in order to extract the data.
+
+ If you don't know what this is, say N.
+
config STM_DUMMY
tristate "Dummy STM driver"
help
diff --git a/drivers/hwtracing/stm/Makefile b/drivers/hwtracing/stm/Makefile
index 1692fcd29277..d9c8615849b9 100644
--- a/drivers/hwtracing/stm/Makefile
+++ b/drivers/hwtracing/stm/Makefile
@@ -5,9 +5,11 @@ stm_core-y := core.o policy.o
obj-$(CONFIG_STM_PROTO_BASIC) += stm_p_basic.o
obj-$(CONFIG_STM_PROTO_SYS_T) += stm_p_sys-t.o
+obj-$(CONFIG_STM_PROTO_OST) += stm_p_ost.o
stm_p_basic-y := p_basic.o
stm_p_sys-t-y := p_sys-t.o
+stm_p_ost-y := p_ost.o
obj-$(CONFIG_STM_DUMMY) += dummy_stm.o
diff --git a/drivers/hwtracing/stm/p_ost.c b/drivers/hwtracing/stm/p_ost.c
new file mode 100644
index 000000000000..d2174872b761
--- /dev/null
+++ b/drivers/hwtracing/stm/p_ost.c
@@ -0,0 +1,241 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ *
+ * MIPI OST framing protocol for STM devices.
+ */
+
+#include <linux/pid.h>
+#include <linux/sched/clock.h>
+#include <linux/slab.h>
+#include <linux/stm.h>
+#include "stm.h"
+
+/*
+ * OST Base Protocol Header
+ *
+ * Position Bits Field Name
+ * 0 8 STARTSIMPLE
+ * 1 8 Version
+ * 2 8 Entity ID
+ * 3 8 Protocol ID
+ */
+#define OST_FIELD_STARTSIMPLE 0
+#define OST_FIELD_VERSION 8
+#define OST_FIELD_ENTITY 16
+#define OST_FIELD_PROTOCOL 24
+
+#define OST_TOKEN_STARTSIMPLE 0x10
+#define OST_VERSION_MIPI1 0x10
+
+/* entity id to identify the source */
+#define OST_ENTITY_FTRACE 0x01
+#define OST_ENTITY_CONSOLE 0x02
+#define OST_ENTITY_DIAG 0xEE
+
+#define OST_CONTROL_PROTOCOL 0x0
+
+#define DATA_HEADER ((OST_TOKEN_STARTSIMPLE << OST_FIELD_STARTSIMPLE) | \
+ (OST_VERSION_MIPI1 << OST_FIELD_VERSION) | \
+ (OST_CONTROL_PROTOCOL << OST_FIELD_PROTOCOL))
+
+#define STM_MAKE_VERSION(ma, mi) (((ma) << 8) | (mi))
+#define STM_HEADER_MAGIC (0x5953)
+
+enum ost_entity_type {
+ OST_ENTITY_TYPE_NONE,
+ OST_ENTITY_TYPE_FTRACE,
+ OST_ENTITY_TYPE_CONSOLE,
+ OST_ENTITY_TYPE_DIAG,
+};
+
+static const char * const str_ost_entity_type[] = {
+ [OST_ENTITY_TYPE_NONE] = "none",
+ [OST_ENTITY_TYPE_FTRACE] = "ftrace",
+ [OST_ENTITY_TYPE_CONSOLE] = "console",
+ [OST_ENTITY_TYPE_DIAG] = "diag",
+};
+
+static const u32 ost_entity_value[] = {
+ [OST_ENTITY_TYPE_NONE] = 0,
+ [OST_ENTITY_TYPE_FTRACE] = OST_ENTITY_FTRACE,
+ [OST_ENTITY_TYPE_CONSOLE] = OST_ENTITY_CONSOLE,
+ [OST_ENTITY_TYPE_DIAG] = OST_ENTITY_DIAG,
+};
+
+struct ost_policy_node {
+ enum ost_entity_type entity_type;
+};
+
+struct ost_output {
+ struct ost_policy_node node;
+};
+
+/* Set default entity type as none */
+static void ost_policy_node_init(void *priv)
+{
+ struct ost_policy_node *pn = priv;
+
+ pn->entity_type = OST_ENTITY_TYPE_NONE;
+}
+
+static int ost_output_open(void *priv, struct stm_output *output)
+{
+ struct ost_policy_node *pn = priv;
+ struct ost_output *opriv;
+
+ opriv = kzalloc_obj(*opriv, GFP_ATOMIC);
+ if (!opriv)
+ return -ENOMEM;
+
+ memcpy(&opriv->node, pn, sizeof(opriv->node));
+ output->pdrv_private = opriv;
+ return 0;
+}
+
+static void ost_output_close(struct stm_output *output)
+{
+ kfree(output->pdrv_private);
+}
+
+static ssize_t ost_t_policy_entity_show(struct config_item *item,
+ char *page)
+{
+ struct ost_policy_node *pn = to_pdrv_policy_node(item);
+ ssize_t sz = 0;
+ int i;
+
+ for (i = 1; i < ARRAY_SIZE(str_ost_entity_type); i++) {
+ if (i == pn->entity_type)
+ sz += sysfs_emit_at(page, sz, "[%s] ", str_ost_entity_type[i]);
+ else
+ sz += sysfs_emit_at(page, sz, "%s ", str_ost_entity_type[i]);
+ }
+
+ sz += sysfs_emit_at(page, sz, "\n");
+ return sz;
+}
+
+static int entity_index(const char *str)
+{
+ int i;
+
+ for (i = 1; i < ARRAY_SIZE(str_ost_entity_type); i++) {
+ if (sysfs_streq(str, str_ost_entity_type[i]))
+ return i;
+ }
+
+ return 0;
+}
+
+static ssize_t
+ost_t_policy_entity_store(struct config_item *item, const char *page,
+ size_t count)
+{
+ struct mutex *mutexp = &item->ci_group->cg_subsys->su_mutex;
+ struct ost_policy_node *pn = to_pdrv_policy_node(item);
+ int i;
+
+ i = entity_index(page);
+ if (i) {
+ mutex_lock(mutexp);
+ pn->entity_type = i;
+ mutex_unlock(mutexp);
+ } else {
+ return -EINVAL;
+ }
+
+ return count;
+}
+CONFIGFS_ATTR(ost_t_policy_, entity);
+
+static struct configfs_attribute *ost_t_policy_attrs[] = {
+ &ost_t_policy_attr_entity,
+ NULL,
+};
+
+static ssize_t
+notrace ost_write(struct stm_data *data, struct stm_output *output,
+ unsigned int chan, const char *buf, size_t count,
+ struct stm_source_data *source)
+{
+ struct ost_output *op = output->pdrv_private;
+ unsigned int c = output->channel + chan;
+ unsigned int m = output->master;
+ const unsigned char nil = 0;
+ u32 header = DATA_HEADER;
+ struct trc_hdr {
+ u16 version;
+ u16 magic;
+ u32 cpu;
+ u64 timestamp;
+ u64 tgid;
+ } hdr;
+ ssize_t sz;
+
+ /*
+ * Identify the source by entity type.
+ * If entity type is not set, return error value.
+ */
+ if (op->node.entity_type)
+ header |= (ost_entity_value[op->node.entity_type] << OST_FIELD_ENTITY);
+ else
+ return -EINVAL;
+
+ /*
+ * STP framing rules for OST frames:
+ * * the first packet of the OST frame is marked;
+ * * the last packet is a FLAG with timestamped tag.
+ */
+ /* Message layout: HEADER / DATA / TAIL */
+ /* HEADER */
+ sz = data->packet(data, m, c, STP_PACKET_DATA, STP_PACKET_MARKED,
+ 4, (u8 *)&header);
+ if (sz <= 0)
+ return sz;
+
+ /* DATA */
+ hdr.version = STM_MAKE_VERSION(0, 3);
+ hdr.magic = STM_HEADER_MAGIC;
+ hdr.cpu = raw_smp_processor_id();
+ hdr.timestamp = sched_clock();
+ hdr.tgid = task_tgid_nr(current);
+ sz = stm_data_write(data, m, c, false, &hdr, sizeof(hdr));
+ if (sz <= 0)
+ return sz;
+
+ sz = stm_data_write(data, m, c, false, buf, count);
+
+ /* TAIL */
+ if (sz > 0)
+ data->packet(data, m, c, STP_PACKET_FLAG,
+ STP_PACKET_TIMESTAMPED, 0, &nil);
+
+ return sz;
+}
+
+static const struct stm_protocol_driver ost_pdrv = {
+ .owner = THIS_MODULE,
+ .name = "p_ost",
+ .priv_sz = sizeof(struct ost_policy_node),
+ .write = ost_write,
+ .policy_attr = ost_t_policy_attrs,
+ .output_open = ost_output_open,
+ .output_close = ost_output_close,
+ .policy_node_init = ost_policy_node_init,
+};
+
+static int ost_stm_init(void)
+{
+ return stm_register_protocol(&ost_pdrv);
+}
+module_init(ost_stm_init);
+
+static void ost_stm_exit(void)
+{
+ stm_unregister_protocol(&ost_pdrv);
+}
+module_exit(ost_stm_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("MIPI Open System Trace STM framing protocol driver");
---
base-commit: 80dd246accce631c328ea43294e53b2b2dd2aa32
change-id: 20260521-stm_p_ost-3489f42a9e8c
Best regards,
--
Yingchao Deng <yingchao.deng@oss.qualcomm.com>
^ permalink raw reply related
* Re: [PATCH v4 11/24] iommu: Add iommu_report_device_broken() to quarantine a broken device
From: Jason Gunthorpe @ 2026-05-21 13:12 UTC (permalink / raw)
To: Nicolin Chen
Cc: Will Deacon, Robin Murphy, Joerg Roedel, Bjorn Helgaas,
Rafael J . Wysocki, Len Brown, Pranjal Shrivastava, Mostafa Saleh,
Lu Baolu, Kevin Tian, linux-arm-kernel, iommu, linux-kernel,
linux-acpi, linux-pci, vsethi, Shuai Xue
In-Reply-To: <ag35umFgzIRGZAHA@Asurada-Nvidia>
On Wed, May 20, 2026 at 11:13:14AM -0700, Nicolin Chen wrote:
> > > > We cannot eliminate parallel ATS invalidation. Two threads could be
> > > > concurrently processing the invs list. So it has handle it, the driver
> > > > is going to have to tolerate a number of redundant error events.
> > >
> > > OK. That sounds like we still need a flag or locking so that at
> > > least pci_disable_ats() would not be called again. I will see
> > > what I can do.
> >
> > I think we can call pci_disable_ats() as many times as we want
>
> That triggers WARN_ON(!dev->ats_enabled) in pci_disable_ats :-(
IMHO I'd rather take that out than add a bunch of complication in the
iommu drivers..
> > Still, I'd feel better if it is was definititive and we didn't rely on
> > this. This further points that the driver has to merge multiple error
> > notifications if it gets some AERs and a new "ATC ERROR" all for the
> > same key event.
>
> I feel some race here... Part of the complexity of this v4 is to deal
> with concurrent device reset during the async report() between IOMMU
> core and driver. Now, we add AER that could compete on the device side
> as well...
It is always going to have concurrent events, so long as the resets
sequence in an orderly way it doesn't matter if they overlap.
Most likely the driver will have locking that prevents it from pushing
concurrent resets.
Jason
^ permalink raw reply
* [PATCH v6] stm: class: Add MIPI OST protocol support
From: Yingchao Deng @ 2026-05-21 13:08 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers,
Jonathan Corbet, Shuah Khan, Alexander Shishkin, Alexandre Torgue
Cc: linux-kernel, linux-trace-kernel, linux-doc, linux-arm-kernel,
Tingwei Zhang, Yuanfang Zhang, Jinlong Mao, Yingchao Deng
Add MIPI OST (Open System Trace) protocol support for stm to format the
traces. The OST Protocol abstracts the underlying layers from the sending
and receiving applications, thus removing dependencies on the connection
media and platform implementation.
OST over STP packet consists of Header/Payload/End. Header is designed to
include the information required by all OST packets. Information that is
not shared by all packets is left to the higher layer protocols. Thus, the
OST Protocol Header can be regarded as the first part of a complete OST
Packet Header, while a higher layer header can be regarded as an extension
designed for a specific purpose.
+--------+--------+--------+--------+
| start |version |entity |protocol|
+--------+--------+--------+--------+
| stm version | magic |
+-----------------------------------+
| cpu |
+-----------------------------------+
| timestamp |
| |
+-----------------------------------+
| tgid |
| |
+-----------------------------------+
| payload |
+-----------------------------------+
| ... | end |
+-----------------------------------+
In header, there will be STARTSIMPLE/VERSION/ENTITY/PROTOCOL.
STARTSIMPLE is used to signal the beginning of a simplified OST protocol.
The Version field is a one byte, unsigned number identifying the version
of the OST Protocol. The Entity ID field is a one byte unsigned number
that identifies the source.
Entity ID values (0~239) are defined and controlled by the TS owner, and
shall be unique for the whole TS. The configfs entity attribute allows the
user to configure which Entity ID is associated with each policy node.
The Protocol ID field is a one byte unsigned number identifying the higher
layer protocol of the OST Packet, i.e. identifying the format of the data
after the OST Protocol Header. OST Control Protocol ID value represents
the common control protocol, the remaining Protocol ID values may be used
by any higher layer protocols capable of being transported by the OST
Protocol.
Co-developed-by: Tingwei Zhang <tingwei.zhang@oss.qualcomm.com>
Signed-off-by: Tingwei Zhang <tingwei.zhang@oss.qualcomm.com>
Co-developed-by: Yuanfang Zhang <yuanfang.zhang@oss.qualcomm.com>
Signed-off-by: Yuanfang Zhang <yuanfang.zhang@oss.qualcomm.com>
Co-developed-by: Jinlong Mao <jinlong.mao@oss.qualcomm.com>
Signed-off-by: Jinlong Mao <jinlong.mao@oss.qualcomm.com>
Signed-off-by: Yingchao Deng <yingchao.deng@oss.qualcomm.com>
---
Changes in v6:
1. Rebase on top of linux-next-20260518.
2. Fix Kconfig: 'default CONFIG_STM' -> 'default STM'.
3. Fix documentation grammar issues.
4. Add p_ost entry to Documentation/trace/index.rst.
5. Add missing priv_sz field to stm_protocol_driver registration.
6. Use kzalloc_obj() instead of kzalloc() in ost_output_open().
7. Add mutex protection in entity configfs store handler.
8. Keep the configfs entity attribute: entity ID values (0~239) are
defined and controlled by the TS owner and are deployment-specific.
stm_source_type only carries a small number of in-kernel source
classifications and cannot represent the full range of OST entity
assignments needed in practice. The configfs attribute allows each
policy node to declare its entity.
OST_ENTITY_TYPE_NONE is an enum sentinel (not entity ID 0) that causes
ost_write() to return -EINVAL when no entity is configured, preventing
emission of packets with an unintended entity field.
OST_ENTITY_DIAG (0xEE) is a TS-owner-defined value used by Qualcomm's
diagnostic framework as the standard entity identifier for diagnostic
trace sources.
Link to v5: https://lore.kernel.org/all/20260129-p_ost-v5-1-2b14fff39428@oss.qualcomm.com/
Changes in v5:
1. Add Co-developed-by tag.
2. Use yearless copyright for new file.
- Link to v4: https://lore.kernel.org/all/20251024-p_ost-v4-1-3652a06fd055@oss.qualcomm.com/
Changes in v4:
1. Delete unused variable 'i'.
2. Fix build error: call to undeclared function 'task_tgid_nr'.
Link to v3 - https://lore.kernel.org/all/20251022071834.1658684-1-yingchao.deng@oss.qualcomm.com/
Changes in v3:
1. Add more details about OST.
2. Delete 'entity_available' node, and 'entity' node will show available
and currently selected (shown in square brackets) entity.
3. Removed the usage of config_item->ci_group->cg_subsys->su_mutex.
Link to v2 - https://lore.kernel.org/all/20230419141328.37472-1-quic_jinlmao@quicinc.com/
---
.../ABI/testing/configfs-stp-policy-p_ost | 9 +
Documentation/trace/index.rst | 1 +
Documentation/trace/p_ost.rst | 39 ++++
drivers/hwtracing/stm/Kconfig | 14 ++
drivers/hwtracing/stm/Makefile | 2 +
drivers/hwtracing/stm/p_ost.c | 241 +++++++++++++++++++++
6 files changed, 306 insertions(+)
diff --git a/Documentation/ABI/testing/configfs-stp-policy-p_ost b/Documentation/ABI/testing/configfs-stp-policy-p_ost
new file mode 100644
index 000000000000..8fb160b50c40
--- /dev/null
+++ b/Documentation/ABI/testing/configfs-stp-policy-p_ost
@@ -0,0 +1,9 @@
+What: /config/stp-policy/<device>:p_ost.<policy>/<node>/entity
+Date: May 2026
+KernelVersion: 7.1
+Description:
+ Set the entity ID which identifies the trace source in the
+ OST packet header. Entity ID values (0~239) are defined by
+ the TS owner. Currently supported values are ftrace, console
+ and diag. RW.
+
diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst
index 5d9bf4694d5d..9cd1e0b5af6d 100644
--- a/Documentation/trace/index.rst
+++ b/Documentation/trace/index.rst
@@ -72,6 +72,7 @@ interactions and system performance.
intel_th
stm
sys-t
+ p_ost
coresight/index
rv/index
hisi-ptt
diff --git a/Documentation/trace/p_ost.rst b/Documentation/trace/p_ost.rst
new file mode 100644
index 000000000000..2b92e2229653
--- /dev/null
+++ b/Documentation/trace/p_ost.rst
@@ -0,0 +1,39 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+MIPI OST over STP
+===================
+
+The OST (Open System Trace) driver is used with STM class devices to
+generate standardized trace stream. Trace sources can be identified
+by different entity IDs.
+
+CONFIG_STM_PROTO_OST is for p_ost driver enablement. Once this config
+is enabled, you can select the p_ost protocol by command below:
+
+# mkdir /sys/kernel/config/stp-policy/stm0:p_ost.policy
+
+The policy name format is extended like this:
+
+ <device_name>:<protocol_name>.<policy_name>
+
+With a coresight-stm device, it will look like "stm0:p_ost.policy".
+
+With the MIPI OST protocol driver, the attributes for each protocol node are:
+
+# mkdir /sys/kernel/config/stp-policy/stm0:p_ost.policy/default
+# ls /sys/kernel/config/stp-policy/stm0:p_ost.policy/default
+channels entity masters
+
+The entity here is the set of entities that p_ost supports. Currently
+p_ost supports ftrace, console and diag entities.
+
+Set entity:
+# echo 'ftrace' > /sys/kernel/config/stp-policy/stm0:p_ost.policy/default/entity
+
+Get available and currently selected (shown in square brackets) entity:
+# cat /sys/kernel/config/stp-policy/stm0:p_ost.policy/default/entity
+[ftrace] console diag
+
+See Documentation/ABI/testing/configfs-stp-policy-p_ost for more details.
+
diff --git a/drivers/hwtracing/stm/Kconfig b/drivers/hwtracing/stm/Kconfig
index cd7f0b0f3fbe..4c83da5d95a0 100644
--- a/drivers/hwtracing/stm/Kconfig
+++ b/drivers/hwtracing/stm/Kconfig
@@ -40,6 +40,20 @@ config STM_PROTO_SYS_T
If you don't know what this is, say N.
+config STM_PROTO_OST
+ tristate "MIPI OST STM framing protocol driver"
+ default STM
+ help
+ This is an implementation of MIPI OST protocol to be used
+ over the STP transport. In addition to the data payload, it
+ also carries additional metadata for entity, better
+ means of trace source identification, etc.
+
+ The receiving side must be able to decode this protocol in
+ addition to the MIPI STP, in order to extract the data.
+
+ If you don't know what this is, say N.
+
config STM_DUMMY
tristate "Dummy STM driver"
help
diff --git a/drivers/hwtracing/stm/Makefile b/drivers/hwtracing/stm/Makefile
index 1692fcd29277..d9c8615849b9 100644
--- a/drivers/hwtracing/stm/Makefile
+++ b/drivers/hwtracing/stm/Makefile
@@ -5,9 +5,11 @@ stm_core-y := core.o policy.o
obj-$(CONFIG_STM_PROTO_BASIC) += stm_p_basic.o
obj-$(CONFIG_STM_PROTO_SYS_T) += stm_p_sys-t.o
+obj-$(CONFIG_STM_PROTO_OST) += stm_p_ost.o
stm_p_basic-y := p_basic.o
stm_p_sys-t-y := p_sys-t.o
+stm_p_ost-y := p_ost.o
obj-$(CONFIG_STM_DUMMY) += dummy_stm.o
diff --git a/drivers/hwtracing/stm/p_ost.c b/drivers/hwtracing/stm/p_ost.c
new file mode 100644
index 000000000000..d2174872b761
--- /dev/null
+++ b/drivers/hwtracing/stm/p_ost.c
@@ -0,0 +1,241 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ *
+ * MIPI OST framing protocol for STM devices.
+ */
+
+#include <linux/pid.h>
+#include <linux/sched/clock.h>
+#include <linux/slab.h>
+#include <linux/stm.h>
+#include "stm.h"
+
+/*
+ * OST Base Protocol Header
+ *
+ * Position Bits Field Name
+ * 0 8 STARTSIMPLE
+ * 1 8 Version
+ * 2 8 Entity ID
+ * 3 8 Protocol ID
+ */
+#define OST_FIELD_STARTSIMPLE 0
+#define OST_FIELD_VERSION 8
+#define OST_FIELD_ENTITY 16
+#define OST_FIELD_PROTOCOL 24
+
+#define OST_TOKEN_STARTSIMPLE 0x10
+#define OST_VERSION_MIPI1 0x10
+
+/* entity id to identify the source */
+#define OST_ENTITY_FTRACE 0x01
+#define OST_ENTITY_CONSOLE 0x02
+#define OST_ENTITY_DIAG 0xEE
+
+#define OST_CONTROL_PROTOCOL 0x0
+
+#define DATA_HEADER ((OST_TOKEN_STARTSIMPLE << OST_FIELD_STARTSIMPLE) | \
+ (OST_VERSION_MIPI1 << OST_FIELD_VERSION) | \
+ (OST_CONTROL_PROTOCOL << OST_FIELD_PROTOCOL))
+
+#define STM_MAKE_VERSION(ma, mi) (((ma) << 8) | (mi))
+#define STM_HEADER_MAGIC (0x5953)
+
+enum ost_entity_type {
+ OST_ENTITY_TYPE_NONE,
+ OST_ENTITY_TYPE_FTRACE,
+ OST_ENTITY_TYPE_CONSOLE,
+ OST_ENTITY_TYPE_DIAG,
+};
+
+static const char * const str_ost_entity_type[] = {
+ [OST_ENTITY_TYPE_NONE] = "none",
+ [OST_ENTITY_TYPE_FTRACE] = "ftrace",
+ [OST_ENTITY_TYPE_CONSOLE] = "console",
+ [OST_ENTITY_TYPE_DIAG] = "diag",
+};
+
+static const u32 ost_entity_value[] = {
+ [OST_ENTITY_TYPE_NONE] = 0,
+ [OST_ENTITY_TYPE_FTRACE] = OST_ENTITY_FTRACE,
+ [OST_ENTITY_TYPE_CONSOLE] = OST_ENTITY_CONSOLE,
+ [OST_ENTITY_TYPE_DIAG] = OST_ENTITY_DIAG,
+};
+
+struct ost_policy_node {
+ enum ost_entity_type entity_type;
+};
+
+struct ost_output {
+ struct ost_policy_node node;
+};
+
+/* Set default entity type as none */
+static void ost_policy_node_init(void *priv)
+{
+ struct ost_policy_node *pn = priv;
+
+ pn->entity_type = OST_ENTITY_TYPE_NONE;
+}
+
+static int ost_output_open(void *priv, struct stm_output *output)
+{
+ struct ost_policy_node *pn = priv;
+ struct ost_output *opriv;
+
+ opriv = kzalloc_obj(*opriv, GFP_ATOMIC);
+ if (!opriv)
+ return -ENOMEM;
+
+ memcpy(&opriv->node, pn, sizeof(opriv->node));
+ output->pdrv_private = opriv;
+ return 0;
+}
+
+static void ost_output_close(struct stm_output *output)
+{
+ kfree(output->pdrv_private);
+}
+
+static ssize_t ost_t_policy_entity_show(struct config_item *item,
+ char *page)
+{
+ struct ost_policy_node *pn = to_pdrv_policy_node(item);
+ ssize_t sz = 0;
+ int i;
+
+ for (i = 1; i < ARRAY_SIZE(str_ost_entity_type); i++) {
+ if (i == pn->entity_type)
+ sz += sysfs_emit_at(page, sz, "[%s] ", str_ost_entity_type[i]);
+ else
+ sz += sysfs_emit_at(page, sz, "%s ", str_ost_entity_type[i]);
+ }
+
+ sz += sysfs_emit_at(page, sz, "\n");
+ return sz;
+}
+
+static int entity_index(const char *str)
+{
+ int i;
+
+ for (i = 1; i < ARRAY_SIZE(str_ost_entity_type); i++) {
+ if (sysfs_streq(str, str_ost_entity_type[i]))
+ return i;
+ }
+
+ return 0;
+}
+
+static ssize_t
+ost_t_policy_entity_store(struct config_item *item, const char *page,
+ size_t count)
+{
+ struct mutex *mutexp = &item->ci_group->cg_subsys->su_mutex;
+ struct ost_policy_node *pn = to_pdrv_policy_node(item);
+ int i;
+
+ i = entity_index(page);
+ if (i) {
+ mutex_lock(mutexp);
+ pn->entity_type = i;
+ mutex_unlock(mutexp);
+ } else {
+ return -EINVAL;
+ }
+
+ return count;
+}
+CONFIGFS_ATTR(ost_t_policy_, entity);
+
+static struct configfs_attribute *ost_t_policy_attrs[] = {
+ &ost_t_policy_attr_entity,
+ NULL,
+};
+
+static ssize_t
+notrace ost_write(struct stm_data *data, struct stm_output *output,
+ unsigned int chan, const char *buf, size_t count,
+ struct stm_source_data *source)
+{
+ struct ost_output *op = output->pdrv_private;
+ unsigned int c = output->channel + chan;
+ unsigned int m = output->master;
+ const unsigned char nil = 0;
+ u32 header = DATA_HEADER;
+ struct trc_hdr {
+ u16 version;
+ u16 magic;
+ u32 cpu;
+ u64 timestamp;
+ u64 tgid;
+ } hdr;
+ ssize_t sz;
+
+ /*
+ * Identify the source by entity type.
+ * If entity type is not set, return error value.
+ */
+ if (op->node.entity_type)
+ header |= (ost_entity_value[op->node.entity_type] << OST_FIELD_ENTITY);
+ else
+ return -EINVAL;
+
+ /*
+ * STP framing rules for OST frames:
+ * * the first packet of the OST frame is marked;
+ * * the last packet is a FLAG with timestamped tag.
+ */
+ /* Message layout: HEADER / DATA / TAIL */
+ /* HEADER */
+ sz = data->packet(data, m, c, STP_PACKET_DATA, STP_PACKET_MARKED,
+ 4, (u8 *)&header);
+ if (sz <= 0)
+ return sz;
+
+ /* DATA */
+ hdr.version = STM_MAKE_VERSION(0, 3);
+ hdr.magic = STM_HEADER_MAGIC;
+ hdr.cpu = raw_smp_processor_id();
+ hdr.timestamp = sched_clock();
+ hdr.tgid = task_tgid_nr(current);
+ sz = stm_data_write(data, m, c, false, &hdr, sizeof(hdr));
+ if (sz <= 0)
+ return sz;
+
+ sz = stm_data_write(data, m, c, false, buf, count);
+
+ /* TAIL */
+ if (sz > 0)
+ data->packet(data, m, c, STP_PACKET_FLAG,
+ STP_PACKET_TIMESTAMPED, 0, &nil);
+
+ return sz;
+}
+
+static const struct stm_protocol_driver ost_pdrv = {
+ .owner = THIS_MODULE,
+ .name = "p_ost",
+ .priv_sz = sizeof(struct ost_policy_node),
+ .write = ost_write,
+ .policy_attr = ost_t_policy_attrs,
+ .output_open = ost_output_open,
+ .output_close = ost_output_close,
+ .policy_node_init = ost_policy_node_init,
+};
+
+static int ost_stm_init(void)
+{
+ return stm_register_protocol(&ost_pdrv);
+}
+module_init(ost_stm_init);
+
+static void ost_stm_exit(void)
+{
+ stm_unregister_protocol(&ost_pdrv);
+}
+module_exit(ost_stm_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("MIPI Open System Trace STM framing protocol driver");
---
base-commit: 80dd246accce631c328ea43294e53b2b2dd2aa32
change-id: 20260521-stm_p_ost-3489f42a9e8c
Best regards,
--
Yingchao Deng <yingchao.deng@oss.qualcomm.com>
^ permalink raw reply related
* Re: [PATCH v2 3/3] KVM: arm64: Add fail-safe for refcounted pages in __pkvm_hyp_donate_host
From: Fuad Tabba @ 2026-05-21 13:07 UTC (permalink / raw)
To: Vincent Donnefort
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, kvmarm, kernel-team,
qperret
In-Reply-To: <20260521102149.804874-4-vdonnefort@google.com>
On Thu, 21 May 2026 at 11:22, Vincent Donnefort <vdonnefort@google.com> wrote:
>
> A previous bug in __pkvm_init_vm error path showed that the hypervisor
> could leak refcounted pages, (i.e. losing access to a page while its
> refcount is still elevated). This poses a threat to the pKVM state
> machine.
>
> Address this by introducing a fail-safe in n __pkvm_hyp_donate_host.
Stray n.
> Transitions are not a hot path so added security is worth the extra
> check.
>
> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Cheers,
/fuad
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> index 42b0b648f32f..bb97d05b9b25 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> @@ -855,6 +855,16 @@ static int __hyp_check_page_state_range(phys_addr_t phys, u64 size, enum pkvm_pa
> return 0;
> }
>
> +static int __hyp_check_page_count_range(phys_addr_t phys, u64 size)
> +{
> + for_each_hyp_page(page, phys, size) {
> + if (page->refcount)
> + return -EBUSY;
> + }
> +
> + return 0;
> +}
> +
> static bool guest_pte_is_poisoned(kvm_pte_t pte)
> {
> if (kvm_pte_valid(pte))
> @@ -1053,7 +1063,6 @@ int __pkvm_guest_unshare_host(struct pkvm_hyp_vcpu *vcpu, u64 gfn)
> int __pkvm_host_unshare_hyp(u64 pfn)
> {
> u64 phys = hyp_pfn_to_phys(pfn);
> - u64 virt = (u64)__hyp_va(phys);
> u64 size = PAGE_SIZE;
> int ret;
>
> @@ -1066,10 +1075,9 @@ int __pkvm_host_unshare_hyp(u64 pfn)
> ret = __hyp_check_page_state_range(phys, size, PKVM_PAGE_SHARED_BORROWED);
> if (ret)
> goto unlock;
> - if (hyp_page_count((void *)virt)) {
> - ret = -EBUSY;
> + ret = __hyp_check_page_count_range(phys, size);
> + if (ret)
> goto unlock;
> - }
>
> __hyp_set_page_state_range(phys, size, PKVM_NOPAGE);
> WARN_ON(__host_set_page_state_range(phys, size, PKVM_PAGE_OWNED));
> @@ -1132,6 +1140,10 @@ int __pkvm_hyp_donate_host(u64 pfn, u64 nr_pages)
> if (ret)
> goto unlock;
>
> + ret = __hyp_check_page_count_range(phys, size);
> + if (ret)
> + goto unlock;
> +
> __hyp_set_page_state_range(phys, size, PKVM_NOPAGE);
> WARN_ON(kvm_pgtable_hyp_unmap(&pkvm_pgtable, virt, size) != size);
> WARN_ON(host_stage2_set_owner_locked(phys, size, PKVM_ID_HOST));
> --
> 2.54.0.746.g67dd491aae-goog
>
^ permalink raw reply
* Re: [PATCH v2 2/3] KVM: arm64: Fix __pkvm_init_vm error path
From: Fuad Tabba @ 2026-05-21 13:07 UTC (permalink / raw)
To: Vincent Donnefort
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, kvmarm, kernel-team,
qperret, Sashiko
In-Reply-To: <20260521102149.804874-3-vdonnefort@google.com>
On Thu, 21 May 2026 at 11:22, Vincent Donnefort <vdonnefort@google.com> wrote:
>
> In the unlikely case where insert_vm_table_entry fails, __pkvm_init_vm
> release the memory donated by the host for the PGD, but as the stage-2
> is still set-up the hypervisor keeps a refcount on those pages,
> effectively leaking the references.
>
> Fix the rollback with the newly added kvm_guest_destroy_stage2().
>
> Fixes: 256b4668cd89 ("KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization")
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Cheers,
/fuad
>
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> index 3cbfae0e3dda..4f2b871199cb 100644
> --- a/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> +++ b/arch/arm64/kvm/hyp/include/nvhe/mem_protect.h
> @@ -56,6 +56,7 @@ int host_stage2_idmap_locked(phys_addr_t addr, u64 size, enum kvm_pgtable_prot p
> int host_stage2_set_owner_locked(phys_addr_t addr, u64 size, u8 owner_id);
> int kvm_host_prepare_stage2(void *pgt_pool_base);
> int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd);
> +void kvm_guest_destroy_stage2(struct pkvm_hyp_vm *vm);
> void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt);
>
> int hyp_pin_shared_mem(void *from, void *to);
> diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> index 89eb20d4fee4..42b0b648f32f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> @@ -306,16 +306,21 @@ int kvm_guest_prepare_stage2(struct pkvm_hyp_vm *vm, void *pgd)
> return 0;
> }
>
> +void kvm_guest_destroy_stage2(struct pkvm_hyp_vm *vm)
> +{
> + guest_lock_component(vm);
> + kvm_pgtable_stage2_destroy(&vm->pgt);
> + vm->kvm.arch.mmu.pgd_phys = 0ULL;
> + guest_unlock_component(vm);
> +}
> +
> void reclaim_pgtable_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
> {
> struct hyp_page *page;
> void *addr;
>
> /* Dump all pgtable pages in the hyp_pool */
> - guest_lock_component(vm);
> - kvm_pgtable_stage2_destroy(&vm->pgt);
> - vm->kvm.arch.mmu.pgd_phys = 0ULL;
> - guest_unlock_component(vm);
> + kvm_guest_destroy_stage2(vm);
>
> /* Drain the hyp_pool into the memcache */
> addr = hyp_alloc_pages(&vm->pool, 0);
> diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> index eb1c10120f9f..3b2c4fbc34d8 100644
> --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
> +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> @@ -853,10 +853,12 @@ int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva,
> /* Must be called last since this publishes the VM. */
> ret = insert_vm_table_entry(handle, hyp_vm);
> if (ret)
> - goto err_remove_mappings;
> + goto err_destroy_stage2;
>
> return 0;
>
> +err_destroy_stage2:
> + kvm_guest_destroy_stage2(hyp_vm);
> err_remove_mappings:
> unmap_donated_memory(hyp_vm, vm_size);
> unmap_donated_memory(pgd, pgd_size);
> --
> 2.54.0.746.g67dd491aae-goog
>
^ permalink raw reply
* Re: [PATCH v2 1/3] KVM: arm64: Reset page order in pKVM hyp_pool_init
From: Fuad Tabba @ 2026-05-21 13:07 UTC (permalink / raw)
To: Vincent Donnefort
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, kvmarm, kernel-team,
qperret, Sashiko
In-Reply-To: <20260521102149.804874-2-vdonnefort@google.com>
On Thu, 21 May 2026 at 11:22, Vincent Donnefort <vdonnefort@google.com> wrote:
>
> When a VM fails to initialise after its stage-2 hyp_pool has been
> initialised, that stage-2 must be torn down entirely. This requires
> resetting both the refcount and the order of its pages back to 0.
>
> Currently, reclaim_pgtable_pages() implicitly resets the page order by
> allocating the entire pool with order-0 granularity. However, in the VM
> initialisation error path, the addresses of the donated memory (the PGD)
> are already known, making it unnecessary to iterate over all pages in
> the pool.
>
> Since the vmemmap page order is a hyp_pool-specific field, leaving a
> non-zero order on hyp_pool destruction is harmless until another pool
> attempts to admit the page. Instead of resetting this field during
> destruction, reset it during pool initialization in hyp_pool_init().
> Note that pages added to the pool outside of the initial pool range
> (e.g., via guest_s2_zalloc_page()) must still have their order managed
> manually.
>
> While at it, add a WARN_ON() in the hyp_pool attach path to catch
> unexpected page orders that exceed the pool's max_order.
>
> Fixes: 256b4668cd89 ("KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization")
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> index 25f04629014e..89eb20d4fee4 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> @@ -322,7 +322,6 @@ void reclaim_pgtable_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
> while (addr) {
> page = hyp_virt_to_page(addr);
> page->refcount = 0;
> - page->order = 0;
> push_hyp_memcache(mc, addr, hyp_virt_to_phys);
> WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(addr), 1));
> addr = hyp_alloc_pages(&vm->pool, 0);
> diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> index a1eb27a1a747..c3b3dc5a8ea7 100644
> --- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> @@ -97,6 +97,8 @@ static void __hyp_attach_page(struct hyp_pool *pool,
> u8 order = p->order;
> struct hyp_page *buddy;
>
> + WARN_ON(p->order > pool->max_order);
> +
Could you add a brief comment? It took me a minute to figure out what this
catches. IIUC it's not attach's own input, it's a stale p->order from way back
when an external page was popped from a memcache (today only via
guest_s2_zalloc_page()). Right?
With that.
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Cheers,
/fuad
> memset(hyp_page_to_virt(p), 0, PAGE_SIZE << p->order);
>
> /* Skip coalescing for 'external' pages being freed into the pool. */
> @@ -237,8 +239,10 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
>
> /* Init the vmemmap portion */
> p = hyp_phys_to_page(phys);
> - for (i = 0; i < nr_pages; i++)
> + for (i = 0; i < nr_pages; i++) {
> hyp_set_page_refcounted(&p[i]);
> + p[i].order = 0;
> + }
>
> /* Attach the unused pages to the buddy tree */
> for (i = reserved_pages; i < nr_pages; i++)
> --
> 2.54.0.746.g67dd491aae-goog
>
^ permalink raw reply
* Re: [PATCH v2 0/3] Fix __pkvm_init_vm error path
From: Fuad Tabba @ 2026-05-21 13:07 UTC (permalink / raw)
To: Vincent Donnefort
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, kvmarm, kernel-team,
qperret
In-Reply-To: <20260521102149.804874-1-vdonnefort@google.com>
On Thu, 21 May 2026 at 11:22, Vincent Donnefort <vdonnefort@google.com> wrote:
>
> Sashiko reported a potential refcount leak in the unlikely case where
> insert_vm_table_entry fails.
>
> While at it, I have added a fail-safe to __pkvm_hyp_donate_host to ensure this
> function doesn't allow leaking refcounted pages.
>
> Changes since v2:
>
> * Proactively init hyp_page order field in hyp_pool_init
>
> v1 (https://lore.kernel.org/all/20260521081250.655226-1-vdonnefort@google.com/)
>
> *** BLURB HERE ***
nit: missing BLURB :)
/fuad
/fuad
>
> Vincent Donnefort (3):
> KVM: arm64: Reset page order in pKVM hyp_pool_init
> KVM: arm64: Fix __pkvm_init_vm error path
> KVM: arm64: Add fail-safe for refcounted pages in
> __pkvm_hyp_donate_host
>
> arch/arm64/kvm/hyp/include/nvhe/mem_protect.h | 1 +
> arch/arm64/kvm/hyp/nvhe/mem_protect.c | 34 ++++++++++++++-----
> arch/arm64/kvm/hyp/nvhe/page_alloc.c | 6 +++-
> arch/arm64/kvm/hyp/nvhe/pkvm.c | 4 ++-
> 4 files changed, 34 insertions(+), 11 deletions(-)
>
>
> base-commit: 5200f5f493f79f14bbdc349e402a40dfb32f23c8
> --
> 2.54.0.746.g67dd491aae-goog
>
^ permalink raw reply
* Re: [PATCH v4 1/3] PCI: Allow ATS to be always on for CXL.cache capable devices
From: Jason Gunthorpe @ 2026-05-21 13:05 UTC (permalink / raw)
To: Yi Liu
Cc: Nicolin Chen, will, robin.murphy, bhelgaas, joro, praan, baolu.lu,
kevin.tian, miko.lenczewski, linux-arm-kernel, iommu,
linux-kernel, linux-pci, dan.j.williams, jonathan.cameron, vsethi,
linux-cxl, nirmoyd
In-Reply-To: <80e7e1be-c384-470f-9949-8c0dbad165ac@intel.com>
On Thu, May 21, 2026 at 03:31:46PM +0800, Yi Liu wrote:
> Does this hardware behavior satisfy the security expectation you have in
> mind? Or do you still require that both the DTE bit and the PCI ATS
> capability be explicitly disabled when a blocking domain is in effect?
If the HW rejects translated TLPs then you should be clearing the ATS
enable bit in the device config space prior to rejecting them
But it does seem secure enough as-is.
Jason
^ permalink raw reply
* [PATCH] irqchip/gic-v4: Harden against bogus command line
From: Mostafa Saleh @ 2026-05-21 13:05 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel; +Cc: maz, tglx, Mostafa Saleh
When accidentally setting “kvm-arm.vgic_v4_enable=1” on the wrong
setup that has no MSI controller device tree node (it exists but
not used) and GICv4, it caused a panic as “gic_domain” is NULL and
the kernel attempted to access its ops.
Originally, I hit this on an older kernel, but was able to reproduce
it on upstream with Qemu by hacking this unreasonable setup.
[ 33.145536] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
[ 33.145658] Mem abort info:
[ 33.145751] ESR = 0x0000000096000006
...
[ 33.154057] CPU: 1 UID: 0 PID: 295 Comm: lkvm-static Not tainted 7.1.0-rc4-ge3f15ad3970e #5 PREEMPT
[ 33.156922] Hardware name: linux,dummy-virt (DT)
[ 33.158780] pstate: 81402005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 33.160340] pc : __irq_domain_instantiate+0x1d4/0x578
[ 33.162602] lr : __irq_domain_instantiate+0x1cc/0x578
Add a hardening check to avoid the NULL access, and fail the VM
creation in that case.
Signed-off-by: Mostafa Saleh <smostafa@google.com>
---
drivers/irqchip/irq-gic-v4.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/irqchip/irq-gic-v4.c b/drivers/irqchip/irq-gic-v4.c
index 8455b4a5fbb0..7e39f7eae85f 100644
--- a/drivers/irqchip/irq-gic-v4.c
+++ b/drivers/irqchip/irq-gic-v4.c
@@ -159,6 +159,9 @@ int its_alloc_vcpu_irqs(struct its_vm *vm)
{
int vpe_base_irq, i;
+ if (!gic_domain)
+ return -EINVAL;
+
vm->fwnode = irq_domain_alloc_named_id_fwnode("GICv4-vpe",
task_pid_nr(current));
if (!vm->fwnode)
--
2.54.0.669.g59709faab0-goog
^ permalink raw reply related
* Re: [PATCH v7 09/28] media: rockchip: rga: remove redundant rga_frame variables
From: Michael Tretter @ 2026-05-21 13:03 UTC (permalink / raw)
To: Sven Püschel
Cc: Jacob Chen, Ezequiel Garcia, Mauro Carvalho Chehab,
Heiko Stuebner, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Hans Verkuil, linux-media, linux-rockchip, linux-arm-kernel,
linux-kernel, devicetree, kernel, nicolas, sebastian.reichel,
p.zabel, Nicolas Dufresne
In-Reply-To: <20260521-spu-rga3-v7-9-3f33e8c7145f@pengutronix.de>
On Thu, 21 May 2026 00:44:14 +0200, Sven Püschel wrote:
> Remove the redundant rga_frame variables width, height and color space.
> The value of these variables is already contained in the pix member
> of rga_frame. The code also keeps these values in sync. Therefore drop
> them in favor of the existing pix member.
>
> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Signed-off-by: Sven Püschel <s.pueschel@pengutronix.de>
Reviewed-by: Michael Tretter <m.tretter@pengutronix.de>
> ---
> drivers/media/platform/rockchip/rga/rga-buf.c | 6 ++---
> drivers/media/platform/rockchip/rga/rga-hw.c | 6 ++---
> drivers/media/platform/rockchip/rga/rga.c | 32 ++++++++++-----------------
> drivers/media/platform/rockchip/rga/rga.h | 5 -----
> 4 files changed, 18 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/media/platform/rockchip/rga/rga-buf.c b/drivers/media/platform/rockchip/rga/rga-buf.c
> index 65fc0d5b4aa10..ffc6162b2e681 100644
> --- a/drivers/media/platform/rockchip/rga/rga-buf.c
> +++ b/drivers/media/platform/rockchip/rga/rga-buf.c
> @@ -103,10 +103,10 @@ static int get_plane_offset(struct rga_frame *f,
> if (plane == 0)
> return 0;
> if (plane == 1)
> - return stride * f->height;
> + return stride * f->pix.height;
> if (plane == 2)
> - return stride * f->height +
> - (stride * f->height / info->hdiv / info->vdiv);
> + return stride * f->pix.height +
> + (stride * f->pix.height / info->hdiv / info->vdiv);
>
> return -EINVAL;
> }
> diff --git a/drivers/media/platform/rockchip/rga/rga-hw.c b/drivers/media/platform/rockchip/rga/rga-hw.c
> index d1618bb247501..ec6c17504ca15 100644
> --- a/drivers/media/platform/rockchip/rga/rga-hw.c
> +++ b/drivers/media/platform/rockchip/rga/rga-hw.c
> @@ -53,7 +53,7 @@ rga_get_addr_offset(struct rga_frame *frm, struct rga_addr_offset *offset,
> x_div = frm->fmt->x_div;
> y_div = frm->fmt->y_div;
> uv_stride = frm->stride / x_div;
> - pixel_width = frm->stride / frm->width;
> + pixel_width = frm->stride / frm->pix.width;
>
> lt->y_off = offset->y_off + y * frm->stride + x * pixel_width;
> lt->u_off = offset->u_off + (y / y_div) * uv_stride + x / x_div;
> @@ -191,7 +191,7 @@ static void rga_cmd_set_trans_info(struct rga_ctx *ctx)
>
> if (RGA_COLOR_FMT_IS_YUV(ctx->in.fmt->hw_format) &&
> RGA_COLOR_FMT_IS_RGB(ctx->out.fmt->hw_format)) {
> - switch (ctx->in.colorspace) {
> + switch (ctx->in.pix.colorspace) {
> case V4L2_COLORSPACE_REC709:
> src_info.data.csc_mode = RGA_SRC_CSC_MODE_BT709_R0;
> break;
> @@ -203,7 +203,7 @@ static void rga_cmd_set_trans_info(struct rga_ctx *ctx)
>
> if (RGA_COLOR_FMT_IS_RGB(ctx->in.fmt->hw_format) &&
> RGA_COLOR_FMT_IS_YUV(ctx->out.fmt->hw_format)) {
> - switch (ctx->out.colorspace) {
> + switch (ctx->out.pix.colorspace) {
> case V4L2_COLORSPACE_REC709:
> dst_info.data.csc_mode = RGA_SRC_CSC_MODE_BT709_R0;
> break;
> diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c
> index c07207edffdb6..ca8d8a53dc251 100644
> --- a/drivers/media/platform/rockchip/rga/rga.c
> +++ b/drivers/media/platform/rockchip/rga/rga.c
> @@ -329,9 +329,6 @@ static struct rga_fmt *rga_fmt_find(u32 pixelformat)
> }
>
> static struct rga_frame def_frame = {
> - .width = DEFAULT_WIDTH,
> - .height = DEFAULT_HEIGHT,
> - .colorspace = V4L2_COLORSPACE_DEFAULT,
> .crop.left = 0,
> .crop.top = 0,
> .crop.width = DEFAULT_WIDTH,
> @@ -363,9 +360,9 @@ static int rga_open(struct file *file)
> ctx->out = def_frame;
>
> v4l2_fill_pixfmt_mp(&ctx->in.pix,
> - ctx->in.fmt->fourcc, ctx->out.width, ctx->out.height);
> + ctx->in.fmt->fourcc, DEFAULT_WIDTH, DEFAULT_HEIGHT);
> v4l2_fill_pixfmt_mp(&ctx->out.pix,
> - ctx->out.fmt->fourcc, ctx->out.width, ctx->out.height);
> + ctx->out.fmt->fourcc, DEFAULT_WIDTH, DEFAULT_HEIGHT);
>
> if (mutex_lock_interruptible(&rga->mutex)) {
> kfree(ctx);
> @@ -453,10 +450,8 @@ static int vidioc_g_fmt(struct file *file, void *priv, struct v4l2_format *f)
> if (IS_ERR(frm))
> return PTR_ERR(frm);
>
> - v4l2_fill_pixfmt_mp(pix_fmt, frm->fmt->fourcc, frm->width, frm->height);
> -
> + *pix_fmt = frm->pix;
> pix_fmt->field = V4L2_FIELD_NONE;
> - pix_fmt->colorspace = frm->colorspace;
>
> return 0;
> }
> @@ -505,27 +500,24 @@ static int vidioc_s_fmt(struct file *file, void *priv, struct v4l2_format *f)
> frm = rga_get_frame(ctx, f->type);
> if (IS_ERR(frm))
> return PTR_ERR(frm);
> - frm->width = pix_fmt->width;
> - frm->height = pix_fmt->height;
> frm->size = 0;
> for (i = 0; i < pix_fmt->num_planes; i++)
> frm->size += pix_fmt->plane_fmt[i].sizeimage;
> frm->fmt = rga_fmt_find(pix_fmt->pixelformat);
> frm->stride = pix_fmt->plane_fmt[0].bytesperline;
> - frm->colorspace = pix_fmt->colorspace;
>
> /* Reset crop settings */
> frm->crop.left = 0;
> frm->crop.top = 0;
> - frm->crop.width = frm->width;
> - frm->crop.height = frm->height;
> + frm->crop.width = pix_fmt->width;
> + frm->crop.height = pix_fmt->height;
>
> frm->pix = *pix_fmt;
>
> v4l2_dbg(debug, 1, &rga->v4l2_dev,
> "[%s] fmt - %p4cc %dx%d (stride %d, sizeimage %d)\n",
> V4L2_TYPE_IS_OUTPUT(f->type) ? "OUTPUT" : "CAPTURE",
> - &frm->fmt->fourcc, frm->width, frm->height,
> + &frm->fmt->fourcc, pix_fmt->width, pix_fmt->height,
> frm->stride, frm->size);
>
> for (i = 0; i < pix_fmt->num_planes; i++) {
> @@ -579,8 +571,8 @@ static int vidioc_g_selection(struct file *file, void *priv,
> } else {
> s->r.left = 0;
> s->r.top = 0;
> - s->r.width = f->width;
> - s->r.height = f->height;
> + s->r.width = f->pix.width;
> + s->r.height = f->pix.height;
> }
>
> return 0;
> @@ -629,8 +621,8 @@ static int vidioc_s_selection(struct file *file, void *priv,
> return -EINVAL;
> }
>
> - if (s->r.left + s->r.width > f->width ||
> - s->r.top + s->r.height > f->height ||
> + if (s->r.left + s->r.width > f->pix.width ||
> + s->r.top + s->r.height > f->pix.height ||
> s->r.width < MIN_WIDTH || s->r.height < MIN_HEIGHT) {
> v4l2_dbg(debug, 1, &rga->v4l2_dev, "unsupported crop value.\n");
> return -EINVAL;
> @@ -821,8 +813,8 @@ static int rga_probe(struct platform_device *pdev)
> goto rel_m2m;
> }
>
> - def_frame.stride = (def_frame.width * def_frame.fmt->depth) >> 3;
> - def_frame.size = def_frame.stride * def_frame.height;
> + def_frame.stride = (DEFAULT_WIDTH * def_frame.fmt->depth) >> 3;
> + def_frame.size = def_frame.stride * DEFAULT_HEIGHT;
>
> ret = video_register_device(vfd, VFL_TYPE_VIDEO, -1);
> if (ret) {
> diff --git a/drivers/media/platform/rockchip/rga/rga.h b/drivers/media/platform/rockchip/rga/rga.h
> index 477cf5b62bbb2..c4a3905a48f0d 100644
> --- a/drivers/media/platform/rockchip/rga/rga.h
> +++ b/drivers/media/platform/rockchip/rga/rga.h
> @@ -24,11 +24,6 @@ struct rga_fmt {
> };
>
> struct rga_frame {
> - /* Original dimensions */
> - u32 width;
> - u32 height;
> - u32 colorspace;
> -
> /* Crop */
> struct v4l2_rect crop;
>
>
> --
> 2.54.0
>
>
--
Pengutronix e.K. | Michael Tretter |
Steuerwalder Str. 21 | https://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply
* Re: [PATCH v14 06/44] arm64: RMI: Check for RMI support at init
From: Marc Zyngier @ 2026-05-21 13:02 UTC (permalink / raw)
To: Steven Price
Cc: kvm, kvmarm, Catalin Marinas, Will Deacon, James Morse,
Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
Fuad Tabba, linux-coco, Ganapatrao Kulkarni, Gavin Shan,
Shanker Donthineni, Alper Gun, Aneesh Kumar K . V, Emi Kisanuki,
Vishal Annapurve, WeiLin.Chang, Lorenzo.Pieralisi2
In-Reply-To: <20260513131757.116630-7-steven.price@arm.com>
On Wed, 13 May 2026 14:17:14 +0100,
Steven Price <steven.price@arm.com> wrote:
>
> Query the RMI version number and check if it is a compatible version.
> The first two feature registers are read and exposed for future code to
> use.
>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
> v14:
> * This moves the basic RMI setup into the 'kernel' directory. This is
> because RMI will be used for some features outside of KVM so should
> be available even if KVM isn't compiled in.
> ---
> arch/arm64/include/asm/rmi_cmds.h | 3 ++
> arch/arm64/kernel/Makefile | 2 +-
> arch/arm64/kernel/cpufeature.c | 1 +
> arch/arm64/kernel/rmi.c | 65 +++++++++++++++++++++++++++++++
> 4 files changed, 70 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm64/kernel/rmi.c
>
> diff --git a/arch/arm64/include/asm/rmi_cmds.h b/arch/arm64/include/asm/rmi_cmds.h
> index 04f7066894e9..9179934925c5 100644
> --- a/arch/arm64/include/asm/rmi_cmds.h
> +++ b/arch/arm64/include/asm/rmi_cmds.h
> @@ -10,6 +10,9 @@
>
> #include <asm/rmi_smc.h>
>
> +extern unsigned long rmm_feat_reg0;
> +extern unsigned long rmm_feat_reg1;
> +
> struct rtt_entry {
> unsigned long walk_level;
> unsigned long desc;
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 74b76bb70452..d68f351aae75 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -34,7 +34,7 @@ obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
> cpufeature.o alternative.o cacheinfo.o \
> smp.o smp_spin_table.o topology.o smccc-call.o \
> syscall.o proton-pack.o idle.o patching.o pi/ \
> - rsi.o jump_label.o
> + rsi.o jump_label.o rmi.o
>
> obj-$(CONFIG_COMPAT) += sys32.o signal32.o \
> sys_compat.o
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 6d53bb15cf7b..8bdd95a8c2de 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -292,6 +292,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar3[] = {
> static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
> ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_CSV3_SHIFT, 4, 0),
> ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_CSV2_SHIFT, 4, 0),
> + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_RME_SHIFT, 4, 0),
> ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_DIT_SHIFT, 4, 0),
> ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_AMU_SHIFT, 4, 0),
> ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_MPAM_SHIFT, 4, 0),
> diff --git a/arch/arm64/kernel/rmi.c b/arch/arm64/kernel/rmi.c
> new file mode 100644
> index 000000000000..99c1ccc35c11
> --- /dev/null
> +++ b/arch/arm64/kernel/rmi.c
> @@ -0,0 +1,65 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2023-2025 ARM Ltd.
> + */
> +
> +#include <linux/memblock.h>
> +
> +#include <asm/rmi_cmds.h>
> +
> +unsigned long rmm_feat_reg0;
> +unsigned long rmm_feat_reg1;
What is the requirement for making those globally accessible? Can't
they be made static and use an accessor that returns them? Can the
variables be made __ro_after_init?
> +
> +static int rmi_check_version(void)
> +{
> + struct arm_smccc_res res;
> + unsigned short version_major, version_minor;
> + unsigned long host_version = RMI_ABI_VERSION(RMI_ABI_MAJOR_VERSION,
> + RMI_ABI_MINOR_VERSION);
> + unsigned long aa64pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +
> + /* If RME isn't supported, then RMI can't be */
> + if (cpuid_feature_extract_unsigned_field(aa64pfr0, ID_AA64PFR0_EL1_RME_SHIFT) == 0)
> + return -ENXIO;
> +
> + arm_smccc_1_1_invoke(SMC_RMI_VERSION, host_version, &res);
> +
> + if (res.a0 == SMCCC_RET_NOT_SUPPORTED)
> + return -ENXIO;
> +
> + version_major = RMI_ABI_VERSION_GET_MAJOR(res.a1);
> + version_minor = RMI_ABI_VERSION_GET_MINOR(res.a1);
> +
> + if (res.a0 != RMI_SUCCESS) {
> + unsigned short high_version_major, high_version_minor;
> +
> + high_version_major = RMI_ABI_VERSION_GET_MAJOR(res.a2);
> + high_version_minor = RMI_ABI_VERSION_GET_MINOR(res.a2);
> +
> + pr_err("Unsupported RMI ABI (v%d.%d - v%d.%d) we want v%d.%d\n",
> + version_major, version_minor,
> + high_version_major, high_version_minor,
> + RMI_ABI_MAJOR_VERSION,
> + RMI_ABI_MINOR_VERSION);
> + return -ENXIO;
> + }
> +
> + pr_info("RMI ABI version %d.%d\n", version_major, version_minor);
> +
> + return 0;
> +}
> +
> +static int __init arm64_init_rmi(void)
> +{
> + /* Continue without realm support if we can't agree on a version */
> + if (rmi_check_version())
> + return 0;
> +
> + if (WARN_ON(rmi_features(0, &rmm_feat_reg0)))
> + return 0;
> + if (WARN_ON(rmi_features(1, &rmm_feat_reg1)))
> + return 0;
> +
> + return 0;
> +}
> +subsys_initcall(arm64_init_rmi);
Is there any reliance on this being executed before or after KVM's own
initialisation? If so, this should be captured.
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply
* Re: [PATCH v4 3/5] firmware: arm_ffa: Fix Endpoint Memory Access Descriptor offset calculation
From: Sudeep Holla @ 2026-05-21 12:55 UTC (permalink / raw)
To: Mostafa Saleh
Cc: op-tee, linux-kernel, Sudeep Holla, kvmarm, linux-arm-kernel, maz,
oupton, joey.gouly, suzuki.poulose, catalin.marinas,
jens.wiklander, sumit.garg, sebastianene, vdonnefort
In-Reply-To: <20260520204948.2440882-4-smostafa@google.com>
On Wed, May 20, 2026 at 08:49:46PM +0000, Mostafa Saleh wrote:
> From: Sebastian Ene <sebastianene@google.com>
>
> Use the descriptor's `ep_mem_offset` to calculate the start of the endpoint
> memory access array and to comply with the FF-A spec instead of defaulting
> to `sizeof(struct ffa_mem_region)`.
> This requires moving `ffa_mem_region_additional_setup()` earlier in the setup
> flow.
> Also, add sanity checks to ensure the calculated descriptor offsets do not
> exceed `max_fragsize`.
>
Core change remains same as v3 except improved error checking, so my
review still applies.
Reviewed-by: Sudeep Holla <sudeep.holla@kernel.org>
--
Regards,
Sudeep
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox