Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups
@ 2026-05-21 13:25 Mark Rutland
  2026-05-21 13:25 ` [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h> Mark Rutland
                   ` (18 more replies)
  0 siblings, 19 replies; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Hi.

This series cleans up low-level FPSIMD/SVE/SME state management code,
making it easier to maintain and extend (e.g. adding SME support to
KVM), and enabling better debugging (e.g. by making SVE/SME save/restore
visible to KASAN and KCSAN).

This is purely cleanup, there are NO bugs addressed by this series.

The series aims to do a few key things:

* Make it harder to mis-manage in-memory SVE state and SME state. These
  are given opaque types (struct sve_state and struct sme_state), and
  the (awkward) calling convention for saving/restoring SVE state is
  simplified to take a pointer to the base of the state rather than a
  pointer to the FFR within the state.

* Minimize duplications between KVM and the rest of the kernel. The
  FPSIMD/SVE/SME routines are moved to inline assembly such that the
  same helper functions can be used everywhere, without the need to wrap
  assembly macros.

* Make the code easier to follow. Assembly sequences are minimized to
  avoid address generation and control-flow that can be written more
  clearly in C. Awkward assembly macros are removed where possible.

* Make it easier to debug state management. Explicit instrumentation is
  added to the save/restore functions so that KASAN and KCSAN can detect
  memory safety issues and concurrency issues.

  This instrumentation is inhibited for nVHE hyp objects, and does not
  adversely affect KVM. I've confirmed by looking at compiler flags
  during the build, and disassembling the relevant object files.

* Remove unnecessary code. By relying on assembler support for SVE and
  SME we can remove awkward assembly macros, making the code
  significantly simpler and easier to read.

I've compile-tested this with a variety of toolchains:

* GCC  8.1.0 + binutils 2.30
* GCC 11.1.0 + binutils 2.36.1
* GCC 12.1.0 + binutils 2.38
* GCC 15.2.0 + binutils 2.45
* LLVM 15.0.7
* LLVM 21.1.8

I've boot-tested on an SVE+SME capable model, both with KASAN enabled
and without KASAN enabled. All the FPSIMD/SVE/SME kselftests passed in
both configurations, without any KASAN splats. Unfortunately, with KCSAN
enabled, some tests hit timeouts (without any KCSAN splat), which I
believe is simply due to the overhead of KCSAN rather than some adverse
functional effect.

I've boot-tested on an SVE+SME capable model, booting with KVM in each
of:

* VHE mode
* hVHE mode
* Protected mode

In each case I've boot-tested a v7.0 defconfig guest, both with SVE and
without SVE.

Mark.

Mark Rutland (18):
  KVM: arm64: Don't include <asm/fpsimdmacros.h>
  KVM: arm64: Don't override FFR save/restore argument
  KVM: arm64: pkvm: Save host FPMR in host cpu context
  KVM: arm64: pkvm: Remove struct cpu_sve_state
  arm64: fpsimd: Fold sve_init_regs() into do_sve_acc()
  arm64: fpsimd: Remove sve_set_vq() and sme_set_vq()
  arm64: fpsimd: Use assembler for SVE instructions
  arm64: fpsimd: Use assembler for baseline SME instructions
  arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline
  arm64: sysreg: Add FPCR and FPSR
  arm64: fpsimd: Split FPSR/FPCR from SVE save/restore
  arm64: fpsimd: Move fpsimd save/restore inline
  arm64: fpsimd: Use opaque type for SVE state
  arm64: fpsimd: Use opaque type for SME state
  arm64: fpsimd: Move SVE save/restore inline
  arm64: fpsimd: Move sve_flush_live() inline
  arm64: fpsimd: Move SME save/restore inline
  arm64: fpsimd: Remove <asm/fpsimdmacros.h>

 arch/arm64/Kconfig                      |   5 +
 arch/arm64/include/asm/fpsimd.h         | 369 ++++++++++++++++++++++--
 arch/arm64/include/asm/fpsimdmacros.h   | 357 -----------------------
 arch/arm64/include/asm/kvm_host.h       |  27 +-
 arch/arm64/include/asm/kvm_hyp.h        |   5 -
 arch/arm64/include/asm/kvm_pkvm.h       |   3 +-
 arch/arm64/include/asm/processor.h      |   7 +-
 arch/arm64/kernel/Makefile              |   2 +-
 arch/arm64/kernel/entry-common.c        |   8 +-
 arch/arm64/kernel/entry-fpsimd.S        | 134 ---------
 arch/arm64/kernel/fpsimd.c              |  90 +++---
 arch/arm64/kvm/arm.c                    |  16 +-
 arch/arm64/kvm/guest.c                  |   4 +-
 arch/arm64/kvm/hyp/entry.S              |   1 -
 arch/arm64/kvm/hyp/fpsimd.S             |  33 ---
 arch/arm64/kvm/hyp/include/hyp/switch.h |  23 +-
 arch/arm64/kvm/hyp/nvhe/Makefile        |   2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      |  20 +-
 arch/arm64/kvm/hyp/nvhe/setup.c         |   4 +-
 arch/arm64/kvm/hyp/vhe/Makefile         |   2 +-
 arch/arm64/tools/sysreg                 |  45 +++
 21 files changed, 480 insertions(+), 677 deletions(-)
 delete mode 100644 arch/arm64/include/asm/fpsimdmacros.h
 delete mode 100644 arch/arm64/kernel/entry-fpsimd.S
 delete mode 100644 arch/arm64/kvm/hyp/fpsimd.S

-- 
2.30.2



^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h>
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 14:18   ` Mark Brown
  2026-05-27 10:10   ` Vladimir Murzin
  2026-05-21 13:25 ` [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument Mark Rutland
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

There's no need for hyp/entry.S to include <asm/fpsimdmacros.h>.

The fpsimd macros have never been used by code in hyp/entry.S, and were
instead used by code in hyp/fpsimd.S.

Remove the unnecessary include.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kvm/hyp/entry.S | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 11a10d8f5beb2..308100ed25de9 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -8,7 +8,6 @@
 
 #include <asm/alternative.h>
 #include <asm/assembler.h>
-#include <asm/fpsimdmacros.h>
 #include <asm/kvm.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_asm.h>
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
  2026-05-21 13:25 ` [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h> Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 14:27   ` Mark Brown
  2026-05-27 10:16   ` Vladimir Murzin
  2026-05-21 13:25 ` [PATCH 03/18] KVM: arm64: pkvm: Save host FPMR in host cpu context Mark Rutland
                   ` (16 subsequent siblings)
  18 siblings, 2 replies; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

The __sve_save_state() and __sve_restore_state() functions take a
parameter describing whether to save/restore the FFR, but both functions
silently override this with '1'. This has always been benign (and
callers have all passed 'true' since the parameter was introduced), but
clearly this is not intentional.

Historically, the functions always saved/restored the FFR, and there was
no parameter to control this.

In v5.16, the sve_save and sve_load assembly macros used by
__sve_save_state() and __sve_restore_state() were changed to make
saving/restoring FFR optional. The implementations of __sve_save_state()
and __sve_restore_state() were changed to pass '1' to their respective
macros, and the prototypes of __sve_save_state() and
__sve_restore_state() were unchanged. See commit:

  9f5848665788 ("arm64/sve: Make access to FFR optional")

In v6.10, the prototypes of __sve_save_state() and __sve_restore_state()
were changed to add 'save_ffr' and 'restore_ffr' parameters
respectively, but the implementations were not changed to stop passing 1
to their respective macros. All callers were changed to pass 'true' to
__sve_save_state() and __sve_restore_state(). See commit:

  45f4ea9bcfe9 ("KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_state")

This is all benign, but clearly unintentional, and it gets in the way of
cleaning up the FPSIMD/SVE/SME code. Remove the unnecessary overriding.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kvm/hyp/fpsimd.S | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
index e950875e31cee..6e16cbfc5df27 100644
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ b/arch/arm64/kvm/hyp/fpsimd.S
@@ -21,13 +21,11 @@ SYM_FUNC_START(__fpsimd_restore_state)
 SYM_FUNC_END(__fpsimd_restore_state)
 
 SYM_FUNC_START(__sve_restore_state)
-	mov	x2, #1
 	sve_load 0, x1, x2, 3
 	ret
 SYM_FUNC_END(__sve_restore_state)
 
 SYM_FUNC_START(__sve_save_state)
-	mov	x2, #1
 	sve_save 0, x1, x2, 3
 	ret
 SYM_FUNC_END(__sve_save_state)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 03/18] KVM: arm64: pkvm: Save host FPMR in host cpu context
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
  2026-05-21 13:25 ` [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h> Mark Rutland
  2026-05-21 13:25 ` [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-27 10:29   ` Vladimir Murzin
  2026-05-21 13:25 ` [PATCH 04/18] KVM: arm64: pkvm: Remove struct cpu_sve_state Mark Rutland
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Protected KVM stores most of the host's system register state in
kvm_host_data::host_ctxt, which is an instance of struct
kvm_cpu_context. As kvm_cpu_context::sys_regs[] has a slot for FPMR, we
can store the host's FPMR there.

Do so, and remove kvm_host_data::fpmr.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/kvm_host.h       | 3 ---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ++++--
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      | 5 +++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 65eead8362e0b..42b1c4764a4bf 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -775,9 +775,6 @@ struct kvm_host_data {
 	 */
 	struct cpu_sve_state *sve_state;
 
-	/* Used by pKVM only. */
-	u64	fpmr;
-
 	/* Ownership of the FP regs */
 	enum {
 		FP_STATE_FREE,
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 98b2976837b11..cc4d011a2b380 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -554,6 +554,8 @@ static inline void fpsimd_lazy_switch_to_host(struct kvm_vcpu *vcpu)
 
 static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
 {
+	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+
 	/*
 	 * Non-protected kvm relies on the host restoring its sve state.
 	 * Protected kvm restores the host's sve state as not to reveal that
@@ -562,11 +564,11 @@ static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
 	if (system_supports_sve()) {
 		__hyp_sve_save_host();
 	} else {
-		__fpsimd_save_state(host_data_ptr(host_ctxt.fp_regs));
+		__fpsimd_save_state(&hctxt->fp_regs);
 	}
 
 	if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))
-		*host_data_ptr(fpmr) = read_sysreg_s(SYS_FPMR);
+		ctxt_sys_reg(hctxt, FPMR) = read_sysreg_s(SYS_FPMR);
 }
 
 
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 06db299c37a89..db60f770060e5 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -66,6 +66,7 @@ static void fpsimd_sve_flush(void)
 
 static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
 {
+	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
 	bool has_fpmr;
 
 	if (!guest_owns_fp_regs())
@@ -89,10 +90,10 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
 	if (system_supports_sve())
 		__hyp_sve_restore_host();
 	else
-		__fpsimd_restore_state(host_data_ptr(host_ctxt.fp_regs));
+		__fpsimd_restore_state(&hctxt->fp_regs);
 
 	if (has_fpmr)
-		write_sysreg_s(*host_data_ptr(fpmr), SYS_FPMR);
+		write_sysreg_s(ctxt_sys_reg(hctxt, FPMR), SYS_FPMR);
 
 	*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;
 }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 04/18] KVM: arm64: pkvm: Remove struct cpu_sve_state
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (2 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 03/18] KVM: arm64: pkvm: Save host FPMR in host cpu context Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-27 11:58   ` Vladimir Murzin
  2026-05-21 13:25 ` [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc() Mark Rutland
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

There's no need for struct cpu_sve_state. Code would be simpler and more
robust without it, and removing it will simplify further cleanups (e.g.
adding an opaque type for the sve register state).

Protected KVM stores most of the host's system register state in
kvm_host_data::host_ctxt, which is an instance of struct
kvm_cpu_context. As kvm_cpu_context::sys_regs[] has a slot for ZCR_EL1,
we can store the host's ZCR_EL1 there.

While kvm_cpu_context::sys_regs doesn't have slots for FPSR and FPCR,
these are usually expected to be stored in struct user_fpsimd_state.
For historical reasons, __sve_save_state and __sve_restore_state()
expect a pointer to fpsr *within* struct user_fpsimd_state, assuming the
fpcr will immediately follow, as per the order within struct
user_fpsimd_state. We currently match this ordering in struct
cpu_sve_state, but it would be simpler and more robust to use struct
user_fpsimd_state directly.

After moving ZCR_EL1, FPSR, and FPCR out of struct cpu_sve_state, all
that's left is sve_regs, which can be represented as a pointer without
need for a container struct. This is kept as a pointer to u8 (matching
the array type), as this permits the compiler to catch unbalanced
referencing/dereferencing, which is not possible for pointers to void.

Apply the above changes, and remove cpu_sve_state.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/kvm_host.h       | 18 ++----------------
 arch/arm64/include/asm/kvm_pkvm.h       |  3 +--
 arch/arm64/kvm/arm.c                    | 16 ++++++++--------
 arch/arm64/kvm/hyp/include/hyp/switch.h |  9 +++++----
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      |  9 +++++----
 arch/arm64/kvm/hyp/nvhe/setup.c         |  4 ++--
 6 files changed, 23 insertions(+), 36 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 42b1c4764a4bf..ae24617380b8f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -732,20 +732,6 @@ struct kvm_cpu_context {
 	u64 *vncr_array;
 };
 
-struct cpu_sve_state {
-	__u64 zcr_el1;
-
-	/*
-	 * Ordering is important since __sve_save_state/__sve_restore_state
-	 * relies on it.
-	 */
-	__u32 fpsr;
-	__u32 fpcr;
-
-	/* Must be SVE_VQ_BYTES (128 bit) aligned. */
-	__u8 sve_regs[];
-};
-
 /*
  * This structure is instantiated on a per-CPU basis, and contains
  * data that is:
@@ -771,9 +757,9 @@ struct kvm_host_data {
 
 	/*
 	 * Hyp VA.
-	 * sve_state is only used in pKVM and if system_supports_sve().
+	 * sve_regs is only used in pKVM and if system_supports_sve().
 	 */
-	struct cpu_sve_state *sve_state;
+	u8	*sve_regs;
 
 	/* Ownership of the FP regs */
 	enum {
diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index 2954b311128c7..74fedd9c5ff02 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -188,8 +188,7 @@ static inline size_t pkvm_host_sve_state_size(void)
 	if (!system_supports_sve())
 		return 0;
 
-	return size_add(sizeof(struct cpu_sve_state),
-			SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl)));
+	return SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl));
 }
 
 struct pkvm_mapping {
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 8bb2c7422cc8b..f9fc85a0344e1 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2499,10 +2499,10 @@ static void __init teardown_hyp_mode(void)
 			continue;
 
 		if (free_sve) {
-			struct cpu_sve_state *sve_state;
+			u8 *sve_regs;
 
-			sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;
-			free_pages((unsigned long) sve_state, pkvm_host_sve_state_order());
+			sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
+			free_pages((unsigned long) sve_regs, pkvm_host_sve_state_order());
 		}
 
 		free_pages(kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu], nvhe_percpu_order());
@@ -2627,7 +2627,7 @@ static int init_pkvm_host_sve_state(void)
 		if (!page)
 			return -ENOMEM;
 
-		per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state = page_address(page);
+		per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs = page_address(page);
 	}
 
 	/*
@@ -2648,11 +2648,11 @@ static void finalize_init_hyp_mode(void)
 
 	if (system_supports_sve() && is_protected_kvm_enabled()) {
 		for_each_possible_cpu(cpu) {
-			struct cpu_sve_state *sve_state;
+			u8 *sve_regs;
 
-			sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;
-			per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state =
-				kern_hyp_va(sve_state);
+			sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
+			per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs =
+				kern_hyp_va(sve_regs);
 		}
 	}
 }
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index cc4d011a2b380..6512dd3f75ae4 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -484,12 +484,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 
 static inline void __hyp_sve_save_host(void)
 {
-	struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
+	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+	u8 *sve_regs = *host_data_ptr(sve_regs);
 
-	sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);
+	ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
-	__sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
-			 &sve_state->fpsr,
+	__sve_save_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
+			 &hctxt->fp_regs.fpsr,
 			 true);
 }
 
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index db60f770060e5..04a6d2e0ea73f 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -41,7 +41,8 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
 
 static void __hyp_sve_restore_host(void)
 {
-	struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
+	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+	u8 *sve_regs = *host_data_ptr(sve_regs);
 
 	/*
 	 * On saving/restoring host sve state, always use the maximum VL for
@@ -53,10 +54,10 @@ static void __hyp_sve_restore_host(void)
 	 * need to be revisited.
 	 */
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
-	__sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
-			    &sve_state->fpsr,
+	__sve_restore_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
+			    &hctxt->fp_regs.fpsr,
 			    true);
-	write_sysreg_el1(sve_state->zcr_el1, SYS_ZCR);
+	write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
 }
 
 static void fpsimd_sve_flush(void)
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index d461981616d90..cdaf53c833409 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -82,9 +82,9 @@ static int pkvm_create_host_sve_mappings(void)
 
 	for (i = 0; i < hyp_nr_cpus; i++) {
 		struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);
-		struct cpu_sve_state *sve_state = host_data->sve_state;
+		u8 *sve_regs = host_data->sve_regs;
 
-		start = kern_hyp_va(sve_state);
+		start = kern_hyp_va(sve_regs);
 		end = start + PAGE_ALIGN(pkvm_host_sve_state_size());
 		ret = pkvm_create_mappings(start, end, PAGE_HYP);
 		if (ret)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc()
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (3 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 04/18] KVM: arm64: pkvm: Remove struct cpu_sve_state Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 15:28   ` Mark Brown
  2026-05-27 12:05   ` Vladimir Murzin
  2026-05-21 13:25 ` [PATCH 06/18] arm64: fpsimd: Remove sve_set_vq() and sme_set_vq() Mark Rutland
                   ` (13 subsequent siblings)
  18 siblings, 2 replies; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

For historical reasons, do_sve_acc() is structurally different from
do_sme_acc(), and the logic to convert the task from FPSIMD to SVE is
out-of-line in sve_init_regs(). We only use sve_init_regs() within
do_sme_acc(), so it's not necessary for this to be a separate function.

Fold sve_init_regs() into do_sve_acc(), and simplify the associated
comments. This makes do_sve_acc() structurally similar to do_sme_acc(),
making it easier to see similarities and differences.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/fpsimd.c | 48 ++++++++++++++------------------------
 1 file changed, 17 insertions(+), 31 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 60a45d600b460..a8395cb303344 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1293,31 +1293,6 @@ void sme_suspend_exit(void)
 
 #endif /* CONFIG_ARM64_SME */
 
-static void sve_init_regs(void)
-{
-	/*
-	 * Convert the FPSIMD state to SVE, zeroing all the state that
-	 * is not shared with FPSIMD. If (as is likely) the current
-	 * state is live in the registers then do this there and
-	 * update our metadata for the current task including
-	 * disabling the trap, otherwise update our in-memory copy.
-	 * We are guaranteed to not be in streaming mode, we can only
-	 * take a SVE trap when not in streaming mode and we can't be
-	 * in streaming mode when taking a SME trap.
-	 */
-	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
-		unsigned long vq_minus_one =
-			sve_vq_from_vl(task_get_sve_vl(current)) - 1;
-		sve_set_vq(vq_minus_one);
-		sve_flush_live(true, vq_minus_one);
-		fpsimd_bind_task_to_cpu();
-	} else {
-		fpsimd_to_sve(current);
-		current->thread.fp_type = FP_STATE_SVE;
-		fpsimd_flush_task_state(current);
-	}
-}
-
 /*
  * Trapped SVE access
  *
@@ -1349,13 +1324,24 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
 		WARN_ON(1); /* SVE access shouldn't have trapped */
 
 	/*
-	 * Even if the task can have used streaming mode we can only
-	 * generate SVE access traps in normal SVE mode and
-	 * transitioning out of streaming mode may discard any
-	 * streaming mode state.  Always clear the high bits to avoid
-	 * any potential errors tracking what is properly initialised.
+	 * Convert the FPSIMD state to SVE. Stale SVE state can be present in
+	 * registers or memory, so we must zero all state that is not shared
+	 * with FPSIMD.
+	 *
+	 * SVE traps cannot be taken from streaming mode, so there cannot be
+	 * any effective streaming mode SVE state.
 	 */
-	sve_init_regs();
+	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
+		unsigned long vq_minus_one =
+			sve_vq_from_vl(task_get_sve_vl(current)) - 1;
+		sve_set_vq(vq_minus_one);
+		sve_flush_live(true, vq_minus_one);
+		fpsimd_bind_task_to_cpu();
+	} else {
+		fpsimd_to_sve(current);
+		current->thread.fp_type = FP_STATE_SVE;
+		fpsimd_flush_task_state(current);
+	}
 
 	put_cpu_fpsimd_context();
 }
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 06/18] arm64: fpsimd: Remove sve_set_vq() and sme_set_vq()
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (4 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc() Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 15:42   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 07/18] arm64: fpsimd: Use assembler for SVE instructions Mark Rutland
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

The sve_set_vq() and sme_set_vq() assembly functions (and the
sve_load_vq and sme_load_vq macros they use) are open-coded forms of
sysreg_clear_set*(). There's no need for these to be implemented
out-of-line in assembly, and the 'vq_minus_1' argument is unusual and
confusing.

Use sysreg_clear_set_s() directly, where the necessary 'vq - 1' encoding
is more obviously part of encoding the register value.

For now, sve_flush_live() is left with the unusual vq_minus_1 argument.
This will be addressed in subsequent patches.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h       |  2 --
 arch/arm64/include/asm/fpsimdmacros.h | 22 ----------------------
 arch/arm64/kernel/entry-fpsimd.S      | 10 ----------
 arch/arm64/kernel/fpsimd.c            | 24 +++++++++++++-----------
 4 files changed, 13 insertions(+), 45 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index d9d00b45ab115..8efa3c0402a7a 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -146,8 +146,6 @@ extern void sve_load_state(void const *state, u32 const *pfpsr,
 			   int restore_ffr);
 extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
 extern unsigned int sve_get_vl(void);
-extern void sve_set_vq(unsigned long vq_minus_1);
-extern void sme_set_vq(unsigned long vq_minus_1);
 extern void sme_save_state(void *state, int zt);
 extern void sme_load_state(void const *state, int zt);
 
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index cda81d009c9bd..adf33d2da40c3 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -265,28 +265,6 @@
 	.purgem _for__body
 .endm
 
-/* Update ZCR_EL1.LEN with the new VQ */
-.macro sve_load_vq xvqminus1, xtmp, xtmp2
-		mrs_s		\xtmp, SYS_ZCR_EL1
-		bic		\xtmp2, \xtmp, ZCR_ELx_LEN_MASK
-		orr		\xtmp2, \xtmp2, \xvqminus1
-		cmp		\xtmp2, \xtmp
-		b.eq		921f
-		msr_s		SYS_ZCR_EL1, \xtmp2	//self-synchronising
-921:
-.endm
-
-/* Update SMCR_EL1.LEN with the new VQ */
-.macro sme_load_vq xvqminus1, xtmp, xtmp2
-		mrs_s		\xtmp, SYS_SMCR_EL1
-		bic		\xtmp2, \xtmp, SMCR_ELx_LEN_MASK
-		orr		\xtmp2, \xtmp2, \xvqminus1
-		cmp		\xtmp2, \xtmp
-		b.eq		921f
-		msr_s		SYS_SMCR_EL1, \xtmp2	//self-synchronising
-921:
-.endm
-
 /* Preserve the first 128-bits of Znz and zero the rest. */
 .macro _sve_flush_z nz
 	_sve_check_zreg \nz
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 6325db1a2179c..88c555745b584 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -62,11 +62,6 @@ SYM_FUNC_START(sve_get_vl)
 	ret
 SYM_FUNC_END(sve_get_vl)
 
-SYM_FUNC_START(sve_set_vq)
-	sve_load_vq x0, x1, x2
-	ret
-SYM_FUNC_END(sve_set_vq)
-
 /*
  * Zero all SVE registers but the first 128-bits of each vector
  *
@@ -94,11 +89,6 @@ SYM_FUNC_START(sme_get_vl)
 	ret
 SYM_FUNC_END(sme_get_vl)
 
-SYM_FUNC_START(sme_set_vq)
-	sme_load_vq x0, x1, x2
-	ret
-SYM_FUNC_END(sme_set_vq)
-
 /*
  * Save the ZA and ZT state
  *
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index a8395cb303344..2578c2372c89e 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -377,8 +377,10 @@ static void task_fpsimd_load(void)
 			if (!thread_sm_enabled(&current->thread))
 				WARN_ON_ONCE(!test_and_set_thread_flag(TIF_SVE));
 
-			if (test_thread_flag(TIF_SVE))
-				sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1);
+			if (test_thread_flag(TIF_SVE)) {
+				unsigned long vq = sve_vq_from_vl(task_get_sve_vl(current));
+				sysreg_clear_set_s(SYS_ZCR_EL1, ZCR_ELx_LEN, vq - 1);
+			}
 
 			restore_sve_regs = true;
 			restore_ffr = true;
@@ -403,8 +405,10 @@ static void task_fpsimd_load(void)
 		unsigned long sme_vl = task_get_sme_vl(current);
 
 		/* Ensure VL is set up for restoring data */
-		if (test_thread_flag(TIF_SME))
-			sme_set_vq(sve_vq_from_vl(sme_vl) - 1);
+		if (test_thread_flag(TIF_SME)) {
+			unsigned long vq = sve_vq_from_vl(sme_vl);
+			sysreg_clear_set_s(SYS_SMCR_EL1, SMCR_ELx_LEN, vq - 1);
+		}
 
 		write_sysreg_s(current->thread.svcr, SYS_SVCR);
 
@@ -1332,10 +1336,9 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
 	 * any effective streaming mode SVE state.
 	 */
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
-		unsigned long vq_minus_one =
-			sve_vq_from_vl(task_get_sve_vl(current)) - 1;
-		sve_set_vq(vq_minus_one);
-		sve_flush_live(true, vq_minus_one);
+		unsigned long vq = sve_vq_from_vl(task_get_sve_vl(current));
+		sysreg_clear_set_s(SYS_ZCR_EL1, ZCR_ELx_LEN, vq - 1);
+		sve_flush_live(true, vq - 1);
 		fpsimd_bind_task_to_cpu();
 	} else {
 		fpsimd_to_sve(current);
@@ -1465,9 +1468,8 @@ void do_sme_acc(unsigned long esr, struct pt_regs *regs)
 		WARN_ON(1);
 
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
-		unsigned long vq_minus_one =
-			sve_vq_from_vl(task_get_sme_vl(current)) - 1;
-		sme_set_vq(vq_minus_one);
+		unsigned long vq = sve_vq_from_vl(task_get_sme_vl(current));
+		sysreg_clear_set_s(SYS_SMCR_EL1, SMCR_ELx_LEN, vq - 1);
 
 		fpsimd_bind_task_to_cpu();
 	} else {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 07/18] arm64: fpsimd: Use assembler for SVE instructions
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (5 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 06/18] arm64: fpsimd: Remove sve_set_vq() and sme_set_vq() Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 15:43   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 08/18] arm64: fpsimd: Use assembler for baseline SME instructions Mark Rutland
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Historically we supported assemblers which could not assemble SVE
instructions. We dropped support for such assemblers in commit:

  118c40b7b503 ("kbuild: require gcc-8 and binutils-2.30")

Since that commit, all supported assemblers (binutils and LLVM) are
capable of assembling SVE instructions, and there's no need for us to
manually encode SVE instructions.

Rely on the assembler to encode SVE instructions, and remove the manual
encoding. The various _sve_<insn> macros are kept for now, and will be
cleaned up in subsequent patches.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimdmacros.h | 64 +++++++--------------------
 1 file changed, 16 insertions(+), 48 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index adf33d2da40c3..1122eea6daacf 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -99,85 +99,53 @@
 	.endif
 .endm
 
-/* SVE instruction encodings for non-SVE-capable assemblers */
-/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
+/* Deprecated macros for SVE instructions */
 
 /* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */
 .macro _sve_str_v nz, nxbase, offset=0
-	_sve_check_zreg \nz
-	_check_general_reg \nxbase
-	_check_num (\offset), -0x100, 0xff
-	.inst	0xe5804000			\
-		| (\nz)				\
-		| ((\nxbase) << 5)		\
-		| (((\offset) & 7) << 10)	\
-		| (((\offset) & 0x1f8) << 13)
+	.arch_extension sve
+	str	z\nz, [X\nxbase, #\offset, MUL VL]
 .endm
 
 /* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */
 .macro _sve_ldr_v nz, nxbase, offset=0
-	_sve_check_zreg \nz
-	_check_general_reg \nxbase
-	_check_num (\offset), -0x100, 0xff
-	.inst	0x85804000			\
-		| (\nz)				\
-		| ((\nxbase) << 5)		\
-		| (((\offset) & 7) << 10)	\
-		| (((\offset) & 0x1f8) << 13)
+	.arch_extension sve
+	ldr	z\nz, [X\nxbase, #\offset, MUL VL]
 .endm
 
 /* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */
 .macro _sve_str_p np, nxbase, offset=0
-	_sve_check_preg \np
-	_check_general_reg \nxbase
-	_check_num (\offset), -0x100, 0xff
-	.inst	0xe5800000			\
-		| (\np)				\
-		| ((\nxbase) << 5)		\
-		| (((\offset) & 7) << 10)	\
-		| (((\offset) & 0x1f8) << 13)
+	.arch_extension sve
+	str	p\np, [X\nxbase, #\offset, MUL VL]
 .endm
 
 /* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */
 .macro _sve_ldr_p np, nxbase, offset=0
-	_sve_check_preg \np
-	_check_general_reg \nxbase
-	_check_num (\offset), -0x100, 0xff
-	.inst	0x85800000			\
-		| (\np)				\
-		| ((\nxbase) << 5)		\
-		| (((\offset) & 7) << 10)	\
-		| (((\offset) & 0x1f8) << 13)
+	.arch_extension sve
+	ldr p\np, [x\nxbase, #\offset, MUL VL]
 .endm
 
 /* RDVL X\nx, #\imm */
 .macro _sve_rdvl nx, imm
-	_check_general_reg \nx
-	_check_num (\imm), -0x20, 0x1f
-	.inst	0x04bf5000			\
-		| (\nx)				\
-		| (((\imm) & 0x3f) << 5)
+	.arch_extension sve
+	rdvl x\nx, #\imm
 .endm
 
 /* RDFFR (unpredicated): RDFFR P\np.B */
 .macro _sve_rdffr np
-	_sve_check_preg \np
-	.inst	0x2519f000			\
-		| (\np)
+	.arch_extension sve
+	rdffr p\np\().b
 .endm
 
 /* WRFFR P\np.B */
 .macro _sve_wrffr np
-	_sve_check_preg \np
-	.inst	0x25289000			\
-		| ((\np) << 5)
+	wrffr p\np\().b
 .endm
 
 /* PFALSE P\np.B */
 .macro _sve_pfalse np
-	_sve_check_preg \np
-	.inst	0x2518e400			\
-		| (\np)
+	.arch_extension sve
+	pfalse	p\np\().b
 .endm
 
 /* SME instruction encodings for non-SME-capable assemblers */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 08/18] arm64: fpsimd: Use assembler for baseline SME instructions
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (6 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 07/18] arm64: fpsimd: Use assembler for SVE instructions Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 15:45   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 09/18] arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline Mark Rutland
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

We currently support assemblers which do not support SME instructions,
and have macros to manually encode SME instructions. This was
necessary historically as SME support was developed before assembler
support was widely available, but things have changed:

* All currently supported versions of LLVM support baseline SME
  instructions. Building the kernel requires LLVM 15+, while LLVM 13+
  supports SME.

* GNU binutils has supported baseline SME instructions since 2.38, which
  was released on 09 February 2022. Toolchains using this or later are
  widely available. For example Debian 12 (released on 10 June 2023)
  provides binutils 2.40. Toolchains provided kernel.org provide
  binutils 2.38+ since the GCC 12.1.0 release (released between 06 May
  2022 and 17 August 2022).

* For various reasons, SME support was marked as BROKEN, and re-enabled
  in v6.16 (released on 27 July 2025). The earliest support LTS kernel
  with SME support is v6.18.y, v6.18 was tagged on 30 November 2025, and
  contemporary toolchains (GCC 15.2 and binutils 2.45) supported
  baseline SME instructions.

* Any distribution which intends to support SME will presumably have a
  toolchain that supports baseline SME instructions such that userspace
  can be built.

Considering the above, there's no practical benefit to allowing SME to
be built when the toolchain doesn't support baseline SME instructions.

Make CONFIG_ARM64_SME depend on assembler support for SME, and remove
the manual encoding of SME instructions. The various _sme_<insn> macros
are kept for now, and will be cleaned up in subsequent patches.

A couple of SME2 instructions require a more recent toolchain, and are
left as-is for now. I've looked through releases of binutils and LLVM to
find when support was added, and noted this in a comment.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/Kconfig                    |  5 ++++
 arch/arm64/include/asm/fpsimdmacros.h | 38 +++++++++++----------------
 2 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fe60738e5943b..378e50fef247a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2247,10 +2247,15 @@ config ARM64_SVE
 	  booting the kernel.  If unsure and you are not observing these
 	  symptoms, you should assume that it is safe to say Y.
 
+config AS_HAS_SME
+	# Supported by LLVM 13+ and binutils 2.38+
+	def_bool $(as-instr,.arch_extension sme)
+
 config ARM64_SME
 	bool "ARM Scalable Matrix Extension support"
 	default y
 	depends on ARM64_SVE
+	depends on AS_HAS_SME
 	help
 	  The Scalable Matrix Extension (SME) is an extension to the AArch64
 	  execution state which utilises a substantial subset of the SVE
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 1122eea6daacf..d0bdbbf2d44ad 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -148,46 +148,38 @@
 	pfalse	p\np\().b
 .endm
 
-/* SME instruction encodings for non-SME-capable assemblers */
-/* (pre binutils 2.38/LLVM 13) */
+/* Deprecated macros for SME instructions */
 
 /* RDSVL X\nx, #\imm */
 .macro _sme_rdsvl nx, imm
-	_check_general_reg \nx
-	_check_num (\imm), -0x20, 0x1f
-	.inst	0x04bf5800			\
-		| (\nx)				\
-		| (((\imm) & 0x3f) << 5)
+	.arch_extension sme
+	rdsvl x\nx, #\imm
 .endm
 
 /*
  * STR (vector from ZA array):
- *	STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ *	STR ZA[W\nw, #\offset], [X\nxbase, #\offset, MUL VL]
  */
 .macro _sme_str_zav nw, nxbase, offset=0
-	_sme_check_wv \nw
-	_check_general_reg \nxbase
-	_check_num (\offset), -0x100, 0xff
-	.inst	0xe1200000			\
-		| (((\nw) & 3) << 13)		\
-		| ((\nxbase) << 5)		\
-		| ((\offset) & 7)
+	.arch_extension sme
+	str	za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
 .endm
 
 /*
  * LDR (vector to ZA array):
- *	LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ *	LDR ZA[w\nw, #\offset], [X\nxbase, #\offset, MUL VL]
  */
 .macro _sme_ldr_zav nw, nxbase, offset=0
-	_sme_check_wv \nw
-	_check_general_reg \nxbase
-	_check_num (\offset), -0x100, 0xff
-	.inst	0xe1000000			\
-		| (((\nw) & 3) << 13)		\
-		| ((\nxbase) << 5)		\
-		| ((\offset) & 7)
+	.arch_extension sme
+	ldr	za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
 .endm
 
+/*
+ * SME2 instruction encodings for older assemblers.
+ * Supported by binutils 2.41+.
+ * Supported by LLVM 16+
+ */
+
 /*
  * LDR (ZT0)
  *
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 09/18] arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (7 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 08/18] arm64: fpsimd: Use assembler for baseline SME instructions Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 15:47   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR Mark Rutland
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

The sve_get_vl() and sme_get_vl() functions are wrappers for the RDVL
and RDSVL instructions respectively. There's no need for those to be
out-of-line.

Replace the out-of-line assembly functions with equivalent inline
functions.

The _sve_rdvl assembly macro is unused, and so it is removed. The
_sme_rdsvl assembly macro is still used elsewhere, and so is kept for
now.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h       | 31 +++++++++++++++++++++++++--
 arch/arm64/include/asm/fpsimdmacros.h |  6 ------
 arch/arm64/kernel/entry-fpsimd.S      | 10 ---------
 3 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 8efa3c0402a7a..36cf528e64971 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -22,6 +22,9 @@
 #include <linux/stddef.h>
 #include <linux/types.h>
 
+#define __SVE_PREAMBLE		".arch_extension sve\n"
+#define __SME_PREAMBLE		".arch_extension sme\n"
+
 /* Masks for extracting the FPSR and FPCR from the FPSCR */
 #define VFP_FPSCR_STAT_MASK	0xf800009f
 #define VFP_FPSCR_CTRL_MASK	0x07f79f00
@@ -141,11 +144,23 @@ static inline void *thread_zt_state(struct thread_struct *thread)
 	return thread->sme_state + ZA_SIG_REGS_SIZE(sme_vq);
 }
 
+static inline unsigned int sve_get_vl(void)
+{
+	unsigned int vl;
+
+	asm volatile(
+	__SVE_PREAMBLE
+	"	rdvl %x[vl], #1\n"
+	: [vl] "=r" (vl)
+	);
+
+	return vl;
+}
+
 extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
 extern void sve_load_state(void const *state, u32 const *pfpsr,
 			   int restore_ffr);
 extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
-extern unsigned int sve_get_vl(void);
 extern void sme_save_state(void *state, int zt);
 extern void sme_load_state(void const *state, int zt);
 
@@ -400,8 +415,20 @@ static inline int sme_max_virtualisable_vl(void)
 	return vec_max_virtualisable_vl(ARM64_VEC_SME);
 }
 
+static inline unsigned int sme_get_vl(void)
+{
+	unsigned int vl;
+
+	asm volatile(
+	__SME_PREAMBLE
+	"	rdsvl %x[vl], #1\n"
+	: [vl] "=r" (vl)
+	);
+
+	return vl;
+}
+
 extern void sme_alloc(struct task_struct *task, bool flush);
-extern unsigned int sme_get_vl(void);
 extern int sme_set_current_vl(unsigned long arg);
 extern int sme_get_current_vl(void);
 extern void sme_suspend_exit(void);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index d0bdbbf2d44ad..d75c9d4c9989b 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -125,12 +125,6 @@
 	ldr p\np, [x\nxbase, #\offset, MUL VL]
 .endm
 
-/* RDVL X\nx, #\imm */
-.macro _sve_rdvl nx, imm
-	.arch_extension sve
-	rdvl x\nx, #\imm
-.endm
-
 /* RDFFR (unpredicated): RDFFR P\np.B */
 .macro _sve_rdffr np
 	.arch_extension sve
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 88c555745b584..7f2d31dff8c17 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -57,11 +57,6 @@ SYM_FUNC_START(sve_load_state)
 	ret
 SYM_FUNC_END(sve_load_state)
 
-SYM_FUNC_START(sve_get_vl)
-	_sve_rdvl	0, 1
-	ret
-SYM_FUNC_END(sve_get_vl)
-
 /*
  * Zero all SVE registers but the first 128-bits of each vector
  *
@@ -84,11 +79,6 @@ SYM_FUNC_END(sve_flush_live)
 
 #ifdef CONFIG_ARM64_SME
 
-SYM_FUNC_START(sme_get_vl)
-	_sme_rdsvl	0, 1
-	ret
-SYM_FUNC_END(sme_get_vl)
-
 /*
  * Save the ZA and ZT state
  *
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (8 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 09/18] arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 15:55   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 11/18] arm64: fpsimd: Split FPSR/FPCR from SVE save/restore Mark Rutland
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Add sysreg definitions for FPCR and FPSR.

Some versions of LLVM will refuse to assemble accesses to FPCR and FPSR
unless the "fp" arch extension is enabled, which we don't currently do
for read_sysreg() and write_sysreg(). In general, handling feature
dependencies would complicate read_sysreg() and write_sysreg(), and it's
simpler to use read_sysreg_s() and write_sysreg_s() instead, requiring
sysreg definitions.

The values used can be found in ARM ARM issue M.b:

  https://developer.arm.com/documentation/ddi0487/mb/

... in sections:

* C5.2.8 ("FPCR, Floating-point Control Register")
* C5.2.10 ("FPSR, Floating-point Status Register")

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/tools/sysreg | 45 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 6c3ff14e561e6..fa155cd856a5b 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -3790,6 +3790,51 @@ Field	1	ZA
 Field	0	SM
 EndSysreg
 
+Sysreg	FPCR	3	3	4	4	0
+Res0	63:27
+Field	26	AHP
+Field	25	DN
+Field	24	FZ
+Enum	23:22	RMode
+	0b00	RN
+	0b01	RP
+	0b10	RM
+	0b11	RZ
+EndEnum
+Field	21:20	Stride
+Field	19	FZ16
+Field	18:16	Len
+Field	15	IDE
+Res0	14
+Field	13	EBF
+Field	12	IXE
+Field	11	UFE
+Field	10	OFE
+Field	9	DZE
+Field	8	IOE
+Res0	7:3
+Field	2	NEP
+Field	1	AH
+Field	0	FIZ
+EndSysreg
+
+Sysreg	FPSR	3	3	4	4	1
+Res0	63:32
+Field	31	N
+Field	30	Q
+Field	29	C
+Field	28	V
+Field	27	QC
+Res0	26:8
+Field	7	IDC
+Res0	6:5
+Field	4	IXC
+Field	3	UFC
+Field	2	OFC
+Field	1	DZC
+Field	0	IOC
+EndSysreg
+
 Sysreg	FPMR	3	3	4	4	2
 Res0	63:38
 Field	37:32	LSCALE2
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 11/18] arm64: fpsimd: Split FPSR/FPCR from SVE save/restore
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (9 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 16:28   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 12/18] arm64: fpsimd: Move fpsimd save/restore inline Mark Rutland
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Regardless of whether the vector registers are saved in FPSIMD or SVE
format, we store FPSR and FPCR in user_fpsimd_state::{fpsr,fpcr}.

For historical reasons, the functions which save/restore SVE context
take a pointer to user_fpsimd_state::fpsr, and use this to access both
user_fpsimd_state::fpsr and user_fpsimd_state::fpcr. This is
unnecessarily fragile.

Move the save/restore of FPSR and FPCR into separate helper functions
which take a pointer to user_fpsimd_state. I've used read_sysreg_s() and
write_sysreg_s() as contemporary versions of LLVM will refuse to
directly assemble accesses to FPCR or FPSR unless the "fp" arch
extension is enabled.

Note that the SVE assembly sequence for restoring FPCR uses an
unconditional write to FPCR. The plain FPSIMD assembly sequence has used
a conditional write to FPCR since 2014 in commit:

  5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")

... but this was not followed for the SVE restore assembly implemented
in 2017 in commit:

  1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")

... so I've assumed that this doesn't actually matter in practice, and
implemented the C version matching the existing SVE assembly.

For the moment, fpsimd_save_state() and fpsimd_load_state() are left
as-is with their own logic to save/restore FPSR and FPCR. This will be
unified in subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h         | 17 ++++++++++++++---
 arch/arm64/include/asm/fpsimdmacros.h   | 13 ++-----------
 arch/arm64/include/asm/kvm_hyp.h        |  4 ++--
 arch/arm64/kernel/entry-fpsimd.S        | 10 ++++------
 arch/arm64/kernel/fpsimd.c              |  5 +++--
 arch/arm64/kvm/hyp/fpsimd.S             |  4 ++--
 arch/arm64/kvm/hyp/include/hyp/switch.h |  4 ++--
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      |  5 +++--
 8 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 36cf528e64971..6fd5cdf5e5f17 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -74,6 +74,18 @@ static inline void cpacr_restore(unsigned long cpacr)
 
 struct task_struct;
 
+static inline void fpsimd_save_common(struct user_fpsimd_state *state)
+{
+	state->fpsr = read_sysreg_s(SYS_FPSR);
+	state->fpcr = read_sysreg_s(SYS_FPCR);
+}
+
+static inline void fpsimd_load_common(const struct user_fpsimd_state *state)
+{
+	write_sysreg_s(state->fpsr, SYS_FPSR);
+	write_sysreg_s(state->fpcr, SYS_FPCR);
+}
+
 extern void fpsimd_save_state(struct user_fpsimd_state *state);
 extern void fpsimd_load_state(struct user_fpsimd_state *state);
 
@@ -157,9 +169,8 @@ static inline unsigned int sve_get_vl(void)
 	return vl;
 }
 
-extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
-extern void sve_load_state(void const *state, u32 const *pfpsr,
-			   int restore_ffr);
+extern void sve_save_state(void *state, int save_ffr);
+extern void sve_load_state(void const *state, int restore_ffr);
 extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
 extern void sme_save_state(void *state, int zt);
 extern void sme_load_state(void const *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index d75c9d4c9989b..c79ae7ec1ff05 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -235,7 +235,7 @@
 		_sve_wrffr	0
 .endm
 
-.macro sve_save nxbase, xpfpsr, save_ffr, nxtmp
+.macro sve_save nxbase, save_ffr
  _for n, 0, 31,	_sve_str_v	\n, \nxbase, \n - 34
  _for n, 0, 15,	_sve_str_p	\n, \nxbase, \n - 16
 		cbz		\save_ffr, 921f
@@ -246,24 +246,15 @@
 922:
 		_sve_str_p	0, \nxbase
 		_sve_ldr_p	0, \nxbase, -16
-		mrs		x\nxtmp, fpsr
-		str		w\nxtmp, [\xpfpsr]
-		mrs		x\nxtmp, fpcr
-		str		w\nxtmp, [\xpfpsr, #4]
 .endm
 
-.macro sve_load nxbase, xpfpsr, restore_ffr, nxtmp
+.macro sve_load nxbase, restore_ffr
  _for n, 0, 31,	_sve_ldr_v	\n, \nxbase, \n - 34
 		cbz		\restore_ffr, 921f
 		_sve_ldr_p	0, \nxbase
 		_sve_wrffr	0
 921:
  _for n, 0, 15,	_sve_ldr_p	\n, \nxbase, \n - 16
-
-		ldr		w\nxtmp, [\xpfpsr]
-		msr		fpsr, x\nxtmp
-		ldr		w\nxtmp, [\xpfpsr, #4]
-		msr		fpcr, x\nxtmp
 .endm
 
 .macro sme_save_za nxbase, xvl, nw
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 8d06b62e7188c..0030cc1b52197 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -123,8 +123,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
 
 void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
 void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
-void __sve_save_state(void *sve_pffr, u32 *fpsr, int save_ffr);
-void __sve_restore_state(void *sve_pffr, u32 *fpsr, int restore_ffr);
+void __sve_save_state(void *sve, int save_ffr);
+void __sve_restore_state(void *sve, int restore_ffr);
 
 u64 __guest_enter(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 7f2d31dff8c17..83fe9c32bbd1c 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -37,11 +37,10 @@ SYM_FUNC_END(fpsimd_load_state)
  * Save the SVE state
  *
  * x0 - pointer to buffer for state
- * x1 - pointer to storage for FPSR
- * x2 - Save FFR if non-zero
+ * x1 - Save FFR if non-zero
  */
 SYM_FUNC_START(sve_save_state)
-	sve_save 0, x1, x2, 3
+	sve_save 0, x1
 	ret
 SYM_FUNC_END(sve_save_state)
 
@@ -49,11 +48,10 @@ SYM_FUNC_END(sve_save_state)
  * Load the SVE state
  *
  * x0 - pointer to buffer for state
- * x1 - pointer to storage for FPSR
- * x2 - Restore FFR if non-zero
+ * x1 - Restore FFR if non-zero
  */
 SYM_FUNC_START(sve_load_state)
-	sve_load 0, x1, x2, 4
+	sve_load 0, x1
 	ret
 SYM_FUNC_END(sve_load_state)
 
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 2578c2372c89e..9806fea8fea7c 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -426,8 +426,8 @@ static void task_fpsimd_load(void)
 	if (restore_sve_regs) {
 		WARN_ON_ONCE(current->thread.fp_type != FP_STATE_SVE);
 		sve_load_state(sve_pffr(&current->thread),
-			       &current->thread.uw.fpsimd_state.fpsr,
 			       restore_ffr);
+		fpsimd_load_common(&current->thread.uw.fpsimd_state);
 	} else {
 		WARN_ON_ONCE(current->thread.fp_type != FP_STATE_FPSIMD);
 		fpsimd_load_state(&current->thread.uw.fpsimd_state);
@@ -509,7 +509,8 @@ static void fpsimd_save_user_state(void)
 
 		sve_save_state((char *)last->sve_state +
 					sve_ffr_offset(vl),
-			       &last->st->fpsr, save_ffr);
+			       save_ffr);
+		fpsimd_save_common(last->st);
 		*last->fp_type = FP_STATE_SVE;
 	} else {
 		fpsimd_save_state(last->st);
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
index 6e16cbfc5df27..8575e32977d19 100644
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ b/arch/arm64/kvm/hyp/fpsimd.S
@@ -21,11 +21,11 @@ SYM_FUNC_START(__fpsimd_restore_state)
 SYM_FUNC_END(__fpsimd_restore_state)
 
 SYM_FUNC_START(__sve_restore_state)
-	sve_load 0, x1, x2, 3
+	sve_load 0, x1
 	ret
 SYM_FUNC_END(__sve_restore_state)
 
 SYM_FUNC_START(__sve_save_state)
-	sve_save 0, x1, x2, 3
+	sve_save 0, x1
 	ret
 SYM_FUNC_END(__sve_save_state)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 6512dd3f75ae4..eb76a863ebb84 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -468,8 +468,8 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 	 */
 	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
 	__sve_restore_state(vcpu_sve_pffr(vcpu),
-			    &vcpu->arch.ctxt.fp_regs.fpsr,
 			    true);
+	fpsimd_load_common(&vcpu->arch.ctxt.fp_regs);
 
 	/*
 	 * The effective VL for a VM could differ from the max VL when running a
@@ -490,8 +490,8 @@ static inline void __hyp_sve_save_host(void)
 	ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
 	__sve_save_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
-			 &hctxt->fp_regs.fpsr,
 			 true);
+	fpsimd_save_common(&hctxt->fp_regs);
 }
 
 static inline void fpsimd_lazy_switch_to_guest(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 04a6d2e0ea73f..0be4577a67e7b 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -35,7 +35,8 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
 	 * on the VL, so use a consistent (i.e., the maximum) guest VL.
 	 */
 	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
-	__sve_save_state(vcpu_sve_pffr(vcpu), &vcpu->arch.ctxt.fp_regs.fpsr, true);
+	__sve_save_state(vcpu_sve_pffr(vcpu), true);
+	fpsimd_save_common(&vcpu->arch.ctxt.fp_regs);
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
 }
 
@@ -55,8 +56,8 @@ static void __hyp_sve_restore_host(void)
 	 */
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
 	__sve_restore_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
-			    &hctxt->fp_regs.fpsr,
 			    true);
+	fpsimd_load_common(&hctxt->fp_regs);
 	write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
 }
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 12/18] arm64: fpsimd: Move fpsimd save/restore inline
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (10 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 11/18] arm64: fpsimd: Split FPSR/FPCR from SVE save/restore Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 16:44   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 13/18] arm64: fpsimd: Use opaque type for SVE state Mark Rutland
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Currently the FPSIMD register save/restore sequences are written in
out-of-line assembly routines. While this works, it's somewhat painful:

* As KVM needs to be able to use the sequences in hyp code, separate
  assembly files are used for the regular kernel and KVM code. While the
  common logic is shared in assembly macros, this still requires some
  duplication, and has lead to some trivial divergence.

* For historical reasons, the assembly macros take some register
  arguments as numerical indices (e.g. "fpsimd_save x0, 8" uses x0 and
  x8), which is simply confusing.

* For historical reasons, the SVE save/restore code and FPSIMD
  save/restore code have distinct sequences for FPSR and FPCR. Ideally
  this logic would be shared.

* The assembly sequences can't be instrumented, and so it's harder than
  necessary to catch memory safety issues.

To handle the above, move the FPSIMD register save/restore sequences to
inline assembly, and share the FPSR+FPCR save/restore with SVE.

Neither GCC nor LLVM instrument memory arguments to inline assembly, so
explicit instrumentation is added in the same manner as other assembly
routines. This instrumentation is implicitly disabled by Kbuild for nVHE
hyp code.

Note that I've used the SVE sequence for restoring FPCR, which uses an
unconditional write to FPCR. The plain FPSIMD assembly sequence used a
conditional write to FPCR since 2014 in commit:

  5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")

... but this was not followed for the SVE assembly implemented in 2017
in commit:

  1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")

... so I've assumed that this doesn't actually matter in practice, and
I've erred in favour of the simpler sequence.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h         | 68 ++++++++++++++++++++++++-
 arch/arm64/include/asm/fpsimdmacros.h   | 59 ---------------------
 arch/arm64/include/asm/kvm_hyp.h        |  2 -
 arch/arm64/kernel/entry-fpsimd.S        | 20 --------
 arch/arm64/kvm/hyp/fpsimd.S             | 10 ----
 arch/arm64/kvm/hyp/include/hyp/switch.h |  4 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      |  4 +-
 7 files changed, 70 insertions(+), 97 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 6fd5cdf5e5f17..19b373ad0ebf7 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -22,6 +22,8 @@
 #include <linux/stddef.h>
 #include <linux/types.h>
 
+#define __FPSIMD_PREAMBLE	".arch_extension fp\n" \
+				".arch_extension simd\n"
 #define __SVE_PREAMBLE		".arch_extension sve\n"
 #define __SME_PREAMBLE		".arch_extension sme\n"
 
@@ -86,8 +88,70 @@ static inline void fpsimd_load_common(const struct user_fpsimd_state *state)
 	write_sysreg_s(state->fpcr, SYS_FPCR);
 }
 
-extern void fpsimd_save_state(struct user_fpsimd_state *state);
-extern void fpsimd_load_state(struct user_fpsimd_state *state);
+static inline void fpsimd_save_vregs(struct user_fpsimd_state *state)
+{
+	instrument_write(state->vregs, sizeof(state->vregs));
+	asm volatile(
+	__FPSIMD_PREAMBLE
+	"	stp	q0,  q1,  [%[vregs], #16 * 0]\n"
+	"	stp	q2,  q3,  [%[vregs], #16 * 2]\n"
+	"	stp	q4,  q5,  [%[vregs], #16 * 4]\n"
+	"	stp	q6,  q7,  [%[vregs], #16 * 6]\n"
+	"	stp	q8,  q9,  [%[vregs], #16 * 8]\n"
+	"	stp	q10, q11, [%[vregs], #16 * 10]\n"
+	"	stp	q12, q13, [%[vregs], #16 * 12]\n"
+	"	stp	q14, q15, [%[vregs], #16 * 14]\n"
+	"	stp	q16, q17, [%[vregs], #16 * 16]\n"
+	"	stp	q18, q19, [%[vregs], #16 * 18]\n"
+	"	stp	q20, q21, [%[vregs], #16 * 20]\n"
+	"	stp	q22, q23, [%[vregs], #16 * 22]\n"
+	"	stp	q24, q25, [%[vregs], #16 * 24]\n"
+	"	stp	q26, q27, [%[vregs], #16 * 26]\n"
+	"	stp	q28, q29, [%[vregs], #16 * 28]\n"
+	"	stp	q30, q31, [%[vregs], #16 * 30]\n"
+	: "=Q" (state->vregs)
+	: [vregs] "r" (state->vregs)
+	);
+}
+
+static inline void fpsimd_load_vregs(const struct user_fpsimd_state *state)
+{
+	instrument_read(state->vregs, sizeof(state->vregs));
+	asm volatile(
+	__FPSIMD_PREAMBLE
+	"	ldp	q0,  q1,  [%[vregs], #16 * 0]\n"
+	"	ldp	q2,  q3,  [%[vregs], #16 * 2]\n"
+	"	ldp	q4,  q5,  [%[vregs], #16 * 4]\n"
+	"	ldp	q6,  q7,  [%[vregs], #16 * 6]\n"
+	"	ldp	q8,  q9,  [%[vregs], #16 * 8]\n"
+	"	ldp	q10, q11, [%[vregs], #16 * 10]\n"
+	"	ldp	q12, q13, [%[vregs], #16 * 12]\n"
+	"	ldp	q14, q15, [%[vregs], #16 * 14]\n"
+	"	ldp	q16, q17, [%[vregs], #16 * 16]\n"
+	"	ldp	q18, q19, [%[vregs], #16 * 18]\n"
+	"	ldp	q20, q21, [%[vregs], #16 * 20]\n"
+	"	ldp	q22, q23, [%[vregs], #16 * 22]\n"
+	"	ldp	q24, q25, [%[vregs], #16 * 24]\n"
+	"	ldp	q26, q27, [%[vregs], #16 * 26]\n"
+	"	ldp	q28, q29, [%[vregs], #16 * 28]\n"
+	"	ldp	q30, q31, [%[vregs], #16 * 30]\n"
+	:
+	: "Q" (state->vregs),
+	  [vregs] "r" (state->vregs)
+	);
+}
+
+static inline void fpsimd_save_state(struct user_fpsimd_state *state)
+{
+	fpsimd_save_vregs(state);
+	fpsimd_save_common(state);
+}
+
+static inline void fpsimd_load_state(const struct user_fpsimd_state *state)
+{
+	fpsimd_load_vregs(state);
+	fpsimd_load_common(state);
+}
 
 extern void fpsimd_thread_switch(struct task_struct *next);
 extern void fpsimd_flush_thread(void);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index c79ae7ec1ff05..01b5e6d51ba79 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -8,65 +8,6 @@
 
 #include <asm/assembler.h>
 
-.macro fpsimd_save state, tmpnr
-	stp	q0, q1, [\state, #16 * 0]
-	stp	q2, q3, [\state, #16 * 2]
-	stp	q4, q5, [\state, #16 * 4]
-	stp	q6, q7, [\state, #16 * 6]
-	stp	q8, q9, [\state, #16 * 8]
-	stp	q10, q11, [\state, #16 * 10]
-	stp	q12, q13, [\state, #16 * 12]
-	stp	q14, q15, [\state, #16 * 14]
-	stp	q16, q17, [\state, #16 * 16]
-	stp	q18, q19, [\state, #16 * 18]
-	stp	q20, q21, [\state, #16 * 20]
-	stp	q22, q23, [\state, #16 * 22]
-	stp	q24, q25, [\state, #16 * 24]
-	stp	q26, q27, [\state, #16 * 26]
-	stp	q28, q29, [\state, #16 * 28]
-	stp	q30, q31, [\state, #16 * 30]!
-	mrs	x\tmpnr, fpsr
-	str	w\tmpnr, [\state, #16 * 2]
-	mrs	x\tmpnr, fpcr
-	str	w\tmpnr, [\state, #16 * 2 + 4]
-.endm
-
-.macro fpsimd_restore_fpcr state, tmp
-	/*
-	 * Writes to fpcr may be self-synchronising, so avoid restoring
-	 * the register if it hasn't changed.
-	 */
-	mrs	\tmp, fpcr
-	cmp	\tmp, \state
-	b.eq	9999f
-	msr	fpcr, \state
-9999:
-.endm
-
-/* Clobbers \state */
-.macro fpsimd_restore state, tmpnr
-	ldp	q0, q1, [\state, #16 * 0]
-	ldp	q2, q3, [\state, #16 * 2]
-	ldp	q4, q5, [\state, #16 * 4]
-	ldp	q6, q7, [\state, #16 * 6]
-	ldp	q8, q9, [\state, #16 * 8]
-	ldp	q10, q11, [\state, #16 * 10]
-	ldp	q12, q13, [\state, #16 * 12]
-	ldp	q14, q15, [\state, #16 * 14]
-	ldp	q16, q17, [\state, #16 * 16]
-	ldp	q18, q19, [\state, #16 * 18]
-	ldp	q20, q21, [\state, #16 * 20]
-	ldp	q22, q23, [\state, #16 * 22]
-	ldp	q24, q25, [\state, #16 * 24]
-	ldp	q26, q27, [\state, #16 * 26]
-	ldp	q28, q29, [\state, #16 * 28]
-	ldp	q30, q31, [\state, #16 * 30]!
-	ldr	w\tmpnr, [\state, #16 * 2]
-	msr	fpsr, x\tmpnr
-	ldr	w\tmpnr, [\state, #16 * 2 + 4]
-	fpsimd_restore_fpcr x\tmpnr, \state
-.endm
-
 /* Sanity-check macros to help avoid encoding garbage instructions */
 
 .macro _check_general_reg nr
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 0030cc1b52197..8c4602c8f4356 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -121,8 +121,6 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu);
 void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
 #endif
 
-void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
-void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
 void __sve_save_state(void *sve, int save_ffr);
 void __sve_restore_state(void *sve, int restore_ffr);
 
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 83fe9c32bbd1c..4fa00c94f28b7 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -11,26 +11,6 @@
 #include <asm/assembler.h>
 #include <asm/fpsimdmacros.h>
 
-/*
- * Save the FP registers.
- *
- * x0 - pointer to struct fpsimd_state
- */
-SYM_FUNC_START(fpsimd_save_state)
-	fpsimd_save x0, 8
-	ret
-SYM_FUNC_END(fpsimd_save_state)
-
-/*
- * Load the FP registers.
- *
- * x0 - pointer to struct fpsimd_state
- */
-SYM_FUNC_START(fpsimd_load_state)
-	fpsimd_restore x0, 8
-	ret
-SYM_FUNC_END(fpsimd_load_state)
-
 #ifdef CONFIG_ARM64_SVE
 
 /*
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
index 8575e32977d19..beacec33b2541 100644
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ b/arch/arm64/kvm/hyp/fpsimd.S
@@ -10,16 +10,6 @@
 
 	.text
 
-SYM_FUNC_START(__fpsimd_save_state)
-	fpsimd_save	x0, 1
-	ret
-SYM_FUNC_END(__fpsimd_save_state)
-
-SYM_FUNC_START(__fpsimd_restore_state)
-	fpsimd_restore	x0, 1
-	ret
-SYM_FUNC_END(__fpsimd_restore_state)
-
 SYM_FUNC_START(__sve_restore_state)
 	sve_load 0, x1
 	ret
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index eb76a863ebb84..aaa43554fd8e6 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -565,7 +565,7 @@ static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
 	if (system_supports_sve()) {
 		__hyp_sve_save_host();
 	} else {
-		__fpsimd_save_state(&hctxt->fp_regs);
+		fpsimd_save_state(&hctxt->fp_regs);
 	}
 
 	if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))
@@ -625,7 +625,7 @@ static inline bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
 	if (sve_guest)
 		__hyp_sve_restore_guest(vcpu);
 	else
-		__fpsimd_restore_state(&vcpu->arch.ctxt.fp_regs);
+		fpsimd_load_state(&vcpu->arch.ctxt.fp_regs);
 
 	if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))
 		write_sysreg_s(__vcpu_sys_reg(vcpu, FPMR), SYS_FPMR);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 0be4577a67e7b..627762ed7327f 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -83,7 +83,7 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
 	if (vcpu_has_sve(vcpu))
 		__hyp_sve_save_guest(vcpu);
 	else
-		__fpsimd_save_state(&vcpu->arch.ctxt.fp_regs);
+		fpsimd_save_state(&vcpu->arch.ctxt.fp_regs);
 
 	has_fpmr = kvm_has_fpmr(kern_hyp_va(vcpu->kvm));
 	if (has_fpmr)
@@ -92,7 +92,7 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
 	if (system_supports_sve())
 		__hyp_sve_restore_host();
 	else
-		__fpsimd_restore_state(&hctxt->fp_regs);
+		fpsimd_load_state(&hctxt->fp_regs);
 
 	if (has_fpmr)
 		write_sysreg_s(ctxt_sys_reg(hctxt, FPMR), SYS_FPMR);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 13/18] arm64: fpsimd: Use opaque type for SVE state
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (11 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 12/18] arm64: fpsimd: Move fpsimd save/restore inline Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 16:53   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 14/18] arm64: fpsimd: Use opaque type for SME state Mark Rutland
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

As the SVE state size can vary at runtime, we don't have a concrete type
for the in-memory SVE state, and pass this around using a pointer to
void. The functions which save/restore the SVE state have a very unusual
calling convention, expecting a pointer to the FFR *in the middle of*
the in-memory SVE state, which is also passed as a pointer to void.
Passing a pointer to the FFR also requires that callers find the live VL
and perform some arithmetic, which callers implement differently.

Using pointer to void means that it's very easy to introduce errors that
cannot be caught by the compiler (e.g. as 'void **' can be assigned to
'void *'). In general this is unnecessarily confusing and fragile.

Improve this by adding an opaque 'struct sve_state', and consistently
passing a pointer to this, performing the necessary offsetting *within*
the save/restore functions.

For the moment, the offsetting is performed in a new '_sve_pffr'
assembly macro, using the ADDVL and ADDPL instructions. These add a
multiple of the live vector length and predicate length respectively.
The ADDVL immediate range cannot encode 32, so this is split into two
increments of 16.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h         | 24 +++---------------------
 arch/arm64/include/asm/fpsimdmacros.h   |  9 +++++++++
 arch/arm64/include/asm/kvm_host.h       |  8 ++------
 arch/arm64/include/asm/kvm_hyp.h        |  4 ++--
 arch/arm64/include/asm/processor.h      |  4 +++-
 arch/arm64/kernel/fpsimd.c              | 21 ++++++++++-----------
 arch/arm64/kvm/arm.c                    |  4 ++--
 arch/arm64/kvm/guest.c                  |  4 ++--
 arch/arm64/kvm/hyp/include/hyp/switch.h |  8 +++-----
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      |  7 +++----
 arch/arm64/kvm/hyp/nvhe/setup.c         |  2 +-
 11 files changed, 40 insertions(+), 55 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 19b373ad0ebf7..19e670ae67598 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -162,7 +162,7 @@ extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 
 struct cpu_fp_state {
 	struct user_fpsimd_state *st;
-	void *sve_state;
+	struct sve_state *sve_state;
 	void *sme_state;
 	u64 *svcr;
 	u64 *fpmr;
@@ -195,24 +195,6 @@ extern void task_smstop_sm(struct task_struct *task);
 /* Maximum VL that SVE/SME VL-agnostic software can transparently support */
 #define VL_ARCH_MAX 0x100
 
-/* Offset of FFR in the SVE register dump */
-static inline size_t sve_ffr_offset(int vl)
-{
-	return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
-}
-
-static inline void *sve_pffr(struct thread_struct *thread)
-{
-	unsigned int vl;
-
-	if (system_supports_sme() && thread_sm_enabled(thread))
-		vl = thread_get_sme_vl(thread);
-	else
-		vl = thread_get_sve_vl(thread);
-
-	return (char *)thread->sve_state + sve_ffr_offset(vl);
-}
-
 static inline void *thread_zt_state(struct thread_struct *thread)
 {
 	/* The ZT register state is stored immediately after the ZA state */
@@ -233,8 +215,8 @@ static inline unsigned int sve_get_vl(void)
 	return vl;
 }
 
-extern void sve_save_state(void *state, int save_ffr);
-extern void sve_load_state(void const *state, int restore_ffr);
+extern void sve_save_state(struct sve_state *state, int save_ffr);
+extern void sve_load_state(const struct sve_state *state, int restore_ffr);
 extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
 extern void sme_save_state(void *state, int zt);
 extern void sme_load_state(void const *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 01b5e6d51ba79..08f4863e67715 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -176,7 +176,15 @@
 		_sve_wrffr	0
 .endm
 
+.macro _sve_pffr ptr
+	.arch_extension sve
+	addvl	\ptr, \ptr, #16
+	addvl	\ptr, \ptr, #16
+	addpl	\ptr, \ptr, #16
+.endm
+
 .macro sve_save nxbase, save_ffr
+		_sve_pffr	x\nxbase
  _for n, 0, 31,	_sve_str_v	\n, \nxbase, \n - 34
  _for n, 0, 15,	_sve_str_p	\n, \nxbase, \n - 16
 		cbz		\save_ffr, 921f
@@ -190,6 +198,7 @@
 .endm
 
 .macro sve_load nxbase, restore_ffr
+		_sve_pffr	x\nxbase
  _for n, 0, 31,	_sve_ldr_v	\n, \nxbase, \n - 34
 		cbz		\restore_ffr, 921f
 		_sve_ldr_p	0, \nxbase
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index ae24617380b8f..a366509c5944e 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -759,7 +759,7 @@ struct kvm_host_data {
 	 * Hyp VA.
 	 * sve_regs is only used in pKVM and if system_supports_sve().
 	 */
-	u8	*sve_regs;
+	struct sve_state *sve_regs;
 
 	/* Ownership of the FP regs */
 	enum {
@@ -853,7 +853,7 @@ struct kvm_vcpu_arch {
 	 * floating point code saves the register state of a task it
 	 * records which view it saved in fp_type.
 	 */
-	void *sve_state;
+	struct sve_state *sve_state;
 	enum fp_type fp_type;
 	unsigned int sve_max_vl;
 
@@ -1097,10 +1097,6 @@ struct kvm_vcpu_arch {
 #define NESTED_SERROR_PENDING	__vcpu_single_flag(sflags, BIT(8))
 
 
-/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
-#define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +	\
-			     sve_ffr_offset((vcpu)->arch.sve_max_vl))
-
 #define vcpu_sve_max_vq(vcpu)	sve_vq_from_vl((vcpu)->arch.sve_max_vl)
 
 #define vcpu_sve_zcr_elx(vcpu)						\
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 8c4602c8f4356..38356eee592ad 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -121,8 +121,8 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu);
 void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
 #endif
 
-void __sve_save_state(void *sve, int save_ffr);
-void __sve_restore_state(void *sve, int restore_ffr);
+void __sve_save_state(struct sve_state *sve, int save_ffr);
+void __sve_restore_state(struct sve_state *sve, int restore_ffr);
 
 u64 __guest_enter(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index e30c4c8e3a7a7..1c2ffd063baa8 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -130,6 +130,8 @@ enum fp_type {
 	FP_STATE_SVE,
 };
 
+struct sve_state;		/* Opaque type */
+
 struct cpu_context {
 	unsigned long x19;
 	unsigned long x20;
@@ -164,7 +166,7 @@ struct thread_struct {
 
 	enum fp_type		fp_type;	/* registers FPSIMD or SVE? */
 	unsigned int		fpsimd_cpu;
-	void			*sve_state;	/* SVE registers, if any */
+	struct sve_state	*sve_state;	/* SVE registers, if any */
 	void			*sme_state;	/* ZA and ZT state, if any */
 	unsigned int		vl[ARM64_VEC_MAX];	/* vector length */
 	unsigned int		vl_onexec[ARM64_VEC_MAX]; /* vl after next exec */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 9806fea8fea7c..66d880d081671 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -425,8 +425,7 @@ static void task_fpsimd_load(void)
 
 	if (restore_sve_regs) {
 		WARN_ON_ONCE(current->thread.fp_type != FP_STATE_SVE);
-		sve_load_state(sve_pffr(&current->thread),
-			       restore_ffr);
+		sve_load_state(current->thread.sve_state, restore_ffr);
 		fpsimd_load_common(&current->thread.uw.fpsimd_state);
 	} else {
 		WARN_ON_ONCE(current->thread.fp_type != FP_STATE_FPSIMD);
@@ -507,9 +506,7 @@ static void fpsimd_save_user_state(void)
 			return;
 		}
 
-		sve_save_state((char *)last->sve_state +
-					sve_ffr_offset(vl),
-			       save_ffr);
+		sve_save_state(last->sve_state, save_ffr);
 		fpsimd_save_common(last->st);
 		*last->fp_type = FP_STATE_SVE;
 	} else {
@@ -641,7 +638,8 @@ static __uint128_t arm64_cpu_to_le128(__uint128_t x)
 
 #define arm64_le128_to_cpu(x) arm64_cpu_to_le128(x)
 
-static void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst,
+static void __fpsimd_to_sve(struct sve_state *sst,
+			    struct user_fpsimd_state const *fst,
 			    unsigned int vq)
 {
 	unsigned int i;
@@ -668,7 +666,7 @@ static void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst,
 static inline void fpsimd_to_sve(struct task_struct *task)
 {
 	unsigned int vq;
-	void *sst = task->thread.sve_state;
+	struct sve_state *sst = task->thread.sve_state;
 	struct user_fpsimd_state const *fst = &task->thread.uw.fpsimd_state;
 
 	if (!system_supports_sve() && !system_supports_sme())
@@ -692,7 +690,7 @@ static inline void fpsimd_to_sve(struct task_struct *task)
 static inline void sve_to_fpsimd(struct task_struct *task)
 {
 	unsigned int vq, vl;
-	void const *sst = task->thread.sve_state;
+	const struct sve_state *sst = task->thread.sve_state;
 	struct user_fpsimd_state *fst = &task->thread.uw.fpsimd_state;
 	unsigned int i;
 	__uint128_t const *p;
@@ -791,7 +789,7 @@ void fpsimd_sync_from_effective_state(struct task_struct *task)
 void fpsimd_sync_to_effective_state_zeropad(struct task_struct *task)
 {
 	unsigned int vq;
-	void *sst = task->thread.sve_state;
+	struct sve_state *sst = task->thread.sve_state;
 	struct user_fpsimd_state const *fst = &task->thread.uw.fpsimd_state;
 
 	if (task->thread.fp_type != FP_STATE_SVE)
@@ -809,7 +807,8 @@ static int change_live_vector_length(struct task_struct *task,
 {
 	unsigned int sve_vl = task_get_sve_vl(task);
 	unsigned int sme_vl = task_get_sme_vl(task);
-	void *sve_state = NULL, *sme_state = NULL;
+	struct sve_state *sve_state = NULL;
+	void *sme_state = NULL;
 
 	if (type == ARM64_VEC_SME)
 		sme_vl = vl;
@@ -1645,7 +1644,7 @@ static void fpsimd_flush_thread_vl(enum vec_type type)
 
 void fpsimd_flush_thread(void)
 {
-	void *sve_state = NULL;
+	struct sve_state *sve_state = NULL;
 	void *sme_state = NULL;
 
 	if (!system_supports_fpsimd())
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f9fc85a0344e1..7a3db4d7dcdef 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2499,7 +2499,7 @@ static void __init teardown_hyp_mode(void)
 			continue;
 
 		if (free_sve) {
-			u8 *sve_regs;
+			struct sve_state *sve_regs;
 
 			sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
 			free_pages((unsigned long) sve_regs, pkvm_host_sve_state_order());
@@ -2648,7 +2648,7 @@ static void finalize_init_hyp_mode(void)
 
 	if (system_supports_sve() && is_protected_kvm_enabled()) {
 		for_each_possible_cpu(cpu) {
-			u8 *sve_regs;
+			struct sve_state *sve_regs;
 
 			sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
 			per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs =
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 332c453b87cf8..b01d6622b8720 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -500,7 +500,7 @@ static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if (!kvm_arm_vcpu_sve_finalized(vcpu))
 		return -EPERM;
 
-	if (copy_to_user(uptr, vcpu->arch.sve_state + region.koffset,
+	if (copy_to_user(uptr, (void *)vcpu->arch.sve_state + region.koffset,
 			 region.klen) ||
 	    clear_user(uptr + region.klen, region.upad))
 		return -EFAULT;
@@ -526,7 +526,7 @@ static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
 	if (!kvm_arm_vcpu_sve_finalized(vcpu))
 		return -EPERM;
 
-	if (copy_from_user(vcpu->arch.sve_state + region.koffset, uptr,
+	if (copy_from_user((void *)vcpu->arch.sve_state + region.koffset, uptr,
 			   region.klen))
 		return -EFAULT;
 
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index aaa43554fd8e6..72e658255cda7 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -467,8 +467,7 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 	 * vCPU. Start off with the max VL so we can load the SVE state.
 	 */
 	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
-	__sve_restore_state(vcpu_sve_pffr(vcpu),
-			    true);
+	__sve_restore_state(kern_hyp_va(vcpu->arch.sve_state), true);
 	fpsimd_load_common(&vcpu->arch.ctxt.fp_regs);
 
 	/*
@@ -485,12 +484,11 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 static inline void __hyp_sve_save_host(void)
 {
 	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
-	u8 *sve_regs = *host_data_ptr(sve_regs);
+	struct sve_state *sve_regs = *host_data_ptr(sve_regs);
 
 	ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
-	__sve_save_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
-			 true);
+	__sve_save_state(sve_regs, true);
 	fpsimd_save_common(&hctxt->fp_regs);
 }
 
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 627762ed7327f..72d025b2178a7 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -35,7 +35,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
 	 * on the VL, so use a consistent (i.e., the maximum) guest VL.
 	 */
 	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
-	__sve_save_state(vcpu_sve_pffr(vcpu), true);
+	__sve_save_state(kern_hyp_va(vcpu->arch.sve_state), true);
 	fpsimd_save_common(&vcpu->arch.ctxt.fp_regs);
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
 }
@@ -43,7 +43,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
 static void __hyp_sve_restore_host(void)
 {
 	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
-	u8 *sve_regs = *host_data_ptr(sve_regs);
+	struct sve_state *sve_regs = *host_data_ptr(sve_regs);
 
 	/*
 	 * On saving/restoring host sve state, always use the maximum VL for
@@ -55,8 +55,7 @@ static void __hyp_sve_restore_host(void)
 	 * need to be revisited.
 	 */
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
-	__sve_restore_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
-			    true);
+	__sve_restore_state(sve_regs, true);
 	fpsimd_load_common(&hctxt->fp_regs);
 	write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
 }
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index cdaf53c833409..77dbcfed05486 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -82,7 +82,7 @@ static int pkvm_create_host_sve_mappings(void)
 
 	for (i = 0; i < hyp_nr_cpus; i++) {
 		struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);
-		u8 *sve_regs = host_data->sve_regs;
+		struct sve_state *sve_regs = host_data->sve_regs;
 
 		start = kern_hyp_va(sve_regs);
 		end = start + PAGE_ALIGN(pkvm_host_sve_state_size());
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 14/18] arm64: fpsimd: Use opaque type for SME state
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (12 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 13/18] arm64: fpsimd: Use opaque type for SVE state Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 16:56   ` Mark Brown
  2026-05-21 13:25 ` [PATCH 15/18] arm64: fpsimd: Move SVE save/restore inline Mark Rutland
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

As the SME state size can vary at runtime, we don't have a concrete type
for the in-memory SME state, and pass this around using a pointer to
void.

Using pointer to void means that it's very easy to introduce errors that
cannot be caught by the compiler (e.g. as 'void **' can be assigned to
'void *').

Improve this by adding an opaque 'struct sve_state', and consistently
passing a pointer to this.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h    | 8 ++++----
 arch/arm64/include/asm/processor.h | 3 ++-
 arch/arm64/kernel/fpsimd.c         | 4 ++--
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 19e670ae67598..560814acc60c0 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -163,7 +163,7 @@ extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 struct cpu_fp_state {
 	struct user_fpsimd_state *st;
 	struct sve_state *sve_state;
-	void *sme_state;
+	struct sme_state *sme_state;
 	u64 *svcr;
 	u64 *fpmr;
 	unsigned int sve_vl;
@@ -199,7 +199,7 @@ static inline void *thread_zt_state(struct thread_struct *thread)
 {
 	/* The ZT register state is stored immediately after the ZA state */
 	unsigned int sme_vq = sve_vq_from_vl(thread_get_sme_vl(thread));
-	return thread->sme_state + ZA_SIG_REGS_SIZE(sme_vq);
+	return (void *)thread->sme_state + ZA_SIG_REGS_SIZE(sme_vq);
 }
 
 static inline unsigned int sve_get_vl(void)
@@ -218,8 +218,8 @@ static inline unsigned int sve_get_vl(void)
 extern void sve_save_state(struct sve_state *state, int save_ffr);
 extern void sve_load_state(const struct sve_state *state, int restore_ffr);
 extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
-extern void sme_save_state(void *state, int zt);
-extern void sme_load_state(void const *state, int zt);
+extern void sme_save_state(struct sme_state *state, int zt);
+extern void sme_load_state(const struct sme_state *state, int zt);
 
 struct arm64_cpu_capabilities;
 extern void cpu_enable_fpsimd(const struct arm64_cpu_capabilities *__unused);
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 1c2ffd063baa8..7304d9cca3e85 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -131,6 +131,7 @@ enum fp_type {
 };
 
 struct sve_state;		/* Opaque type */
+struct sme_state;		/* Opaque type */
 
 struct cpu_context {
 	unsigned long x19;
@@ -167,7 +168,7 @@ struct thread_struct {
 	enum fp_type		fp_type;	/* registers FPSIMD or SVE? */
 	unsigned int		fpsimd_cpu;
 	struct sve_state	*sve_state;	/* SVE registers, if any */
-	void			*sme_state;	/* ZA and ZT state, if any */
+	struct sme_state	*sme_state;	/* ZA and ZT state, if any */
 	unsigned int		vl[ARM64_VEC_MAX];	/* vector length */
 	unsigned int		vl_onexec[ARM64_VEC_MAX]; /* vl after next exec */
 	unsigned long		fault_address;	/* fault info */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 66d880d081671..f9b3eeacf130d 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -808,7 +808,7 @@ static int change_live_vector_length(struct task_struct *task,
 	unsigned int sve_vl = task_get_sve_vl(task);
 	unsigned int sme_vl = task_get_sme_vl(task);
 	struct sve_state *sve_state = NULL;
-	void *sme_state = NULL;
+	struct sme_state *sme_state = NULL;
 
 	if (type == ARM64_VEC_SME)
 		sme_vl = vl;
@@ -1645,7 +1645,7 @@ static void fpsimd_flush_thread_vl(enum vec_type type)
 void fpsimd_flush_thread(void)
 {
 	struct sve_state *sve_state = NULL;
-	void *sme_state = NULL;
+	struct sme_state *sme_state = NULL;
 
 	if (!system_supports_fpsimd())
 		return;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 15/18] arm64: fpsimd: Move SVE save/restore inline
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (13 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 14/18] arm64: fpsimd: Use opaque type for SME state Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-21 13:25 ` [PATCH 16/18] arm64: fpsimd: Move sve_flush_live() inline Mark Rutland
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Currently the SVE register save/restore sequences are written in
out-of-line assembly routines. While this works, it's somewhat painful:

* As KVM needs to be able to use the sequences in hyp code, separate
  assembly files are used for the regular kernel and KVM code. While the
  common logic is shared in assembly macros, this still requires some
  duplication, and has lead to some trivial divergence.

* As the SVE LDR/STR instrucitons have limited addressing modes, the
  assembly macros use an awkward pattern requiring negative offsets.
  This could be written more clearly with addresses being generated in C
  code.

* As the FFR does not always exist in streaming mode, some awkward
  conditional branching has been written in assembly which could be
  clearer in C (and would permit the compiler to optimize out
  unnecessary branches in some cases).

* For historical reasons, the assembly macros take some register
  arguments as numerical indices (e.g. "sve_save 0, x1" uses x0 and x1),
  which is simply confusing.

* For historical reasons, the SVE save/restore code and FPSIMD
  save/restore code have a distinct sequences for FPSR and FPCR. Ideally
  this logic would be shared.

* The assembly sequences can't be instrumented, and so it's harder than
  necessary to catch memory safety issues.

To handle the above, move the SVE register save/restore sequences
to inline assembly.

Neither GCC nor LLVM instrument memory arguments to inline assembly, so
explicit instrumentation is added in the same manner as other assembly
routines. This instrumentation is implicitly disabled by Kbuild for nVHE
hyp code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h         | 119 +++++++++++++++++++++++-
 arch/arm64/include/asm/fpsimdmacros.h   |  61 ------------
 arch/arm64/include/asm/kvm_hyp.h        |   3 -
 arch/arm64/kernel/entry-fpsimd.S        |  22 -----
 arch/arm64/kvm/hyp/fpsimd.S             |  21 -----
 arch/arm64/kvm/hyp/include/hyp/switch.h |   4 +-
 arch/arm64/kvm/hyp/nvhe/Makefile        |   2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c      |   4 +-
 arch/arm64/kvm/hyp/vhe/Makefile         |   2 +-
 9 files changed, 123 insertions(+), 115 deletions(-)
 delete mode 100644 arch/arm64/kvm/hyp/fpsimd.S

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 560814acc60c0..d005324bbcf3e 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -215,8 +215,123 @@ static inline unsigned int sve_get_vl(void)
 	return vl;
 }
 
-extern void sve_save_state(struct sve_state *state, int save_ffr);
-extern void sve_load_state(const struct sve_state *state, int restore_ffr);
+#define FOR_EACH_Z_REG(idx_str, asm_str)											\
+	"	.irp " idx_str ",0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31\n"	\
+	asm_str	"\n"														\
+	"	.endr\n"
+
+#define FOR_EACH_P_REG(idx_str, asm_str)											\
+	"	.irp " idx_str ",0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15\n"	\
+	asm_str	"\n"								\
+	"	.endr\n"
+
+static inline void __sve_save_z(struct sve_state *state, unsigned long vl)
+{
+	instrument_write(state, SVE_NUM_ZREGS * vl);
+	asm volatile(
+	__SVE_PREAMBLE
+	FOR_EACH_Z_REG("n", "str	z\\n, [%[zregs], #\\n, MUL VL]")
+	:
+	: [zregs] "r" (state)
+	: "memory"
+	);
+}
+
+static inline void __sve_load_z(const struct sve_state *state, unsigned long vl)
+{
+	instrument_read(state, SVE_NUM_ZREGS * vl);
+	asm volatile(
+	__SVE_PREAMBLE
+	FOR_EACH_Z_REG("n", "ldr	z\\n, [%[zregs], #\\n, MUL VL]")
+	:
+	: [zregs] "r" (state)
+	: "memory"
+	);
+}
+
+static inline void __sve_save_p(struct sve_state *state, unsigned long vl, bool ffr)
+{
+	void *pregs = (void *)state + SVE_NUM_ZREGS * vl;
+	unsigned long pl = vl / 8;
+	void *pffr = pregs + SVE_NUM_PREGS * pl;
+
+	instrument_write(pregs, SVE_NUM_PREGS * pl);
+	asm volatile(
+	__SVE_PREAMBLE
+	FOR_EACH_P_REG("n", "str	p\\n, [%[pregs], #\\n, MUL VL]\n")
+	:
+	: [pregs] "r" (pregs)
+	: "memory"
+	);
+
+	instrument_write(pffr, pl);
+	if (ffr) {
+		asm volatile(
+		__SVE_PREAMBLE
+		"	rdffr	p0.b\n"
+		"	str	p0, [%[pffr]]\n"
+		"	ldr	p0, [%[pregs]]\n"
+		:
+		: [pregs] "r" (pregs),
+		  [pffr] "r" (pffr)
+		: "memory"
+		);
+	} else {
+		asm volatile(
+		__SVE_PREAMBLE
+		"	pfalse	p0.b\n"
+		"	str	p0, [%[pffr]]\n"
+		"	ldr	p0, [%[pregs]]\n"
+		:
+		: [pregs] "r" (pregs),
+		  [pffr] "r" (pffr)
+		: "memory"
+		);
+	}
+}
+
+static inline void __sve_load_p(const struct sve_state *state, unsigned long vl, bool ffr)
+{
+	const void *pregs = (const void *)state + SVE_NUM_ZREGS * vl;
+	unsigned long pl = vl / 8;
+	const void *pffr = pregs + SVE_NUM_PREGS * pl;
+
+	if (ffr) {
+		instrument_read(pffr, pl);
+		asm volatile(
+		__SVE_PREAMBLE
+		"	ldr	p0, [%[pffr]]\n"
+		"	wrffr	p0.b\n"
+		:
+		: [pffr] "r" (pffr)
+		: "memory"
+		);
+	}
+
+	instrument_read(pregs, SVE_NUM_PREGS * pl);
+	asm volatile(
+	__SVE_PREAMBLE
+	FOR_EACH_P_REG("n", "ldr	p\\n, [%[pregs], #\\n, MUL VL]\n")
+	:
+	: [pregs] "r" (pregs)
+	: "memory"
+	);
+}
+
+static inline void sve_save_state(struct sve_state *state, bool ffr)
+{
+	unsigned long vl = sve_get_vl();
+	__sve_save_z(state, vl);
+	__sve_save_p(state, vl, ffr);
+}
+
+static inline void sve_load_state(const struct sve_state *state, bool ffr)
+{
+	unsigned long vl = sve_get_vl();
+	__sve_load_z(state, vl);
+	__sve_load_p(state, vl, ffr);
+}
+
 extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
 extern void sme_save_state(struct sme_state *state, int zt);
 extern void sme_load_state(const struct sme_state *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 08f4863e67715..ebf8b47313e90 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -42,36 +42,6 @@
 
 /* Deprecated macros for SVE instructions */
 
-/* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */
-.macro _sve_str_v nz, nxbase, offset=0
-	.arch_extension sve
-	str	z\nz, [X\nxbase, #\offset, MUL VL]
-.endm
-
-/* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */
-.macro _sve_ldr_v nz, nxbase, offset=0
-	.arch_extension sve
-	ldr	z\nz, [X\nxbase, #\offset, MUL VL]
-.endm
-
-/* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */
-.macro _sve_str_p np, nxbase, offset=0
-	.arch_extension sve
-	str	p\np, [X\nxbase, #\offset, MUL VL]
-.endm
-
-/* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */
-.macro _sve_ldr_p np, nxbase, offset=0
-	.arch_extension sve
-	ldr p\np, [x\nxbase, #\offset, MUL VL]
-.endm
-
-/* RDFFR (unpredicated): RDFFR P\np.B */
-.macro _sve_rdffr np
-	.arch_extension sve
-	rdffr p\np\().b
-.endm
-
 /* WRFFR P\np.B */
 .macro _sve_wrffr np
 	wrffr p\np\().b
@@ -176,37 +146,6 @@
 		_sve_wrffr	0
 .endm
 
-.macro _sve_pffr ptr
-	.arch_extension sve
-	addvl	\ptr, \ptr, #16
-	addvl	\ptr, \ptr, #16
-	addpl	\ptr, \ptr, #16
-.endm
-
-.macro sve_save nxbase, save_ffr
-		_sve_pffr	x\nxbase
- _for n, 0, 31,	_sve_str_v	\n, \nxbase, \n - 34
- _for n, 0, 15,	_sve_str_p	\n, \nxbase, \n - 16
-		cbz		\save_ffr, 921f
-		_sve_rdffr	0
-		b		922f
-921:
-		_sve_pfalse	0			// Zero out FFR
-922:
-		_sve_str_p	0, \nxbase
-		_sve_ldr_p	0, \nxbase, -16
-.endm
-
-.macro sve_load nxbase, restore_ffr
-		_sve_pffr	x\nxbase
- _for n, 0, 31,	_sve_ldr_v	\n, \nxbase, \n - 34
-		cbz		\restore_ffr, 921f
-		_sve_ldr_p	0, \nxbase
-		_sve_wrffr	0
-921:
- _for n, 0, 15,	_sve_ldr_p	\n, \nxbase, \n - 16
-.endm
-
 .macro sme_save_za nxbase, xvl, nw
 	mov	w\nw, #0
 
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 38356eee592ad..ad19de1d0654f 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -121,9 +121,6 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu);
 void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
 #endif
 
-void __sve_save_state(struct sve_state *sve, int save_ffr);
-void __sve_restore_state(struct sve_state *sve, int restore_ffr);
-
 u64 __guest_enter(struct kvm_vcpu *vcpu);
 
 bool kvm_host_psci_handler(struct kvm_cpu_context *host_ctxt, u32 func_id);
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 4fa00c94f28b7..0575d90e6dffb 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -13,28 +13,6 @@
 
 #ifdef CONFIG_ARM64_SVE
 
-/*
- * Save the SVE state
- *
- * x0 - pointer to buffer for state
- * x1 - Save FFR if non-zero
- */
-SYM_FUNC_START(sve_save_state)
-	sve_save 0, x1
-	ret
-SYM_FUNC_END(sve_save_state)
-
-/*
- * Load the SVE state
- *
- * x0 - pointer to buffer for state
- * x1 - Restore FFR if non-zero
- */
-SYM_FUNC_START(sve_load_state)
-	sve_load 0, x1
-	ret
-SYM_FUNC_END(sve_load_state)
-
 /*
  * Zero all SVE registers but the first 128-bits of each vector
  *
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
deleted file mode 100644
index beacec33b2541..0000000000000
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ /dev/null
@@ -1,21 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2015 - ARM Ltd
- * Author: Marc Zyngier <marc.zyngier@arm.com>
- */
-
-#include <linux/linkage.h>
-
-#include <asm/fpsimdmacros.h>
-
-	.text
-
-SYM_FUNC_START(__sve_restore_state)
-	sve_load 0, x1
-	ret
-SYM_FUNC_END(__sve_restore_state)
-
-SYM_FUNC_START(__sve_save_state)
-	sve_save 0, x1
-	ret
-SYM_FUNC_END(__sve_save_state)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 72e658255cda7..41c60c9eea423 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -467,7 +467,7 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 	 * vCPU. Start off with the max VL so we can load the SVE state.
 	 */
 	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
-	__sve_restore_state(kern_hyp_va(vcpu->arch.sve_state), true);
+	sve_load_state(kern_hyp_va(vcpu->arch.sve_state), true);
 	fpsimd_load_common(&vcpu->arch.ctxt.fp_regs);
 
 	/*
@@ -488,7 +488,7 @@ static inline void __hyp_sve_save_host(void)
 
 	ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
-	__sve_save_state(sve_regs, true);
+	sve_save_state(sve_regs, true);
 	fpsimd_save_common(&hctxt->fp_regs);
 }
 
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 62cdfbff75625..f57450ebcb498 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -26,7 +26,7 @@ hyp-obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o
 	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o page_alloc.o \
 	 cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o stacktrace.o ffa.o
 hyp-obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
-	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o ../vgic-v5-sr.o
+	 ../hyp-entry.o ../exception.o ../pgtable.o ../vgic-v5-sr.o
 hyp-obj-y += ../../../kernel/smccc-call.o
 hyp-obj-$(CONFIG_LIST_HARDENED) += list_debug.o
 hyp-obj-$(CONFIG_NVHE_EL2_TRACING) += clock.o trace.o events.o
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 72d025b2178a7..5c43943f24380 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -35,7 +35,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
 	 * on the VL, so use a consistent (i.e., the maximum) guest VL.
 	 */
 	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
-	__sve_save_state(kern_hyp_va(vcpu->arch.sve_state), true);
+	sve_save_state(kern_hyp_va(vcpu->arch.sve_state), true);
 	fpsimd_save_common(&vcpu->arch.ctxt.fp_regs);
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
 }
@@ -55,7 +55,7 @@ static void __hyp_sve_restore_host(void)
 	 * need to be revisited.
 	 */
 	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
-	__sve_restore_state(sve_regs, true);
+	sve_load_state(sve_regs, true);
 	fpsimd_load_common(&hctxt->fp_regs);
 	write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
 }
diff --git a/arch/arm64/kvm/hyp/vhe/Makefile b/arch/arm64/kvm/hyp/vhe/Makefile
index 9695328bbd96e..d6b3475145c0e 100644
--- a/arch/arm64/kvm/hyp/vhe/Makefile
+++ b/arch/arm64/kvm/hyp/vhe/Makefile
@@ -10,4 +10,4 @@ CFLAGS_switch.o += -Wno-override-init
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
-	 ../fpsimd.o ../hyp-entry.o ../exception.o ../vgic-v5-sr.o
+	 ../hyp-entry.o ../exception.o ../vgic-v5-sr.o
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 16/18] arm64: fpsimd: Move sve_flush_live() inline
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (14 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 15/18] arm64: fpsimd: Move SVE save/restore inline Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-21 13:25 ` [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline Mark Rutland
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Currently sve_flush_live() is written in out-of-line assembly. It would
be nice if we could move it inline such that control flow can be written
more clearly in C, and to permit the removal of otherwise unused
assembly macros.

The 'flush_ffr' argument is redundant as sve_flush_live() is always
called from non-streaming mode, and all callers pass 'true'. Remove the
argument and make it a requirement that the function is called from
non-streaming mode.

The 'vq_minus_1' argument is unnecessary, as sve_flush_live() can read
the live VL directly using the RDVL instruction (wrapped by the
sve_get_vl() helper function).

Move the function to C, with the simplifications above.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h       | 26 +++++++++++++++++++++++-
 arch/arm64/include/asm/fpsimdmacros.h | 29 ---------------------------
 arch/arm64/kernel/entry-common.c      |  8 ++------
 arch/arm64/kernel/entry-fpsimd.S      | 22 --------------------
 arch/arm64/kernel/fpsimd.c            |  2 +-
 5 files changed, 28 insertions(+), 59 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index d005324bbcf3e..550987b36206a 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -332,7 +332,31 @@ static inline void sve_load_state(const struct sve_state *state, bool ffr)
 	__sve_load_p(state, vl, ffr);
 }
 
-extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
+
+/*
+ * Zero all SVE registers except for the first 128 bits of each vector.
+ *
+ * The caller must ensure that the VL has been configured and the CPU must be
+ * in non-streaming mode.
+ */
+static inline void sve_flush_live(void)
+{
+	unsigned long vl = sve_get_vl();
+
+	if (vl > sizeof(__uint128_t)) {
+		asm volatile(
+		__FPSIMD_PREAMBLE
+		FOR_EACH_Z_REG("n", "mov	v\\n\\().16b, v\\n\\().16b")
+		);
+	}
+
+	asm volatile(
+	__SVE_PREAMBLE
+	FOR_EACH_P_REG("n", "pfalse	p\\n\\().b")
+	"	wrffr	p0.b\n"
+	);
+}
+
 extern void sme_save_state(struct sme_state *state, int zt);
 extern void sme_load_state(const struct sme_state *state, int zt);
 
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index ebf8b47313e90..9e352b5c6b764 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -40,19 +40,6 @@
 	.endif
 .endm
 
-/* Deprecated macros for SVE instructions */
-
-/* WRFFR P\np.B */
-.macro _sve_wrffr np
-	wrffr p\np\().b
-.endm
-
-/* PFALSE P\np.B */
-.macro _sve_pfalse np
-	.arch_extension sve
-	pfalse	p\np\().b
-.endm
-
 /* Deprecated macros for SME instructions */
 
 /* RDSVL X\nx, #\imm */
@@ -130,22 +117,6 @@
 	.purgem _for__body
 .endm
 
-/* Preserve the first 128-bits of Znz and zero the rest. */
-.macro _sve_flush_z nz
-	_sve_check_zreg \nz
-	mov	v\nz\().16b, v\nz\().16b
-.endm
-
-.macro sve_flush_z
- _for n, 0, 31, _sve_flush_z	\n
-.endm
-.macro sve_flush_p
- _for n, 0, 15, _sve_pfalse	\n
-.endm
-.macro sve_flush_ffr
-		_sve_wrffr	0
-.endm
-
 .macro sme_save_za nxbase, xvl, nw
 	mov	w\nw, #0
 
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index cb54335465f66..2352297330e12 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -237,12 +237,8 @@ static inline void fpsimd_syscall_enter(void)
 	if (!system_supports_sve())
 		return;
 
-	if (test_thread_flag(TIF_SVE)) {
-		unsigned int sve_vq_minus_one;
-
-		sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
-		sve_flush_live(true, sve_vq_minus_one);
-	}
+	if (test_thread_flag(TIF_SVE))
+		sve_flush_live();
 
 	/*
 	 * Any live non-FPSIMD SVE state has been zeroed. Allow
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 0575d90e6dffb..bff941eea9566 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -11,28 +11,6 @@
 #include <asm/assembler.h>
 #include <asm/fpsimdmacros.h>
 
-#ifdef CONFIG_ARM64_SVE
-
-/*
- * Zero all SVE registers but the first 128-bits of each vector
- *
- * VQ must already be configured by caller, any further updates of VQ
- * will need to ensure that the register state remains valid.
- *
- * x0 = include FFR?
- * x1 = VQ - 1
- */
-SYM_FUNC_START(sve_flush_live)
-	cbz		x1, 1f	// A VQ-1 of 0 is 128 bits so no extra Z state
-	sve_flush_z
-1:	sve_flush_p
-	tbz		x0, #0, 2f
-	sve_flush_ffr
-2:	ret
-SYM_FUNC_END(sve_flush_live)
-
-#endif /* CONFIG_ARM64_SVE */
-
 #ifdef CONFIG_ARM64_SME
 
 /*
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index f9b3eeacf130d..42177b439b3c7 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1338,7 +1338,7 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		unsigned long vq = sve_vq_from_vl(task_get_sve_vl(current));
 		sysreg_clear_set_s(SYS_ZCR_EL1, ZCR_ELx_LEN, vq - 1);
-		sve_flush_live(true, vq - 1);
+		sve_flush_live();
 		fpsimd_bind_task_to_cpu();
 	} else {
 		fpsimd_to_sve(current);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (15 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 16/18] arm64: fpsimd: Move sve_flush_live() inline Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-26 14:08   ` Mark Rutland
  2026-05-21 13:25 ` [PATCH 18/18] arm64: fpsimd: Remove <asm/fpsimdmacros.h> Mark Rutland
  2026-05-27  8:07 ` [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Marc Zyngier
  18 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

Currently the SVE register save/restore sequences are written in
out-of-line assembly routines. While this works, it's somewhat painful:

* For KVM to use the sequences, portions of the logic will need to be
  duplicated in KVM hyp code. While the common logic can be shared in
  assembly macros, this is very likely to lead to unnecessary divergence
  and be a maintenance burden.

* For historical reasons, the assembly macros take some register
  arguments as numerical indices (e.g. "sme_save_za 0, x2, 12" uses x0, x1, and
  x12), which is simply confusing.

* Address generation and control flow are far clearer in C than in
  assembly.

* The assembly sequences can't be instrumented, and so it's harder than
  necessary to catch memory safety issues.

To handle the above, move the SME register save/restore sequences
to inline assembly.

Neither GCC nor LLVM instrument memory arguments to inline assembly, so
explicit instrumentation is added in the same manner as other assembly
routines. This instrumentation is implicitly disabled by Kbuild for nVHE
hyp code.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimd.h       | 100 +++++++++++++++++++++++++-
 arch/arm64/include/asm/fpsimdmacros.h |  76 --------------------
 arch/arm64/kernel/Makefile            |   2 +-
 arch/arm64/kernel/entry-fpsimd.S      |  48 -------------
 4 files changed, 98 insertions(+), 128 deletions(-)
 delete mode 100644 arch/arm64/kernel/entry-fpsimd.S

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 550987b36206a..12f222f64b8d5 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -357,9 +357,6 @@ static inline void sve_flush_live(void)
 	);
 }
 
-extern void sme_save_state(struct sme_state *state, int zt);
-extern void sme_load_state(const struct sme_state *state, int zt);
-
 struct arm64_cpu_capabilities;
 extern void cpu_enable_fpsimd(const struct arm64_cpu_capabilities *__unused);
 extern void cpu_enable_sve(const struct arm64_cpu_capabilities *__unused);
@@ -639,6 +636,100 @@ static inline size_t __sme_state_size(unsigned int sme_vl)
 	return size;
 }
 
+static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
+{
+	/* The <Wv> argument to STR (array vector) can only encode W12-W15 */
+	register unsigned long v asm ("12");
+
+	instrument_write(state, svl * svl);
+	for (v = 0; v < svl; v++) {
+		void *pav = (void *)state + v * svl;
+
+		asm volatile(
+		__SME_PREAMBLE
+		"	str	za[%w[v], #0], [%[pav]]\n"
+		:
+		: [v] "r" (v),
+		  [pav] "r" (pav)
+		: "memory"
+		);
+	}
+}
+
+static inline void __sme_load_za(struct sme_state *state, unsigned long svl)
+{
+	/* The <Wv> argument to LDR (array vector) can only encode W12-W15 */
+	register unsigned long v asm ("12");
+
+	instrument_read(state, svl * svl);
+	for (v = 0; v < svl; v++) {
+		void *pav = (void *)state + v * svl;
+
+		asm volatile(
+		__SME_PREAMBLE
+		"	ldr	za[%w[v], #0], [%[pav]]\n"
+		:
+		: [v] "r" (v),
+		  [pav] "r" (pav)
+		: "memory"
+		);
+	}
+}
+
+static inline void __sme_save_zt(struct sme_state *state, unsigned long svl)
+{
+	void *pzt = (void *)state + svl * svl;
+
+	instrument_write(pzt, svl);
+	asm volatile(
+	__DEFINE_ASM_GPR_NUMS
+	/*
+	 * STR ZT0, [<Xn|SP>]
+	 * Supported by binutils 2.41+.
+	 * Supported by LLVM 16+
+	 */
+	"	.inst	0xe13f8000 | ((.L__gpr_num_%[pzt]) << 5)\n"
+	:
+	: [pzt] "r" (pzt)
+	: "memory");
+}
+
+static inline void __sme_load_zt(const struct sme_state *state, unsigned long svl)
+{
+	void *pzt = (void *)state + svl * svl;
+
+	instrument_read(pzt, svl);
+	asm volatile(
+	__DEFINE_ASM_GPR_NUMS
+	/*
+	 * LDR ZT0, [<Xn|SP>]
+	 * Supported by binutils 2.41+.
+	 * Supported by LLVM 16+
+	 */
+	"	.inst	0xe11f8000 | ((.L__gpr_num_%[pzt]) << 5)\n"
+	:
+	: [pzt] "r" (pzt)
+	: "memory");
+}
+
+static inline void sme_save_state(struct sme_state *state, bool zt)
+{
+	unsigned long svl = sme_get_vl();
+
+	__sme_save_za(state, svl);
+	if (zt)
+		__sme_save_zt(state, svl);
+}
+
+static inline void sme_load_state(struct sme_state *state, bool zt)
+{
+	unsigned long svl = sme_get_vl();
+
+	__sme_load_za(state, svl);
+	if (zt)
+		__sme_load_zt(state, svl);
+}
+
 /*
  * Return how many bytes of memory are required to store the full SME
  * specific state for task, given task's currently configured vector
@@ -695,6 +786,9 @@ static inline size_t sme_state_size(struct task_struct const *task)
 	return 0;
 }
 
+static inline void sme_save_state(struct sme_state *state, bool zt) { BUILD_BUG(); }
+static inline void sme_load_state(const struct sme_state *state, bool zt) { BUILD_BUG(); }
+
 static inline void sme_enter_from_user_mode(void) { }
 static inline void sme_exit_to_user_mode(void) { }
 
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 9e352b5c6b764..a763fd03ffef3 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -40,60 +40,6 @@
 	.endif
 .endm
 
-/* Deprecated macros for SME instructions */
-
-/* RDSVL X\nx, #\imm */
-.macro _sme_rdsvl nx, imm
-	.arch_extension sme
-	rdsvl x\nx, #\imm
-.endm
-
-/*
- * STR (vector from ZA array):
- *	STR ZA[W\nw, #\offset], [X\nxbase, #\offset, MUL VL]
- */
-.macro _sme_str_zav nw, nxbase, offset=0
-	.arch_extension sme
-	str	za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
-.endm
-
-/*
- * LDR (vector to ZA array):
- *	LDR ZA[w\nw, #\offset], [X\nxbase, #\offset, MUL VL]
- */
-.macro _sme_ldr_zav nw, nxbase, offset=0
-	.arch_extension sme
-	ldr	za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
-.endm
-
-/*
- * SME2 instruction encodings for older assemblers.
- * Supported by binutils 2.41+.
- * Supported by LLVM 16+
- */
-
-/*
- * LDR (ZT0)
- *
- *	LDR ZT0, nx
- */
-.macro _ldr_zt nx
-	_check_general_reg \nx
-	.inst	0xe11f8000	\
-		 | (\nx << 5)
-.endm
-
-/*
- * STR (ZT0)
- *
- *	STR ZT0, nx
- */
-.macro _str_zt nx
-	_check_general_reg \nx
-	.inst	0xe13f8000		\
-		| (\nx << 5)
-.endm
-
 .macro __for from:req, to:req
 	.if (\from) == (\to)
 		_for__body %\from
@@ -116,25 +62,3 @@
 
 	.purgem _for__body
 .endm
-
-.macro sme_save_za nxbase, xvl, nw
-	mov	w\nw, #0
-
-423:
-	_sme_str_zav \nw, \nxbase
-	add	x\nxbase, x\nxbase, \xvl
-	add	x\nw, x\nw, #1
-	cmp	\xvl, x\nw
-	bne	423b
-.endm
-
-.macro sme_load_za nxbase, xvl, nw
-	mov	w\nw, #0
-
-423:
-	_sme_ldr_zav \nw, \nxbase
-	add	x\nxbase, x\nxbase, \xvl
-	add	x\nw, x\nw, #1
-	cmp	\xvl, x\nw
-	bne	423b
-.endm
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 74b76bb704523..d2690c3ec5288 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -27,7 +27,7 @@ KCOV_INSTRUMENT_idle.o := n
 
 # Object file lists.
 obj-y			:= debug-monitors.o entry.o irq.o fpsimd.o		\
-			   entry-common.o entry-fpsimd.o process.o ptrace.o	\
+			   entry-common.o process.o ptrace.o			\
 			   setup.o signal.o sys.o stacktrace.o time.o traps.o	\
 			   io.o vdso.o hyp-stub.o psci.o cpu_ops.o		\
 			   return_address.o cpuinfo.o cpu_errata.o		\
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
deleted file mode 100644
index bff941eea9566..0000000000000
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ /dev/null
@@ -1,48 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * FP/SIMD state saving and restoring
- *
- * Copyright (C) 2012 ARM Ltd.
- * Author: Catalin Marinas <catalin.marinas@arm.com>
- */
-
-#include <linux/linkage.h>
-
-#include <asm/assembler.h>
-#include <asm/fpsimdmacros.h>
-
-#ifdef CONFIG_ARM64_SME
-
-/*
- * Save the ZA and ZT state
- *
- * x0 - pointer to buffer for state
- * x1 - number of ZT registers to save
- */
-SYM_FUNC_START(sme_save_state)
-	_sme_rdsvl	2, 1		// x2 = VL/8
-	sme_save_za 0, x2, 12		// Leaves x0 pointing to the end of ZA
-
-	cbz	x1, 1f
-	_str_zt 0
-1:
-	ret
-SYM_FUNC_END(sme_save_state)
-
-/*
- * Load the ZA and ZT state
- *
- * x0 - pointer to buffer for state
- * x1 - number of ZT registers to save
- */
-SYM_FUNC_START(sme_load_state)
-	_sme_rdsvl	2, 1		// x2 = VL/8
-	sme_load_za 0, x2, 12		// Leaves x0 pointing to the end of ZA
-
-	cbz	x1, 1f
-	_ldr_zt 0
-1:
-	ret
-SYM_FUNC_END(sme_load_state)
-
-#endif /* CONFIG_ARM64_SME */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 18/18] arm64: fpsimd: Remove <asm/fpsimdmacros.h>
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (16 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline Mark Rutland
@ 2026-05-21 13:25 ` Mark Rutland
  2026-05-27  8:07 ` [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Marc Zyngier
  18 siblings, 0 replies; 45+ messages in thread
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
	tabba, will

We no longer need any of the remaining macros in <asm/fpsimdmacros.h>.

Remove all of it.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/fpsimdmacros.h | 64 ---------------------------
 1 file changed, 64 deletions(-)
 delete mode 100644 arch/arm64/include/asm/fpsimdmacros.h

diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
deleted file mode 100644
index a763fd03ffef3..0000000000000
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ /dev/null
@@ -1,64 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * FP/SIMD state saving and restoring macros
- *
- * Copyright (C) 2012 ARM Ltd.
- * Author: Catalin Marinas <catalin.marinas@arm.com>
- */
-
-#include <asm/assembler.h>
-
-/* Sanity-check macros to help avoid encoding garbage instructions */
-
-.macro _check_general_reg nr
-	.if (\nr) < 0 || (\nr) > 30
-		.error "Bad register number \nr."
-	.endif
-.endm
-
-.macro _sve_check_zreg znr
-	.if (\znr) < 0 || (\znr) > 31
-		.error "Bad Scalable Vector Extension vector register number \znr."
-	.endif
-.endm
-
-.macro _sve_check_preg pnr
-	.if (\pnr) < 0 || (\pnr) > 15
-		.error "Bad Scalable Vector Extension predicate register number \pnr."
-	.endif
-.endm
-
-.macro _check_num n, min, max
-	.if (\n) < (\min) || (\n) > (\max)
-		.error "Number \n out of range [\min,\max]"
-	.endif
-.endm
-
-.macro _sme_check_wv v
-	.if (\v) < 12 || (\v) > 15
-		.error "Bad vector select register \v."
-	.endif
-.endm
-
-.macro __for from:req, to:req
-	.if (\from) == (\to)
-		_for__body %\from
-	.else
-		__for %\from, %((\from) + ((\to) - (\from)) / 2)
-		__for %((\from) + ((\to) - (\from)) / 2 + 1), %\to
-	.endif
-.endm
-
-.macro _for var:req, from:req, to:req, insn:vararg
-	.macro _for__body \var:req
-		.noaltmacro
-		\insn
-		.altmacro
-	.endm
-
-	.altmacro
-	__for \from, \to
-	.noaltmacro
-
-	.purgem _for__body
-.endm
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline
  2026-05-21 13:25 ` [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline Mark Rutland
@ 2026-05-26 14:08   ` Mark Rutland
  2026-05-26 14:39     ` Vladimir Murzin
  0 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-26 14:08 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, maz, oupton, tabba, will

On Thu, May 21, 2026 at 02:25:55PM +0100, Mark Rutland wrote:
> +static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
> +{
> +	/* The <Wv> argument to STR (array vector) can only encode W12-W15 */
> +	register unsigned long v asm ("12");

Sorry, I had meant to put "x12" here, but evidently GCC and LLVM accept
"12" on its own.

For clarity (e.g. to match the comment) I'll change that to "w12" and
make the type unsigned int. Likewise in __sme_load_za().

Mark.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h>
  2026-05-21 13:25 ` [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h> Mark Rutland
@ 2026-05-26 14:18   ` Mark Brown
  2026-05-27 10:10   ` Vladimir Murzin
  1 sibling, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 14:18 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 338 bytes --]

On Thu, May 21, 2026 at 02:25:39PM +0100, Mark Rutland wrote:
> There's no need for hyp/entry.S to include <asm/fpsimdmacros.h>.
> 
> The fpsimd macros have never been used by code in hyp/entry.S, and were
> instead used by code in hyp/fpsimd.S.
> 
> Remove the unnecessary include.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument
  2026-05-21 13:25 ` [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument Mark Rutland
@ 2026-05-26 14:27   ` Mark Brown
  2026-05-27 10:16   ` Vladimir Murzin
  1 sibling, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 14:27 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 428 bytes --]

On Thu, May 21, 2026 at 02:25:40PM +0100, Mark Rutland wrote:
> The __sve_save_state() and __sve_restore_state() functions take a
> parameter describing whether to save/restore the FFR, but both functions
> silently override this with '1'. This has always been benign (and
> callers have all passed 'true' since the parameter was introduced), but
> clearly this is not intentional.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline
  2026-05-26 14:08   ` Mark Rutland
@ 2026-05-26 14:39     ` Vladimir Murzin
  2026-05-26 15:28       ` Mark Rutland
  0 siblings, 1 reply; 45+ messages in thread
From: Vladimir Murzin @ 2026-05-26 14:39 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, maz, oupton, tabba, will

Hi Mark,

On 5/26/26 15:08, Mark Rutland wrote:
> On Thu, May 21, 2026 at 02:25:55PM +0100, Mark Rutland wrote:
>> +static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
>> +{
>> +	/* The <Wv> argument to STR (array vector) can only encode W12-W15 */
>> +	register unsigned long v asm ("12");
> Sorry, I had meant to put "x12" here, but evidently GCC and LLVM accept
> "12" on its own.
> 
> For clarity (e.g. to match the comment) I'll change that to "w12" and
> make the type unsigned int. Likewise in __sme_load_za().
> 

I suspect you are intentionally not using "Ucj" constrain to limit register allocator,
if so I'm wondering why?

Cheers
Vladimir


> Mark.
> 



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline
  2026-05-26 14:39     ` Vladimir Murzin
@ 2026-05-26 15:28       ` Mark Rutland
  2026-05-26 16:38         ` Mark Rutland
  0 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-26 15:28 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: linux-arm-kernel, kvmarm, broonie, catalin.marinas, james.morse,
	maz, oupton, tabba, will

On Tue, May 26, 2026 at 03:39:56PM +0100, Vladimir Murzin wrote:
> Hi Mark,
> 
> On 5/26/26 15:08, Mark Rutland wrote:
> > On Thu, May 21, 2026 at 02:25:55PM +0100, Mark Rutland wrote:
> >> +static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
> >> +{
> >> +	/* The <Wv> argument to STR (array vector) can only encode W12-W15 */
> >> +	register unsigned long v asm ("12");
> > Sorry, I had meant to put "x12" here, but evidently GCC and LLVM accept
> > "12" on its own.
> > 
> > For clarity (e.g. to match the comment) I'll change that to "w12" and
> > make the type unsigned int. Likewise in __sme_load_za().
> 
> I suspect you are intentionally not using "Ucj" constrain to limit register allocator,
> if so I'm wondering why?

Thanks for the suggestion; that was ignorance rather than intent.

I was not aware of "Ucj" as it doesn't appear on the public GCC
documentation:

  https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html

Looking at the machine description file, that's marked with '@internal',
so IIUC GCC folk don't seem to expect/want people to use it. That said,
LLVM seems to support it.

I'll go check that all relevant toolchains support this, and poke GCC
folk to see if they're happy to promote that to a public constraint.

If that's all good, I'll move over to "Ucj". If not, I'll update the
commit message and/or comments to explain why.

Mark.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc()
  2026-05-21 13:25 ` [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc() Mark Rutland
@ 2026-05-26 15:28   ` Mark Brown
  2026-05-27 12:05   ` Vladimir Murzin
  1 sibling, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 15:28 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 393 bytes --]

On Thu, May 21, 2026 at 02:25:43PM +0100, Mark Rutland wrote:
> For historical reasons, do_sve_acc() is structurally different from
> do_sme_acc(), and the logic to convert the task from FPSIMD to SVE is
> out-of-line in sve_init_regs(). We only use sve_init_regs() within
> do_sme_acc(), so it's not necessary for this to be a separate function.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 06/18] arm64: fpsimd: Remove sve_set_vq() and sme_set_vq()
  2026-05-21 13:25 ` [PATCH 06/18] arm64: fpsimd: Remove sve_set_vq() and sme_set_vq() Mark Rutland
@ 2026-05-26 15:42   ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 15:42 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 395 bytes --]

On Thu, May 21, 2026 at 02:25:44PM +0100, Mark Rutland wrote:
> The sve_set_vq() and sme_set_vq() assembly functions (and the
> sve_load_vq and sme_load_vq macros they use) are open-coded forms of
> sysreg_clear_set*(). There's no need for these to be implemented
> out-of-line in assembly, and the 'vq_minus_1' argument is unusual and
> confusing.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 07/18] arm64: fpsimd: Use assembler for SVE instructions
  2026-05-21 13:25 ` [PATCH 07/18] arm64: fpsimd: Use assembler for SVE instructions Mark Rutland
@ 2026-05-26 15:43   ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 15:43 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 544 bytes --]

On Thu, May 21, 2026 at 02:25:45PM +0100, Mark Rutland wrote:
> Historically we supported assemblers which could not assemble SVE
> instructions. We dropped support for such assemblers in commit:
> 
>   118c40b7b503 ("kbuild: require gcc-8 and binutils-2.30")
> 
> Since that commit, all supported assemblers (binutils and LLVM) are
> capable of assembling SVE instructions, and there's no need for us to
> manually encode SVE instructions.

Oh, finally.  I hadn't checked in a while:

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 08/18] arm64: fpsimd: Use assembler for baseline SME instructions
  2026-05-21 13:25 ` [PATCH 08/18] arm64: fpsimd: Use assembler for baseline SME instructions Mark Rutland
@ 2026-05-26 15:45   ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 15:45 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 373 bytes --]

On Thu, May 21, 2026 at 02:25:46PM +0100, Mark Rutland wrote:
> We currently support assemblers which do not support SME instructions,
> and have macros to manually encode SME instructions. This was
> necessary historically as SME support was developed before assembler
> support was widely available, but things have changed:

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 09/18] arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline
  2026-05-21 13:25 ` [PATCH 09/18] arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline Mark Rutland
@ 2026-05-26 15:47   ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 15:47 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 266 bytes --]

On Thu, May 21, 2026 at 02:25:47PM +0100, Mark Rutland wrote:
> The sve_get_vl() and sme_get_vl() functions are wrappers for the RDVL
> and RDSVL instructions respectively. There's no need for those to be
> out-of-line.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR
  2026-05-21 13:25 ` [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR Mark Rutland
@ 2026-05-26 15:55   ` Mark Brown
  2026-05-26 16:51     ` Mark Rutland
  0 siblings, 1 reply; 45+ messages in thread
From: Mark Brown @ 2026-05-26 15:55 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 275 bytes --]

On Thu, May 21, 2026 at 02:25:48PM +0100, Mark Rutland wrote:
> Add sysreg definitions for FPCR and FPSR.

> +Sysreg	FPCR	3	3	4	4	0

This looks good.

> +Sysreg	FPSR	3	3	4	4	1
> +Res0	63:32
> +Field	31	N
> +Field	30	Q

DDI0487 M.b and DDI0601 2026-03 both call this field Z.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 11/18] arm64: fpsimd: Split FPSR/FPCR from SVE save/restore
  2026-05-21 13:25 ` [PATCH 11/18] arm64: fpsimd: Split FPSR/FPCR from SVE save/restore Mark Rutland
@ 2026-05-26 16:28   ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 16:28 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 1730 bytes --]

On Thu, May 21, 2026 at 02:25:49PM +0100, Mark Rutland wrote:
> Regardless of whether the vector registers are saved in FPSIMD or SVE
> format, we store FPSR and FPCR in user_fpsimd_state::{fpsr,fpcr}.

...

> Note that the SVE assembly sequence for restoring FPCR uses an
> unconditional write to FPCR. The plain FPSIMD assembly sequence has used
> a conditional write to FPCR since 2014 in commit:

>   5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")

> ... but this was not followed for the SVE restore assembly implemented
> in 2017 in commit:

>   1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")

> ... so I've assumed that this doesn't actually matter in practice, and
> implemented the C version matching the existing SVE assembly.

> For the moment, fpsimd_save_state() and fpsimd_load_state() are left
> as-is with their own logic to save/restore FPSR and FPCR. This will be
> unified in subsequent patches.

There is a possibility that it only matters for older, FPSIMD only CPUs
or just that nobody got round to benchmarking this on physical CPUs with
SVE and in fact a similar optimisation is also useful there.  I'm a bit
wary of dropping the optimisation without any verification of the
performance impact, but equally I'm not aware of a specific benchmark
that showed the impact or even if there was one in the first place.  The
changelog sounds like the optimisation might've been written based on
inspection alone, I don't know if anyone will remember more than a
decade later.

Having said all that given that a conditional update is simple to
implement in C it seems safer to add one in the SVE path than to drop
it from the FPSIMD path.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline
  2026-05-26 15:28       ` Mark Rutland
@ 2026-05-26 16:38         ` Mark Rutland
  2026-05-27  9:00           ` Vladimir Murzin
  0 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-26 16:38 UTC (permalink / raw)
  To: Vladimir Murzin
  Cc: linux-arm-kernel, kvmarm, broonie, catalin.marinas, james.morse,
	maz, oupton, tabba, will

On Tue, May 26, 2026 at 04:28:17PM +0100, Mark Rutland wrote:
> On Tue, May 26, 2026 at 03:39:56PM +0100, Vladimir Murzin wrote:
> > Hi Mark,
> > 
> > On 5/26/26 15:08, Mark Rutland wrote:
> > > On Thu, May 21, 2026 at 02:25:55PM +0100, Mark Rutland wrote:
> > >> +static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
> > >> +{
> > >> +	/* The <Wv> argument to STR (array vector) can only encode W12-W15 */
> > >> +	register unsigned long v asm ("12");
> > > Sorry, I had meant to put "x12" here, but evidently GCC and LLVM accept
> > > "12" on its own.
> > > 
> > > For clarity (e.g. to match the comment) I'll change that to "w12" and
> > > make the type unsigned int. Likewise in __sme_load_za().
> > 
> > I suspect you are intentionally not using "Ucj" constrain to limit register allocator,
> > if so I'm wondering why?
> 
> Thanks for the suggestion; that was ignorance rather than intent.
> 
> I was not aware of "Ucj" as it doesn't appear on the public GCC
> documentation:
> 
>   https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html
> 
> Looking at the machine description file, that's marked with '@internal',
> so IIUC GCC folk don't seem to expect/want people to use it. That said,
> LLVM seems to support it.
> 
> I'll go check that all relevant toolchains support this, and poke GCC
> folk to see if they're happy to promote that to a public constraint.

GCC folk seem happy to make this public, which is great! I'll cross-link
a thread here if/when patches appear.

In the short term, using "Ucj" would require bumping our minimum
supported toolchain necessary for SME:

* GCC gained "Ucj" in 14.1.0, tagged on 7 May 2024.

* LLVM gained "Ucj" in 18.1.0, tagged on 27 Feb 2024.

... so using that would require adding a dependency on a newer
toolchain, e.g. via a CC_HAS_UCJ_CONSTRAINT to match the existing
CC_HAS_K_CONSTRAINT.

Aligned with the rationale on patch 8, v6.16 (tagged 27 July 2025) was
contemporary with GCC 15.1.0 (tagged 25 April 2025) and LLVM 20.1.0
(tagged 4 March 2025), both of which supported "Ucj".

> If that's all good, I'll move over to "Ucj". If not, I'll update the
> commit message and/or comments to explain why.

If Will and Catalin are happy to depend on a toolchain as above, I'll go
add the necessary CC_HAS_UCJ_CONSTRAINT Kconfig logic.

Otherwise I'll go note the above in a comment, and stick with the
register variable for now.

Mark.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 12/18] arm64: fpsimd: Move fpsimd save/restore inline
  2026-05-21 13:25 ` [PATCH 12/18] arm64: fpsimd: Move fpsimd save/restore inline Mark Rutland
@ 2026-05-26 16:44   ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 16:44 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 954 bytes --]

On Thu, May 21, 2026 at 02:25:50PM +0100, Mark Rutland wrote:

> Note that I've used the SVE sequence for restoring FPCR, which uses an
> unconditional write to FPCR. The plain FPSIMD assembly sequence used a
> conditional write to FPCR since 2014 in commit:

>   5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")

> ... but this was not followed for the SVE assembly implemented in 2017
> in commit:

>   1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")

> ... so I've assumed that this doesn't actually matter in practice, and
> I've erred in favour of the simpler sequence.

As I said on the earlier patch I'm a bit nervous about assuming this
doesn't matter for anyone without verifying (though I wouldn't be
surprised if that turned out to be the case) but that's internal to that
patch and this is obviously a great improvement so:

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR
  2026-05-26 15:55   ` Mark Brown
@ 2026-05-26 16:51     ` Mark Rutland
  2026-05-26 16:54       ` Mark Brown
  0 siblings, 1 reply; 45+ messages in thread
From: Mark Rutland @ 2026-05-26 16:51 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

On Tue, May 26, 2026 at 04:55:44PM +0100, Mark Brown wrote:
> On Thu, May 21, 2026 at 02:25:48PM +0100, Mark Rutland wrote:
> > Add sysreg definitions for FPCR and FPSR.
> 
> > +Sysreg	FPCR	3	3	4	4	0
> 
> This looks good.
> 
> > +Sysreg	FPSR	3	3	4	4	1
> > +Res0	63:32
> > +Field	31	N
> > +Field	30	Q
> 
> DDI0487 M.b and DDI0601 2026-03 both call this field Z.

Sorry about that; fixed locally now, and I've double-checked the rest.

Mark.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 13/18] arm64: fpsimd: Use opaque type for SVE state
  2026-05-21 13:25 ` [PATCH 13/18] arm64: fpsimd: Use opaque type for SVE state Mark Rutland
@ 2026-05-26 16:53   ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 16:53 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 827 bytes --]

On Thu, May 21, 2026 at 02:25:51PM +0100, Mark Rutland wrote:

> As the SVE state size can vary at runtime, we don't have a concrete type
> for the in-memory SVE state, and pass this around using a pointer to
> void. The functions which save/restore the SVE state have a very unusual
> calling convention, expecting a pointer to the FFR *in the middle of*
> the in-memory SVE state, which is also passed as a pointer to void.
> Passing a pointer to the FFR also requires that callers find the live VL
> and perform some arithmetic, which callers implement differently.

...

> Improve this by adding an opaque 'struct sve_state', and consistently
> passing a pointer to this, performing the necessary offsetting *within*
> the save/restore functions.

This is much more ergonomic:

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR
  2026-05-26 16:51     ` Mark Rutland
@ 2026-05-26 16:54       ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 16:54 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 375 bytes --]

On Tue, May 26, 2026 at 05:51:53PM +0100, Mark Rutland wrote:
> On Tue, May 26, 2026 at 04:55:44PM +0100, Mark Brown wrote:

> > > +Field	30	Q

> > DDI0487 M.b and DDI0601 2026-03 both call this field Z.

> Sorry about that; fixed locally now, and I've double-checked the rest.

Yes, I'd also checked everything else - sorry, should've mentioned that
it was only that field.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 484 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 14/18] arm64: fpsimd: Use opaque type for SME state
  2026-05-21 13:25 ` [PATCH 14/18] arm64: fpsimd: Use opaque type for SME state Mark Rutland
@ 2026-05-26 16:56   ` Mark Brown
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Brown @ 2026-05-26 16:56 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, catalin.marinas, james.morse, maz,
	oupton, tabba, will

[-- Attachment #1: Type: text/plain, Size: 543 bytes --]

On Thu, May 21, 2026 at 02:25:52PM +0100, Mark Rutland wrote:
> As the SME state size can vary at runtime, we don't have a concrete type
> for the in-memory SME state, and pass this around using a pointer to
> void.
> 
> Using pointer to void means that it's very easy to introduce errors that
> cannot be caught by the compiler (e.g. as 'void **' can be assigned to
> 'void *').
> 
> Improve this by adding an opaque 'struct sve_state', and consistently
> passing a pointer to this.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups
  2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
                   ` (17 preceding siblings ...)
  2026-05-21 13:25 ` [PATCH 18/18] arm64: fpsimd: Remove <asm/fpsimdmacros.h> Mark Rutland
@ 2026-05-27  8:07 ` Marc Zyngier
  2026-05-27 10:32   ` Mark Rutland
  18 siblings, 1 reply; 45+ messages in thread
From: Marc Zyngier @ 2026-05-27  8:07 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, broonie, catalin.marinas, james.morse,
	oupton, tabba, will

On Thu, 21 May 2026 14:25:38 +0100,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> Hi.
> 
> This series cleans up low-level FPSIMD/SVE/SME state management code,
> making it easier to maintain and extend (e.g. adding SME support to
> KVM), and enabling better debugging (e.g. by making SVE/SME save/restore
> visible to KASAN and KCSAN).
> 
> This is purely cleanup, there are NO bugs addressed by this series.

I had a look throughout, and couldn't see anything untoward other than
the couple of nits that were already pointed out. Killing the horrible
asm macros definitely brings a bit of fresh air to this code base.

Given the sensitivity of the change, I'd like this to simmer in -next
for a bit. How do you want this to be merged? I'm happy to take the
whole thing in kvmarm, and share the branch with arm64.

I assume you'll post a v2 anyway?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline
  2026-05-26 16:38         ` Mark Rutland
@ 2026-05-27  9:00           ` Vladimir Murzin
  0 siblings, 0 replies; 45+ messages in thread
From: Vladimir Murzin @ 2026-05-27  9:00 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, kvmarm, broonie, catalin.marinas, james.morse,
	maz, oupton, tabba, will

Hi Mark,

On 5/26/26 17:38, Mark Rutland wrote:
> On Tue, May 26, 2026 at 04:28:17PM +0100, Mark Rutland wrote:
>> On Tue, May 26, 2026 at 03:39:56PM +0100, Vladimir Murzin wrote:
>>> Hi Mark,
>>>
>>> On 5/26/26 15:08, Mark Rutland wrote:
>>>> On Thu, May 21, 2026 at 02:25:55PM +0100, Mark Rutland wrote:
>>>>> +static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
>>>>> +{
>>>>> +	/* The <Wv> argument to STR (array vector) can only encode W12-W15 */
>>>>> +	register unsigned long v asm ("12");
>>>> Sorry, I had meant to put "x12" here, but evidently GCC and LLVM accept
>>>> "12" on its own.
>>>>
>>>> For clarity (e.g. to match the comment) I'll change that to "w12" and
>>>> make the type unsigned int. Likewise in __sme_load_za().
>>> I suspect you are intentionally not using "Ucj" constrain to limit register allocator,
>>> if so I'm wondering why?
>> Thanks for the suggestion; that was ignorance rather than intent.
>>
>> I was not aware of "Ucj" as it doesn't appear on the public GCC
>> documentation:
>>
>>   https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html
>>
>> Looking at the machine description file, that's marked with '@internal',
>> so IIUC GCC folk don't seem to expect/want people to use it. That said,
>> LLVM seems to support it.
>>
>> I'll go check that all relevant toolchains support this, and poke GCC
>> folk to see if they're happy to promote that to a public constraint.
> GCC folk seem happy to make this public, which is great! I'll cross-link
> a thread here if/when patches appear.
> 
> In the short term, using "Ucj" would require bumping our minimum
> supported toolchain necessary for SME:
> 
> * GCC gained "Ucj" in 14.1.0, tagged on 7 May 2024.
> 
> * LLVM gained "Ucj" in 18.1.0, tagged on 27 Feb 2024.
> 
> ... so using that would require adding a dependency on a newer
> toolchain, e.g. via a CC_HAS_UCJ_CONSTRAINT to match the existing
> CC_HAS_K_CONSTRAINT.
> 
> Aligned with the rationale on patch 8, v6.16 (tagged 27 July 2025) was
> contemporary with GCC 15.1.0 (tagged 25 April 2025) and LLVM 20.1.0
> (tagged 4 March 2025), both of which supported "Ucj".
> 
>> If that's all good, I'll move over to "Ucj". If not, I'll update the
>> commit message and/or comments to explain why.
> If Will and Catalin are happy to depend on a toolchain as above, I'll go
> add the necessary CC_HAS_UCJ_CONSTRAINT Kconfig logic.
> 
> Otherwise I'll go note the above in a comment, and stick with the
> register variable for now.
> 
> Mark.
> 

Wow, I had no intention for generating this amount of work for
you - thanks for digging into that! FWIW, either way works for me :)

Cheers
Vladimir


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h>
  2026-05-21 13:25 ` [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h> Mark Rutland
  2026-05-26 14:18   ` Mark Brown
@ 2026-05-27 10:10   ` Vladimir Murzin
  1 sibling, 0 replies; 45+ messages in thread
From: Vladimir Murzin @ 2026-05-27 10:10 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, maz, oupton, tabba, will

On 5/21/26 14:25, Mark Rutland wrote:
> There's no need for hyp/entry.S to include <asm/fpsimdmacros.h>.
> 
> The fpsimd macros have never been used by code in hyp/entry.S, and were
> instead used by code in hyp/fpsimd.S.
> 
> Remove the unnecessary include.
> 
> There should be no functional change as a result of this patch.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Fuad Tabba <tabba@google.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Oliver Upton <oupton@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kvm/hyp/entry.S | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
> index 11a10d8f5beb2..308100ed25de9 100644
> --- a/arch/arm64/kvm/hyp/entry.S
> +++ b/arch/arm64/kvm/hyp/entry.S
> @@ -8,7 +8,6 @@
>  
>  #include <asm/alternative.h>
>  #include <asm/assembler.h>
> -#include <asm/fpsimdmacros.h>
>  #include <asm/kvm.h>
>  #include <asm/kvm_arm.h>
>  #include <asm/kvm_asm.h>
> -- 2.30.2
> 

FWIW,

Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument
  2026-05-21 13:25 ` [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument Mark Rutland
  2026-05-26 14:27   ` Mark Brown
@ 2026-05-27 10:16   ` Vladimir Murzin
  1 sibling, 0 replies; 45+ messages in thread
From: Vladimir Murzin @ 2026-05-27 10:16 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, maz, oupton, tabba, will

On 5/21/26 14:25, Mark Rutland wrote:
> The __sve_save_state() and __sve_restore_state() functions take a
> parameter describing whether to save/restore the FFR, but both functions
> silently override this with '1'. This has always been benign (and
> callers have all passed 'true' since the parameter was introduced), but
> clearly this is not intentional.
> 
> Historically, the functions always saved/restored the FFR, and there was
> no parameter to control this.
> 
> In v5.16, the sve_save and sve_load assembly macros used by
> __sve_save_state() and __sve_restore_state() were changed to make
> saving/restoring FFR optional. The implementations of __sve_save_state()
> and __sve_restore_state() were changed to pass '1' to their respective
> macros, and the prototypes of __sve_save_state() and
> __sve_restore_state() were unchanged. See commit:
> 
>   9f5848665788 ("arm64/sve: Make access to FFR optional")
> 
> In v6.10, the prototypes of __sve_save_state() and __sve_restore_state()
> were changed to add 'save_ffr' and 'restore_ffr' parameters
> respectively, but the implementations were not changed to stop passing 1
> to their respective macros. All callers were changed to pass 'true' to
> __sve_save_state() and __sve_restore_state(). See commit:
> 
>   45f4ea9bcfe9 ("KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_state")
> 
> This is all benign, but clearly unintentional, and it gets in the way of
> cleaning up the FPSIMD/SVE/SME code. Remove the unnecessary overriding.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Fuad Tabba <tabba@google.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Oliver Upton <oupton@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kvm/hyp/fpsimd.S | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
> index e950875e31cee..6e16cbfc5df27 100644
> --- a/arch/arm64/kvm/hyp/fpsimd.S
> +++ b/arch/arm64/kvm/hyp/fpsimd.S
> @@ -21,13 +21,11 @@ SYM_FUNC_START(__fpsimd_restore_state)
>  SYM_FUNC_END(__fpsimd_restore_state)
>  
>  SYM_FUNC_START(__sve_restore_state)
> -	mov	x2, #1
>  	sve_load 0, x1, x2, 3
>  	ret
>  SYM_FUNC_END(__sve_restore_state)
>  
>  SYM_FUNC_START(__sve_save_state)
> -	mov	x2, #1
>  	sve_save 0, x1, x2, 3
>  	ret
>  SYM_FUNC_END(__sve_save_state)
> -- 2.30.2
> 

FWIW,

Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 03/18] KVM: arm64: pkvm: Save host FPMR in host cpu context
  2026-05-21 13:25 ` [PATCH 03/18] KVM: arm64: pkvm: Save host FPMR in host cpu context Mark Rutland
@ 2026-05-27 10:29   ` Vladimir Murzin
  0 siblings, 0 replies; 45+ messages in thread
From: Vladimir Murzin @ 2026-05-27 10:29 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, maz, oupton, tabba, will

On 5/21/26 14:25, Mark Rutland wrote:
> Protected KVM stores most of the host's system register state in
> kvm_host_data::host_ctxt, which is an instance of struct
> kvm_cpu_context. As kvm_cpu_context::sys_regs[] has a slot for FPMR, we
> can store the host's FPMR there.
> 
> Do so, and remove kvm_host_data::fpmr.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Fuad Tabba <tabba@google.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Oliver Upton <oupton@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/kvm_host.h       | 3 ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ++++--
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c      | 5 +++--
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 65eead8362e0b..42b1c4764a4bf 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -775,9 +775,6 @@ struct kvm_host_data {
>  	 */
>  	struct cpu_sve_state *sve_state;
>  
> -	/* Used by pKVM only. */
> -	u64	fpmr;
> -
>  	/* Ownership of the FP regs */
>  	enum {
>  		FP_STATE_FREE,
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 98b2976837b11..cc4d011a2b380 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -554,6 +554,8 @@ static inline void fpsimd_lazy_switch_to_host(struct kvm_vcpu *vcpu)
>  
>  static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
>  {
> +	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
> +
>  	/*
>  	 * Non-protected kvm relies on the host restoring its sve state.
>  	 * Protected kvm restores the host's sve state as not to reveal that
> @@ -562,11 +564,11 @@ static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
>  	if (system_supports_sve()) {
>  		__hyp_sve_save_host();
>  	} else {
> -		__fpsimd_save_state(host_data_ptr(host_ctxt.fp_regs));
> +		__fpsimd_save_state(&hctxt->fp_regs);
>  	}
>  
>  	if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))
> -		*host_data_ptr(fpmr) = read_sysreg_s(SYS_FPMR);
> +		ctxt_sys_reg(hctxt, FPMR) = read_sysreg_s(SYS_FPMR);
>  }
>  
>  
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 06db299c37a89..db60f770060e5 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -66,6 +66,7 @@ static void fpsimd_sve_flush(void)
>  
>  static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
>  {
> +	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
>  	bool has_fpmr;
>  
>  	if (!guest_owns_fp_regs())
> @@ -89,10 +90,10 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
>  	if (system_supports_sve())
>  		__hyp_sve_restore_host();
>  	else
> -		__fpsimd_restore_state(host_data_ptr(host_ctxt.fp_regs));
> +		__fpsimd_restore_state(&hctxt->fp_regs);
>  
>  	if (has_fpmr)
> -		write_sysreg_s(*host_data_ptr(fpmr), SYS_FPMR);
> +		write_sysreg_s(ctxt_sys_reg(hctxt, FPMR), SYS_FPMR);
>  
>  	*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;
>  }
> -- 2.30.2
> 

FWIW,

Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups
  2026-05-27  8:07 ` [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Marc Zyngier
@ 2026-05-27 10:32   ` Mark Rutland
  0 siblings, 0 replies; 45+ messages in thread
From: Mark Rutland @ 2026-05-27 10:32 UTC (permalink / raw)
  To: Marc Zyngier, will, catalin.marinas
  Cc: linux-arm-kernel, kvmarm, broonie, james.morse, oupton, tabba

On Wed, May 27, 2026 at 09:07:31AM +0100, Marc Zyngier wrote:
> On Thu, 21 May 2026 14:25:38 +0100,
> Mark Rutland <mark.rutland@arm.com> wrote:
> > This series cleans up low-level FPSIMD/SVE/SME state management code,
> > making it easier to maintain and extend (e.g. adding SME support to
> > KVM), and enabling better debugging (e.g. by making SVE/SME save/restore
> > visible to KASAN and KCSAN).
> > 
> > This is purely cleanup, there are NO bugs addressed by this series.
> 
> I had a look throughout, and couldn't see anything untoward other than
> the couple of nits that were already pointed out. Killing the horrible
> asm macros definitely brings a bit of fresh air to this code base.
> 
> Given the sensitivity of the change, I'd like this to simmer in -next
> for a bit. How do you want this to be merged? I'm happy to take the
> whole thing in kvmarm, and share the branch with arm64.

That works for me.

Catalin, Will, any preference?

> I assume you'll post a v2 anyway?

Yep; given the bits that need fixups, I'll send a v2 by the end of this
week unless something explodes.

Mark.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 04/18] KVM: arm64: pkvm: Remove struct cpu_sve_state
  2026-05-21 13:25 ` [PATCH 04/18] KVM: arm64: pkvm: Remove struct cpu_sve_state Mark Rutland
@ 2026-05-27 11:58   ` Vladimir Murzin
  0 siblings, 0 replies; 45+ messages in thread
From: Vladimir Murzin @ 2026-05-27 11:58 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, maz, oupton, tabba, will

Hi Mark,

On 5/21/26 14:25, Mark Rutland wrote:
> There's no need for struct cpu_sve_state. Code would be simpler and more
> robust without it, and removing it will simplify further cleanups (e.g.
> adding an opaque type for the sve register state).
> 
> Protected KVM stores most of the host's system register state in
> kvm_host_data::host_ctxt, which is an instance of struct
> kvm_cpu_context. As kvm_cpu_context::sys_regs[] has a slot for ZCR_EL1,
> we can store the host's ZCR_EL1 there.
> 
> While kvm_cpu_context::sys_regs doesn't have slots for FPSR and FPCR,
> these are usually expected to be stored in struct user_fpsimd_state.
> For historical reasons, __sve_save_state and __sve_restore_state()
> expect a pointer to fpsr *within* struct user_fpsimd_state, assuming the
> fpcr will immediately follow, as per the order within struct
> user_fpsimd_state. We currently match this ordering in struct
> cpu_sve_state, but it would be simpler and more robust to use struct
> user_fpsimd_state directly.
> 
> After moving ZCR_EL1, FPSR, and FPCR out of struct cpu_sve_state, all
> that's left is sve_regs, which can be represented as a pointer without
> need for a container struct. This is kept as a pointer to u8 (matching
> the array type), as this permits the compiler to catch unbalanced
> referencing/dereferencing, which is not possible for pointers to void.
> 
> Apply the above changes, and remove cpu_sve_state.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Fuad Tabba <tabba@google.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Oliver Upton <oupton@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/include/asm/kvm_host.h       | 18 ++----------------
>  arch/arm64/include/asm/kvm_pkvm.h       |  3 +--
>  arch/arm64/kvm/arm.c                    | 16 ++++++++--------
>  arch/arm64/kvm/hyp/include/hyp/switch.h |  9 +++++----
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c      |  9 +++++----
>  arch/arm64/kvm/hyp/nvhe/setup.c         |  4 ++--
>  6 files changed, 23 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 42b1c4764a4bf..ae24617380b8f 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -732,20 +732,6 @@ struct kvm_cpu_context {
>  	u64 *vncr_array;
>  };
>  
> -struct cpu_sve_state {
> -	__u64 zcr_el1;
> -
> -	/*
> -	 * Ordering is important since __sve_save_state/__sve_restore_state
> -	 * relies on it.
> -	 */
> -	__u32 fpsr;
> -	__u32 fpcr;
> -
> -	/* Must be SVE_VQ_BYTES (128 bit) aligned. */
> -	__u8 sve_regs[];


It seems that the requirement (driven by SVE ldr/str) is
satisfied with the new sve_regs pointing to the start of the
page.

I'm not sure whether we want to keep the comment (or perhaps
enforce this with explicit checks) so that future refactoring
doesn't lead to time spent debugging alignment faults...

> -};
> -
>  /*
>   * This structure is instantiated on a per-CPU basis, and contains
>   * data that is:
> @@ -771,9 +757,9 @@ struct kvm_host_data {
>  
>  	/*
>  	 * Hyp VA.
> -	 * sve_state is only used in pKVM and if system_supports_sve().
> +	 * sve_regs is only used in pKVM and if system_supports_sve().
>  	 */
> -	struct cpu_sve_state *sve_state;
> +	u8	*sve_regs;
>  
>  	/* Ownership of the FP regs */
>  	enum {
> diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
> index 2954b311128c7..74fedd9c5ff02 100644
> --- a/arch/arm64/include/asm/kvm_pkvm.h
> +++ b/arch/arm64/include/asm/kvm_pkvm.h
> @@ -188,8 +188,7 @@ static inline size_t pkvm_host_sve_state_size(void)
>  	if (!system_supports_sve())
>  		return 0;
>  
> -	return size_add(sizeof(struct cpu_sve_state),
> -			SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl)));
> +	return SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl));
>  }
>  
>  struct pkvm_mapping {
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 8bb2c7422cc8b..f9fc85a0344e1 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -2499,10 +2499,10 @@ static void __init teardown_hyp_mode(void)
>  			continue;
>  
>  		if (free_sve) {
> -			struct cpu_sve_state *sve_state;
> +			u8 *sve_regs;
>  
> -			sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;
> -			free_pages((unsigned long) sve_state, pkvm_host_sve_state_order());
> +			sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
> +			free_pages((unsigned long) sve_regs, pkvm_host_sve_state_order());
>  		}
>  
>  		free_pages(kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu], nvhe_percpu_order());
> @@ -2627,7 +2627,7 @@ static int init_pkvm_host_sve_state(void)
>  		if (!page)
>  			return -ENOMEM;
>  
> -		per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state = page_address(page);
> +		per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs = page_address(page);
>  	}
>  
>  	/*
> @@ -2648,11 +2648,11 @@ static void finalize_init_hyp_mode(void)
>  
>  	if (system_supports_sve() && is_protected_kvm_enabled()) {
>  		for_each_possible_cpu(cpu) {
> -			struct cpu_sve_state *sve_state;
> +			u8 *sve_regs;
>  
> -			sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;
> -			per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state =
> -				kern_hyp_va(sve_state);
> +			sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
> +			per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs =
> +				kern_hyp_va(sve_regs);
>  		}
>  	}
>  }
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index cc4d011a2b380..6512dd3f75ae4 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -484,12 +484,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
>  
>  static inline void __hyp_sve_save_host(void)
>  {
> -	struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
> +	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
> +	u8 *sve_regs = *host_data_ptr(sve_regs);
>  
> -	sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);
> +	ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
>  	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
> -	__sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
> -			 &sve_state->fpsr,
> +	__sve_save_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
> +			 &hctxt->fp_regs.fpsr,
>  			 true);
>  }
>  
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index db60f770060e5..04a6d2e0ea73f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -41,7 +41,8 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
>  
>  static void __hyp_sve_restore_host(void)
>  {
> -	struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
> +	struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
> +	u8 *sve_regs = *host_data_ptr(sve_regs);
>  
>  	/*
>  	 * On saving/restoring host sve state, always use the maximum VL for
> @@ -53,10 +54,10 @@ static void __hyp_sve_restore_host(void)
>  	 * need to be revisited.
>  	 */
>  	write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
> -	__sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
> -			    &sve_state->fpsr,
> +	__sve_restore_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
> +			    &hctxt->fp_regs.fpsr,
>  			    true);
> -	write_sysreg_el1(sve_state->zcr_el1, SYS_ZCR);
> +	write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
>  }
>  
>  static void fpsimd_sve_flush(void)
> diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
> index d461981616d90..cdaf53c833409 100644
> --- a/arch/arm64/kvm/hyp/nvhe/setup.c
> +++ b/arch/arm64/kvm/hyp/nvhe/setup.c
> @@ -82,9 +82,9 @@ static int pkvm_create_host_sve_mappings(void)
>  
>  	for (i = 0; i < hyp_nr_cpus; i++) {
>  		struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);
> -		struct cpu_sve_state *sve_state = host_data->sve_state;
> +		u8 *sve_regs = host_data->sve_regs;
>  
> -		start = kern_hyp_va(sve_state);
> +		start = kern_hyp_va(sve_regs);
>  		end = start + PAGE_ALIGN(pkvm_host_sve_state_size());
>  		ret = pkvm_create_mappings(start, end, PAGE_HYP);
>  		if (ret)
> -- 2.30.2
> 

FWIW,

Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc()
  2026-05-21 13:25 ` [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc() Mark Rutland
  2026-05-26 15:28   ` Mark Brown
@ 2026-05-27 12:05   ` Vladimir Murzin
  1 sibling, 0 replies; 45+ messages in thread
From: Vladimir Murzin @ 2026-05-27 12:05 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel, kvmarm
  Cc: broonie, catalin.marinas, james.morse, maz, oupton, tabba, will

On 5/21/26 14:25, Mark Rutland wrote:
> For historical reasons, do_sve_acc() is structurally different from
> do_sme_acc(), and the logic to convert the task from FPSIMD to SVE is
> out-of-line in sve_init_regs(). We only use sve_init_regs() within
> do_sme_acc(), so it's not necessary for this to be a separate function.
> 
> Fold sve_init_regs() into do_sve_acc(), and simplify the associated
> comments. This makes do_sve_acc() structurally similar to do_sme_acc(),
> making it easier to see similarities and differences.
> 
> There should be no functional change as a result of this patch.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Fuad Tabba <tabba@google.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Oliver Upton <oupton@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/kernel/fpsimd.c | 48 ++++++++++++++------------------------
>  1 file changed, 17 insertions(+), 31 deletions(-)
> 
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 60a45d600b460..a8395cb303344 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -1293,31 +1293,6 @@ void sme_suspend_exit(void)
>  
>  #endif /* CONFIG_ARM64_SME */
>  
> -static void sve_init_regs(void)
> -{
> -	/*
> -	 * Convert the FPSIMD state to SVE, zeroing all the state that
> -	 * is not shared with FPSIMD. If (as is likely) the current
> -	 * state is live in the registers then do this there and
> -	 * update our metadata for the current task including
> -	 * disabling the trap, otherwise update our in-memory copy.
> -	 * We are guaranteed to not be in streaming mode, we can only
> -	 * take a SVE trap when not in streaming mode and we can't be
> -	 * in streaming mode when taking a SME trap.
> -	 */
> -	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
> -		unsigned long vq_minus_one =
> -			sve_vq_from_vl(task_get_sve_vl(current)) - 1;
> -		sve_set_vq(vq_minus_one);
> -		sve_flush_live(true, vq_minus_one);
> -		fpsimd_bind_task_to_cpu();
> -	} else {
> -		fpsimd_to_sve(current);
> -		current->thread.fp_type = FP_STATE_SVE;
> -		fpsimd_flush_task_state(current);
> -	}
> -}
> -
>  /*
>   * Trapped SVE access
>   *
> @@ -1349,13 +1324,24 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
>  		WARN_ON(1); /* SVE access shouldn't have trapped */
>  
>  	/*
> -	 * Even if the task can have used streaming mode we can only
> -	 * generate SVE access traps in normal SVE mode and
> -	 * transitioning out of streaming mode may discard any
> -	 * streaming mode state.  Always clear the high bits to avoid
> -	 * any potential errors tracking what is properly initialised.
> +	 * Convert the FPSIMD state to SVE. Stale SVE state can be present in
> +	 * registers or memory, so we must zero all state that is not shared
> +	 * with FPSIMD.
> +	 *
> +	 * SVE traps cannot be taken from streaming mode, so there cannot be
> +	 * any effective streaming mode SVE state.
>  	 */
> -	sve_init_regs();
> +	if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
> +		unsigned long vq_minus_one =
> +			sve_vq_from_vl(task_get_sve_vl(current)) - 1;
> +		sve_set_vq(vq_minus_one);
> +		sve_flush_live(true, vq_minus_one);
> +		fpsimd_bind_task_to_cpu();
> +	} else {
> +		fpsimd_to_sve(current);
> +		current->thread.fp_type = FP_STATE_SVE;
> +		fpsimd_flush_task_state(current);
> +	}
>  
>  	put_cpu_fpsimd_context();
>  }
> -- 2.30.2
> 

FWIW,

Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>



^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2026-05-27 12:05 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-21 13:25 [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Mark Rutland
2026-05-21 13:25 ` [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h> Mark Rutland
2026-05-26 14:18   ` Mark Brown
2026-05-27 10:10   ` Vladimir Murzin
2026-05-21 13:25 ` [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument Mark Rutland
2026-05-26 14:27   ` Mark Brown
2026-05-27 10:16   ` Vladimir Murzin
2026-05-21 13:25 ` [PATCH 03/18] KVM: arm64: pkvm: Save host FPMR in host cpu context Mark Rutland
2026-05-27 10:29   ` Vladimir Murzin
2026-05-21 13:25 ` [PATCH 04/18] KVM: arm64: pkvm: Remove struct cpu_sve_state Mark Rutland
2026-05-27 11:58   ` Vladimir Murzin
2026-05-21 13:25 ` [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc() Mark Rutland
2026-05-26 15:28   ` Mark Brown
2026-05-27 12:05   ` Vladimir Murzin
2026-05-21 13:25 ` [PATCH 06/18] arm64: fpsimd: Remove sve_set_vq() and sme_set_vq() Mark Rutland
2026-05-26 15:42   ` Mark Brown
2026-05-21 13:25 ` [PATCH 07/18] arm64: fpsimd: Use assembler for SVE instructions Mark Rutland
2026-05-26 15:43   ` Mark Brown
2026-05-21 13:25 ` [PATCH 08/18] arm64: fpsimd: Use assembler for baseline SME instructions Mark Rutland
2026-05-26 15:45   ` Mark Brown
2026-05-21 13:25 ` [PATCH 09/18] arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline Mark Rutland
2026-05-26 15:47   ` Mark Brown
2026-05-21 13:25 ` [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR Mark Rutland
2026-05-26 15:55   ` Mark Brown
2026-05-26 16:51     ` Mark Rutland
2026-05-26 16:54       ` Mark Brown
2026-05-21 13:25 ` [PATCH 11/18] arm64: fpsimd: Split FPSR/FPCR from SVE save/restore Mark Rutland
2026-05-26 16:28   ` Mark Brown
2026-05-21 13:25 ` [PATCH 12/18] arm64: fpsimd: Move fpsimd save/restore inline Mark Rutland
2026-05-26 16:44   ` Mark Brown
2026-05-21 13:25 ` [PATCH 13/18] arm64: fpsimd: Use opaque type for SVE state Mark Rutland
2026-05-26 16:53   ` Mark Brown
2026-05-21 13:25 ` [PATCH 14/18] arm64: fpsimd: Use opaque type for SME state Mark Rutland
2026-05-26 16:56   ` Mark Brown
2026-05-21 13:25 ` [PATCH 15/18] arm64: fpsimd: Move SVE save/restore inline Mark Rutland
2026-05-21 13:25 ` [PATCH 16/18] arm64: fpsimd: Move sve_flush_live() inline Mark Rutland
2026-05-21 13:25 ` [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline Mark Rutland
2026-05-26 14:08   ` Mark Rutland
2026-05-26 14:39     ` Vladimir Murzin
2026-05-26 15:28       ` Mark Rutland
2026-05-26 16:38         ` Mark Rutland
2026-05-27  9:00           ` Vladimir Murzin
2026-05-21 13:25 ` [PATCH 18/18] arm64: fpsimd: Remove <asm/fpsimdmacros.h> Mark Rutland
2026-05-27  8:07 ` [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups Marc Zyngier
2026-05-27 10:32   ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox