* Re: [PATCH 5/6] firmware: samsung: acpm: Add TMU protocol support
From: Alexey Klimov @ 2026-05-21 13:37 UTC (permalink / raw)
To: Peter Griffin
Cc: Tudor Ambarus, Krzysztof Kozlowski, Michael Turquette,
Stephen Boyd, Lee Jones, Alim Akhtar, Sylwester Nawrocki,
Chanwoo Choi, André Draszik, linux-kernel, linux-samsung-soc,
linux-arm-kernel, linux-clk, jyescas, kernel-team,
Krzysztof Kozlowski
In-Reply-To: <CADrjBPqiooFC9o56bOAg-j7908ssPtrzff1reNe6eXmu7hcA=w@mail.gmail.com>
On Thu May 21, 2026 at 9:25 AM BST, Peter Griffin wrote:
> Hi Alexey,
>
> On Wed, 20 May 2026 at 22:01, Alexey Klimov <alexey.klimov@linaro.org> wrote:
>>
>> Hi Tudor,
>>
>> On Tue May 19, 2026 at 4:46 PM BST, Tudor Ambarus wrote:
>> > Hi, Alexey,
>> >
>> > On 5/18/26 2:24 PM, Alexey Klimov wrote:
>> >> Thinking further about this I'd humbly suggest that even
>> >>
>> >> if (fw_err >= 0)
>> >> return 0;
>> >>
>> >> pr_debug_ratelimited("ACPM tmu call returned: %x\n", fw_err);
>> >> or pr_debug(...);
>> >>
>> >> if (fw_err == -1)
>> >> return -EACCES;
>> >>
>> >> some debug message would do.
>> >> Perhaps we need some convertation, for instance as it is done in scmi
>> >> code (scmi_to_linux_errno(), scmi_linux_errmap[]). But I don't have any
>> >> data for mapping acpm errors to some human meanings.
>> >
>> > I did that for the pmic helpers. I don't need any debug prints for
>> > gs101 TMU as I have clear instructions from firmware: 0 for success,
>> > -1 for error.
>>
>> This doesn't look like a right approach for upstreaming a ACPM TMU
>> framework.
>>
>> You are trying to submit a gs101-specific implementation masquerading
>> it as a generic ACPM TMU framework, while explicitly pushing the
>> refactoring work onto the next developer to add support for other
>> SoCs in this generic ACPM code.
>>
>> The ACPM TMU protocol implementation on Exynos850 is different: it uses
>> different error codes, and half of the calls in this 'generic' driver
>> are not even implemented in the Exynos850 firmware. Relying on a
>> hardcoded if (fw_err == -1) in a driver named generic ACPM is broken
>> by design and may silently swallow critical firmware errors on other
>> SoCs.
>>
>> What about such options below?
>> - rename the driver to reflect reality: rename this specifically to
>> gs101-acpm-tmu-something to reflect that it is tailored for gs101-s;
>>
>> or
>> - abstract the firmware error handling paths through driver_data or
>> a dedicated ops structure now, so that other SoCs can cleanly hook into
>> it without having to rewrite the logic later.
>
> AFAIK it's pretty normal not to add new hooks before they are
> required. I think the approach taken in this series makes sense, as
> it's the developer adding support for SoC #2 who best knows what the
> differences are on their platform versus what exists upstream.
> Similarly, the developer adding support for SoC #3 may have different
> requirements to e850 and gs101 and that developer is best placed to
> refactor and add hooks or quirks that are required for that platform.
>
> Let's not try to boil the ocean with this series, it's targeting GS101
> support. We can evolve it for future SoCs as those requirements and
> differences become clear.
Peter, I agree we shouldn't bother about hypothetical SoCs. However,
Exynos850 is not hypothetical (I guess SoC #2 in this text). It is
possible to take set of patches from maillist, copy acpm DT node from
gs101 (minimal compatible rename) and it will be Exynos SoC with enabled
ACPM. I am also actively working on it. Hooks or whatever other way of
handling firmware error codes are required now.
We actually already know the differences: different error codes and
only 4 ACPM TMU calls are implemented on e850: TMU init, TMU read
temperature, TMU suspend/resume.
Extra TMU calls are fine since we can just do not use them from thermal
driver but hiding correct error codes, well, that's another story.
We already know this way of handling firmware codes is not ACPM TMU
generic.
The ACPM TMU part of gs101 ACPM firmware might be a vendor-specific fork.
We shouldn't assume it strictly adheres to the reference Samsung ACPM TMU
design.
Alexey
^ permalink raw reply
* Re: [PATCH v22 08/13] mfd: core: Add firmware-node support to MFD cells
From: Bartosz Golaszewski @ 2026-05-21 13:36 UTC (permalink / raw)
To: Lee Jones
Cc: Shivendra Pratap, Sebastian Reichel, Mark Rutland,
Lorenzo Pieralisi, Rafael J. Wysocki, Daniel Lezcano,
Christian Loehle, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Bjorn Andersson, Konrad Dybcio, Arnd Bergmann,
Souvik Chakravarty, Andy Yan, Matthias Brugger, John Stultz,
Moritz Fischer, Sudeep Holla, linux-pm, linux-kernel,
linux-arm-msm, linux-arm-kernel, devicetree, Florian Fainelli,
Krzysztof Kozlowski, Dmitry Baryshkov, Mukesh Ojha, Andre Draszik,
Greg Kroah-Hartman, Kathiravan Thirumoorthy, Srinivas Kandagatla,
Bartosz Golaszewski
In-Reply-To: <20260521132419.GA3591266@google.com>
On Thu, May 21, 2026 at 3:24 PM Lee Jones <lee@kernel.org> wrote:
>
> >
> > I suggested it because of its flexibility. The alternative I had in
> > mind is something like a new field in mfd_cell:
> >
> > const char *cell_node_name;
> >
> > Which - if set - would tell MFD to look up an fwnode that's a child of
> > the parent device's node by name - as it may not have a compatible.
>
> Remind me why the chlid device can't look-up its own fwnode?
>
Oh sure it can, but should it? I'm not sure it's logically sound to
have the child device reach into the parent, look up the fwnode and
then assign it to itself after it's already attached to the driver.
This should be done at the subsystem level before the device is
registered.
Bart
^ permalink raw reply
* Re: [PATCH v2 1/3] KVM: arm64: Reset page order in pKVM hyp_pool_init
From: Fuad Tabba @ 2026-05-21 13:30 UTC (permalink / raw)
To: Vincent Donnefort
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, kvmarm, kernel-team,
qperret, Sashiko
In-Reply-To: <ag8GvtAonB6LNB5m@google.com>
On Thu, 21 May 2026 at 14:21, Vincent Donnefort <vdonnefort@google.com> wrote:
>
> On Thu, May 21, 2026 at 02:07:36PM +0100, Fuad Tabba wrote:
> > On Thu, 21 May 2026 at 11:22, Vincent Donnefort <vdonnefort@google.com> wrote:
> > >
> > > When a VM fails to initialise after its stage-2 hyp_pool has been
> > > initialised, that stage-2 must be torn down entirely. This requires
> > > resetting both the refcount and the order of its pages back to 0.
> > >
> > > Currently, reclaim_pgtable_pages() implicitly resets the page order by
> > > allocating the entire pool with order-0 granularity. However, in the VM
> > > initialisation error path, the addresses of the donated memory (the PGD)
> > > are already known, making it unnecessary to iterate over all pages in
> > > the pool.
> > >
> > > Since the vmemmap page order is a hyp_pool-specific field, leaving a
> > > non-zero order on hyp_pool destruction is harmless until another pool
> > > attempts to admit the page. Instead of resetting this field during
> > > destruction, reset it during pool initialization in hyp_pool_init().
> > > Note that pages added to the pool outside of the initial pool range
> > > (e.g., via guest_s2_zalloc_page()) must still have their order managed
> > > manually.
> > >
> > > While at it, add a WARN_ON() in the hyp_pool attach path to catch
> > > unexpected page orders that exceed the pool's max_order.
> > >
> > > Fixes: 256b4668cd89 ("KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization")
> > > Reported-by: Sashiko <sashiko-bot@kernel.org>
> > > Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> > >
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > index 25f04629014e..89eb20d4fee4 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > > @@ -322,7 +322,6 @@ void reclaim_pgtable_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
> > > while (addr) {
> > > page = hyp_virt_to_page(addr);
> > > page->refcount = 0;
> > > - page->order = 0;
> > > push_hyp_memcache(mc, addr, hyp_virt_to_phys);
> > > WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(addr), 1));
> > > addr = hyp_alloc_pages(&vm->pool, 0);
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > > index a1eb27a1a747..c3b3dc5a8ea7 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > > @@ -97,6 +97,8 @@ static void __hyp_attach_page(struct hyp_pool *pool,
> > > u8 order = p->order;
> > > struct hyp_page *buddy;
> > >
> > > + WARN_ON(p->order > pool->max_order);
> > > +
> >
> > Could you add a brief comment? It took me a minute to figure out what this
> > catches. IIUC it's not attach's own input, it's a stale p->order from way back
> > when an external page was popped from a memcache (today only via
> > guest_s2_zalloc_page()). Right?
>
> I think it'd be self explanatory if that was next the page_add_to_list, but that
> wouldn't protect the memset (that's really best-effort though).
>
> How about?
>
> /*
> * A page with an order bigger than the pool's max is an 'external' page
> * whose order hasn't been reset before being added to the pool.
> */
>
> But now I am thinking I can do way better: we can easily identify external
> pages, so I could just force the order to 0 in that case.
>
> WDYS?
Yeah, Sounds better. The WARN's scope was actually narrower than the
real risk. Forcing order = 0 on entry covers all of that and removes
the implicit caller obligation the WARN was best-effort enforcing.
The memset is trivially safe then too (PAGE_SIZE << 0, regardless of
what was in the vmemmap).
Cheers,
/fuad
>
> >
> > With that.
> >
> > Reviewed-by: Fuad Tabba <tabba@google.com>
> > Tested-by: Fuad Tabba <tabba@google.com>
> >
> > Cheers,
> > /fuad
> >
> >
> >
> >
> > > memset(hyp_page_to_virt(p), 0, PAGE_SIZE << p->order);
> > >
> > > /* Skip coalescing for 'external' pages being freed into the pool. */
> > > @@ -237,8 +239,10 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
> > >
> > > /* Init the vmemmap portion */
> > > p = hyp_phys_to_page(phys);
> > > - for (i = 0; i < nr_pages; i++)
> > > + for (i = 0; i < nr_pages; i++) {
> > > hyp_set_page_refcounted(&p[i]);
> > > + p[i].order = 0;
> > > + }
> > >
> > > /* Attach the unused pages to the buddy tree */
> > > for (i = reserved_pages; i < nr_pages; i++)
> > > --
> > > 2.54.0.746.g67dd491aae-goog
> > >
^ permalink raw reply
* Re: [PATCH v14 07/44] arm64: RMI: Configure the RMM with the host's page size
From: Marc Zyngier @ 2026-05-21 13:30 UTC (permalink / raw)
To: Steven Price
Cc: kvm, kvmarm, Catalin Marinas, Will Deacon, James Morse,
Oliver Upton, Suzuki K Poulose, Zenghui Yu, linux-arm-kernel,
linux-kernel, Joey Gouly, Alexandru Elisei, Christoffer Dall,
Fuad Tabba, linux-coco, Ganapatrao Kulkarni, Gavin Shan,
Shanker Donthineni, Alper Gun, Aneesh Kumar K . V, Emi Kisanuki,
Vishal Annapurve, WeiLin.Chang, Lorenzo.Pieralisi2
In-Reply-To: <20260513131757.116630-8-steven.price@arm.com>
On Wed, 13 May 2026 14:17:15 +0100,
Steven Price <steven.price@arm.com> wrote:
>
> RMM v2.0 brings the ability to set the RMM's granule size. Check the
> feature registers and configure the RMM so that it matches the host's
> page size. This means that operations can be done with a granulatity
> equal to PAGE_SIZE.
>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
> Changes since v13:
> * Moved out of KVM.
> ---
> arch/arm64/kernel/rmi.c | 42 +++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 42 insertions(+)
>
> diff --git a/arch/arm64/kernel/rmi.c b/arch/arm64/kernel/rmi.c
> index 99c1ccc35c11..a14ead5dedda 100644
> --- a/arch/arm64/kernel/rmi.c
> +++ b/arch/arm64/kernel/rmi.c
> @@ -49,6 +49,45 @@ static int rmi_check_version(void)
> return 0;
> }
>
> +static int rmi_configure(void)
> +{
> + struct rmm_config *config __free(free_page) = NULL;
> + unsigned long ret;
> +
> + config = (struct rmm_config *)get_zeroed_page(GFP_KERNEL);
> + if (!config)
> + return -ENOMEM;
This is the sort of buggy construct that is highlighted in
include/linux/cleanup.h: initialising the object for cleanup with
NULL, and only later assigning the expected value.
It may not matter here, but it will catch you (or more probably me) in
the future.
> +
> + switch (PAGE_SIZE) {
> + case SZ_4K:
> + config->rmi_granule_size = RMI_GRANULE_SIZE_4KB;
> + break;
> + case SZ_16K:
> + config->rmi_granule_size = RMI_GRANULE_SIZE_16KB;
> + break;
> + case SZ_64K:
> + config->rmi_granule_size = RMI_GRANULE_SIZE_64KB;
> + break;
> + default:
> + pr_err("Unsupported PAGE_SIZE for RMM\n");
Do you really anticipate PAGE_SIZE being any other value? This is 100%
dead code. If you want to be extra cautious, have a BUILD_BUg_ON().
> + return -EINVAL;
> + }
> +
> + ret = rmi_rmm_config_set(virt_to_phys(config));
> + if (ret) {
> + pr_err("RMM config set failed\n");
> + return -EINVAL;
> + }
What is the live cycle of the page when the call succeeds? Is it
switched back to the NS PAS and allowed to be freed?
> +
> + ret = rmi_rmm_activate();
> + if (ret) {
> + pr_err("RMM activate failed\n");
> + return -ENXIO;
> + }
> +
> + return 0;
> +}
> +
> static int __init arm64_init_rmi(void)
> {
> /* Continue without realm support if we can't agree on a version */
> @@ -60,6 +99,9 @@ static int __init arm64_init_rmi(void)
> if (WARN_ON(rmi_features(1, &rmm_feat_reg1)))
> return 0;
>
> + if (rmi_configure())
> + return 0;
> +
> return 0;
> }
> subsys_initcall(arm64_init_rmi);
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply
* [PATCH 14/18] arm64: fpsimd: Use opaque type for SME state
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
As the SME state size can vary at runtime, we don't have a concrete type
for the in-memory SME state, and pass this around using a pointer to
void.
Using pointer to void means that it's very easy to introduce errors that
cannot be caught by the compiler (e.g. as 'void **' can be assigned to
'void *').
Improve this by adding an opaque 'struct sve_state', and consistently
passing a pointer to this.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 8 ++++----
arch/arm64/include/asm/processor.h | 3 ++-
arch/arm64/kernel/fpsimd.c | 4 ++--
3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 19e670ae67598..560814acc60c0 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -163,7 +163,7 @@ extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
struct cpu_fp_state {
struct user_fpsimd_state *st;
struct sve_state *sve_state;
- void *sme_state;
+ struct sme_state *sme_state;
u64 *svcr;
u64 *fpmr;
unsigned int sve_vl;
@@ -199,7 +199,7 @@ static inline void *thread_zt_state(struct thread_struct *thread)
{
/* The ZT register state is stored immediately after the ZA state */
unsigned int sme_vq = sve_vq_from_vl(thread_get_sme_vl(thread));
- return thread->sme_state + ZA_SIG_REGS_SIZE(sme_vq);
+ return (void *)thread->sme_state + ZA_SIG_REGS_SIZE(sme_vq);
}
static inline unsigned int sve_get_vl(void)
@@ -218,8 +218,8 @@ static inline unsigned int sve_get_vl(void)
extern void sve_save_state(struct sve_state *state, int save_ffr);
extern void sve_load_state(const struct sve_state *state, int restore_ffr);
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
-extern void sme_save_state(void *state, int zt);
-extern void sme_load_state(void const *state, int zt);
+extern void sme_save_state(struct sme_state *state, int zt);
+extern void sme_load_state(const struct sme_state *state, int zt);
struct arm64_cpu_capabilities;
extern void cpu_enable_fpsimd(const struct arm64_cpu_capabilities *__unused);
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 1c2ffd063baa8..7304d9cca3e85 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -131,6 +131,7 @@ enum fp_type {
};
struct sve_state; /* Opaque type */
+struct sme_state; /* Opaque type */
struct cpu_context {
unsigned long x19;
@@ -167,7 +168,7 @@ struct thread_struct {
enum fp_type fp_type; /* registers FPSIMD or SVE? */
unsigned int fpsimd_cpu;
struct sve_state *sve_state; /* SVE registers, if any */
- void *sme_state; /* ZA and ZT state, if any */
+ struct sme_state *sme_state; /* ZA and ZT state, if any */
unsigned int vl[ARM64_VEC_MAX]; /* vector length */
unsigned int vl_onexec[ARM64_VEC_MAX]; /* vl after next exec */
unsigned long fault_address; /* fault info */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 66d880d081671..f9b3eeacf130d 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -808,7 +808,7 @@ static int change_live_vector_length(struct task_struct *task,
unsigned int sve_vl = task_get_sve_vl(task);
unsigned int sme_vl = task_get_sme_vl(task);
struct sve_state *sve_state = NULL;
- void *sme_state = NULL;
+ struct sme_state *sme_state = NULL;
if (type == ARM64_VEC_SME)
sme_vl = vl;
@@ -1645,7 +1645,7 @@ static void fpsimd_flush_thread_vl(enum vec_type type)
void fpsimd_flush_thread(void)
{
struct sve_state *sve_state = NULL;
- void *sme_state = NULL;
+ struct sme_state *sme_state = NULL;
if (!system_supports_fpsimd())
return;
--
2.30.2
^ permalink raw reply related
* [PATCH 18/18] arm64: fpsimd: Remove <asm/fpsimdmacros.h>
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
We no longer need any of the remaining macros in <asm/fpsimdmacros.h>.
Remove all of it.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimdmacros.h | 64 ---------------------------
1 file changed, 64 deletions(-)
delete mode 100644 arch/arm64/include/asm/fpsimdmacros.h
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
deleted file mode 100644
index a763fd03ffef3..0000000000000
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ /dev/null
@@ -1,64 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * FP/SIMD state saving and restoring macros
- *
- * Copyright (C) 2012 ARM Ltd.
- * Author: Catalin Marinas <catalin.marinas@arm.com>
- */
-
-#include <asm/assembler.h>
-
-/* Sanity-check macros to help avoid encoding garbage instructions */
-
-.macro _check_general_reg nr
- .if (\nr) < 0 || (\nr) > 30
- .error "Bad register number \nr."
- .endif
-.endm
-
-.macro _sve_check_zreg znr
- .if (\znr) < 0 || (\znr) > 31
- .error "Bad Scalable Vector Extension vector register number \znr."
- .endif
-.endm
-
-.macro _sve_check_preg pnr
- .if (\pnr) < 0 || (\pnr) > 15
- .error "Bad Scalable Vector Extension predicate register number \pnr."
- .endif
-.endm
-
-.macro _check_num n, min, max
- .if (\n) < (\min) || (\n) > (\max)
- .error "Number \n out of range [\min,\max]"
- .endif
-.endm
-
-.macro _sme_check_wv v
- .if (\v) < 12 || (\v) > 15
- .error "Bad vector select register \v."
- .endif
-.endm
-
-.macro __for from:req, to:req
- .if (\from) == (\to)
- _for__body %\from
- .else
- __for %\from, %((\from) + ((\to) - (\from)) / 2)
- __for %((\from) + ((\to) - (\from)) / 2 + 1), %\to
- .endif
-.endm
-
-.macro _for var:req, from:req, to:req, insn:vararg
- .macro _for__body \var:req
- .noaltmacro
- \insn
- .altmacro
- .endm
-
- .altmacro
- __for \from, \to
- .noaltmacro
-
- .purgem _for__body
-.endm
--
2.30.2
^ permalink raw reply related
* [PATCH 15/18] arm64: fpsimd: Move SVE save/restore inline
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Currently the SVE register save/restore sequences are written in
out-of-line assembly routines. While this works, it's somewhat painful:
* As KVM needs to be able to use the sequences in hyp code, separate
assembly files are used for the regular kernel and KVM code. While the
common logic is shared in assembly macros, this still requires some
duplication, and has lead to some trivial divergence.
* As the SVE LDR/STR instrucitons have limited addressing modes, the
assembly macros use an awkward pattern requiring negative offsets.
This could be written more clearly with addresses being generated in C
code.
* As the FFR does not always exist in streaming mode, some awkward
conditional branching has been written in assembly which could be
clearer in C (and would permit the compiler to optimize out
unnecessary branches in some cases).
* For historical reasons, the assembly macros take some register
arguments as numerical indices (e.g. "sve_save 0, x1" uses x0 and x1),
which is simply confusing.
* For historical reasons, the SVE save/restore code and FPSIMD
save/restore code have a distinct sequences for FPSR and FPCR. Ideally
this logic would be shared.
* The assembly sequences can't be instrumented, and so it's harder than
necessary to catch memory safety issues.
To handle the above, move the SVE register save/restore sequences
to inline assembly.
Neither GCC nor LLVM instrument memory arguments to inline assembly, so
explicit instrumentation is added in the same manner as other assembly
routines. This instrumentation is implicitly disabled by Kbuild for nVHE
hyp code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 119 +++++++++++++++++++++++-
arch/arm64/include/asm/fpsimdmacros.h | 61 ------------
arch/arm64/include/asm/kvm_hyp.h | 3 -
arch/arm64/kernel/entry-fpsimd.S | 22 -----
arch/arm64/kvm/hyp/fpsimd.S | 21 -----
arch/arm64/kvm/hyp/include/hyp/switch.h | 4 +-
arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 4 +-
arch/arm64/kvm/hyp/vhe/Makefile | 2 +-
9 files changed, 123 insertions(+), 115 deletions(-)
delete mode 100644 arch/arm64/kvm/hyp/fpsimd.S
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 560814acc60c0..d005324bbcf3e 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -215,8 +215,123 @@ static inline unsigned int sve_get_vl(void)
return vl;
}
-extern void sve_save_state(struct sve_state *state, int save_ffr);
-extern void sve_load_state(const struct sve_state *state, int restore_ffr);
+#define FOR_EACH_Z_REG(idx_str, asm_str) \
+ " .irp " idx_str ",0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31\n" \
+ asm_str "\n" \
+ " .endr\n"
+
+#define FOR_EACH_P_REG(idx_str, asm_str) \
+ " .irp " idx_str ",0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15\n" \
+ asm_str "\n" \
+ " .endr\n"
+
+static inline void __sve_save_z(struct sve_state *state, unsigned long vl)
+{
+ instrument_write(state, SVE_NUM_ZREGS * vl);
+ asm volatile(
+ __SVE_PREAMBLE
+ FOR_EACH_Z_REG("n", "str z\\n, [%[zregs], #\\n, MUL VL]")
+ :
+ : [zregs] "r" (state)
+ : "memory"
+ );
+}
+
+static inline void __sve_load_z(const struct sve_state *state, unsigned long vl)
+{
+ instrument_read(state, SVE_NUM_ZREGS * vl);
+ asm volatile(
+ __SVE_PREAMBLE
+ FOR_EACH_Z_REG("n", "ldr z\\n, [%[zregs], #\\n, MUL VL]")
+ :
+ : [zregs] "r" (state)
+ : "memory"
+ );
+}
+
+static inline void __sve_save_p(struct sve_state *state, unsigned long vl, bool ffr)
+{
+ void *pregs = (void *)state + SVE_NUM_ZREGS * vl;
+ unsigned long pl = vl / 8;
+ void *pffr = pregs + SVE_NUM_PREGS * pl;
+
+ instrument_write(pregs, SVE_NUM_PREGS * pl);
+ asm volatile(
+ __SVE_PREAMBLE
+ FOR_EACH_P_REG("n", "str p\\n, [%[pregs], #\\n, MUL VL]\n")
+ :
+ : [pregs] "r" (pregs)
+ : "memory"
+ );
+
+ instrument_write(pffr, pl);
+ if (ffr) {
+ asm volatile(
+ __SVE_PREAMBLE
+ " rdffr p0.b\n"
+ " str p0, [%[pffr]]\n"
+ " ldr p0, [%[pregs]]\n"
+ :
+ : [pregs] "r" (pregs),
+ [pffr] "r" (pffr)
+ : "memory"
+ );
+ } else {
+ asm volatile(
+ __SVE_PREAMBLE
+ " pfalse p0.b\n"
+ " str p0, [%[pffr]]\n"
+ " ldr p0, [%[pregs]]\n"
+ :
+ : [pregs] "r" (pregs),
+ [pffr] "r" (pffr)
+ : "memory"
+ );
+ }
+}
+
+static inline void __sve_load_p(const struct sve_state *state, unsigned long vl, bool ffr)
+{
+ const void *pregs = (const void *)state + SVE_NUM_ZREGS * vl;
+ unsigned long pl = vl / 8;
+ const void *pffr = pregs + SVE_NUM_PREGS * pl;
+
+ if (ffr) {
+ instrument_read(pffr, pl);
+ asm volatile(
+ __SVE_PREAMBLE
+ " ldr p0, [%[pffr]]\n"
+ " wrffr p0.b\n"
+ :
+ : [pffr] "r" (pffr)
+ : "memory"
+ );
+ }
+
+ instrument_read(pregs, SVE_NUM_PREGS * pl);
+ asm volatile(
+ __SVE_PREAMBLE
+ FOR_EACH_P_REG("n", "ldr p\\n, [%[pregs], #\\n, MUL VL]\n")
+ :
+ : [pregs] "r" (pregs)
+ : "memory"
+ );
+}
+
+static inline void sve_save_state(struct sve_state *state, bool ffr)
+{
+ unsigned long vl = sve_get_vl();
+ __sve_save_z(state, vl);
+ __sve_save_p(state, vl, ffr);
+}
+
+static inline void sve_load_state(const struct sve_state *state, bool ffr)
+{
+ unsigned long vl = sve_get_vl();
+ __sve_load_z(state, vl);
+ __sve_load_p(state, vl, ffr);
+}
+
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
extern void sme_save_state(struct sme_state *state, int zt);
extern void sme_load_state(const struct sme_state *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 08f4863e67715..ebf8b47313e90 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -42,36 +42,6 @@
/* Deprecated macros for SVE instructions */
-/* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */
-.macro _sve_str_v nz, nxbase, offset=0
- .arch_extension sve
- str z\nz, [X\nxbase, #\offset, MUL VL]
-.endm
-
-/* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */
-.macro _sve_ldr_v nz, nxbase, offset=0
- .arch_extension sve
- ldr z\nz, [X\nxbase, #\offset, MUL VL]
-.endm
-
-/* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */
-.macro _sve_str_p np, nxbase, offset=0
- .arch_extension sve
- str p\np, [X\nxbase, #\offset, MUL VL]
-.endm
-
-/* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */
-.macro _sve_ldr_p np, nxbase, offset=0
- .arch_extension sve
- ldr p\np, [x\nxbase, #\offset, MUL VL]
-.endm
-
-/* RDFFR (unpredicated): RDFFR P\np.B */
-.macro _sve_rdffr np
- .arch_extension sve
- rdffr p\np\().b
-.endm
-
/* WRFFR P\np.B */
.macro _sve_wrffr np
wrffr p\np\().b
@@ -176,37 +146,6 @@
_sve_wrffr 0
.endm
-.macro _sve_pffr ptr
- .arch_extension sve
- addvl \ptr, \ptr, #16
- addvl \ptr, \ptr, #16
- addpl \ptr, \ptr, #16
-.endm
-
-.macro sve_save nxbase, save_ffr
- _sve_pffr x\nxbase
- _for n, 0, 31, _sve_str_v \n, \nxbase, \n - 34
- _for n, 0, 15, _sve_str_p \n, \nxbase, \n - 16
- cbz \save_ffr, 921f
- _sve_rdffr 0
- b 922f
-921:
- _sve_pfalse 0 // Zero out FFR
-922:
- _sve_str_p 0, \nxbase
- _sve_ldr_p 0, \nxbase, -16
-.endm
-
-.macro sve_load nxbase, restore_ffr
- _sve_pffr x\nxbase
- _for n, 0, 31, _sve_ldr_v \n, \nxbase, \n - 34
- cbz \restore_ffr, 921f
- _sve_ldr_p 0, \nxbase
- _sve_wrffr 0
-921:
- _for n, 0, 15, _sve_ldr_p \n, \nxbase, \n - 16
-.endm
-
.macro sme_save_za nxbase, xvl, nw
mov w\nw, #0
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 38356eee592ad..ad19de1d0654f 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -121,9 +121,6 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu);
void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
#endif
-void __sve_save_state(struct sve_state *sve, int save_ffr);
-void __sve_restore_state(struct sve_state *sve, int restore_ffr);
-
u64 __guest_enter(struct kvm_vcpu *vcpu);
bool kvm_host_psci_handler(struct kvm_cpu_context *host_ctxt, u32 func_id);
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 4fa00c94f28b7..0575d90e6dffb 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -13,28 +13,6 @@
#ifdef CONFIG_ARM64_SVE
-/*
- * Save the SVE state
- *
- * x0 - pointer to buffer for state
- * x1 - Save FFR if non-zero
- */
-SYM_FUNC_START(sve_save_state)
- sve_save 0, x1
- ret
-SYM_FUNC_END(sve_save_state)
-
-/*
- * Load the SVE state
- *
- * x0 - pointer to buffer for state
- * x1 - Restore FFR if non-zero
- */
-SYM_FUNC_START(sve_load_state)
- sve_load 0, x1
- ret
-SYM_FUNC_END(sve_load_state)
-
/*
* Zero all SVE registers but the first 128-bits of each vector
*
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
deleted file mode 100644
index beacec33b2541..0000000000000
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ /dev/null
@@ -1,21 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2015 - ARM Ltd
- * Author: Marc Zyngier <marc.zyngier@arm.com>
- */
-
-#include <linux/linkage.h>
-
-#include <asm/fpsimdmacros.h>
-
- .text
-
-SYM_FUNC_START(__sve_restore_state)
- sve_load 0, x1
- ret
-SYM_FUNC_END(__sve_restore_state)
-
-SYM_FUNC_START(__sve_save_state)
- sve_save 0, x1
- ret
-SYM_FUNC_END(__sve_save_state)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 72e658255cda7..41c60c9eea423 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -467,7 +467,7 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
* vCPU. Start off with the max VL so we can load the SVE state.
*/
sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
- __sve_restore_state(kern_hyp_va(vcpu->arch.sve_state), true);
+ sve_load_state(kern_hyp_va(vcpu->arch.sve_state), true);
fpsimd_load_common(&vcpu->arch.ctxt.fp_regs);
/*
@@ -488,7 +488,7 @@ static inline void __hyp_sve_save_host(void)
ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
- __sve_save_state(sve_regs, true);
+ sve_save_state(sve_regs, true);
fpsimd_save_common(&hctxt->fp_regs);
}
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 62cdfbff75625..f57450ebcb498 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -26,7 +26,7 @@ hyp-obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o
hyp-main.o hyp-smp.o psci-relay.o early_alloc.o page_alloc.o \
cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o stacktrace.o ffa.o
hyp-obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
- ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o ../vgic-v5-sr.o
+ ../hyp-entry.o ../exception.o ../pgtable.o ../vgic-v5-sr.o
hyp-obj-y += ../../../kernel/smccc-call.o
hyp-obj-$(CONFIG_LIST_HARDENED) += list_debug.o
hyp-obj-$(CONFIG_NVHE_EL2_TRACING) += clock.o trace.o events.o
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 72d025b2178a7..5c43943f24380 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -35,7 +35,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
* on the VL, so use a consistent (i.e., the maximum) guest VL.
*/
sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
- __sve_save_state(kern_hyp_va(vcpu->arch.sve_state), true);
+ sve_save_state(kern_hyp_va(vcpu->arch.sve_state), true);
fpsimd_save_common(&vcpu->arch.ctxt.fp_regs);
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
}
@@ -55,7 +55,7 @@ static void __hyp_sve_restore_host(void)
* need to be revisited.
*/
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
- __sve_restore_state(sve_regs, true);
+ sve_load_state(sve_regs, true);
fpsimd_load_common(&hctxt->fp_regs);
write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
}
diff --git a/arch/arm64/kvm/hyp/vhe/Makefile b/arch/arm64/kvm/hyp/vhe/Makefile
index 9695328bbd96e..d6b3475145c0e 100644
--- a/arch/arm64/kvm/hyp/vhe/Makefile
+++ b/arch/arm64/kvm/hyp/vhe/Makefile
@@ -10,4 +10,4 @@ CFLAGS_switch.o += -Wno-override-init
obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o
obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
- ../fpsimd.o ../hyp-entry.o ../exception.o ../vgic-v5-sr.o
+ ../hyp-entry.o ../exception.o ../vgic-v5-sr.o
--
2.30.2
^ permalink raw reply related
* [PATCH 16/18] arm64: fpsimd: Move sve_flush_live() inline
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Currently sve_flush_live() is written in out-of-line assembly. It would
be nice if we could move it inline such that control flow can be written
more clearly in C, and to permit the removal of otherwise unused
assembly macros.
The 'flush_ffr' argument is redundant as sve_flush_live() is always
called from non-streaming mode, and all callers pass 'true'. Remove the
argument and make it a requirement that the function is called from
non-streaming mode.
The 'vq_minus_1' argument is unnecessary, as sve_flush_live() can read
the live VL directly using the RDVL instruction (wrapped by the
sve_get_vl() helper function).
Move the function to C, with the simplifications above.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 26 +++++++++++++++++++++++-
arch/arm64/include/asm/fpsimdmacros.h | 29 ---------------------------
arch/arm64/kernel/entry-common.c | 8 ++------
arch/arm64/kernel/entry-fpsimd.S | 22 --------------------
arch/arm64/kernel/fpsimd.c | 2 +-
5 files changed, 28 insertions(+), 59 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index d005324bbcf3e..550987b36206a 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -332,7 +332,31 @@ static inline void sve_load_state(const struct sve_state *state, bool ffr)
__sve_load_p(state, vl, ffr);
}
-extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
+
+/*
+ * Zero all SVE registers except for the first 128 bits of each vector.
+ *
+ * The caller must ensure that the VL has been configured and the CPU must be
+ * in non-streaming mode.
+ */
+static inline void sve_flush_live(void)
+{
+ unsigned long vl = sve_get_vl();
+
+ if (vl > sizeof(__uint128_t)) {
+ asm volatile(
+ __FPSIMD_PREAMBLE
+ FOR_EACH_Z_REG("n", "mov v\\n\\().16b, v\\n\\().16b")
+ );
+ }
+
+ asm volatile(
+ __SVE_PREAMBLE
+ FOR_EACH_P_REG("n", "pfalse p\\n\\().b")
+ " wrffr p0.b\n"
+ );
+}
+
extern void sme_save_state(struct sme_state *state, int zt);
extern void sme_load_state(const struct sme_state *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index ebf8b47313e90..9e352b5c6b764 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -40,19 +40,6 @@
.endif
.endm
-/* Deprecated macros for SVE instructions */
-
-/* WRFFR P\np.B */
-.macro _sve_wrffr np
- wrffr p\np\().b
-.endm
-
-/* PFALSE P\np.B */
-.macro _sve_pfalse np
- .arch_extension sve
- pfalse p\np\().b
-.endm
-
/* Deprecated macros for SME instructions */
/* RDSVL X\nx, #\imm */
@@ -130,22 +117,6 @@
.purgem _for__body
.endm
-/* Preserve the first 128-bits of Znz and zero the rest. */
-.macro _sve_flush_z nz
- _sve_check_zreg \nz
- mov v\nz\().16b, v\nz\().16b
-.endm
-
-.macro sve_flush_z
- _for n, 0, 31, _sve_flush_z \n
-.endm
-.macro sve_flush_p
- _for n, 0, 15, _sve_pfalse \n
-.endm
-.macro sve_flush_ffr
- _sve_wrffr 0
-.endm
-
.macro sme_save_za nxbase, xvl, nw
mov w\nw, #0
diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index cb54335465f66..2352297330e12 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -237,12 +237,8 @@ static inline void fpsimd_syscall_enter(void)
if (!system_supports_sve())
return;
- if (test_thread_flag(TIF_SVE)) {
- unsigned int sve_vq_minus_one;
-
- sve_vq_minus_one = sve_vq_from_vl(task_get_sve_vl(current)) - 1;
- sve_flush_live(true, sve_vq_minus_one);
- }
+ if (test_thread_flag(TIF_SVE))
+ sve_flush_live();
/*
* Any live non-FPSIMD SVE state has been zeroed. Allow
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 0575d90e6dffb..bff941eea9566 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -11,28 +11,6 @@
#include <asm/assembler.h>
#include <asm/fpsimdmacros.h>
-#ifdef CONFIG_ARM64_SVE
-
-/*
- * Zero all SVE registers but the first 128-bits of each vector
- *
- * VQ must already be configured by caller, any further updates of VQ
- * will need to ensure that the register state remains valid.
- *
- * x0 = include FFR?
- * x1 = VQ - 1
- */
-SYM_FUNC_START(sve_flush_live)
- cbz x1, 1f // A VQ-1 of 0 is 128 bits so no extra Z state
- sve_flush_z
-1: sve_flush_p
- tbz x0, #0, 2f
- sve_flush_ffr
-2: ret
-SYM_FUNC_END(sve_flush_live)
-
-#endif /* CONFIG_ARM64_SVE */
-
#ifdef CONFIG_ARM64_SME
/*
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index f9b3eeacf130d..42177b439b3c7 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1338,7 +1338,7 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
unsigned long vq = sve_vq_from_vl(task_get_sve_vl(current));
sysreg_clear_set_s(SYS_ZCR_EL1, ZCR_ELx_LEN, vq - 1);
- sve_flush_live(true, vq - 1);
+ sve_flush_live();
fpsimd_bind_task_to_cpu();
} else {
fpsimd_to_sve(current);
--
2.30.2
^ permalink raw reply related
* [PATCH 13/18] arm64: fpsimd: Use opaque type for SVE state
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
As the SVE state size can vary at runtime, we don't have a concrete type
for the in-memory SVE state, and pass this around using a pointer to
void. The functions which save/restore the SVE state have a very unusual
calling convention, expecting a pointer to the FFR *in the middle of*
the in-memory SVE state, which is also passed as a pointer to void.
Passing a pointer to the FFR also requires that callers find the live VL
and perform some arithmetic, which callers implement differently.
Using pointer to void means that it's very easy to introduce errors that
cannot be caught by the compiler (e.g. as 'void **' can be assigned to
'void *'). In general this is unnecessarily confusing and fragile.
Improve this by adding an opaque 'struct sve_state', and consistently
passing a pointer to this, performing the necessary offsetting *within*
the save/restore functions.
For the moment, the offsetting is performed in a new '_sve_pffr'
assembly macro, using the ADDVL and ADDPL instructions. These add a
multiple of the live vector length and predicate length respectively.
The ADDVL immediate range cannot encode 32, so this is split into two
increments of 16.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 24 +++---------------------
arch/arm64/include/asm/fpsimdmacros.h | 9 +++++++++
arch/arm64/include/asm/kvm_host.h | 8 ++------
arch/arm64/include/asm/kvm_hyp.h | 4 ++--
arch/arm64/include/asm/processor.h | 4 +++-
arch/arm64/kernel/fpsimd.c | 21 ++++++++++-----------
arch/arm64/kvm/arm.c | 4 ++--
arch/arm64/kvm/guest.c | 4 ++--
arch/arm64/kvm/hyp/include/hyp/switch.h | 8 +++-----
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 7 +++----
arch/arm64/kvm/hyp/nvhe/setup.c | 2 +-
11 files changed, 40 insertions(+), 55 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 19b373ad0ebf7..19e670ae67598 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -162,7 +162,7 @@ extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
struct cpu_fp_state {
struct user_fpsimd_state *st;
- void *sve_state;
+ struct sve_state *sve_state;
void *sme_state;
u64 *svcr;
u64 *fpmr;
@@ -195,24 +195,6 @@ extern void task_smstop_sm(struct task_struct *task);
/* Maximum VL that SVE/SME VL-agnostic software can transparently support */
#define VL_ARCH_MAX 0x100
-/* Offset of FFR in the SVE register dump */
-static inline size_t sve_ffr_offset(int vl)
-{
- return SVE_SIG_FFR_OFFSET(sve_vq_from_vl(vl)) - SVE_SIG_REGS_OFFSET;
-}
-
-static inline void *sve_pffr(struct thread_struct *thread)
-{
- unsigned int vl;
-
- if (system_supports_sme() && thread_sm_enabled(thread))
- vl = thread_get_sme_vl(thread);
- else
- vl = thread_get_sve_vl(thread);
-
- return (char *)thread->sve_state + sve_ffr_offset(vl);
-}
-
static inline void *thread_zt_state(struct thread_struct *thread)
{
/* The ZT register state is stored immediately after the ZA state */
@@ -233,8 +215,8 @@ static inline unsigned int sve_get_vl(void)
return vl;
}
-extern void sve_save_state(void *state, int save_ffr);
-extern void sve_load_state(void const *state, int restore_ffr);
+extern void sve_save_state(struct sve_state *state, int save_ffr);
+extern void sve_load_state(const struct sve_state *state, int restore_ffr);
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
extern void sme_save_state(void *state, int zt);
extern void sme_load_state(void const *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 01b5e6d51ba79..08f4863e67715 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -176,7 +176,15 @@
_sve_wrffr 0
.endm
+.macro _sve_pffr ptr
+ .arch_extension sve
+ addvl \ptr, \ptr, #16
+ addvl \ptr, \ptr, #16
+ addpl \ptr, \ptr, #16
+.endm
+
.macro sve_save nxbase, save_ffr
+ _sve_pffr x\nxbase
_for n, 0, 31, _sve_str_v \n, \nxbase, \n - 34
_for n, 0, 15, _sve_str_p \n, \nxbase, \n - 16
cbz \save_ffr, 921f
@@ -190,6 +198,7 @@
.endm
.macro sve_load nxbase, restore_ffr
+ _sve_pffr x\nxbase
_for n, 0, 31, _sve_ldr_v \n, \nxbase, \n - 34
cbz \restore_ffr, 921f
_sve_ldr_p 0, \nxbase
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index ae24617380b8f..a366509c5944e 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -759,7 +759,7 @@ struct kvm_host_data {
* Hyp VA.
* sve_regs is only used in pKVM and if system_supports_sve().
*/
- u8 *sve_regs;
+ struct sve_state *sve_regs;
/* Ownership of the FP regs */
enum {
@@ -853,7 +853,7 @@ struct kvm_vcpu_arch {
* floating point code saves the register state of a task it
* records which view it saved in fp_type.
*/
- void *sve_state;
+ struct sve_state *sve_state;
enum fp_type fp_type;
unsigned int sve_max_vl;
@@ -1097,10 +1097,6 @@ struct kvm_vcpu_arch {
#define NESTED_SERROR_PENDING __vcpu_single_flag(sflags, BIT(8))
-/* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
-#define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) + \
- sve_ffr_offset((vcpu)->arch.sve_max_vl))
-
#define vcpu_sve_max_vq(vcpu) sve_vq_from_vl((vcpu)->arch.sve_max_vl)
#define vcpu_sve_zcr_elx(vcpu) \
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 8c4602c8f4356..38356eee592ad 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -121,8 +121,8 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu);
void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
#endif
-void __sve_save_state(void *sve, int save_ffr);
-void __sve_restore_state(void *sve, int restore_ffr);
+void __sve_save_state(struct sve_state *sve, int save_ffr);
+void __sve_restore_state(struct sve_state *sve, int restore_ffr);
u64 __guest_enter(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index e30c4c8e3a7a7..1c2ffd063baa8 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -130,6 +130,8 @@ enum fp_type {
FP_STATE_SVE,
};
+struct sve_state; /* Opaque type */
+
struct cpu_context {
unsigned long x19;
unsigned long x20;
@@ -164,7 +166,7 @@ struct thread_struct {
enum fp_type fp_type; /* registers FPSIMD or SVE? */
unsigned int fpsimd_cpu;
- void *sve_state; /* SVE registers, if any */
+ struct sve_state *sve_state; /* SVE registers, if any */
void *sme_state; /* ZA and ZT state, if any */
unsigned int vl[ARM64_VEC_MAX]; /* vector length */
unsigned int vl_onexec[ARM64_VEC_MAX]; /* vl after next exec */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 9806fea8fea7c..66d880d081671 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -425,8 +425,7 @@ static void task_fpsimd_load(void)
if (restore_sve_regs) {
WARN_ON_ONCE(current->thread.fp_type != FP_STATE_SVE);
- sve_load_state(sve_pffr(¤t->thread),
- restore_ffr);
+ sve_load_state(current->thread.sve_state, restore_ffr);
fpsimd_load_common(¤t->thread.uw.fpsimd_state);
} else {
WARN_ON_ONCE(current->thread.fp_type != FP_STATE_FPSIMD);
@@ -507,9 +506,7 @@ static void fpsimd_save_user_state(void)
return;
}
- sve_save_state((char *)last->sve_state +
- sve_ffr_offset(vl),
- save_ffr);
+ sve_save_state(last->sve_state, save_ffr);
fpsimd_save_common(last->st);
*last->fp_type = FP_STATE_SVE;
} else {
@@ -641,7 +638,8 @@ static __uint128_t arm64_cpu_to_le128(__uint128_t x)
#define arm64_le128_to_cpu(x) arm64_cpu_to_le128(x)
-static void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst,
+static void __fpsimd_to_sve(struct sve_state *sst,
+ struct user_fpsimd_state const *fst,
unsigned int vq)
{
unsigned int i;
@@ -668,7 +666,7 @@ static void __fpsimd_to_sve(void *sst, struct user_fpsimd_state const *fst,
static inline void fpsimd_to_sve(struct task_struct *task)
{
unsigned int vq;
- void *sst = task->thread.sve_state;
+ struct sve_state *sst = task->thread.sve_state;
struct user_fpsimd_state const *fst = &task->thread.uw.fpsimd_state;
if (!system_supports_sve() && !system_supports_sme())
@@ -692,7 +690,7 @@ static inline void fpsimd_to_sve(struct task_struct *task)
static inline void sve_to_fpsimd(struct task_struct *task)
{
unsigned int vq, vl;
- void const *sst = task->thread.sve_state;
+ const struct sve_state *sst = task->thread.sve_state;
struct user_fpsimd_state *fst = &task->thread.uw.fpsimd_state;
unsigned int i;
__uint128_t const *p;
@@ -791,7 +789,7 @@ void fpsimd_sync_from_effective_state(struct task_struct *task)
void fpsimd_sync_to_effective_state_zeropad(struct task_struct *task)
{
unsigned int vq;
- void *sst = task->thread.sve_state;
+ struct sve_state *sst = task->thread.sve_state;
struct user_fpsimd_state const *fst = &task->thread.uw.fpsimd_state;
if (task->thread.fp_type != FP_STATE_SVE)
@@ -809,7 +807,8 @@ static int change_live_vector_length(struct task_struct *task,
{
unsigned int sve_vl = task_get_sve_vl(task);
unsigned int sme_vl = task_get_sme_vl(task);
- void *sve_state = NULL, *sme_state = NULL;
+ struct sve_state *sve_state = NULL;
+ void *sme_state = NULL;
if (type == ARM64_VEC_SME)
sme_vl = vl;
@@ -1645,7 +1644,7 @@ static void fpsimd_flush_thread_vl(enum vec_type type)
void fpsimd_flush_thread(void)
{
- void *sve_state = NULL;
+ struct sve_state *sve_state = NULL;
void *sme_state = NULL;
if (!system_supports_fpsimd())
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f9fc85a0344e1..7a3db4d7dcdef 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2499,7 +2499,7 @@ static void __init teardown_hyp_mode(void)
continue;
if (free_sve) {
- u8 *sve_regs;
+ struct sve_state *sve_regs;
sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
free_pages((unsigned long) sve_regs, pkvm_host_sve_state_order());
@@ -2648,7 +2648,7 @@ static void finalize_init_hyp_mode(void)
if (system_supports_sve() && is_protected_kvm_enabled()) {
for_each_possible_cpu(cpu) {
- u8 *sve_regs;
+ struct sve_state *sve_regs;
sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs =
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 332c453b87cf8..b01d6622b8720 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -500,7 +500,7 @@ static int get_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
if (!kvm_arm_vcpu_sve_finalized(vcpu))
return -EPERM;
- if (copy_to_user(uptr, vcpu->arch.sve_state + region.koffset,
+ if (copy_to_user(uptr, (void *)vcpu->arch.sve_state + region.koffset,
region.klen) ||
clear_user(uptr + region.klen, region.upad))
return -EFAULT;
@@ -526,7 +526,7 @@ static int set_sve_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
if (!kvm_arm_vcpu_sve_finalized(vcpu))
return -EPERM;
- if (copy_from_user(vcpu->arch.sve_state + region.koffset, uptr,
+ if (copy_from_user((void *)vcpu->arch.sve_state + region.koffset, uptr,
region.klen))
return -EFAULT;
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index aaa43554fd8e6..72e658255cda7 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -467,8 +467,7 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
* vCPU. Start off with the max VL so we can load the SVE state.
*/
sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
- __sve_restore_state(vcpu_sve_pffr(vcpu),
- true);
+ __sve_restore_state(kern_hyp_va(vcpu->arch.sve_state), true);
fpsimd_load_common(&vcpu->arch.ctxt.fp_regs);
/*
@@ -485,12 +484,11 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
static inline void __hyp_sve_save_host(void)
{
struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
- u8 *sve_regs = *host_data_ptr(sve_regs);
+ struct sve_state *sve_regs = *host_data_ptr(sve_regs);
ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
- __sve_save_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
- true);
+ __sve_save_state(sve_regs, true);
fpsimd_save_common(&hctxt->fp_regs);
}
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 627762ed7327f..72d025b2178a7 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -35,7 +35,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
* on the VL, so use a consistent (i.e., the maximum) guest VL.
*/
sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
- __sve_save_state(vcpu_sve_pffr(vcpu), true);
+ __sve_save_state(kern_hyp_va(vcpu->arch.sve_state), true);
fpsimd_save_common(&vcpu->arch.ctxt.fp_regs);
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
}
@@ -43,7 +43,7 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
static void __hyp_sve_restore_host(void)
{
struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
- u8 *sve_regs = *host_data_ptr(sve_regs);
+ struct sve_state *sve_regs = *host_data_ptr(sve_regs);
/*
* On saving/restoring host sve state, always use the maximum VL for
@@ -55,8 +55,7 @@ static void __hyp_sve_restore_host(void)
* need to be revisited.
*/
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
- __sve_restore_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
- true);
+ __sve_restore_state(sve_regs, true);
fpsimd_load_common(&hctxt->fp_regs);
write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
}
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index cdaf53c833409..77dbcfed05486 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -82,7 +82,7 @@ static int pkvm_create_host_sve_mappings(void)
for (i = 0; i < hyp_nr_cpus; i++) {
struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);
- u8 *sve_regs = host_data->sve_regs;
+ struct sve_state *sve_regs = host_data->sve_regs;
start = kern_hyp_va(sve_regs);
end = start + PAGE_ALIGN(pkvm_host_sve_state_size());
--
2.30.2
^ permalink raw reply related
* [PATCH 17/18] arm64: fpsimd: Move SME save/restore inline
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Currently the SVE register save/restore sequences are written in
out-of-line assembly routines. While this works, it's somewhat painful:
* For KVM to use the sequences, portions of the logic will need to be
duplicated in KVM hyp code. While the common logic can be shared in
assembly macros, this is very likely to lead to unnecessary divergence
and be a maintenance burden.
* For historical reasons, the assembly macros take some register
arguments as numerical indices (e.g. "sme_save_za 0, x2, 12" uses x0, x1, and
x12), which is simply confusing.
* Address generation and control flow are far clearer in C than in
assembly.
* The assembly sequences can't be instrumented, and so it's harder than
necessary to catch memory safety issues.
To handle the above, move the SME register save/restore sequences
to inline assembly.
Neither GCC nor LLVM instrument memory arguments to inline assembly, so
explicit instrumentation is added in the same manner as other assembly
routines. This instrumentation is implicitly disabled by Kbuild for nVHE
hyp code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 100 +++++++++++++++++++++++++-
arch/arm64/include/asm/fpsimdmacros.h | 76 --------------------
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/entry-fpsimd.S | 48 -------------
4 files changed, 98 insertions(+), 128 deletions(-)
delete mode 100644 arch/arm64/kernel/entry-fpsimd.S
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 550987b36206a..12f222f64b8d5 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -357,9 +357,6 @@ static inline void sve_flush_live(void)
);
}
-extern void sme_save_state(struct sme_state *state, int zt);
-extern void sme_load_state(const struct sme_state *state, int zt);
-
struct arm64_cpu_capabilities;
extern void cpu_enable_fpsimd(const struct arm64_cpu_capabilities *__unused);
extern void cpu_enable_sve(const struct arm64_cpu_capabilities *__unused);
@@ -639,6 +636,100 @@ static inline size_t __sme_state_size(unsigned int sme_vl)
return size;
}
+static inline void __sme_save_za(struct sme_state *state, unsigned long svl)
+{
+ /* The <Wv> argument to STR (array vector) can only encode W12-W15 */
+ register unsigned long v asm ("12");
+
+ instrument_write(state, svl * svl);
+ for (v = 0; v < svl; v++) {
+ void *pav = (void *)state + v * svl;
+
+ asm volatile(
+ __SME_PREAMBLE
+ " str za[%w[v], #0], [%[pav]]\n"
+ :
+ : [v] "r" (v),
+ [pav] "r" (pav)
+ : "memory"
+ );
+ }
+}
+
+static inline void __sme_load_za(struct sme_state *state, unsigned long svl)
+{
+ /* The <Wv> argument to LDR (array vector) can only encode W12-W15 */
+ register unsigned long v asm ("12");
+
+ instrument_read(state, svl * svl);
+ for (v = 0; v < svl; v++) {
+ void *pav = (void *)state + v * svl;
+
+ asm volatile(
+ __SME_PREAMBLE
+ " ldr za[%w[v], #0], [%[pav]]\n"
+ :
+ : [v] "r" (v),
+ [pav] "r" (pav)
+ : "memory"
+ );
+ }
+}
+
+static inline void __sme_save_zt(struct sme_state *state, unsigned long svl)
+{
+ void *pzt = (void *)state + svl * svl;
+
+ instrument_write(pzt, svl);
+ asm volatile(
+ __DEFINE_ASM_GPR_NUMS
+ /*
+ * STR ZT0, [<Xn|SP>]
+ * Supported by binutils 2.41+.
+ * Supported by LLVM 16+
+ */
+ " .inst 0xe13f8000 | ((.L__gpr_num_%[pzt]) << 5)\n"
+ :
+ : [pzt] "r" (pzt)
+ : "memory");
+}
+
+static inline void __sme_load_zt(const struct sme_state *state, unsigned long svl)
+{
+ void *pzt = (void *)state + svl * svl;
+
+ instrument_read(pzt, svl);
+ asm volatile(
+ __DEFINE_ASM_GPR_NUMS
+ /*
+ * LDR ZT0, [<Xn|SP>]
+ * Supported by binutils 2.41+.
+ * Supported by LLVM 16+
+ */
+ " .inst 0xe11f8000 | ((.L__gpr_num_%[pzt]) << 5)\n"
+ :
+ : [pzt] "r" (pzt)
+ : "memory");
+}
+
+static inline void sme_save_state(struct sme_state *state, bool zt)
+{
+ unsigned long svl = sme_get_vl();
+
+ __sme_save_za(state, svl);
+ if (zt)
+ __sme_save_zt(state, svl);
+}
+
+static inline void sme_load_state(struct sme_state *state, bool zt)
+{
+ unsigned long svl = sme_get_vl();
+
+ __sme_load_za(state, svl);
+ if (zt)
+ __sme_load_zt(state, svl);
+}
+
/*
* Return how many bytes of memory are required to store the full SME
* specific state for task, given task's currently configured vector
@@ -695,6 +786,9 @@ static inline size_t sme_state_size(struct task_struct const *task)
return 0;
}
+static inline void sme_save_state(struct sme_state *state, bool zt) { BUILD_BUG(); }
+static inline void sme_load_state(const struct sme_state *state, bool zt) { BUILD_BUG(); }
+
static inline void sme_enter_from_user_mode(void) { }
static inline void sme_exit_to_user_mode(void) { }
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 9e352b5c6b764..a763fd03ffef3 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -40,60 +40,6 @@
.endif
.endm
-/* Deprecated macros for SME instructions */
-
-/* RDSVL X\nx, #\imm */
-.macro _sme_rdsvl nx, imm
- .arch_extension sme
- rdsvl x\nx, #\imm
-.endm
-
-/*
- * STR (vector from ZA array):
- * STR ZA[W\nw, #\offset], [X\nxbase, #\offset, MUL VL]
- */
-.macro _sme_str_zav nw, nxbase, offset=0
- .arch_extension sme
- str za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
-.endm
-
-/*
- * LDR (vector to ZA array):
- * LDR ZA[w\nw, #\offset], [X\nxbase, #\offset, MUL VL]
- */
-.macro _sme_ldr_zav nw, nxbase, offset=0
- .arch_extension sme
- ldr za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
-.endm
-
-/*
- * SME2 instruction encodings for older assemblers.
- * Supported by binutils 2.41+.
- * Supported by LLVM 16+
- */
-
-/*
- * LDR (ZT0)
- *
- * LDR ZT0, nx
- */
-.macro _ldr_zt nx
- _check_general_reg \nx
- .inst 0xe11f8000 \
- | (\nx << 5)
-.endm
-
-/*
- * STR (ZT0)
- *
- * STR ZT0, nx
- */
-.macro _str_zt nx
- _check_general_reg \nx
- .inst 0xe13f8000 \
- | (\nx << 5)
-.endm
-
.macro __for from:req, to:req
.if (\from) == (\to)
_for__body %\from
@@ -116,25 +62,3 @@
.purgem _for__body
.endm
-
-.macro sme_save_za nxbase, xvl, nw
- mov w\nw, #0
-
-423:
- _sme_str_zav \nw, \nxbase
- add x\nxbase, x\nxbase, \xvl
- add x\nw, x\nw, #1
- cmp \xvl, x\nw
- bne 423b
-.endm
-
-.macro sme_load_za nxbase, xvl, nw
- mov w\nw, #0
-
-423:
- _sme_ldr_zav \nw, \nxbase
- add x\nxbase, x\nxbase, \xvl
- add x\nw, x\nw, #1
- cmp \xvl, x\nw
- bne 423b
-.endm
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 74b76bb704523..d2690c3ec5288 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -27,7 +27,7 @@ KCOV_INSTRUMENT_idle.o := n
# Object file lists.
obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
- entry-common.o entry-fpsimd.o process.o ptrace.o \
+ entry-common.o process.o ptrace.o \
setup.o signal.o sys.o stacktrace.o time.o traps.o \
io.o vdso.o hyp-stub.o psci.o cpu_ops.o \
return_address.o cpuinfo.o cpu_errata.o \
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
deleted file mode 100644
index bff941eea9566..0000000000000
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ /dev/null
@@ -1,48 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * FP/SIMD state saving and restoring
- *
- * Copyright (C) 2012 ARM Ltd.
- * Author: Catalin Marinas <catalin.marinas@arm.com>
- */
-
-#include <linux/linkage.h>
-
-#include <asm/assembler.h>
-#include <asm/fpsimdmacros.h>
-
-#ifdef CONFIG_ARM64_SME
-
-/*
- * Save the ZA and ZT state
- *
- * x0 - pointer to buffer for state
- * x1 - number of ZT registers to save
- */
-SYM_FUNC_START(sme_save_state)
- _sme_rdsvl 2, 1 // x2 = VL/8
- sme_save_za 0, x2, 12 // Leaves x0 pointing to the end of ZA
-
- cbz x1, 1f
- _str_zt 0
-1:
- ret
-SYM_FUNC_END(sme_save_state)
-
-/*
- * Load the ZA and ZT state
- *
- * x0 - pointer to buffer for state
- * x1 - number of ZT registers to save
- */
-SYM_FUNC_START(sme_load_state)
- _sme_rdsvl 2, 1 // x2 = VL/8
- sme_load_za 0, x2, 12 // Leaves x0 pointing to the end of ZA
-
- cbz x1, 1f
- _ldr_zt 0
-1:
- ret
-SYM_FUNC_END(sme_load_state)
-
-#endif /* CONFIG_ARM64_SME */
--
2.30.2
^ permalink raw reply related
* [PATCH 11/18] arm64: fpsimd: Split FPSR/FPCR from SVE save/restore
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Regardless of whether the vector registers are saved in FPSIMD or SVE
format, we store FPSR and FPCR in user_fpsimd_state::{fpsr,fpcr}.
For historical reasons, the functions which save/restore SVE context
take a pointer to user_fpsimd_state::fpsr, and use this to access both
user_fpsimd_state::fpsr and user_fpsimd_state::fpcr. This is
unnecessarily fragile.
Move the save/restore of FPSR and FPCR into separate helper functions
which take a pointer to user_fpsimd_state. I've used read_sysreg_s() and
write_sysreg_s() as contemporary versions of LLVM will refuse to
directly assemble accesses to FPCR or FPSR unless the "fp" arch
extension is enabled.
Note that the SVE assembly sequence for restoring FPCR uses an
unconditional write to FPCR. The plain FPSIMD assembly sequence has used
a conditional write to FPCR since 2014 in commit:
5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")
... but this was not followed for the SVE restore assembly implemented
in 2017 in commit:
1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")
... so I've assumed that this doesn't actually matter in practice, and
implemented the C version matching the existing SVE assembly.
For the moment, fpsimd_save_state() and fpsimd_load_state() are left
as-is with their own logic to save/restore FPSR and FPCR. This will be
unified in subsequent patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 17 ++++++++++++++---
arch/arm64/include/asm/fpsimdmacros.h | 13 ++-----------
arch/arm64/include/asm/kvm_hyp.h | 4 ++--
arch/arm64/kernel/entry-fpsimd.S | 10 ++++------
arch/arm64/kernel/fpsimd.c | 5 +++--
arch/arm64/kvm/hyp/fpsimd.S | 4 ++--
arch/arm64/kvm/hyp/include/hyp/switch.h | 4 ++--
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 5 +++--
8 files changed, 32 insertions(+), 30 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 36cf528e64971..6fd5cdf5e5f17 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -74,6 +74,18 @@ static inline void cpacr_restore(unsigned long cpacr)
struct task_struct;
+static inline void fpsimd_save_common(struct user_fpsimd_state *state)
+{
+ state->fpsr = read_sysreg_s(SYS_FPSR);
+ state->fpcr = read_sysreg_s(SYS_FPCR);
+}
+
+static inline void fpsimd_load_common(const struct user_fpsimd_state *state)
+{
+ write_sysreg_s(state->fpsr, SYS_FPSR);
+ write_sysreg_s(state->fpcr, SYS_FPCR);
+}
+
extern void fpsimd_save_state(struct user_fpsimd_state *state);
extern void fpsimd_load_state(struct user_fpsimd_state *state);
@@ -157,9 +169,8 @@ static inline unsigned int sve_get_vl(void)
return vl;
}
-extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
-extern void sve_load_state(void const *state, u32 const *pfpsr,
- int restore_ffr);
+extern void sve_save_state(void *state, int save_ffr);
+extern void sve_load_state(void const *state, int restore_ffr);
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
extern void sme_save_state(void *state, int zt);
extern void sme_load_state(void const *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index d75c9d4c9989b..c79ae7ec1ff05 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -235,7 +235,7 @@
_sve_wrffr 0
.endm
-.macro sve_save nxbase, xpfpsr, save_ffr, nxtmp
+.macro sve_save nxbase, save_ffr
_for n, 0, 31, _sve_str_v \n, \nxbase, \n - 34
_for n, 0, 15, _sve_str_p \n, \nxbase, \n - 16
cbz \save_ffr, 921f
@@ -246,24 +246,15 @@
922:
_sve_str_p 0, \nxbase
_sve_ldr_p 0, \nxbase, -16
- mrs x\nxtmp, fpsr
- str w\nxtmp, [\xpfpsr]
- mrs x\nxtmp, fpcr
- str w\nxtmp, [\xpfpsr, #4]
.endm
-.macro sve_load nxbase, xpfpsr, restore_ffr, nxtmp
+.macro sve_load nxbase, restore_ffr
_for n, 0, 31, _sve_ldr_v \n, \nxbase, \n - 34
cbz \restore_ffr, 921f
_sve_ldr_p 0, \nxbase
_sve_wrffr 0
921:
_for n, 0, 15, _sve_ldr_p \n, \nxbase, \n - 16
-
- ldr w\nxtmp, [\xpfpsr]
- msr fpsr, x\nxtmp
- ldr w\nxtmp, [\xpfpsr, #4]
- msr fpcr, x\nxtmp
.endm
.macro sme_save_za nxbase, xvl, nw
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 8d06b62e7188c..0030cc1b52197 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -123,8 +123,8 @@ void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
-void __sve_save_state(void *sve_pffr, u32 *fpsr, int save_ffr);
-void __sve_restore_state(void *sve_pffr, u32 *fpsr, int restore_ffr);
+void __sve_save_state(void *sve, int save_ffr);
+void __sve_restore_state(void *sve, int restore_ffr);
u64 __guest_enter(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 7f2d31dff8c17..83fe9c32bbd1c 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -37,11 +37,10 @@ SYM_FUNC_END(fpsimd_load_state)
* Save the SVE state
*
* x0 - pointer to buffer for state
- * x1 - pointer to storage for FPSR
- * x2 - Save FFR if non-zero
+ * x1 - Save FFR if non-zero
*/
SYM_FUNC_START(sve_save_state)
- sve_save 0, x1, x2, 3
+ sve_save 0, x1
ret
SYM_FUNC_END(sve_save_state)
@@ -49,11 +48,10 @@ SYM_FUNC_END(sve_save_state)
* Load the SVE state
*
* x0 - pointer to buffer for state
- * x1 - pointer to storage for FPSR
- * x2 - Restore FFR if non-zero
+ * x1 - Restore FFR if non-zero
*/
SYM_FUNC_START(sve_load_state)
- sve_load 0, x1, x2, 4
+ sve_load 0, x1
ret
SYM_FUNC_END(sve_load_state)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 2578c2372c89e..9806fea8fea7c 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -426,8 +426,8 @@ static void task_fpsimd_load(void)
if (restore_sve_regs) {
WARN_ON_ONCE(current->thread.fp_type != FP_STATE_SVE);
sve_load_state(sve_pffr(¤t->thread),
- ¤t->thread.uw.fpsimd_state.fpsr,
restore_ffr);
+ fpsimd_load_common(¤t->thread.uw.fpsimd_state);
} else {
WARN_ON_ONCE(current->thread.fp_type != FP_STATE_FPSIMD);
fpsimd_load_state(¤t->thread.uw.fpsimd_state);
@@ -509,7 +509,8 @@ static void fpsimd_save_user_state(void)
sve_save_state((char *)last->sve_state +
sve_ffr_offset(vl),
- &last->st->fpsr, save_ffr);
+ save_ffr);
+ fpsimd_save_common(last->st);
*last->fp_type = FP_STATE_SVE;
} else {
fpsimd_save_state(last->st);
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
index 6e16cbfc5df27..8575e32977d19 100644
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ b/arch/arm64/kvm/hyp/fpsimd.S
@@ -21,11 +21,11 @@ SYM_FUNC_START(__fpsimd_restore_state)
SYM_FUNC_END(__fpsimd_restore_state)
SYM_FUNC_START(__sve_restore_state)
- sve_load 0, x1, x2, 3
+ sve_load 0, x1
ret
SYM_FUNC_END(__sve_restore_state)
SYM_FUNC_START(__sve_save_state)
- sve_save 0, x1, x2, 3
+ sve_save 0, x1
ret
SYM_FUNC_END(__sve_save_state)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 6512dd3f75ae4..eb76a863ebb84 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -468,8 +468,8 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
*/
sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
__sve_restore_state(vcpu_sve_pffr(vcpu),
- &vcpu->arch.ctxt.fp_regs.fpsr,
true);
+ fpsimd_load_common(&vcpu->arch.ctxt.fp_regs);
/*
* The effective VL for a VM could differ from the max VL when running a
@@ -490,8 +490,8 @@ static inline void __hyp_sve_save_host(void)
ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
__sve_save_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
- &hctxt->fp_regs.fpsr,
true);
+ fpsimd_save_common(&hctxt->fp_regs);
}
static inline void fpsimd_lazy_switch_to_guest(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 04a6d2e0ea73f..0be4577a67e7b 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -35,7 +35,8 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
* on the VL, so use a consistent (i.e., the maximum) guest VL.
*/
sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
- __sve_save_state(vcpu_sve_pffr(vcpu), &vcpu->arch.ctxt.fp_regs.fpsr, true);
+ __sve_save_state(vcpu_sve_pffr(vcpu), true);
+ fpsimd_save_common(&vcpu->arch.ctxt.fp_regs);
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
}
@@ -55,8 +56,8 @@ static void __hyp_sve_restore_host(void)
*/
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
__sve_restore_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
- &hctxt->fp_regs.fpsr,
true);
+ fpsimd_load_common(&hctxt->fp_regs);
write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
}
--
2.30.2
^ permalink raw reply related
* [PATCH 10/18] arm64: sysreg: Add FPCR and FPSR
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Add sysreg definitions for FPCR and FPSR.
Some versions of LLVM will refuse to assemble accesses to FPCR and FPSR
unless the "fp" arch extension is enabled, which we don't currently do
for read_sysreg() and write_sysreg(). In general, handling feature
dependencies would complicate read_sysreg() and write_sysreg(), and it's
simpler to use read_sysreg_s() and write_sysreg_s() instead, requiring
sysreg definitions.
The values used can be found in ARM ARM issue M.b:
https://developer.arm.com/documentation/ddi0487/mb/
... in sections:
* C5.2.8 ("FPCR, Floating-point Control Register")
* C5.2.10 ("FPSR, Floating-point Status Register")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/tools/sysreg | 45 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 6c3ff14e561e6..fa155cd856a5b 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -3790,6 +3790,51 @@ Field 1 ZA
Field 0 SM
EndSysreg
+Sysreg FPCR 3 3 4 4 0
+Res0 63:27
+Field 26 AHP
+Field 25 DN
+Field 24 FZ
+Enum 23:22 RMode
+ 0b00 RN
+ 0b01 RP
+ 0b10 RM
+ 0b11 RZ
+EndEnum
+Field 21:20 Stride
+Field 19 FZ16
+Field 18:16 Len
+Field 15 IDE
+Res0 14
+Field 13 EBF
+Field 12 IXE
+Field 11 UFE
+Field 10 OFE
+Field 9 DZE
+Field 8 IOE
+Res0 7:3
+Field 2 NEP
+Field 1 AH
+Field 0 FIZ
+EndSysreg
+
+Sysreg FPSR 3 3 4 4 1
+Res0 63:32
+Field 31 N
+Field 30 Q
+Field 29 C
+Field 28 V
+Field 27 QC
+Res0 26:8
+Field 7 IDC
+Res0 6:5
+Field 4 IXC
+Field 3 UFC
+Field 2 OFC
+Field 1 DZC
+Field 0 IOC
+EndSysreg
+
Sysreg FPMR 3 3 4 4 2
Res0 63:38
Field 37:32 LSCALE2
--
2.30.2
^ permalink raw reply related
* [PATCH 12/18] arm64: fpsimd: Move fpsimd save/restore inline
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Currently the FPSIMD register save/restore sequences are written in
out-of-line assembly routines. While this works, it's somewhat painful:
* As KVM needs to be able to use the sequences in hyp code, separate
assembly files are used for the regular kernel and KVM code. While the
common logic is shared in assembly macros, this still requires some
duplication, and has lead to some trivial divergence.
* For historical reasons, the assembly macros take some register
arguments as numerical indices (e.g. "fpsimd_save x0, 8" uses x0 and
x8), which is simply confusing.
* For historical reasons, the SVE save/restore code and FPSIMD
save/restore code have distinct sequences for FPSR and FPCR. Ideally
this logic would be shared.
* The assembly sequences can't be instrumented, and so it's harder than
necessary to catch memory safety issues.
To handle the above, move the FPSIMD register save/restore sequences to
inline assembly, and share the FPSR+FPCR save/restore with SVE.
Neither GCC nor LLVM instrument memory arguments to inline assembly, so
explicit instrumentation is added in the same manner as other assembly
routines. This instrumentation is implicitly disabled by Kbuild for nVHE
hyp code.
Note that I've used the SVE sequence for restoring FPCR, which uses an
unconditional write to FPCR. The plain FPSIMD assembly sequence used a
conditional write to FPCR since 2014 in commit:
5959e25729a5 ("arm64: fpsimd: avoid restoring fpcr if the contents haven't change")
... but this was not followed for the SVE assembly implemented in 2017
in commit:
1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")
... so I've assumed that this doesn't actually matter in practice, and
I've erred in favour of the simpler sequence.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 68 ++++++++++++++++++++++++-
arch/arm64/include/asm/fpsimdmacros.h | 59 ---------------------
arch/arm64/include/asm/kvm_hyp.h | 2 -
arch/arm64/kernel/entry-fpsimd.S | 20 --------
arch/arm64/kvm/hyp/fpsimd.S | 10 ----
arch/arm64/kvm/hyp/include/hyp/switch.h | 4 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 4 +-
7 files changed, 70 insertions(+), 97 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 6fd5cdf5e5f17..19b373ad0ebf7 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -22,6 +22,8 @@
#include <linux/stddef.h>
#include <linux/types.h>
+#define __FPSIMD_PREAMBLE ".arch_extension fp\n" \
+ ".arch_extension simd\n"
#define __SVE_PREAMBLE ".arch_extension sve\n"
#define __SME_PREAMBLE ".arch_extension sme\n"
@@ -86,8 +88,70 @@ static inline void fpsimd_load_common(const struct user_fpsimd_state *state)
write_sysreg_s(state->fpcr, SYS_FPCR);
}
-extern void fpsimd_save_state(struct user_fpsimd_state *state);
-extern void fpsimd_load_state(struct user_fpsimd_state *state);
+static inline void fpsimd_save_vregs(struct user_fpsimd_state *state)
+{
+ instrument_write(state->vregs, sizeof(state->vregs));
+ asm volatile(
+ __FPSIMD_PREAMBLE
+ " stp q0, q1, [%[vregs], #16 * 0]\n"
+ " stp q2, q3, [%[vregs], #16 * 2]\n"
+ " stp q4, q5, [%[vregs], #16 * 4]\n"
+ " stp q6, q7, [%[vregs], #16 * 6]\n"
+ " stp q8, q9, [%[vregs], #16 * 8]\n"
+ " stp q10, q11, [%[vregs], #16 * 10]\n"
+ " stp q12, q13, [%[vregs], #16 * 12]\n"
+ " stp q14, q15, [%[vregs], #16 * 14]\n"
+ " stp q16, q17, [%[vregs], #16 * 16]\n"
+ " stp q18, q19, [%[vregs], #16 * 18]\n"
+ " stp q20, q21, [%[vregs], #16 * 20]\n"
+ " stp q22, q23, [%[vregs], #16 * 22]\n"
+ " stp q24, q25, [%[vregs], #16 * 24]\n"
+ " stp q26, q27, [%[vregs], #16 * 26]\n"
+ " stp q28, q29, [%[vregs], #16 * 28]\n"
+ " stp q30, q31, [%[vregs], #16 * 30]\n"
+ : "=Q" (state->vregs)
+ : [vregs] "r" (state->vregs)
+ );
+}
+
+static inline void fpsimd_load_vregs(const struct user_fpsimd_state *state)
+{
+ instrument_read(state->vregs, sizeof(state->vregs));
+ asm volatile(
+ __FPSIMD_PREAMBLE
+ " ldp q0, q1, [%[vregs], #16 * 0]\n"
+ " ldp q2, q3, [%[vregs], #16 * 2]\n"
+ " ldp q4, q5, [%[vregs], #16 * 4]\n"
+ " ldp q6, q7, [%[vregs], #16 * 6]\n"
+ " ldp q8, q9, [%[vregs], #16 * 8]\n"
+ " ldp q10, q11, [%[vregs], #16 * 10]\n"
+ " ldp q12, q13, [%[vregs], #16 * 12]\n"
+ " ldp q14, q15, [%[vregs], #16 * 14]\n"
+ " ldp q16, q17, [%[vregs], #16 * 16]\n"
+ " ldp q18, q19, [%[vregs], #16 * 18]\n"
+ " ldp q20, q21, [%[vregs], #16 * 20]\n"
+ " ldp q22, q23, [%[vregs], #16 * 22]\n"
+ " ldp q24, q25, [%[vregs], #16 * 24]\n"
+ " ldp q26, q27, [%[vregs], #16 * 26]\n"
+ " ldp q28, q29, [%[vregs], #16 * 28]\n"
+ " ldp q30, q31, [%[vregs], #16 * 30]\n"
+ :
+ : "Q" (state->vregs),
+ [vregs] "r" (state->vregs)
+ );
+}
+
+static inline void fpsimd_save_state(struct user_fpsimd_state *state)
+{
+ fpsimd_save_vregs(state);
+ fpsimd_save_common(state);
+}
+
+static inline void fpsimd_load_state(const struct user_fpsimd_state *state)
+{
+ fpsimd_load_vregs(state);
+ fpsimd_load_common(state);
+}
extern void fpsimd_thread_switch(struct task_struct *next);
extern void fpsimd_flush_thread(void);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index c79ae7ec1ff05..01b5e6d51ba79 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -8,65 +8,6 @@
#include <asm/assembler.h>
-.macro fpsimd_save state, tmpnr
- stp q0, q1, [\state, #16 * 0]
- stp q2, q3, [\state, #16 * 2]
- stp q4, q5, [\state, #16 * 4]
- stp q6, q7, [\state, #16 * 6]
- stp q8, q9, [\state, #16 * 8]
- stp q10, q11, [\state, #16 * 10]
- stp q12, q13, [\state, #16 * 12]
- stp q14, q15, [\state, #16 * 14]
- stp q16, q17, [\state, #16 * 16]
- stp q18, q19, [\state, #16 * 18]
- stp q20, q21, [\state, #16 * 20]
- stp q22, q23, [\state, #16 * 22]
- stp q24, q25, [\state, #16 * 24]
- stp q26, q27, [\state, #16 * 26]
- stp q28, q29, [\state, #16 * 28]
- stp q30, q31, [\state, #16 * 30]!
- mrs x\tmpnr, fpsr
- str w\tmpnr, [\state, #16 * 2]
- mrs x\tmpnr, fpcr
- str w\tmpnr, [\state, #16 * 2 + 4]
-.endm
-
-.macro fpsimd_restore_fpcr state, tmp
- /*
- * Writes to fpcr may be self-synchronising, so avoid restoring
- * the register if it hasn't changed.
- */
- mrs \tmp, fpcr
- cmp \tmp, \state
- b.eq 9999f
- msr fpcr, \state
-9999:
-.endm
-
-/* Clobbers \state */
-.macro fpsimd_restore state, tmpnr
- ldp q0, q1, [\state, #16 * 0]
- ldp q2, q3, [\state, #16 * 2]
- ldp q4, q5, [\state, #16 * 4]
- ldp q6, q7, [\state, #16 * 6]
- ldp q8, q9, [\state, #16 * 8]
- ldp q10, q11, [\state, #16 * 10]
- ldp q12, q13, [\state, #16 * 12]
- ldp q14, q15, [\state, #16 * 14]
- ldp q16, q17, [\state, #16 * 16]
- ldp q18, q19, [\state, #16 * 18]
- ldp q20, q21, [\state, #16 * 20]
- ldp q22, q23, [\state, #16 * 22]
- ldp q24, q25, [\state, #16 * 24]
- ldp q26, q27, [\state, #16 * 26]
- ldp q28, q29, [\state, #16 * 28]
- ldp q30, q31, [\state, #16 * 30]!
- ldr w\tmpnr, [\state, #16 * 2]
- msr fpsr, x\tmpnr
- ldr w\tmpnr, [\state, #16 * 2 + 4]
- fpsimd_restore_fpcr x\tmpnr, \state
-.endm
-
/* Sanity-check macros to help avoid encoding garbage instructions */
.macro _check_general_reg nr
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 0030cc1b52197..8c4602c8f4356 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -121,8 +121,6 @@ void __debug_save_host_buffers_nvhe(struct kvm_vcpu *vcpu);
void __debug_restore_host_buffers_nvhe(struct kvm_vcpu *vcpu);
#endif
-void __fpsimd_save_state(struct user_fpsimd_state *fp_regs);
-void __fpsimd_restore_state(struct user_fpsimd_state *fp_regs);
void __sve_save_state(void *sve, int save_ffr);
void __sve_restore_state(void *sve, int restore_ffr);
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 83fe9c32bbd1c..4fa00c94f28b7 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -11,26 +11,6 @@
#include <asm/assembler.h>
#include <asm/fpsimdmacros.h>
-/*
- * Save the FP registers.
- *
- * x0 - pointer to struct fpsimd_state
- */
-SYM_FUNC_START(fpsimd_save_state)
- fpsimd_save x0, 8
- ret
-SYM_FUNC_END(fpsimd_save_state)
-
-/*
- * Load the FP registers.
- *
- * x0 - pointer to struct fpsimd_state
- */
-SYM_FUNC_START(fpsimd_load_state)
- fpsimd_restore x0, 8
- ret
-SYM_FUNC_END(fpsimd_load_state)
-
#ifdef CONFIG_ARM64_SVE
/*
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
index 8575e32977d19..beacec33b2541 100644
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ b/arch/arm64/kvm/hyp/fpsimd.S
@@ -10,16 +10,6 @@
.text
-SYM_FUNC_START(__fpsimd_save_state)
- fpsimd_save x0, 1
- ret
-SYM_FUNC_END(__fpsimd_save_state)
-
-SYM_FUNC_START(__fpsimd_restore_state)
- fpsimd_restore x0, 1
- ret
-SYM_FUNC_END(__fpsimd_restore_state)
-
SYM_FUNC_START(__sve_restore_state)
sve_load 0, x1
ret
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index eb76a863ebb84..aaa43554fd8e6 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -565,7 +565,7 @@ static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
if (system_supports_sve()) {
__hyp_sve_save_host();
} else {
- __fpsimd_save_state(&hctxt->fp_regs);
+ fpsimd_save_state(&hctxt->fp_regs);
}
if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))
@@ -625,7 +625,7 @@ static inline bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
if (sve_guest)
__hyp_sve_restore_guest(vcpu);
else
- __fpsimd_restore_state(&vcpu->arch.ctxt.fp_regs);
+ fpsimd_load_state(&vcpu->arch.ctxt.fp_regs);
if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))
write_sysreg_s(__vcpu_sys_reg(vcpu, FPMR), SYS_FPMR);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 0be4577a67e7b..627762ed7327f 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -83,7 +83,7 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
if (vcpu_has_sve(vcpu))
__hyp_sve_save_guest(vcpu);
else
- __fpsimd_save_state(&vcpu->arch.ctxt.fp_regs);
+ fpsimd_save_state(&vcpu->arch.ctxt.fp_regs);
has_fpmr = kvm_has_fpmr(kern_hyp_va(vcpu->kvm));
if (has_fpmr)
@@ -92,7 +92,7 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
if (system_supports_sve())
__hyp_sve_restore_host();
else
- __fpsimd_restore_state(&hctxt->fp_regs);
+ fpsimd_load_state(&hctxt->fp_regs);
if (has_fpmr)
write_sysreg_s(ctxt_sys_reg(hctxt, FPMR), SYS_FPMR);
--
2.30.2
^ permalink raw reply related
* [PATCH 08/18] arm64: fpsimd: Use assembler for baseline SME instructions
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
We currently support assemblers which do not support SME instructions,
and have macros to manually encode SME instructions. This was
necessary historically as SME support was developed before assembler
support was widely available, but things have changed:
* All currently supported versions of LLVM support baseline SME
instructions. Building the kernel requires LLVM 15+, while LLVM 13+
supports SME.
* GNU binutils has supported baseline SME instructions since 2.38, which
was released on 09 February 2022. Toolchains using this or later are
widely available. For example Debian 12 (released on 10 June 2023)
provides binutils 2.40. Toolchains provided kernel.org provide
binutils 2.38+ since the GCC 12.1.0 release (released between 06 May
2022 and 17 August 2022).
* For various reasons, SME support was marked as BROKEN, and re-enabled
in v6.16 (released on 27 July 2025). The earliest support LTS kernel
with SME support is v6.18.y, v6.18 was tagged on 30 November 2025, and
contemporary toolchains (GCC 15.2 and binutils 2.45) supported
baseline SME instructions.
* Any distribution which intends to support SME will presumably have a
toolchain that supports baseline SME instructions such that userspace
can be built.
Considering the above, there's no practical benefit to allowing SME to
be built when the toolchain doesn't support baseline SME instructions.
Make CONFIG_ARM64_SME depend on assembler support for SME, and remove
the manual encoding of SME instructions. The various _sme_<insn> macros
are kept for now, and will be cleaned up in subsequent patches.
A couple of SME2 instructions require a more recent toolchain, and are
left as-is for now. I've looked through releases of binutils and LLVM to
find when support was added, and noted this in a comment.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/Kconfig | 5 ++++
arch/arm64/include/asm/fpsimdmacros.h | 38 +++++++++++----------------
2 files changed, 20 insertions(+), 23 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fe60738e5943b..378e50fef247a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2247,10 +2247,15 @@ config ARM64_SVE
booting the kernel. If unsure and you are not observing these
symptoms, you should assume that it is safe to say Y.
+config AS_HAS_SME
+ # Supported by LLVM 13+ and binutils 2.38+
+ def_bool $(as-instr,.arch_extension sme)
+
config ARM64_SME
bool "ARM Scalable Matrix Extension support"
default y
depends on ARM64_SVE
+ depends on AS_HAS_SME
help
The Scalable Matrix Extension (SME) is an extension to the AArch64
execution state which utilises a substantial subset of the SVE
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 1122eea6daacf..d0bdbbf2d44ad 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -148,46 +148,38 @@
pfalse p\np\().b
.endm
-/* SME instruction encodings for non-SME-capable assemblers */
-/* (pre binutils 2.38/LLVM 13) */
+/* Deprecated macros for SME instructions */
/* RDSVL X\nx, #\imm */
.macro _sme_rdsvl nx, imm
- _check_general_reg \nx
- _check_num (\imm), -0x20, 0x1f
- .inst 0x04bf5800 \
- | (\nx) \
- | (((\imm) & 0x3f) << 5)
+ .arch_extension sme
+ rdsvl x\nx, #\imm
.endm
/*
* STR (vector from ZA array):
- * STR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ * STR ZA[W\nw, #\offset], [X\nxbase, #\offset, MUL VL]
*/
.macro _sme_str_zav nw, nxbase, offset=0
- _sme_check_wv \nw
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0xe1200000 \
- | (((\nw) & 3) << 13) \
- | ((\nxbase) << 5) \
- | ((\offset) & 7)
+ .arch_extension sme
+ str za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
.endm
/*
* LDR (vector to ZA array):
- * LDR ZA[\nw, #\offset], [X\nxbase, #\offset, MUL VL]
+ * LDR ZA[w\nw, #\offset], [X\nxbase, #\offset, MUL VL]
*/
.macro _sme_ldr_zav nw, nxbase, offset=0
- _sme_check_wv \nw
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0xe1000000 \
- | (((\nw) & 3) << 13) \
- | ((\nxbase) << 5) \
- | ((\offset) & 7)
+ .arch_extension sme
+ ldr za[w\nw, #\offset], [x\nxbase, #\offset, MUL VL]
.endm
+/*
+ * SME2 instruction encodings for older assemblers.
+ * Supported by binutils 2.41+.
+ * Supported by LLVM 16+
+ */
+
/*
* LDR (ZT0)
*
--
2.30.2
^ permalink raw reply related
* [PATCH 09/18] arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
The sve_get_vl() and sme_get_vl() functions are wrappers for the RDVL
and RDSVL instructions respectively. There's no need for those to be
out-of-line.
Replace the out-of-line assembly functions with equivalent inline
functions.
The _sve_rdvl assembly macro is unused, and so it is removed. The
_sme_rdsvl assembly macro is still used elsewhere, and so is kept for
now.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 31 +++++++++++++++++++++++++--
arch/arm64/include/asm/fpsimdmacros.h | 6 ------
arch/arm64/kernel/entry-fpsimd.S | 10 ---------
3 files changed, 29 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 8efa3c0402a7a..36cf528e64971 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -22,6 +22,9 @@
#include <linux/stddef.h>
#include <linux/types.h>
+#define __SVE_PREAMBLE ".arch_extension sve\n"
+#define __SME_PREAMBLE ".arch_extension sme\n"
+
/* Masks for extracting the FPSR and FPCR from the FPSCR */
#define VFP_FPSCR_STAT_MASK 0xf800009f
#define VFP_FPSCR_CTRL_MASK 0x07f79f00
@@ -141,11 +144,23 @@ static inline void *thread_zt_state(struct thread_struct *thread)
return thread->sme_state + ZA_SIG_REGS_SIZE(sme_vq);
}
+static inline unsigned int sve_get_vl(void)
+{
+ unsigned int vl;
+
+ asm volatile(
+ __SVE_PREAMBLE
+ " rdvl %x[vl], #1\n"
+ : [vl] "=r" (vl)
+ );
+
+ return vl;
+}
+
extern void sve_save_state(void *state, u32 *pfpsr, int save_ffr);
extern void sve_load_state(void const *state, u32 const *pfpsr,
int restore_ffr);
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
-extern unsigned int sve_get_vl(void);
extern void sme_save_state(void *state, int zt);
extern void sme_load_state(void const *state, int zt);
@@ -400,8 +415,20 @@ static inline int sme_max_virtualisable_vl(void)
return vec_max_virtualisable_vl(ARM64_VEC_SME);
}
+static inline unsigned int sme_get_vl(void)
+{
+ unsigned int vl;
+
+ asm volatile(
+ __SME_PREAMBLE
+ " rdsvl %x[vl], #1\n"
+ : [vl] "=r" (vl)
+ );
+
+ return vl;
+}
+
extern void sme_alloc(struct task_struct *task, bool flush);
-extern unsigned int sme_get_vl(void);
extern int sme_set_current_vl(unsigned long arg);
extern int sme_get_current_vl(void);
extern void sme_suspend_exit(void);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index d0bdbbf2d44ad..d75c9d4c9989b 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -125,12 +125,6 @@
ldr p\np, [x\nxbase, #\offset, MUL VL]
.endm
-/* RDVL X\nx, #\imm */
-.macro _sve_rdvl nx, imm
- .arch_extension sve
- rdvl x\nx, #\imm
-.endm
-
/* RDFFR (unpredicated): RDFFR P\np.B */
.macro _sve_rdffr np
.arch_extension sve
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 88c555745b584..7f2d31dff8c17 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -57,11 +57,6 @@ SYM_FUNC_START(sve_load_state)
ret
SYM_FUNC_END(sve_load_state)
-SYM_FUNC_START(sve_get_vl)
- _sve_rdvl 0, 1
- ret
-SYM_FUNC_END(sve_get_vl)
-
/*
* Zero all SVE registers but the first 128-bits of each vector
*
@@ -84,11 +79,6 @@ SYM_FUNC_END(sve_flush_live)
#ifdef CONFIG_ARM64_SME
-SYM_FUNC_START(sme_get_vl)
- _sme_rdsvl 0, 1
- ret
-SYM_FUNC_END(sme_get_vl)
-
/*
* Save the ZA and ZT state
*
--
2.30.2
^ permalink raw reply related
* [PATCH 06/18] arm64: fpsimd: Remove sve_set_vq() and sme_set_vq()
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
The sve_set_vq() and sme_set_vq() assembly functions (and the
sve_load_vq and sme_load_vq macros they use) are open-coded forms of
sysreg_clear_set*(). There's no need for these to be implemented
out-of-line in assembly, and the 'vq_minus_1' argument is unusual and
confusing.
Use sysreg_clear_set_s() directly, where the necessary 'vq - 1' encoding
is more obviously part of encoding the register value.
For now, sve_flush_live() is left with the unusual vq_minus_1 argument.
This will be addressed in subsequent patches.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimd.h | 2 --
arch/arm64/include/asm/fpsimdmacros.h | 22 ----------------------
arch/arm64/kernel/entry-fpsimd.S | 10 ----------
arch/arm64/kernel/fpsimd.c | 24 +++++++++++++-----------
4 files changed, 13 insertions(+), 45 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index d9d00b45ab115..8efa3c0402a7a 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -146,8 +146,6 @@ extern void sve_load_state(void const *state, u32 const *pfpsr,
int restore_ffr);
extern void sve_flush_live(bool flush_ffr, unsigned long vq_minus_1);
extern unsigned int sve_get_vl(void);
-extern void sve_set_vq(unsigned long vq_minus_1);
-extern void sme_set_vq(unsigned long vq_minus_1);
extern void sme_save_state(void *state, int zt);
extern void sme_load_state(void const *state, int zt);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index cda81d009c9bd..adf33d2da40c3 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -265,28 +265,6 @@
.purgem _for__body
.endm
-/* Update ZCR_EL1.LEN with the new VQ */
-.macro sve_load_vq xvqminus1, xtmp, xtmp2
- mrs_s \xtmp, SYS_ZCR_EL1
- bic \xtmp2, \xtmp, ZCR_ELx_LEN_MASK
- orr \xtmp2, \xtmp2, \xvqminus1
- cmp \xtmp2, \xtmp
- b.eq 921f
- msr_s SYS_ZCR_EL1, \xtmp2 //self-synchronising
-921:
-.endm
-
-/* Update SMCR_EL1.LEN with the new VQ */
-.macro sme_load_vq xvqminus1, xtmp, xtmp2
- mrs_s \xtmp, SYS_SMCR_EL1
- bic \xtmp2, \xtmp, SMCR_ELx_LEN_MASK
- orr \xtmp2, \xtmp2, \xvqminus1
- cmp \xtmp2, \xtmp
- b.eq 921f
- msr_s SYS_SMCR_EL1, \xtmp2 //self-synchronising
-921:
-.endm
-
/* Preserve the first 128-bits of Znz and zero the rest. */
.macro _sve_flush_z nz
_sve_check_zreg \nz
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index 6325db1a2179c..88c555745b584 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -62,11 +62,6 @@ SYM_FUNC_START(sve_get_vl)
ret
SYM_FUNC_END(sve_get_vl)
-SYM_FUNC_START(sve_set_vq)
- sve_load_vq x0, x1, x2
- ret
-SYM_FUNC_END(sve_set_vq)
-
/*
* Zero all SVE registers but the first 128-bits of each vector
*
@@ -94,11 +89,6 @@ SYM_FUNC_START(sme_get_vl)
ret
SYM_FUNC_END(sme_get_vl)
-SYM_FUNC_START(sme_set_vq)
- sme_load_vq x0, x1, x2
- ret
-SYM_FUNC_END(sme_set_vq)
-
/*
* Save the ZA and ZT state
*
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index a8395cb303344..2578c2372c89e 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -377,8 +377,10 @@ static void task_fpsimd_load(void)
if (!thread_sm_enabled(¤t->thread))
WARN_ON_ONCE(!test_and_set_thread_flag(TIF_SVE));
- if (test_thread_flag(TIF_SVE))
- sve_set_vq(sve_vq_from_vl(task_get_sve_vl(current)) - 1);
+ if (test_thread_flag(TIF_SVE)) {
+ unsigned long vq = sve_vq_from_vl(task_get_sve_vl(current));
+ sysreg_clear_set_s(SYS_ZCR_EL1, ZCR_ELx_LEN, vq - 1);
+ }
restore_sve_regs = true;
restore_ffr = true;
@@ -403,8 +405,10 @@ static void task_fpsimd_load(void)
unsigned long sme_vl = task_get_sme_vl(current);
/* Ensure VL is set up for restoring data */
- if (test_thread_flag(TIF_SME))
- sme_set_vq(sve_vq_from_vl(sme_vl) - 1);
+ if (test_thread_flag(TIF_SME)) {
+ unsigned long vq = sve_vq_from_vl(sme_vl);
+ sysreg_clear_set_s(SYS_SMCR_EL1, SMCR_ELx_LEN, vq - 1);
+ }
write_sysreg_s(current->thread.svcr, SYS_SVCR);
@@ -1332,10 +1336,9 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
* any effective streaming mode SVE state.
*/
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
- unsigned long vq_minus_one =
- sve_vq_from_vl(task_get_sve_vl(current)) - 1;
- sve_set_vq(vq_minus_one);
- sve_flush_live(true, vq_minus_one);
+ unsigned long vq = sve_vq_from_vl(task_get_sve_vl(current));
+ sysreg_clear_set_s(SYS_ZCR_EL1, ZCR_ELx_LEN, vq - 1);
+ sve_flush_live(true, vq - 1);
fpsimd_bind_task_to_cpu();
} else {
fpsimd_to_sve(current);
@@ -1465,9 +1468,8 @@ void do_sme_acc(unsigned long esr, struct pt_regs *regs)
WARN_ON(1);
if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
- unsigned long vq_minus_one =
- sve_vq_from_vl(task_get_sme_vl(current)) - 1;
- sme_set_vq(vq_minus_one);
+ unsigned long vq = sve_vq_from_vl(task_get_sme_vl(current));
+ sysreg_clear_set_s(SYS_SMCR_EL1, SMCR_ELx_LEN, vq - 1);
fpsimd_bind_task_to_cpu();
} else {
--
2.30.2
^ permalink raw reply related
* [PATCH 07/18] arm64: fpsimd: Use assembler for SVE instructions
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Historically we supported assemblers which could not assemble SVE
instructions. We dropped support for such assemblers in commit:
118c40b7b503 ("kbuild: require gcc-8 and binutils-2.30")
Since that commit, all supported assemblers (binutils and LLVM) are
capable of assembling SVE instructions, and there's no need for us to
manually encode SVE instructions.
Rely on the assembler to encode SVE instructions, and remove the manual
encoding. The various _sve_<insn> macros are kept for now, and will be
cleaned up in subsequent patches.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/fpsimdmacros.h | 64 +++++++--------------------
1 file changed, 16 insertions(+), 48 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index adf33d2da40c3..1122eea6daacf 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -99,85 +99,53 @@
.endif
.endm
-/* SVE instruction encodings for non-SVE-capable assemblers */
-/* (pre binutils 2.28, all kernel capable clang versions support SVE) */
+/* Deprecated macros for SVE instructions */
/* STR (vector): STR Z\nz, [X\nxbase, #\offset, MUL VL] */
.macro _sve_str_v nz, nxbase, offset=0
- _sve_check_zreg \nz
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0xe5804000 \
- | (\nz) \
- | ((\nxbase) << 5) \
- | (((\offset) & 7) << 10) \
- | (((\offset) & 0x1f8) << 13)
+ .arch_extension sve
+ str z\nz, [X\nxbase, #\offset, MUL VL]
.endm
/* LDR (vector): LDR Z\nz, [X\nxbase, #\offset, MUL VL] */
.macro _sve_ldr_v nz, nxbase, offset=0
- _sve_check_zreg \nz
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0x85804000 \
- | (\nz) \
- | ((\nxbase) << 5) \
- | (((\offset) & 7) << 10) \
- | (((\offset) & 0x1f8) << 13)
+ .arch_extension sve
+ ldr z\nz, [X\nxbase, #\offset, MUL VL]
.endm
/* STR (predicate): STR P\np, [X\nxbase, #\offset, MUL VL] */
.macro _sve_str_p np, nxbase, offset=0
- _sve_check_preg \np
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0xe5800000 \
- | (\np) \
- | ((\nxbase) << 5) \
- | (((\offset) & 7) << 10) \
- | (((\offset) & 0x1f8) << 13)
+ .arch_extension sve
+ str p\np, [X\nxbase, #\offset, MUL VL]
.endm
/* LDR (predicate): LDR P\np, [X\nxbase, #\offset, MUL VL] */
.macro _sve_ldr_p np, nxbase, offset=0
- _sve_check_preg \np
- _check_general_reg \nxbase
- _check_num (\offset), -0x100, 0xff
- .inst 0x85800000 \
- | (\np) \
- | ((\nxbase) << 5) \
- | (((\offset) & 7) << 10) \
- | (((\offset) & 0x1f8) << 13)
+ .arch_extension sve
+ ldr p\np, [x\nxbase, #\offset, MUL VL]
.endm
/* RDVL X\nx, #\imm */
.macro _sve_rdvl nx, imm
- _check_general_reg \nx
- _check_num (\imm), -0x20, 0x1f
- .inst 0x04bf5000 \
- | (\nx) \
- | (((\imm) & 0x3f) << 5)
+ .arch_extension sve
+ rdvl x\nx, #\imm
.endm
/* RDFFR (unpredicated): RDFFR P\np.B */
.macro _sve_rdffr np
- _sve_check_preg \np
- .inst 0x2519f000 \
- | (\np)
+ .arch_extension sve
+ rdffr p\np\().b
.endm
/* WRFFR P\np.B */
.macro _sve_wrffr np
- _sve_check_preg \np
- .inst 0x25289000 \
- | ((\np) << 5)
+ wrffr p\np\().b
.endm
/* PFALSE P\np.B */
.macro _sve_pfalse np
- _sve_check_preg \np
- .inst 0x2518e400 \
- | (\np)
+ .arch_extension sve
+ pfalse p\np\().b
.endm
/* SME instruction encodings for non-SME-capable assemblers */
--
2.30.2
^ permalink raw reply related
* [PATCH 03/18] KVM: arm64: pkvm: Save host FPMR in host cpu context
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
Protected KVM stores most of the host's system register state in
kvm_host_data::host_ctxt, which is an instance of struct
kvm_cpu_context. As kvm_cpu_context::sys_regs[] has a slot for FPMR, we
can store the host's FPMR there.
Do so, and remove kvm_host_data::fpmr.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/kvm_host.h | 3 ---
arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ++++--
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 5 +++--
3 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 65eead8362e0b..42b1c4764a4bf 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -775,9 +775,6 @@ struct kvm_host_data {
*/
struct cpu_sve_state *sve_state;
- /* Used by pKVM only. */
- u64 fpmr;
-
/* Ownership of the FP regs */
enum {
FP_STATE_FREE,
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 98b2976837b11..cc4d011a2b380 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -554,6 +554,8 @@ static inline void fpsimd_lazy_switch_to_host(struct kvm_vcpu *vcpu)
static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
{
+ struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+
/*
* Non-protected kvm relies on the host restoring its sve state.
* Protected kvm restores the host's sve state as not to reveal that
@@ -562,11 +564,11 @@ static void kvm_hyp_save_fpsimd_host(struct kvm_vcpu *vcpu)
if (system_supports_sve()) {
__hyp_sve_save_host();
} else {
- __fpsimd_save_state(host_data_ptr(host_ctxt.fp_regs));
+ __fpsimd_save_state(&hctxt->fp_regs);
}
if (kvm_has_fpmr(kern_hyp_va(vcpu->kvm)))
- *host_data_ptr(fpmr) = read_sysreg_s(SYS_FPMR);
+ ctxt_sys_reg(hctxt, FPMR) = read_sysreg_s(SYS_FPMR);
}
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 06db299c37a89..db60f770060e5 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -66,6 +66,7 @@ static void fpsimd_sve_flush(void)
static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
{
+ struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
bool has_fpmr;
if (!guest_owns_fp_regs())
@@ -89,10 +90,10 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
if (system_supports_sve())
__hyp_sve_restore_host();
else
- __fpsimd_restore_state(host_data_ptr(host_ctxt.fp_regs));
+ __fpsimd_restore_state(&hctxt->fp_regs);
if (has_fpmr)
- write_sysreg_s(*host_data_ptr(fpmr), SYS_FPMR);
+ write_sysreg_s(ctxt_sys_reg(hctxt, FPMR), SYS_FPMR);
*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;
}
--
2.30.2
^ permalink raw reply related
* [PATCH 05/18] arm64: fpsimd: Fold sve_init_regs() into do_sve_acc()
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
For historical reasons, do_sve_acc() is structurally different from
do_sme_acc(), and the logic to convert the task from FPSIMD to SVE is
out-of-line in sve_init_regs(). We only use sve_init_regs() within
do_sme_acc(), so it's not necessary for this to be a separate function.
Fold sve_init_regs() into do_sve_acc(), and simplify the associated
comments. This makes do_sve_acc() structurally similar to do_sme_acc(),
making it easier to see similarities and differences.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kernel/fpsimd.c | 48 ++++++++++++++------------------------
1 file changed, 17 insertions(+), 31 deletions(-)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 60a45d600b460..a8395cb303344 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1293,31 +1293,6 @@ void sme_suspend_exit(void)
#endif /* CONFIG_ARM64_SME */
-static void sve_init_regs(void)
-{
- /*
- * Convert the FPSIMD state to SVE, zeroing all the state that
- * is not shared with FPSIMD. If (as is likely) the current
- * state is live in the registers then do this there and
- * update our metadata for the current task including
- * disabling the trap, otherwise update our in-memory copy.
- * We are guaranteed to not be in streaming mode, we can only
- * take a SVE trap when not in streaming mode and we can't be
- * in streaming mode when taking a SME trap.
- */
- if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
- unsigned long vq_minus_one =
- sve_vq_from_vl(task_get_sve_vl(current)) - 1;
- sve_set_vq(vq_minus_one);
- sve_flush_live(true, vq_minus_one);
- fpsimd_bind_task_to_cpu();
- } else {
- fpsimd_to_sve(current);
- current->thread.fp_type = FP_STATE_SVE;
- fpsimd_flush_task_state(current);
- }
-}
-
/*
* Trapped SVE access
*
@@ -1349,13 +1324,24 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
WARN_ON(1); /* SVE access shouldn't have trapped */
/*
- * Even if the task can have used streaming mode we can only
- * generate SVE access traps in normal SVE mode and
- * transitioning out of streaming mode may discard any
- * streaming mode state. Always clear the high bits to avoid
- * any potential errors tracking what is properly initialised.
+ * Convert the FPSIMD state to SVE. Stale SVE state can be present in
+ * registers or memory, so we must zero all state that is not shared
+ * with FPSIMD.
+ *
+ * SVE traps cannot be taken from streaming mode, so there cannot be
+ * any effective streaming mode SVE state.
*/
- sve_init_regs();
+ if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
+ unsigned long vq_minus_one =
+ sve_vq_from_vl(task_get_sve_vl(current)) - 1;
+ sve_set_vq(vq_minus_one);
+ sve_flush_live(true, vq_minus_one);
+ fpsimd_bind_task_to_cpu();
+ } else {
+ fpsimd_to_sve(current);
+ current->thread.fp_type = FP_STATE_SVE;
+ fpsimd_flush_task_state(current);
+ }
put_cpu_fpsimd_context();
}
--
2.30.2
^ permalink raw reply related
* [PATCH 04/18] KVM: arm64: pkvm: Remove struct cpu_sve_state
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
There's no need for struct cpu_sve_state. Code would be simpler and more
robust without it, and removing it will simplify further cleanups (e.g.
adding an opaque type for the sve register state).
Protected KVM stores most of the host's system register state in
kvm_host_data::host_ctxt, which is an instance of struct
kvm_cpu_context. As kvm_cpu_context::sys_regs[] has a slot for ZCR_EL1,
we can store the host's ZCR_EL1 there.
While kvm_cpu_context::sys_regs doesn't have slots for FPSR and FPCR,
these are usually expected to be stored in struct user_fpsimd_state.
For historical reasons, __sve_save_state and __sve_restore_state()
expect a pointer to fpsr *within* struct user_fpsimd_state, assuming the
fpcr will immediately follow, as per the order within struct
user_fpsimd_state. We currently match this ordering in struct
cpu_sve_state, but it would be simpler and more robust to use struct
user_fpsimd_state directly.
After moving ZCR_EL1, FPSR, and FPCR out of struct cpu_sve_state, all
that's left is sve_regs, which can be represented as a pointer without
need for a container struct. This is kept as a pointer to u8 (matching
the array type), as this permits the compiler to catch unbalanced
referencing/dereferencing, which is not possible for pointers to void.
Apply the above changes, and remove cpu_sve_state.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/include/asm/kvm_host.h | 18 ++----------------
arch/arm64/include/asm/kvm_pkvm.h | 3 +--
arch/arm64/kvm/arm.c | 16 ++++++++--------
arch/arm64/kvm/hyp/include/hyp/switch.h | 9 +++++----
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 9 +++++----
arch/arm64/kvm/hyp/nvhe/setup.c | 4 ++--
6 files changed, 23 insertions(+), 36 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 42b1c4764a4bf..ae24617380b8f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -732,20 +732,6 @@ struct kvm_cpu_context {
u64 *vncr_array;
};
-struct cpu_sve_state {
- __u64 zcr_el1;
-
- /*
- * Ordering is important since __sve_save_state/__sve_restore_state
- * relies on it.
- */
- __u32 fpsr;
- __u32 fpcr;
-
- /* Must be SVE_VQ_BYTES (128 bit) aligned. */
- __u8 sve_regs[];
-};
-
/*
* This structure is instantiated on a per-CPU basis, and contains
* data that is:
@@ -771,9 +757,9 @@ struct kvm_host_data {
/*
* Hyp VA.
- * sve_state is only used in pKVM and if system_supports_sve().
+ * sve_regs is only used in pKVM and if system_supports_sve().
*/
- struct cpu_sve_state *sve_state;
+ u8 *sve_regs;
/* Ownership of the FP regs */
enum {
diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index 2954b311128c7..74fedd9c5ff02 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -188,8 +188,7 @@ static inline size_t pkvm_host_sve_state_size(void)
if (!system_supports_sve())
return 0;
- return size_add(sizeof(struct cpu_sve_state),
- SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl)));
+ return SVE_SIG_REGS_SIZE(sve_vq_from_vl(kvm_host_sve_max_vl));
}
struct pkvm_mapping {
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 8bb2c7422cc8b..f9fc85a0344e1 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2499,10 +2499,10 @@ static void __init teardown_hyp_mode(void)
continue;
if (free_sve) {
- struct cpu_sve_state *sve_state;
+ u8 *sve_regs;
- sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;
- free_pages((unsigned long) sve_state, pkvm_host_sve_state_order());
+ sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
+ free_pages((unsigned long) sve_regs, pkvm_host_sve_state_order());
}
free_pages(kvm_nvhe_sym(kvm_arm_hyp_percpu_base)[cpu], nvhe_percpu_order());
@@ -2627,7 +2627,7 @@ static int init_pkvm_host_sve_state(void)
if (!page)
return -ENOMEM;
- per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state = page_address(page);
+ per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs = page_address(page);
}
/*
@@ -2648,11 +2648,11 @@ static void finalize_init_hyp_mode(void)
if (system_supports_sve() && is_protected_kvm_enabled()) {
for_each_possible_cpu(cpu) {
- struct cpu_sve_state *sve_state;
+ u8 *sve_regs;
- sve_state = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state;
- per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_state =
- kern_hyp_va(sve_state);
+ sve_regs = per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs;
+ per_cpu_ptr_nvhe_sym(kvm_host_data, cpu)->sve_regs =
+ kern_hyp_va(sve_regs);
}
}
}
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index cc4d011a2b380..6512dd3f75ae4 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -484,12 +484,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
static inline void __hyp_sve_save_host(void)
{
- struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
+ struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+ u8 *sve_regs = *host_data_ptr(sve_regs);
- sve_state->zcr_el1 = read_sysreg_el1(SYS_ZCR);
+ ctxt_sys_reg(hctxt, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
- __sve_save_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
- &sve_state->fpsr,
+ __sve_save_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
+ &hctxt->fp_regs.fpsr,
true);
}
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index db60f770060e5..04a6d2e0ea73f 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -41,7 +41,8 @@ static void __hyp_sve_save_guest(struct kvm_vcpu *vcpu)
static void __hyp_sve_restore_host(void)
{
- struct cpu_sve_state *sve_state = *host_data_ptr(sve_state);
+ struct kvm_cpu_context *hctxt = host_data_ptr(host_ctxt);
+ u8 *sve_regs = *host_data_ptr(sve_regs);
/*
* On saving/restoring host sve state, always use the maximum VL for
@@ -53,10 +54,10 @@ static void __hyp_sve_restore_host(void)
* need to be revisited.
*/
write_sysreg_s(sve_vq_from_vl(kvm_host_sve_max_vl) - 1, SYS_ZCR_EL2);
- __sve_restore_state(sve_state->sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
- &sve_state->fpsr,
+ __sve_restore_state(sve_regs + sve_ffr_offset(kvm_host_sve_max_vl),
+ &hctxt->fp_regs.fpsr,
true);
- write_sysreg_el1(sve_state->zcr_el1, SYS_ZCR);
+ write_sysreg_el1(ctxt_sys_reg(hctxt, ZCR_EL1), SYS_ZCR);
}
static void fpsimd_sve_flush(void)
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index d461981616d90..cdaf53c833409 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -82,9 +82,9 @@ static int pkvm_create_host_sve_mappings(void)
for (i = 0; i < hyp_nr_cpus; i++) {
struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);
- struct cpu_sve_state *sve_state = host_data->sve_state;
+ u8 *sve_regs = host_data->sve_regs;
- start = kern_hyp_va(sve_state);
+ start = kern_hyp_va(sve_regs);
end = start + PAGE_ALIGN(pkvm_host_sve_state_size());
ret = pkvm_create_mappings(start, end, PAGE_HYP);
if (ret)
--
2.30.2
^ permalink raw reply related
* [PATCH 02/18] KVM: arm64: Don't override FFR save/restore argument
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
The __sve_save_state() and __sve_restore_state() functions take a
parameter describing whether to save/restore the FFR, but both functions
silently override this with '1'. This has always been benign (and
callers have all passed 'true' since the parameter was introduced), but
clearly this is not intentional.
Historically, the functions always saved/restored the FFR, and there was
no parameter to control this.
In v5.16, the sve_save and sve_load assembly macros used by
__sve_save_state() and __sve_restore_state() were changed to make
saving/restoring FFR optional. The implementations of __sve_save_state()
and __sve_restore_state() were changed to pass '1' to their respective
macros, and the prototypes of __sve_save_state() and
__sve_restore_state() were unchanged. See commit:
9f5848665788 ("arm64/sve: Make access to FFR optional")
In v6.10, the prototypes of __sve_save_state() and __sve_restore_state()
were changed to add 'save_ffr' and 'restore_ffr' parameters
respectively, but the implementations were not changed to stop passing 1
to their respective macros. All callers were changed to pass 'true' to
__sve_save_state() and __sve_restore_state(). See commit:
45f4ea9bcfe9 ("KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_state")
This is all benign, but clearly unintentional, and it gets in the way of
cleaning up the FPSIMD/SVE/SME code. Remove the unnecessary overriding.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kvm/hyp/fpsimd.S | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/arm64/kvm/hyp/fpsimd.S b/arch/arm64/kvm/hyp/fpsimd.S
index e950875e31cee..6e16cbfc5df27 100644
--- a/arch/arm64/kvm/hyp/fpsimd.S
+++ b/arch/arm64/kvm/hyp/fpsimd.S
@@ -21,13 +21,11 @@ SYM_FUNC_START(__fpsimd_restore_state)
SYM_FUNC_END(__fpsimd_restore_state)
SYM_FUNC_START(__sve_restore_state)
- mov x2, #1
sve_load 0, x1, x2, 3
ret
SYM_FUNC_END(__sve_restore_state)
SYM_FUNC_START(__sve_save_state)
- mov x2, #1
sve_save 0, x1, x2, 3
ret
SYM_FUNC_END(__sve_save_state)
--
2.30.2
^ permalink raw reply related
* [PATCH 00/18] arm64+KVM: FPSIMD/SVE/SME cleanups
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
Hi.
This series cleans up low-level FPSIMD/SVE/SME state management code,
making it easier to maintain and extend (e.g. adding SME support to
KVM), and enabling better debugging (e.g. by making SVE/SME save/restore
visible to KASAN and KCSAN).
This is purely cleanup, there are NO bugs addressed by this series.
The series aims to do a few key things:
* Make it harder to mis-manage in-memory SVE state and SME state. These
are given opaque types (struct sve_state and struct sme_state), and
the (awkward) calling convention for saving/restoring SVE state is
simplified to take a pointer to the base of the state rather than a
pointer to the FFR within the state.
* Minimize duplications between KVM and the rest of the kernel. The
FPSIMD/SVE/SME routines are moved to inline assembly such that the
same helper functions can be used everywhere, without the need to wrap
assembly macros.
* Make the code easier to follow. Assembly sequences are minimized to
avoid address generation and control-flow that can be written more
clearly in C. Awkward assembly macros are removed where possible.
* Make it easier to debug state management. Explicit instrumentation is
added to the save/restore functions so that KASAN and KCSAN can detect
memory safety issues and concurrency issues.
This instrumentation is inhibited for nVHE hyp objects, and does not
adversely affect KVM. I've confirmed by looking at compiler flags
during the build, and disassembling the relevant object files.
* Remove unnecessary code. By relying on assembler support for SVE and
SME we can remove awkward assembly macros, making the code
significantly simpler and easier to read.
I've compile-tested this with a variety of toolchains:
* GCC 8.1.0 + binutils 2.30
* GCC 11.1.0 + binutils 2.36.1
* GCC 12.1.0 + binutils 2.38
* GCC 15.2.0 + binutils 2.45
* LLVM 15.0.7
* LLVM 21.1.8
I've boot-tested on an SVE+SME capable model, both with KASAN enabled
and without KASAN enabled. All the FPSIMD/SVE/SME kselftests passed in
both configurations, without any KASAN splats. Unfortunately, with KCSAN
enabled, some tests hit timeouts (without any KCSAN splat), which I
believe is simply due to the overhead of KCSAN rather than some adverse
functional effect.
I've boot-tested on an SVE+SME capable model, booting with KVM in each
of:
* VHE mode
* hVHE mode
* Protected mode
In each case I've boot-tested a v7.0 defconfig guest, both with SVE and
without SVE.
Mark.
Mark Rutland (18):
KVM: arm64: Don't include <asm/fpsimdmacros.h>
KVM: arm64: Don't override FFR save/restore argument
KVM: arm64: pkvm: Save host FPMR in host cpu context
KVM: arm64: pkvm: Remove struct cpu_sve_state
arm64: fpsimd: Fold sve_init_regs() into do_sve_acc()
arm64: fpsimd: Remove sve_set_vq() and sme_set_vq()
arm64: fpsimd: Use assembler for SVE instructions
arm64: fpsimd: Use assembler for baseline SME instructions
arm64: fpsimd: Move sve_get_vl() and sme_get_vl() inline
arm64: sysreg: Add FPCR and FPSR
arm64: fpsimd: Split FPSR/FPCR from SVE save/restore
arm64: fpsimd: Move fpsimd save/restore inline
arm64: fpsimd: Use opaque type for SVE state
arm64: fpsimd: Use opaque type for SME state
arm64: fpsimd: Move SVE save/restore inline
arm64: fpsimd: Move sve_flush_live() inline
arm64: fpsimd: Move SME save/restore inline
arm64: fpsimd: Remove <asm/fpsimdmacros.h>
arch/arm64/Kconfig | 5 +
arch/arm64/include/asm/fpsimd.h | 369 ++++++++++++++++++++++--
arch/arm64/include/asm/fpsimdmacros.h | 357 -----------------------
arch/arm64/include/asm/kvm_host.h | 27 +-
arch/arm64/include/asm/kvm_hyp.h | 5 -
arch/arm64/include/asm/kvm_pkvm.h | 3 +-
arch/arm64/include/asm/processor.h | 7 +-
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/entry-common.c | 8 +-
arch/arm64/kernel/entry-fpsimd.S | 134 ---------
arch/arm64/kernel/fpsimd.c | 90 +++---
arch/arm64/kvm/arm.c | 16 +-
arch/arm64/kvm/guest.c | 4 +-
arch/arm64/kvm/hyp/entry.S | 1 -
arch/arm64/kvm/hyp/fpsimd.S | 33 ---
arch/arm64/kvm/hyp/include/hyp/switch.h | 23 +-
arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 20 +-
arch/arm64/kvm/hyp/nvhe/setup.c | 4 +-
arch/arm64/kvm/hyp/vhe/Makefile | 2 +-
arch/arm64/tools/sysreg | 45 +++
21 files changed, 480 insertions(+), 677 deletions(-)
delete mode 100644 arch/arm64/include/asm/fpsimdmacros.h
delete mode 100644 arch/arm64/kernel/entry-fpsimd.S
delete mode 100644 arch/arm64/kvm/hyp/fpsimd.S
--
2.30.2
^ permalink raw reply
* [PATCH 01/18] KVM: arm64: Don't include <asm/fpsimdmacros.h>
From: Mark Rutland @ 2026-05-21 13:25 UTC (permalink / raw)
To: linux-arm-kernel, kvmarm
Cc: broonie, catalin.marinas, james.morse, mark.rutland, maz, oupton,
tabba, will
In-Reply-To: <20260521132556.584676-1-mark.rutland@arm.com>
There's no need for hyp/entry.S to include <asm/fpsimdmacros.h>.
The fpsimd macros have never been used by code in hyp/entry.S, and were
instead used by code in hyp/fpsimd.S.
Remove the unnecessary include.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fuad Tabba <tabba@google.com>
Cc: James Morse <james.morse@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Oliver Upton <oupton@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
arch/arm64/kvm/hyp/entry.S | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 11a10d8f5beb2..308100ed25de9 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -8,7 +8,6 @@
#include <asm/alternative.h>
#include <asm/assembler.h>
-#include <asm/fpsimdmacros.h>
#include <asm/kvm.h>
#include <asm/kvm_arm.h>
#include <asm/kvm_asm.h>
--
2.30.2
^ permalink raw reply related
* Re: [PATCH v22 08/13] mfd: core: Add firmware-node support to MFD cells
From: Lee Jones @ 2026-05-21 13:24 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: Shivendra Pratap, Sebastian Reichel, Mark Rutland,
Lorenzo Pieralisi, Rafael J. Wysocki, Daniel Lezcano,
Christian Loehle, Ulf Hansson, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, Bjorn Andersson, Konrad Dybcio, Arnd Bergmann,
Souvik Chakravarty, Andy Yan, Matthias Brugger, John Stultz,
Moritz Fischer, Sudeep Holla, linux-pm, linux-kernel,
linux-arm-msm, linux-arm-kernel, devicetree, Florian Fainelli,
Krzysztof Kozlowski, Dmitry Baryshkov, Mukesh Ojha, Andre Draszik,
Greg Kroah-Hartman, Kathiravan Thirumoorthy, Srinivas Kandagatla,
Bartosz Golaszewski
In-Reply-To: <CAMRc=MfqaCjiALZyVBHQs=Taft1M9xmNTFvQHWPrd5PgcTfJDQ@mail.gmail.com>
On Thu, 21 May 2026, Bartosz Golaszewski wrote:
> On Thu, May 21, 2026 at 1:26 PM Lee Jones <lee@kernel.org> wrote:
> >
> > On Thu, 14 May 2026, Shivendra Pratap wrote:
> >
> > > MFD core has no way to register a child device using an explicit firmware
> > > node. This prevents drivers from registering child nodes when those nodes
> > > do not define a compatible string. One such example is the PSCI
> > > "reboot-mode" node, which omits a compatible string as it describes
> > > boot-states provided by the underlying firmware.
> > >
> > > Extend struct mfd_cell with a callback that allows drivers to provide an
> > > explicit firmware node. The node is added to the MFD child device during
> > > registration when none is assigned by device tree, ACPI, or software
> > > matching.
> > >
> > > Suggested-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
> > > Signed-off-by: Shivendra Pratap <shivendra.pratap@oss.qualcomm.com>
> > > ---
> > > drivers/mfd/mfd-core.c | 30 ++++++++++++++++++++++++++++++
> > > include/linux/mfd/core.h | 14 ++++++++++++++
> > > 2 files changed, 44 insertions(+)
> > >
> > > diff --git a/drivers/mfd/mfd-core.c b/drivers/mfd/mfd-core.c
> > > index 7aa32b90cf1eb7fa0a05bf3dc506e60a262c9850..cc2a2a924d6d3044e29a9f864b536ee325ed797b 100644
> > > --- a/drivers/mfd/mfd-core.c
> > > +++ b/drivers/mfd/mfd-core.c
> > > @@ -10,6 +10,7 @@
> > > #include <linux/kernel.h>
> > > #include <linux/platform_device.h>
> > > #include <linux/acpi.h>
> > > +#include <linux/fwnode.h>
> > > #include <linux/list.h>
> > > #include <linux/property.h>
> > > #include <linux/mfd/core.h>
> > > @@ -148,6 +149,11 @@ static int mfd_match_of_node_to_dev(struct platform_device *pdev,
> > > return 0;
> > > }
> > >
> > > +static void mfd_child_fwnode_put(void *data)
> > > +{
> > > + fwnode_handle_put(data);
> > > +}
> > > +
> > > static int mfd_add_device(struct device *parent, int id,
> > > const struct mfd_cell *cell,
> > > struct resource *mem_base,
> > > @@ -156,6 +162,7 @@ static int mfd_add_device(struct device *parent, int id,
> > > struct resource *res;
> > > struct platform_device *pdev;
> > > struct mfd_of_node_entry *of_entry, *tmp;
> > > + struct fwnode_handle *fwnode;
> > > bool disabled = false;
> > > int ret = -ENOMEM;
> > > int platform_id;
> > > @@ -224,6 +231,29 @@ static int mfd_add_device(struct device *parent, int id,
> > >
> > > mfd_acpi_add_device(cell, pdev);
> > >
> > > + if (!pdev->dev.fwnode && cell->get_child_fwnode) {
> > > + fwnode = cell->get_child_fwnode(parent);
> > > + if (fwnode) {
> > > + device_set_node(&pdev->dev, fwnode);
> > > +
> > > + /*
> > > + * platform_device_release() drops only of_node refs.
> > > + * Track non-OF fwnodes explicitly so they are put on
> > > + * all teardown paths.
> > > + */
> > > + if (!to_of_node(fwnode)) {
> > > + ret = devm_add_action(&pdev->dev,
> > > + mfd_child_fwnode_put,
> > > + fwnode);
> > > + if (ret) {
> > > + device_set_node(&pdev->dev, NULL);
> > > + fwnode_handle_put(fwnode);
> > > + goto fail_of_entry;
> > > + }
> > > + }
> > > + }
> > > + }
> >
> > mfd_add_device() is getting very busy now with support for all of these
> > different registration APIs. Suggest that we start breaking them out.
> >
> > > +
> > > if (cell->pdata_size) {
> > > ret = platform_device_add_data(pdev,
> > > cell->platform_data, cell->pdata_size);
> > > diff --git a/include/linux/mfd/core.h b/include/linux/mfd/core.h
> > > index faeea7abd688f223fb0b31cde0a9b69dfe2a61ff..abfc26c057d6ee46947ba2b6f2e99f420e74b127 100644
> > > --- a/include/linux/mfd/core.h
> > > +++ b/include/linux/mfd/core.h
> > > @@ -50,6 +50,7 @@
> > > #define MFD_DEP_LEVEL_HIGH 1
> > >
> > > struct irq_domain;
> > > +struct fwnode_handle;
> > > struct software_node;
> > >
> > > /* Matches ACPI PNP id, either _HID or _CID, or ACPI _ADR */
> > > @@ -80,6 +81,19 @@ struct mfd_cell {
> > >
> > > /* Software node for the device. */
> > > const struct software_node *swnode;
> > > + /*
> > > + * Callback to return an explicit firmware node.
> > > + * @parent: MFD parent device passed to mfd_add_devices().
> > > + *
> > > + * Called only if OF/ACPI matching did not assign a fwnode.
> > > + * Ownership of the returned reference is transferred to MFD core.
> > > + *
> > > + * Return a referenced fwnode or NULL if none is available.
> > > + *
> > > + * mfd_cell must be zero-initialized or get_child_fwnode must be NULL
> > > + * when unused.
> > > + */
> > > + struct fwnode_handle *(*get_child_fwnode)(struct device *parent);
> >
> > I'm very much against pointers to functions if they can be avoided. Why
> > does fwnode need this and none of the other APIs do?
> >
>
> I suggested it because of its flexibility. The alternative I had in
> mind is something like a new field in mfd_cell:
>
> const char *cell_node_name;
>
> Which - if set - would tell MFD to look up an fwnode that's a child of
> the parent device's node by name - as it may not have a compatible.
Remind me why the chlid device can't look-up its own fwnode?
--
Lee Jones
^ permalink raw reply
* Re: [PATCH v2 1/3] KVM: arm64: Reset page order in pKVM hyp_pool_init
From: Vincent Donnefort @ 2026-05-21 13:21 UTC (permalink / raw)
To: Fuad Tabba
Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
catalin.marinas, will, linux-arm-kernel, kvmarm, kernel-team,
qperret, Sashiko
In-Reply-To: <CA+EHjTxPoxjvMTZX5w+UyVgC=W3VUSDoOQ-tCDLfnae16SqoMQ@mail.gmail.com>
On Thu, May 21, 2026 at 02:07:36PM +0100, Fuad Tabba wrote:
> On Thu, 21 May 2026 at 11:22, Vincent Donnefort <vdonnefort@google.com> wrote:
> >
> > When a VM fails to initialise after its stage-2 hyp_pool has been
> > initialised, that stage-2 must be torn down entirely. This requires
> > resetting both the refcount and the order of its pages back to 0.
> >
> > Currently, reclaim_pgtable_pages() implicitly resets the page order by
> > allocating the entire pool with order-0 granularity. However, in the VM
> > initialisation error path, the addresses of the donated memory (the PGD)
> > are already known, making it unnecessary to iterate over all pages in
> > the pool.
> >
> > Since the vmemmap page order is a hyp_pool-specific field, leaving a
> > non-zero order on hyp_pool destruction is harmless until another pool
> > attempts to admit the page. Instead of resetting this field during
> > destruction, reset it during pool initialization in hyp_pool_init().
> > Note that pages added to the pool outside of the initial pool range
> > (e.g., via guest_s2_zalloc_page()) must still have their order managed
> > manually.
> >
> > While at it, add a WARN_ON() in the hyp_pool attach path to catch
> > unexpected page orders that exceed the pool's max_order.
> >
> > Fixes: 256b4668cd89 ("KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization")
> > Reported-by: Sashiko <sashiko-bot@kernel.org>
> > Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > index 25f04629014e..89eb20d4fee4 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > @@ -322,7 +322,6 @@ void reclaim_pgtable_pages(struct pkvm_hyp_vm *vm, struct kvm_hyp_memcache *mc)
> > while (addr) {
> > page = hyp_virt_to_page(addr);
> > page->refcount = 0;
> > - page->order = 0;
> > push_hyp_memcache(mc, addr, hyp_virt_to_phys);
> > WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(addr), 1));
> > addr = hyp_alloc_pages(&vm->pool, 0);
> > diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > index a1eb27a1a747..c3b3dc5a8ea7 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
> > @@ -97,6 +97,8 @@ static void __hyp_attach_page(struct hyp_pool *pool,
> > u8 order = p->order;
> > struct hyp_page *buddy;
> >
> > + WARN_ON(p->order > pool->max_order);
> > +
>
> Could you add a brief comment? It took me a minute to figure out what this
> catches. IIUC it's not attach's own input, it's a stale p->order from way back
> when an external page was popped from a memcache (today only via
> guest_s2_zalloc_page()). Right?
I think it'd be self explanatory if that was next the page_add_to_list, but that
wouldn't protect the memset (that's really best-effort though).
How about?
/*
* A page with an order bigger than the pool's max is an 'external' page
* whose order hasn't been reset before being added to the pool.
*/
But now I am thinking I can do way better: we can easily identify external
pages, so I could just force the order to 0 in that case.
WDYS?
>
> With that.
>
> Reviewed-by: Fuad Tabba <tabba@google.com>
> Tested-by: Fuad Tabba <tabba@google.com>
>
> Cheers,
> /fuad
>
>
>
>
> > memset(hyp_page_to_virt(p), 0, PAGE_SIZE << p->order);
> >
> > /* Skip coalescing for 'external' pages being freed into the pool. */
> > @@ -237,8 +239,10 @@ int hyp_pool_init(struct hyp_pool *pool, u64 pfn, unsigned int nr_pages,
> >
> > /* Init the vmemmap portion */
> > p = hyp_phys_to_page(phys);
> > - for (i = 0; i < nr_pages; i++)
> > + for (i = 0; i < nr_pages; i++) {
> > hyp_set_page_refcounted(&p[i]);
> > + p[i].order = 0;
> > + }
> >
> > /* Attach the unused pages to the buddy tree */
> > for (i = reserved_pages; i < nr_pages; i++)
> > --
> > 2.54.0.746.g67dd491aae-goog
> >
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox