* [PATCH 5/8] KVM: arm64: Add host and hypervisor vCPU lookup primitives
From: Fuad Tabba @ 2026-06-19 7:05 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070508.802802-1-tabba@google.com>
From: Marc Zyngier <maz@kernel.org>
The nVHE hypervisor repeatedly resolves a host vCPU into the EL2
address space and validates that the loaded hyp vCPU matches it, with
that logic open-coded in each handler.
Add __get_host_hyp_vcpus() and the get_host_hyp_vcpus() macro, which
translate the host vCPU into the hypervisor's address space and, when
pKVM is enabled, also return the loaded hyp vCPU if it matches. If pKVM
is enabled but the loaded hyp vCPU does not correspond to the requested
host vCPU, both the host and hyp vCPU are returned as NULL. Convert
handle___kvm_vcpu_run() to use it.
No functional change intended.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 52 ++++++++++++++++++++++--------
1 file changed, 38 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 1d01c6e547f5..8923f594c264 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -212,14 +212,45 @@ static void handle___pkvm_vcpu_put(struct kvm_cpu_context *host_ctxt)
pkvm_put_hyp_vcpu(hyp_vcpu);
}
-static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
+static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
+ struct pkvm_hyp_vcpu **hyp_vcpup)
{
- DECLARE_REG(struct kvm_vcpu *, host_vcpu, host_ctxt, 1);
- int ret;
+ struct kvm_vcpu *host_vcpu = kern_hyp_va(arg);
+ struct pkvm_hyp_vcpu *hyp_vcpu = NULL;
if (unlikely(is_protected_kvm_enabled())) {
- struct pkvm_hyp_vcpu *hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
+ hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
+ if (!hyp_vcpu || hyp_vcpu->host_vcpu != host_vcpu) {
+ hyp_vcpu = NULL;
+ host_vcpu = NULL;
+ }
+ }
+
+ *hyp_vcpup = hyp_vcpu;
+ return host_vcpu;
+}
+
+#define get_host_hyp_vcpus(ctxt, regnr, hyp_vcpup) \
+ ({ \
+ DECLARE_REG(struct kvm_vcpu *, __vcpu, ctxt, regnr); \
+ __get_host_hyp_vcpus(__vcpu, hyp_vcpup); \
+ })
+
+static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
+{
+ struct pkvm_hyp_vcpu *hyp_vcpu;
+ struct kvm_vcpu *host_vcpu;
+ int ret;
+
+ host_vcpu = get_host_hyp_vcpus(host_ctxt, 1, &hyp_vcpu);
+
+ if (!host_vcpu) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (unlikely(hyp_vcpu)) {
/*
* KVM (and pKVM) doesn't support SME guests for now, and
* ensures that SME features aren't enabled in pstate when
@@ -231,23 +262,16 @@ static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
goto out;
}
- if (!hyp_vcpu) {
- ret = -EINVAL;
- goto out;
- }
-
flush_hyp_vcpu(hyp_vcpu);
ret = __kvm_vcpu_run(&hyp_vcpu->vcpu);
sync_hyp_vcpu(hyp_vcpu);
} else {
- struct kvm_vcpu *vcpu = kern_hyp_va(host_vcpu);
-
/* The host is fully trusted, run its vCPU directly. */
- fpsimd_lazy_switch_to_guest(vcpu);
- ret = __kvm_vcpu_run(vcpu);
- fpsimd_lazy_switch_to_host(vcpu);
+ fpsimd_lazy_switch_to_guest(host_vcpu);
+ ret = __kvm_vcpu_run(host_vcpu);
+ fpsimd_lazy_switch_to_host(host_vcpu);
}
out:
cpu_reg(host_ctxt, 1) = ret;
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH 7/8] KVM: arm64: Add primitives to flush/sync the VGIC state at EL2
From: Fuad Tabba @ 2026-06-19 7:05 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070508.802802-1-tabba@google.com>
From: Marc Zyngier <maz@kernel.org>
pKVM performs its own world switch for protected VMs but has no
primitives to move the per-vCPU VGIC state between the host and
hypervisor vCPU contexts.
Add flush_hyp_vgic_state() and sync_hyp_vgic_state(). Flush copies
vgic_hcr, the in-use list registers and used_lrs from the host into the
hyp vCPU and pins vgic_sre to a fixed value; sync copies vgic_hcr,
vgic_vmcr and the in-use list registers back. The active priority
registers are handled separately by the save/restore-aprs path.
Bound used_lrs by hyp_gicv3_nr_lr, the cached implemented-LR count,
instead of reading ICH_VTR_EL2 on each entry. That clamps the
host-supplied value and avoids a per-entry sysreg read that is costly
under NV.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 55 ++++++++++++++++++++++--------
1 file changed, 41 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f25ee3971528..0194965930e6 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -102,6 +102,45 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;
}
+static void flush_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+ struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+ struct vgic_v3_cpu_if *host_cpu_if, *hyp_cpu_if;
+ unsigned int used_lrs, i;
+
+ host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+ hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+
+ used_lrs = host_cpu_if->used_lrs;
+ used_lrs = min(used_lrs, hyp_gicv3_nr_lr);
+
+ hyp_cpu_if->vgic_hcr = host_cpu_if->vgic_hcr;
+ /* Should be a one-off */
+ hyp_cpu_if->vgic_sre = (ICC_SRE_EL1_DIB |
+ ICC_SRE_EL1_DFB |
+ ICC_SRE_EL1_SRE);
+ hyp_cpu_if->used_lrs = used_lrs;
+
+ for (i = 0; i < used_lrs; i++)
+ hyp_cpu_if->vgic_lr[i] = host_cpu_if->vgic_lr[i];
+}
+
+static void sync_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+ struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+ struct vgic_v3_cpu_if *host_cpu_if, *hyp_cpu_if;
+ unsigned int i;
+
+ host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+ hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+
+ host_cpu_if->vgic_hcr = hyp_cpu_if->vgic_hcr;
+ host_cpu_if->vgic_vmcr = hyp_cpu_if->vgic_vmcr;
+
+ for (i = 0; i < hyp_cpu_if->used_lrs; i++)
+ host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
+}
+
static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)
{
struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
@@ -150,13 +189,7 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
hyp_vcpu->vcpu.arch.vsesr_el2 = host_vcpu->arch.vsesr_el2;
- hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3 = host_vcpu->arch.vgic_cpu.vgic_v3;
-
- /* Bound used_lrs by the number of implemented list registers. */
- hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3.used_lrs =
- min_t(unsigned int,
- hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3.used_lrs,
- hyp_gicv3_nr_lr);
+ flush_hyp_vgic_state(hyp_vcpu);
hyp_vcpu->vcpu.arch.pid = host_vcpu->arch.pid;
}
@@ -164,9 +197,6 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
{
struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
- struct vgic_v3_cpu_if *hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
- struct vgic_v3_cpu_if *host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
- unsigned int i;
fpsimd_sve_sync(&hyp_vcpu->vcpu);
sync_debug_state(hyp_vcpu);
@@ -179,10 +209,7 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
host_vcpu->arch.iflags = hyp_vcpu->vcpu.arch.iflags;
- host_cpu_if->vgic_hcr = hyp_cpu_if->vgic_hcr;
- host_cpu_if->vgic_vmcr = hyp_cpu_if->vgic_vmcr;
- for (i = 0; i < hyp_cpu_if->used_lrs; ++i)
- host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
+ sync_hyp_vgic_state(hyp_vcpu);
}
static void handle___pkvm_vcpu_load(struct kvm_cpu_context *host_ctxt)
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH 4/8] KVM: arm64: Move PSCI helper functions to a shared header
From: Fuad Tabba @ 2026-06-19 7:05 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070508.802802-1-tabba@google.com>
Move kvm_psci_valid_affinity() and kvm_psci_narrow_to_32bit() from
psci.c to include/kvm/arm_psci.h, and move psci_affinity_mask() there
too, renaming it kvm_psci_affinity_mask() now that it is no longer
file-local. A follow-up series handles some protected-guest PSCI calls
at EL2 using these helpers.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/psci.c | 30 +-----------------------------
include/kvm/arm_psci.h | 27 +++++++++++++++++++++++++++
2 files changed, 28 insertions(+), 29 deletions(-)
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 3b5dbe9a0a0e..e3db84400d1f 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -21,16 +21,6 @@
* as described in ARM document number ARM DEN 0022A.
*/
-#define AFFINITY_MASK(level) ~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
-
-static unsigned long psci_affinity_mask(unsigned long affinity_level)
-{
- if (affinity_level <= 3)
- return MPIDR_HWID_BITMASK & AFFINITY_MASK(affinity_level);
-
- return 0;
-}
-
static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
{
/*
@@ -51,12 +41,6 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
return PSCI_RET_SUCCESS;
}
-static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
- unsigned long affinity)
-{
- return !(affinity & ~MPIDR_HWID_BITMASK);
-}
-
static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
{
struct vcpu_reset_state *reset_state;
@@ -135,7 +119,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
return PSCI_RET_INVALID_PARAMS;
/* Determine target affinity mask */
- target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
+ target_affinity_mask = kvm_psci_affinity_mask(lowest_affinity_level);
if (!target_affinity_mask)
return PSCI_RET_INVALID_PARAMS;
@@ -220,18 +204,6 @@ static void kvm_psci_system_suspend(struct kvm_vcpu *vcpu)
run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
}
-static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
-{
- int i;
-
- /*
- * Zero the input registers' upper 32 bits. They will be fully
- * zeroed on exit, so we're fine changing them in place.
- */
- for (i = 1; i < 4; i++)
- vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
-}
-
static unsigned long kvm_psci_check_allowed_function(struct kvm_vcpu *vcpu, u32 fn)
{
/*
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index cbaec804eb83..f86a006d6713 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -38,6 +38,33 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
return KVM_ARM_PSCI_0_1;
}
+/* Narrow the PSCI register arguments (r1 to r3) to 32 bits. */
+static inline void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
+{
+ int i;
+
+ /*
+ * Zero the input registers' upper 32 bits. They will be fully
+ * zeroed on exit, so we're fine changing them in place.
+ */
+ for (i = 1; i < 4; i++)
+ vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
+}
+
+static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
+ unsigned long affinity)
+{
+ return !(affinity & ~MPIDR_HWID_BITMASK);
+}
+
+static inline unsigned long kvm_psci_affinity_mask(unsigned long affinity_level)
+{
+ if (affinity_level <= 3)
+ return MPIDR_HWID_BITMASK &
+ ~((0x1UL << (affinity_level * MPIDR_LEVEL_BITS)) - 1);
+
+ return 0;
+}
int kvm_psci_call(struct kvm_vcpu *vcpu);
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH 6/8] KVM: arm64: Minimise EL2's exposure of host VGIC state during world switch
From: Fuad Tabba @ 2026-06-19 7:05 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070508.802802-1-tabba@google.com>
From: Marc Zyngier <maz@kernel.org>
The host passes a vgic_v3_cpu_if pointer to the __vgic_v3_save_aprs and
__vgic_v3_restore_vmcr_aprs hypercalls, which EL2 dereferences
wholesale. That exposes the host's full VGIC emulation state to the
hypervisor, against pKVM's isolation goals.
Recover the host vCPU from the supplied cpu_if via container_of() and
copy only vgic_vmcr and the active priority registers between EL2's
hyp-side state and the host vCPU, so EL2 no longer dereferences the
host's vgic_v3_cpu_if directly.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 67 ++++++++++++++++++++++++++++--
1 file changed, 63 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 8923f594c264..f25ee3971528 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -7,6 +7,8 @@
#include <hyp/adjust_pc.h>
#include <hyp/switch.h>
+#include <linux/irqchip/arm-gic-v3.h>
+
#include <asm/pgtable-types.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
@@ -237,6 +239,16 @@ static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
__get_host_hyp_vcpus(__vcpu, hyp_vcpup); \
})
+#define get_host_hyp_vcpus_from_vgic_v3_cpu_if(ctxt, regnr, hyp_vcpup) \
+ ({ \
+ DECLARE_REG(struct vgic_v3_cpu_if *, cif, ctxt, regnr);\
+ struct kvm_vcpu *__vcpu = container_of(cif, \
+ struct kvm_vcpu, \
+ arch.vgic_cpu.vgic_v3); \
+ \
+ __get_host_hyp_vcpus(__vcpu, hyp_vcpup); \
+ })
+
static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
{
struct pkvm_hyp_vcpu *hyp_vcpu;
@@ -506,16 +518,63 @@ static void handle___vgic_v3_init_lrs(struct kvm_cpu_context *host_ctxt)
static void handle___vgic_v3_save_aprs(struct kvm_cpu_context *host_ctxt)
{
- DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
+ struct pkvm_hyp_vcpu *hyp_vcpu;
+ struct kvm_vcpu *host_vcpu;
- __vgic_v3_save_aprs(kern_hyp_va(cpu_if));
+ host_vcpu = get_host_hyp_vcpus_from_vgic_v3_cpu_if(host_ctxt, 1,
+ &hyp_vcpu);
+ if (!host_vcpu)
+ return;
+
+ if (unlikely(hyp_vcpu)) {
+ struct vgic_v3_cpu_if *hyp_cpu_if, *host_cpu_if;
+ int i;
+
+ hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+ __vgic_v3_save_aprs(hyp_cpu_if);
+
+ host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+ host_cpu_if->vgic_vmcr = hyp_cpu_if->vgic_vmcr;
+ for (i = 0; i < ARRAY_SIZE(host_cpu_if->vgic_ap0r); i++) {
+ host_cpu_if->vgic_ap0r[i] = hyp_cpu_if->vgic_ap0r[i];
+ host_cpu_if->vgic_ap1r[i] = hyp_cpu_if->vgic_ap1r[i];
+ }
+ } else {
+ __vgic_v3_save_aprs(&host_vcpu->arch.vgic_cpu.vgic_v3);
+ }
}
static void handle___vgic_v3_restore_vmcr_aprs(struct kvm_cpu_context *host_ctxt)
{
- DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
+ struct pkvm_hyp_vcpu *hyp_vcpu;
+ struct kvm_vcpu *host_vcpu;
- __vgic_v3_restore_vmcr_aprs(kern_hyp_va(cpu_if));
+ host_vcpu = get_host_hyp_vcpus_from_vgic_v3_cpu_if(host_ctxt, 1,
+ &hyp_vcpu);
+ if (!host_vcpu)
+ return;
+
+ if (unlikely(hyp_vcpu)) {
+ struct vgic_v3_cpu_if *hyp_cpu_if, *host_cpu_if;
+ int i;
+
+ hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+ host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+
+ hyp_cpu_if->vgic_vmcr = host_cpu_if->vgic_vmcr;
+ /* Should be a one-off */
+ hyp_cpu_if->vgic_sre = (ICC_SRE_EL1_DIB |
+ ICC_SRE_EL1_DFB |
+ ICC_SRE_EL1_SRE);
+ for (i = 0; i < ARRAY_SIZE(host_cpu_if->vgic_ap0r); i++) {
+ hyp_cpu_if->vgic_ap0r[i] = host_cpu_if->vgic_ap0r[i];
+ hyp_cpu_if->vgic_ap1r[i] = host_cpu_if->vgic_ap1r[i];
+ }
+
+ __vgic_v3_restore_vmcr_aprs(hyp_cpu_if);
+ } else {
+ __vgic_v3_restore_vmcr_aprs(&host_vcpu->arch.vgic_cpu.vgic_v3);
+ }
}
static void handle___pkvm_init(struct kvm_cpu_context *host_ctxt)
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH 8/8] KVM: arm64: Implement lazy vCPU state sync for non-protected guests
From: Fuad Tabba @ 2026-06-19 7:05 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070508.802802-1-tabba@google.com>
pKVM copies a non-protected guest's register context between the host
and the hypervisor on every world switch, even when the host never
inspects it. Defer the copy: on entry, flush the host context into the
hyp vCPU only when the host marked it dirty (PKVM_HOST_STATE_DIRTY); on
exit, leave it in the hyp vCPU and copy it back only when the host needs
it, via a __pkvm_vcpu_sync_state hypercall on trap handling or at vcpu
put. A protected guest's context is copied as before, since lazy sync
only helps where the host is trusted to see the guest's registers.
PC and PSTATE are the exception: they are copied back on every exit so
the kvm_exit tracepoint reports the guest's real exit PC, and the run
loop's vcpu_mode_is_bad_32bit() and SError-masking checks evaluate the
guest's current PSTATE rather than the value left by the previous sync.
handle_exit_early() can also inject an SError, which writes the guest
context (ESR_EL1) outside the trap-handling path. For a non-protected
guest it therefore syncs the context from the hyp vCPU and marks it
dirty, as handle_trap_exceptions() does, so the injection reaches the
hyp vCPU on re-entry rather than being dropped.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/include/asm/kvm_asm.h | 1 +
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/kvm/arm.c | 7 +++
arch/arm64/kvm/handle_exit.c | 30 +++++++++++
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 86 ++++++++++++++++++++++++++++--
5 files changed, 121 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 043495f7fc78..6e1135b3ded4 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -113,6 +113,7 @@ enum __kvm_host_smccc_func {
__KVM_HOST_SMCCC_FUNC___pkvm_finalize_teardown_vm,
__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_load,
__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_put,
+ __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_sync_state,
__KVM_HOST_SMCCC_FUNC___pkvm_tlb_flush_vmid,
MARKER(__KVM_HOST_SMCCC_FUNC_MAX)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2faa60df847d..caa39ee5125f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1068,6 +1068,8 @@ struct kvm_vcpu_arch {
#define INCREMENT_PC __vcpu_single_flag(iflags, BIT(1))
/* Target EL/MODE (not a single flag, but let's abuse the macro) */
#define EXCEPT_MASK __vcpu_single_flag(iflags, GENMASK(3, 1))
+/* Host-set: the hyp flushes the non-protected vCPU state in on entry */
+#define PKVM_HOST_STATE_DIRTY __vcpu_single_flag(iflags, BIT(4))
/* Helpers to encode exceptions with minimum fuss */
#define __EXCEPT_MASK_VAL unpack_vcpu_flag(EXCEPT_MASK)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3732ee9eb0d4..4e89558d8027 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -733,6 +733,10 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
if (is_protected_kvm_enabled()) {
kvm_call_hyp(__vgic_v3_save_aprs, &vcpu->arch.vgic_cpu.vgic_v3);
kvm_call_hyp_nvhe(__pkvm_vcpu_put);
+
+ /* __pkvm_vcpu_put implies a sync of the state */
+ if (!kvm_vm_is_protected(vcpu->kvm))
+ vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
}
kvm_vcpu_put_debug(vcpu);
@@ -964,6 +968,9 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
return ret;
if (is_protected_kvm_enabled()) {
+ /* Start with the vcpu in a dirty state */
+ if (!kvm_vm_is_protected(vcpu->kvm))
+ vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
ret = pkvm_create_hyp_vm(kvm);
if (ret)
return ret;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 54aedf93c78b..8963621bcdd1 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -422,6 +422,20 @@ static int handle_trap_exceptions(struct kvm_vcpu *vcpu)
{
int handled;
+ /*
+ * If we run a non-protected VM when protection is enabled
+ * system-wide, resync the state from the hypervisor and mark
+ * it as dirty on the host side if it wasn't dirty already
+ * (which could happen if preemption has taken place).
+ */
+ if (is_protected_kvm_enabled() && !kvm_vm_is_protected(vcpu->kvm)) {
+ guard(preempt)();
+ if (!(vcpu_get_flag(vcpu, PKVM_HOST_STATE_DIRTY))) {
+ kvm_call_hyp_nvhe(__pkvm_vcpu_sync_state);
+ vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
+ }
+ }
+
/*
* See ARM ARM B1.14.1: "Hyp traps on instructions
* that fail their condition code check"
@@ -489,6 +503,22 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index)
/* For exit types that need handling before we can be preempted */
void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
{
+ bool inject_serror = ARM_SERROR_PENDING(exception_index) ||
+ ARM_EXCEPTION_CODE(exception_index) == ARM_EXCEPTION_EL1_SERROR;
+
+ /*
+ * An SError injected below writes the host ctxt; for a non-protected
+ * guest, sync from the hyp vCPU and keep it dirty so it isn't dropped.
+ */
+ if (is_protected_kvm_enabled()) {
+ vcpu_clear_flag(vcpu, PKVM_HOST_STATE_DIRTY);
+
+ if (inject_serror && !kvm_vm_is_protected(vcpu->kvm)) {
+ kvm_call_hyp_nvhe(__pkvm_vcpu_sync_state);
+ vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
+ }
+ }
+
if (ARM_SERROR_PENDING(exception_index)) {
if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN)) {
u64 disr = kvm_vcpu_get_disr(vcpu);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 0194965930e6..acf53aae4fe4 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -141,6 +141,48 @@ static void sync_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
}
+static void __copy_vcpu_state(const struct kvm_vcpu *from_vcpu,
+ struct kvm_vcpu *to_vcpu)
+{
+ int i;
+
+ to_vcpu->arch.ctxt.regs = from_vcpu->arch.ctxt.regs;
+ to_vcpu->arch.ctxt.spsr_abt = from_vcpu->arch.ctxt.spsr_abt;
+ to_vcpu->arch.ctxt.spsr_und = from_vcpu->arch.ctxt.spsr_und;
+ to_vcpu->arch.ctxt.spsr_irq = from_vcpu->arch.ctxt.spsr_irq;
+ to_vcpu->arch.ctxt.spsr_fiq = from_vcpu->arch.ctxt.spsr_fiq;
+ to_vcpu->arch.ctxt.fp_regs = from_vcpu->arch.ctxt.fp_regs;
+
+ /*
+ * Copy the sysregs, but don't mess with the timer state which
+ * is directly handled by EL1 and is expected to be preserved.
+ * enum vcpu_sysreg is sparse: VNCR-mapped registers take values
+ * derived from their VNCR page offset, so the timer registers do
+ * not form a contiguous numeric range and must be skipped by name.
+ */
+ for (i = 1; i < NR_SYS_REGS; i++) {
+ switch (i) {
+ case CNTVOFF_EL2:
+ case CNTV_CVAL_EL0:
+ case CNTV_CTL_EL0:
+ case CNTP_CVAL_EL0:
+ case CNTP_CTL_EL0:
+ continue;
+ }
+ to_vcpu->arch.ctxt.sys_regs[i] = from_vcpu->arch.ctxt.sys_regs[i];
+ }
+}
+
+static void sync_hyp_vcpu_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+ __copy_vcpu_state(&hyp_vcpu->vcpu, hyp_vcpu->host_vcpu);
+}
+
+static void flush_hyp_vcpu_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+ __copy_vcpu_state(hyp_vcpu->host_vcpu, &hyp_vcpu->vcpu);
+}
+
static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)
{
struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
@@ -170,7 +212,17 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
fpsimd_sve_flush();
flush_debug_state(hyp_vcpu);
- hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt;
+ /*
+ * If we deal with a non-protected guest and the state is potentially
+ * dirty (from a host perspective), copy the state back into the hyp
+ * vcpu.
+ */
+ if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
+ if (vcpu_get_flag(host_vcpu, PKVM_HOST_STATE_DIRTY))
+ flush_hyp_vcpu_state(hyp_vcpu);
+ } else {
+ hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt;
+ }
/* __hyp_running_vcpu must be NULL in a guest context. */
hyp_vcpu->vcpu.arch.ctxt.__hyp_running_vcpu = NULL;
@@ -201,9 +253,13 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
fpsimd_sve_sync(&hyp_vcpu->vcpu);
sync_debug_state(hyp_vcpu);
- host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt;
-
- host_vcpu->arch.hcr_el2 = hyp_vcpu->vcpu.arch.hcr_el2;
+ if (pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
+ host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt;
+ } else {
+ /* Keep PC (tracepoint) and PSTATE (vcpu_mode_is_bad_32bit) current. */
+ host_vcpu->arch.ctxt.regs.pc = hyp_vcpu->vcpu.arch.ctxt.regs.pc;
+ host_vcpu->arch.ctxt.regs.pstate = hyp_vcpu->vcpu.arch.ctxt.regs.pstate;
+ }
host_vcpu->arch.fault = hyp_vcpu->vcpu.arch.fault;
@@ -237,8 +293,27 @@ static void handle___pkvm_vcpu_put(struct kvm_cpu_context *host_ctxt)
{
struct pkvm_hyp_vcpu *hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
- if (hyp_vcpu)
+ if (hyp_vcpu) {
+ struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+
+ if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu) &&
+ !vcpu_get_flag(host_vcpu, PKVM_HOST_STATE_DIRTY)) {
+ sync_hyp_vcpu_state(hyp_vcpu);
+ }
+
pkvm_put_hyp_vcpu(hyp_vcpu);
+ }
+}
+
+static void handle___pkvm_vcpu_sync_state(struct kvm_cpu_context *host_ctxt)
+{
+ struct pkvm_hyp_vcpu *hyp_vcpu;
+
+ hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
+ if (!hyp_vcpu || pkvm_hyp_vcpu_is_protected(hyp_vcpu))
+ return;
+
+ sync_hyp_vcpu_state(hyp_vcpu);
}
static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
@@ -869,6 +944,7 @@ static const hcall_t host_hcall[] = {
HANDLE_FUNC(__pkvm_finalize_teardown_vm),
HANDLE_FUNC(__pkvm_vcpu_load),
HANDLE_FUNC(__pkvm_vcpu_put),
+ HANDLE_FUNC(__pkvm_vcpu_sync_state),
HANDLE_FUNC(__pkvm_tlb_flush_vmid),
};
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* Re: [PATCH 0/8] KVM: arm64: Rework pKVM vCPU state synchronisation
From: Fuad Tabba @ 2026-06-19 7:06 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim
In-Reply-To: <20260619070508.802802-1-tabba@google.com>
Sorry, as the emails were being sent, I realized I forgot the V2 in
the subject. Resending now. Sorry for the noise.
Cheers,
/fuad
On Fri, 19 Jun 2026 at 08:05, Fuad Tabba <tabba@google.com> wrote:
>
> Hi folks,
>
> Changes since v1 [2]:
> - Dropped the guard()/scoped_guard() conversion patches: standalone churn
> on code this series does not otherwise rework. (Marc)
> - Rebased onto kvmarm/next. The VGIC flush primitive now bounds used_lrs
> using the cached hyp_gicv3_nr_lr instead of reading ICH_VTR_EL2 on every
> entry. (Marc)
> - Grouped the PKVM_HOST_STATE_DIRTY flag with the other iflags and
> clarified its comment. (Marc)
> - Sync PSTATE alongside PC on every non-protected exit, and sync+dirty
> before host-side SError injection so the syndrome is not dropped. (sashiko)
> - Various cleanups and tidying up. (Vincent)
>
> Building on Will's pKVM infrastructure series [1], this series reworks
> how pKVM moves vCPU state between the host and EL2, and stops copying a
> non-protected guest's state on every world switch.
>
> EL2 gains proper primitives for the state it transfers: vCPU lookup
> helpers, and VGIC flush/sync that reduces how much host state EL2
> dereferences. The series also moves some preparatory code (such as sys
> reg access and PSCI helpers) to shared headers and HYP, and implements
> lazy copying of a non-protected guest's register state back to the host
> until the host actually needs it, instead of on every exit.
>
> This is the first of two series moving pKVM vCPU state management to
> EL2. The follow-up completes the job for protected VMs: state
> isolation, PSCI handling at EL2, and the resulting API behaviour.
>
> The series is structured as follows:
>
> 01-04: Preparatory refactoring (MPIDR, sys reg access, vCPU reset, PSCI
> helpers) to shared headers and HYP.
> 05: Host and hypervisor vCPU lookup primitives.
> 06-07: VGIC: reduce EL2's exposure to host state, add flush/sync primitives.
> 08: Lazy state sync for non-protected guests.
>
> Based on kvmarm/next.
>
> [1] https://lore.kernel.org/all/20260105154939.11041-1-will@kernel.org/
> [2] https://lore.kernel.org/all/20260612065925.755562-1-tabba@google.com/
>
> Cheers,
> /fuad
>
> Fuad Tabba (5):
> KVM: arm64: Extract MPIDR computation into a shared header
> KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
> KVM: arm64: Factor out reusable vCPU reset helpers
> KVM: arm64: Move PSCI helper functions to a shared header
> KVM: arm64: Implement lazy vCPU state sync for non-protected guests
>
> Marc Zyngier (3):
> KVM: arm64: Add host and hypervisor vCPU lookup primitives
> KVM: arm64: Minimise EL2's exposure of host VGIC state during world
> switch
> KVM: arm64: Add primitives to flush/sync the VGIC state at EL2
>
> arch/arm64/include/asm/kvm_arm.h | 12 ++
> arch/arm64/include/asm/kvm_asm.h | 1 +
> arch/arm64/include/asm/kvm_emulate.h | 79 +++++++-
> arch/arm64/include/asm/kvm_host.h | 2 +
> arch/arm64/kvm/arm.c | 7 +
> arch/arm64/kvm/handle_exit.c | 30 ++++
> arch/arm64/kvm/hyp/exception.c | 34 +---
> arch/arm64/kvm/hyp/nvhe/hyp-main.c | 258 +++++++++++++++++++++++----
> arch/arm64/kvm/psci.c | 30 +---
> arch/arm64/kvm/reset.c | 60 +------
> arch/arm64/kvm/sys_regs.c | 14 +-
> arch/arm64/kvm/sys_regs.h | 19 ++
> include/kvm/arm_psci.h | 27 +++
> 13 files changed, 410 insertions(+), 163 deletions(-)
>
> --
> 2.55.0.rc0.738.g0c8ab3ebcc-goog
>
^ permalink raw reply
* [PATCH v2 0/8] KVM: arm64: Rework pKVM vCPU state synchronisation
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
Hi folks,
Changes since v1 [2]:
- Dropped the guard()/scoped_guard() conversion patches: standalone churn
on code this series does not otherwise rework. (Marc)
- Rebased onto kvmarm/next. The VGIC flush primitive now bounds used_lrs
using the cached hyp_gicv3_nr_lr instead of reading ICH_VTR_EL2 on every
entry. (Marc)
- Grouped the PKVM_HOST_STATE_DIRTY flag with the other iflags and
clarified its comment. (Marc)
- Sync PSTATE alongside PC on every non-protected exit, and sync+dirty
before host-side SError injection so the syndrome is not dropped. (sashiko)
- Various cleanups and tidying up. (Vincent)
Building on Will's pKVM infrastructure series [1], this series reworks
how pKVM moves vCPU state between the host and EL2, and stops copying a
non-protected guest's state on every world switch.
EL2 gains proper primitives for the state it transfers: vCPU lookup
helpers, and VGIC flush/sync that reduces how much host state EL2
dereferences. The series also moves some preparatory code (such as sys
reg access and PSCI helpers) to shared headers and HYP, and implements
lazy copying of a non-protected guest's register state back to the host
until the host actually needs it, instead of on every exit.
This is the first of two series moving pKVM vCPU state management to
EL2. The follow-up completes the job for protected VMs: state
isolation, PSCI handling at EL2, and the resulting API behaviour.
The series is structured as follows:
01-04: Preparatory refactoring (MPIDR, sys reg access, vCPU reset, PSCI
helpers) to shared headers and HYP.
05: Host and hypervisor vCPU lookup primitives.
06-07: VGIC: reduce EL2's exposure to host state, add flush/sync primitives.
08: Lazy state sync for non-protected guests.
Based on kvmarm/next.
[1] https://lore.kernel.org/all/20260105154939.11041-1-will@kernel.org/
[2] https://lore.kernel.org/all/20260612065925.755562-1-tabba@google.com/
Cheers,
/fuad
Fuad Tabba (5):
KVM: arm64: Extract MPIDR computation into a shared header
KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
KVM: arm64: Factor out reusable vCPU reset helpers
KVM: arm64: Move PSCI helper functions to a shared header
KVM: arm64: Implement lazy vCPU state sync for non-protected guests
Marc Zyngier (3):
KVM: arm64: Add host and hypervisor vCPU lookup primitives
KVM: arm64: Minimise EL2's exposure of host VGIC state during world
switch
KVM: arm64: Add primitives to flush/sync the VGIC state at EL2
arch/arm64/include/asm/kvm_arm.h | 12 ++
arch/arm64/include/asm/kvm_asm.h | 1 +
arch/arm64/include/asm/kvm_emulate.h | 79 +++++++-
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/kvm/arm.c | 7 +
arch/arm64/kvm/handle_exit.c | 30 ++++
arch/arm64/kvm/hyp/exception.c | 34 +---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 258 +++++++++++++++++++++++----
arch/arm64/kvm/psci.c | 30 +---
arch/arm64/kvm/reset.c | 60 +------
arch/arm64/kvm/sys_regs.c | 14 +-
arch/arm64/kvm/sys_regs.h | 19 ++
include/kvm/arm_psci.h | 27 +++
13 files changed, 410 insertions(+), 163 deletions(-)
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply
* [PATCH v2 1/8] KVM: arm64: Extract MPIDR computation into a shared header
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070719.812227-1-tabba@google.com>
Extract the vCPU MPIDR computation embedded in reset_mpidr() into a
kvm_calculate_mpidr() inline in sys_regs.h, so it can be computed
without duplicating the logic. A follow-up series reuses it to reset
protected vCPUs at EL2.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/sys_regs.c | 14 +-------------
arch/arm64/kvm/sys_regs.h | 19 +++++++++++++++++++
2 files changed, 20 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 33c921df19b5..674fabe1d40d 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -976,21 +976,9 @@ static u64 reset_actlr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
static u64 reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
{
- u64 mpidr;
+ u64 mpidr = kvm_calculate_mpidr(vcpu);
- /*
- * Map the vcpu_id into the first three affinity level fields of
- * the MPIDR. We limit the number of VCPUs in level 0 due to a
- * limitation to 16 CPUs in that level in the ICC_SGIxR registers
- * of the GICv3 to be able to address each CPU directly when
- * sending IPIs.
- */
- mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
- mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
- mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
- mpidr |= (1ULL << 31);
vcpu_write_sys_reg(vcpu, mpidr, MPIDR_EL1);
-
return mpidr;
}
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 2a983664220c..bd56a45abbf9 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -222,6 +222,25 @@ find_reg(const struct sys_reg_params *params, const struct sys_reg_desc table[],
return __inline_bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
}
+static inline u64 kvm_calculate_mpidr(const struct kvm_vcpu *vcpu)
+{
+ u64 mpidr;
+
+ /*
+ * Map the vcpu_id into the first three affinity level fields of
+ * the MPIDR. We limit the number of VCPUs in level 0 due to a
+ * limitation to 16 CPUs in that level in the ICC_SGIxR registers
+ * of the GICv3 to be able to address each CPU directly when
+ * sending IPIs.
+ */
+ mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
+ mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
+ mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
+ mpidr |= (1ULL << 31);
+
+ return mpidr;
+}
+
const struct sys_reg_desc *get_reg_by_id(u64 id,
const struct sys_reg_desc table[],
unsigned int num);
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH v2 2/8] KVM: arm64: Make vcpu_{read,write}_sys_reg available to HYP code
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070719.812227-1-tabba@google.com>
The vcpu_{read,write}_sys_reg() accessors are host-only, so helpers
built on them such as kvm_vcpu_set_be()/kvm_vcpu_is_be() cannot be
shared with hyp code. exception.c already wraps them in
__vcpu_{read,write}_sys_reg(), which pick the host- or hyp-side accessor
via has_vhe() and so are valid in any context.
Move those wrappers to kvm_emulate.h as kvm_vcpu_{read,write}_sys_reg()
and switch the callers over, so a follow-up series can share that
emulation code at EL2.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/include/asm/kvm_emulate.h | 22 +++++++++++++++---
arch/arm64/kvm/hyp/exception.c | 34 ++++++++--------------------
2 files changed, 28 insertions(+), 28 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 5bf3d7e1d92c..80b30fead3d1 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -506,6 +506,22 @@ static inline unsigned long kvm_vcpu_get_mpidr_aff(struct kvm_vcpu *vcpu)
return __vcpu_sys_reg(vcpu, MPIDR_EL1) & MPIDR_HWID_BITMASK;
}
+static inline u64 kvm_vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
+{
+ if (has_vhe())
+ return vcpu_read_sys_reg(vcpu, reg);
+
+ return __vcpu_sys_reg(vcpu, reg);
+}
+
+static inline void kvm_vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
+{
+ if (has_vhe())
+ vcpu_write_sys_reg(vcpu, val, reg);
+ else
+ __vcpu_assign_sys_reg(vcpu, reg, val);
+}
+
static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
{
if (vcpu_mode_is_32bit(vcpu)) {
@@ -516,9 +532,9 @@ static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
r = vcpu_has_nv(vcpu) ? SCTLR_EL2 : SCTLR_EL1;
- sctlr = vcpu_read_sys_reg(vcpu, r);
+ sctlr = kvm_vcpu_read_sys_reg(vcpu, r);
sctlr |= SCTLR_ELx_EE;
- vcpu_write_sys_reg(vcpu, sctlr, r);
+ kvm_vcpu_write_sys_reg(vcpu, sctlr, r);
}
}
@@ -533,7 +549,7 @@ static inline bool kvm_vcpu_is_be(struct kvm_vcpu *vcpu)
r = is_hyp_ctxt(vcpu) ? SCTLR_EL2 : SCTLR_EL1;
bit = vcpu_mode_priv(vcpu) ? SCTLR_ELx_EE : SCTLR_EL1_E0E;
- return vcpu_read_sys_reg(vcpu, r) & bit;
+ return kvm_vcpu_read_sys_reg(vcpu, r) & bit;
}
static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu,
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index bef40ddb16db..2cb68dc7d441 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -20,22 +20,6 @@
#error Hypervisor code only!
#endif
-static inline u64 __vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
-{
- if (has_vhe())
- return vcpu_read_sys_reg(vcpu, reg);
-
- return __vcpu_sys_reg(vcpu, reg);
-}
-
-static inline void __vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
-{
- if (has_vhe())
- vcpu_write_sys_reg(vcpu, val, reg);
- else
- __vcpu_assign_sys_reg(vcpu, reg, val);
-}
-
static void __vcpu_write_spsr(struct kvm_vcpu *vcpu, unsigned long target_mode,
u64 val)
{
@@ -101,14 +85,14 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
switch (target_mode) {
case PSR_MODE_EL1h:
- vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL1);
- sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
- __vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL1);
+ vbar = kvm_vcpu_read_sys_reg(vcpu, VBAR_EL1);
+ sctlr = kvm_vcpu_read_sys_reg(vcpu, SCTLR_EL1);
+ kvm_vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL1);
break;
case PSR_MODE_EL2h:
- vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL2);
- sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL2);
- __vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL2);
+ vbar = kvm_vcpu_read_sys_reg(vcpu, VBAR_EL2);
+ sctlr = kvm_vcpu_read_sys_reg(vcpu, SCTLR_EL2);
+ kvm_vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL2);
break;
default:
/* Don't do that */
@@ -185,7 +169,7 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
*/
static unsigned long get_except32_cpsr(struct kvm_vcpu *vcpu, u32 mode)
{
- u32 sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
+ u32 sctlr = kvm_vcpu_read_sys_reg(vcpu, SCTLR_EL1);
unsigned long old, new;
old = *vcpu_cpsr(vcpu);
@@ -281,7 +265,7 @@ static void enter_exception32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset)
{
unsigned long spsr = *vcpu_cpsr(vcpu);
bool is_thumb = (spsr & PSR_AA32_T_BIT);
- u32 sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
+ u32 sctlr = kvm_vcpu_read_sys_reg(vcpu, SCTLR_EL1);
u32 return_address;
*vcpu_cpsr(vcpu) = get_except32_cpsr(vcpu, mode);
@@ -305,7 +289,7 @@ static void enter_exception32(struct kvm_vcpu *vcpu, u32 mode, u32 vect_offset)
if (sctlr & (1 << 13))
vect_offset += 0xffff0000;
else /* always have security exceptions */
- vect_offset += __vcpu_read_sys_reg(vcpu, VBAR_EL1);
+ vect_offset += kvm_vcpu_read_sys_reg(vcpu, VBAR_EL1);
*vcpu_pc(vcpu) = vect_offset;
}
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH v2 4/8] KVM: arm64: Move PSCI helper functions to a shared header
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070719.812227-1-tabba@google.com>
Move kvm_psci_valid_affinity() and kvm_psci_narrow_to_32bit() from
psci.c to include/kvm/arm_psci.h, and move psci_affinity_mask() there
too, renaming it kvm_psci_affinity_mask() now that it is no longer
file-local. A follow-up series handles some protected-guest PSCI calls
at EL2 using these helpers.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/psci.c | 30 +-----------------------------
include/kvm/arm_psci.h | 27 +++++++++++++++++++++++++++
2 files changed, 28 insertions(+), 29 deletions(-)
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 3b5dbe9a0a0e..e3db84400d1f 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -21,16 +21,6 @@
* as described in ARM document number ARM DEN 0022A.
*/
-#define AFFINITY_MASK(level) ~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
-
-static unsigned long psci_affinity_mask(unsigned long affinity_level)
-{
- if (affinity_level <= 3)
- return MPIDR_HWID_BITMASK & AFFINITY_MASK(affinity_level);
-
- return 0;
-}
-
static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
{
/*
@@ -51,12 +41,6 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
return PSCI_RET_SUCCESS;
}
-static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
- unsigned long affinity)
-{
- return !(affinity & ~MPIDR_HWID_BITMASK);
-}
-
static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
{
struct vcpu_reset_state *reset_state;
@@ -135,7 +119,7 @@ static unsigned long kvm_psci_vcpu_affinity_info(struct kvm_vcpu *vcpu)
return PSCI_RET_INVALID_PARAMS;
/* Determine target affinity mask */
- target_affinity_mask = psci_affinity_mask(lowest_affinity_level);
+ target_affinity_mask = kvm_psci_affinity_mask(lowest_affinity_level);
if (!target_affinity_mask)
return PSCI_RET_INVALID_PARAMS;
@@ -220,18 +204,6 @@ static void kvm_psci_system_suspend(struct kvm_vcpu *vcpu)
run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
}
-static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
-{
- int i;
-
- /*
- * Zero the input registers' upper 32 bits. They will be fully
- * zeroed on exit, so we're fine changing them in place.
- */
- for (i = 1; i < 4; i++)
- vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
-}
-
static unsigned long kvm_psci_check_allowed_function(struct kvm_vcpu *vcpu, u32 fn)
{
/*
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index cbaec804eb83..f86a006d6713 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -38,6 +38,33 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
return KVM_ARM_PSCI_0_1;
}
+/* Narrow the PSCI register arguments (r1 to r3) to 32 bits. */
+static inline void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
+{
+ int i;
+
+ /*
+ * Zero the input registers' upper 32 bits. They will be fully
+ * zeroed on exit, so we're fine changing them in place.
+ */
+ for (i = 1; i < 4; i++)
+ vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
+}
+
+static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
+ unsigned long affinity)
+{
+ return !(affinity & ~MPIDR_HWID_BITMASK);
+}
+
+static inline unsigned long kvm_psci_affinity_mask(unsigned long affinity_level)
+{
+ if (affinity_level <= 3)
+ return MPIDR_HWID_BITMASK &
+ ~((0x1UL << (affinity_level * MPIDR_LEVEL_BITS)) - 1);
+
+ return 0;
+}
int kvm_psci_call(struct kvm_vcpu *vcpu);
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH v2 3/8] KVM: arm64: Factor out reusable vCPU reset helpers
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070719.812227-1-tabba@google.com>
Pull the reusable pieces out of kvm_reset_vcpu(): expose the reset
PSTATE values in kvm_arm.h, and split the core register reset and the
PSCI-driven reset into kvm_reset_vcpu_core() and kvm_reset_vcpu_psci().
A follow-up series reuses these to reset protected vCPUs at EL2.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/include/asm/kvm_arm.h | 12 ++++++
arch/arm64/include/asm/kvm_emulate.h | 57 ++++++++++++++++++++++++++
arch/arm64/kvm/reset.c | 60 ++--------------------------
3 files changed, 72 insertions(+), 57 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 3f9233b5a130..aba4ec09acd2 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -348,4 +348,16 @@
{ PSR_AA32_MODE_UND, "32-bit UND" }, \
{ PSR_AA32_MODE_SYS, "32-bit SYS" }
+/*
+ * ARMv8 Reset Values
+ */
+#define VCPU_RESET_PSTATE_EL1 (PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | \
+ PSR_F_BIT | PSR_D_BIT)
+
+#define VCPU_RESET_PSTATE_EL2 (PSR_MODE_EL2h | PSR_A_BIT | PSR_I_BIT | \
+ PSR_F_BIT | PSR_D_BIT)
+
+#define VCPU_RESET_PSTATE_SVC (PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
+ PSR_AA32_I_BIT | PSR_AA32_F_BIT)
+
#endif /* __ARM64_KVM_ARM_H__ */
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 80b30fead3d1..2385d8855fcf 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -704,4 +704,61 @@ static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu)
vcpu->arch.hcrx_el2 |= HCRX_EL2_EnASR;
}
}
+
+/* Reset a vcpu's core registers. */
+static inline void kvm_reset_vcpu_core(struct kvm_vcpu *vcpu)
+{
+ u32 pstate;
+
+ if (vcpu_el1_is_32bit(vcpu))
+ pstate = VCPU_RESET_PSTATE_SVC;
+ else if (vcpu_has_nv(vcpu))
+ pstate = VCPU_RESET_PSTATE_EL2;
+ else
+ pstate = VCPU_RESET_PSTATE_EL1;
+
+ /* Reset core registers */
+ memset(vcpu_gp_regs(vcpu), 0, sizeof(*vcpu_gp_regs(vcpu)));
+ memset(&vcpu->arch.ctxt.fp_regs, 0, sizeof(vcpu->arch.ctxt.fp_regs));
+ vcpu->arch.ctxt.spsr_abt = 0;
+ vcpu->arch.ctxt.spsr_und = 0;
+ vcpu->arch.ctxt.spsr_irq = 0;
+ vcpu->arch.ctxt.spsr_fiq = 0;
+ vcpu_gp_regs(vcpu)->pstate = pstate;
+}
+
+/* PSCI reset handling for a vcpu. */
+static inline void kvm_reset_vcpu_psci(struct kvm_vcpu *vcpu,
+ struct vcpu_reset_state *reset_state)
+{
+ unsigned long target_pc = reset_state->pc;
+
+ /* Gracefully handle Thumb2 entry point */
+ if (vcpu_mode_is_32bit(vcpu) && (target_pc & 1)) {
+ target_pc &= ~1UL;
+ vcpu_set_thumb(vcpu);
+ }
+
+ /* Propagate caller endianness */
+ if (reset_state->be)
+ kvm_vcpu_set_be(vcpu);
+
+ *vcpu_pc(vcpu) = target_pc;
+
+ /*
+ * We may come from a state where either a PC update was
+ * pending (SMC call resulting in PC being increpented to
+ * skip the SMC) or a pending exception. Make sure we get
+ * rid of all that, as this cannot be valid out of reset.
+ *
+ * Note that clearing the exception mask also clears PC
+ * updates, but that's an implementation detail, and we
+ * really want to make it explicit.
+ */
+ vcpu_clear_flag(vcpu, PENDING_EXCEPTION);
+ vcpu_clear_flag(vcpu, EXCEPT_MASK);
+ vcpu_clear_flag(vcpu, INCREMENT_PC);
+ vcpu_set_reg(vcpu, 0, reset_state->r0);
+}
+
#endif /* __ARM64_KVM_EMULATE_H__ */
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index b963fd975aac..10eb7249aa9e 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -34,18 +34,6 @@
static u32 __ro_after_init kvm_ipa_limit;
unsigned int __ro_after_init kvm_host_sve_max_vl;
-/*
- * ARMv8 Reset Values
- */
-#define VCPU_RESET_PSTATE_EL1 (PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | \
- PSR_F_BIT | PSR_D_BIT)
-
-#define VCPU_RESET_PSTATE_EL2 (PSR_MODE_EL2h | PSR_A_BIT | PSR_I_BIT | \
- PSR_F_BIT | PSR_D_BIT)
-
-#define VCPU_RESET_PSTATE_SVC (PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
- PSR_AA32_I_BIT | PSR_AA32_F_BIT)
-
unsigned int __ro_after_init kvm_sve_max_vl;
int __init kvm_arm_init_sve(void)
@@ -191,7 +179,6 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
{
struct vcpu_reset_state reset_state;
bool loaded;
- u32 pstate;
spin_lock(&vcpu->arch.mp_state_lock);
reset_state = vcpu->arch.reset_state;
@@ -210,21 +197,8 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
kvm_vcpu_reset_sve(vcpu);
}
- if (vcpu_el1_is_32bit(vcpu))
- pstate = VCPU_RESET_PSTATE_SVC;
- else if (vcpu_has_nv(vcpu))
- pstate = VCPU_RESET_PSTATE_EL2;
- else
- pstate = VCPU_RESET_PSTATE_EL1;
-
/* Reset core registers */
- memset(vcpu_gp_regs(vcpu), 0, sizeof(*vcpu_gp_regs(vcpu)));
- memset(&vcpu->arch.ctxt.fp_regs, 0, sizeof(vcpu->arch.ctxt.fp_regs));
- vcpu->arch.ctxt.spsr_abt = 0;
- vcpu->arch.ctxt.spsr_und = 0;
- vcpu->arch.ctxt.spsr_irq = 0;
- vcpu->arch.ctxt.spsr_fiq = 0;
- vcpu_gp_regs(vcpu)->pstate = pstate;
+ kvm_reset_vcpu_core(vcpu);
/* Reset system registers */
kvm_reset_sys_regs(vcpu);
@@ -233,36 +207,8 @@ void kvm_reset_vcpu(struct kvm_vcpu *vcpu)
* Additional reset state handling that PSCI may have imposed on us.
* Must be done after all the sys_reg reset.
*/
- if (reset_state.reset) {
- unsigned long target_pc = reset_state.pc;
-
- /* Gracefully handle Thumb2 entry point */
- if (vcpu_mode_is_32bit(vcpu) && (target_pc & 1)) {
- target_pc &= ~1UL;
- vcpu_set_thumb(vcpu);
- }
-
- /* Propagate caller endianness */
- if (reset_state.be)
- kvm_vcpu_set_be(vcpu);
-
- *vcpu_pc(vcpu) = target_pc;
-
- /*
- * We may come from a state where either a PC update was
- * pending (SMC call resulting in PC being increpented to
- * skip the SMC) or a pending exception. Make sure we get
- * rid of all that, as this cannot be valid out of reset.
- *
- * Note that clearing the exception mask also clears PC
- * updates, but that's an implementation detail, and we
- * really want to make it explicit.
- */
- vcpu_clear_flag(vcpu, PENDING_EXCEPTION);
- vcpu_clear_flag(vcpu, EXCEPT_MASK);
- vcpu_clear_flag(vcpu, INCREMENT_PC);
- vcpu_set_reg(vcpu, 0, reset_state.r0);
- }
+ if (reset_state.reset)
+ kvm_reset_vcpu_psci(vcpu, &reset_state);
/* Reset timer */
kvm_timer_vcpu_reset(vcpu);
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH v2 6/8] KVM: arm64: Minimise EL2's exposure of host VGIC state during world switch
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070719.812227-1-tabba@google.com>
From: Marc Zyngier <maz@kernel.org>
The host passes a vgic_v3_cpu_if pointer to the __vgic_v3_save_aprs and
__vgic_v3_restore_vmcr_aprs hypercalls, which EL2 dereferences
wholesale. That exposes the host's full VGIC emulation state to the
hypervisor, against pKVM's isolation goals.
Recover the host vCPU from the supplied cpu_if via container_of() and
copy only vgic_vmcr and the active priority registers between EL2's
hyp-side state and the host vCPU, so EL2 no longer dereferences the
host's vgic_v3_cpu_if directly.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 67 ++++++++++++++++++++++++++++--
1 file changed, 63 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 8923f594c264..f25ee3971528 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -7,6 +7,8 @@
#include <hyp/adjust_pc.h>
#include <hyp/switch.h>
+#include <linux/irqchip/arm-gic-v3.h>
+
#include <asm/pgtable-types.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_emulate.h>
@@ -237,6 +239,16 @@ static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
__get_host_hyp_vcpus(__vcpu, hyp_vcpup); \
})
+#define get_host_hyp_vcpus_from_vgic_v3_cpu_if(ctxt, regnr, hyp_vcpup) \
+ ({ \
+ DECLARE_REG(struct vgic_v3_cpu_if *, cif, ctxt, regnr);\
+ struct kvm_vcpu *__vcpu = container_of(cif, \
+ struct kvm_vcpu, \
+ arch.vgic_cpu.vgic_v3); \
+ \
+ __get_host_hyp_vcpus(__vcpu, hyp_vcpup); \
+ })
+
static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
{
struct pkvm_hyp_vcpu *hyp_vcpu;
@@ -506,16 +518,63 @@ static void handle___vgic_v3_init_lrs(struct kvm_cpu_context *host_ctxt)
static void handle___vgic_v3_save_aprs(struct kvm_cpu_context *host_ctxt)
{
- DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
+ struct pkvm_hyp_vcpu *hyp_vcpu;
+ struct kvm_vcpu *host_vcpu;
- __vgic_v3_save_aprs(kern_hyp_va(cpu_if));
+ host_vcpu = get_host_hyp_vcpus_from_vgic_v3_cpu_if(host_ctxt, 1,
+ &hyp_vcpu);
+ if (!host_vcpu)
+ return;
+
+ if (unlikely(hyp_vcpu)) {
+ struct vgic_v3_cpu_if *hyp_cpu_if, *host_cpu_if;
+ int i;
+
+ hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+ __vgic_v3_save_aprs(hyp_cpu_if);
+
+ host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+ host_cpu_if->vgic_vmcr = hyp_cpu_if->vgic_vmcr;
+ for (i = 0; i < ARRAY_SIZE(host_cpu_if->vgic_ap0r); i++) {
+ host_cpu_if->vgic_ap0r[i] = hyp_cpu_if->vgic_ap0r[i];
+ host_cpu_if->vgic_ap1r[i] = hyp_cpu_if->vgic_ap1r[i];
+ }
+ } else {
+ __vgic_v3_save_aprs(&host_vcpu->arch.vgic_cpu.vgic_v3);
+ }
}
static void handle___vgic_v3_restore_vmcr_aprs(struct kvm_cpu_context *host_ctxt)
{
- DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
+ struct pkvm_hyp_vcpu *hyp_vcpu;
+ struct kvm_vcpu *host_vcpu;
- __vgic_v3_restore_vmcr_aprs(kern_hyp_va(cpu_if));
+ host_vcpu = get_host_hyp_vcpus_from_vgic_v3_cpu_if(host_ctxt, 1,
+ &hyp_vcpu);
+ if (!host_vcpu)
+ return;
+
+ if (unlikely(hyp_vcpu)) {
+ struct vgic_v3_cpu_if *hyp_cpu_if, *host_cpu_if;
+ int i;
+
+ hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+ host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+
+ hyp_cpu_if->vgic_vmcr = host_cpu_if->vgic_vmcr;
+ /* Should be a one-off */
+ hyp_cpu_if->vgic_sre = (ICC_SRE_EL1_DIB |
+ ICC_SRE_EL1_DFB |
+ ICC_SRE_EL1_SRE);
+ for (i = 0; i < ARRAY_SIZE(host_cpu_if->vgic_ap0r); i++) {
+ hyp_cpu_if->vgic_ap0r[i] = host_cpu_if->vgic_ap0r[i];
+ hyp_cpu_if->vgic_ap1r[i] = host_cpu_if->vgic_ap1r[i];
+ }
+
+ __vgic_v3_restore_vmcr_aprs(hyp_cpu_if);
+ } else {
+ __vgic_v3_restore_vmcr_aprs(&host_vcpu->arch.vgic_cpu.vgic_v3);
+ }
}
static void handle___pkvm_init(struct kvm_cpu_context *host_ctxt)
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH v2 5/8] KVM: arm64: Add host and hypervisor vCPU lookup primitives
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070719.812227-1-tabba@google.com>
From: Marc Zyngier <maz@kernel.org>
The nVHE hypervisor repeatedly resolves a host vCPU into the EL2
address space and validates that the loaded hyp vCPU matches it, with
that logic open-coded in each handler.
Add __get_host_hyp_vcpus() and the get_host_hyp_vcpus() macro, which
translate the host vCPU into the hypervisor's address space and, when
pKVM is enabled, also return the loaded hyp vCPU if it matches. If pKVM
is enabled but the loaded hyp vCPU does not correspond to the requested
host vCPU, both the host and hyp vCPU are returned as NULL. Convert
handle___kvm_vcpu_run() to use it.
No functional change intended.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 52 ++++++++++++++++++++++--------
1 file changed, 38 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 1d01c6e547f5..8923f594c264 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -212,14 +212,45 @@ static void handle___pkvm_vcpu_put(struct kvm_cpu_context *host_ctxt)
pkvm_put_hyp_vcpu(hyp_vcpu);
}
-static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
+static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
+ struct pkvm_hyp_vcpu **hyp_vcpup)
{
- DECLARE_REG(struct kvm_vcpu *, host_vcpu, host_ctxt, 1);
- int ret;
+ struct kvm_vcpu *host_vcpu = kern_hyp_va(arg);
+ struct pkvm_hyp_vcpu *hyp_vcpu = NULL;
if (unlikely(is_protected_kvm_enabled())) {
- struct pkvm_hyp_vcpu *hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
+ hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
+ if (!hyp_vcpu || hyp_vcpu->host_vcpu != host_vcpu) {
+ hyp_vcpu = NULL;
+ host_vcpu = NULL;
+ }
+ }
+
+ *hyp_vcpup = hyp_vcpu;
+ return host_vcpu;
+}
+
+#define get_host_hyp_vcpus(ctxt, regnr, hyp_vcpup) \
+ ({ \
+ DECLARE_REG(struct kvm_vcpu *, __vcpu, ctxt, regnr); \
+ __get_host_hyp_vcpus(__vcpu, hyp_vcpup); \
+ })
+
+static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
+{
+ struct pkvm_hyp_vcpu *hyp_vcpu;
+ struct kvm_vcpu *host_vcpu;
+ int ret;
+
+ host_vcpu = get_host_hyp_vcpus(host_ctxt, 1, &hyp_vcpu);
+
+ if (!host_vcpu) {
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (unlikely(hyp_vcpu)) {
/*
* KVM (and pKVM) doesn't support SME guests for now, and
* ensures that SME features aren't enabled in pstate when
@@ -231,23 +262,16 @@ static void handle___kvm_vcpu_run(struct kvm_cpu_context *host_ctxt)
goto out;
}
- if (!hyp_vcpu) {
- ret = -EINVAL;
- goto out;
- }
-
flush_hyp_vcpu(hyp_vcpu);
ret = __kvm_vcpu_run(&hyp_vcpu->vcpu);
sync_hyp_vcpu(hyp_vcpu);
} else {
- struct kvm_vcpu *vcpu = kern_hyp_va(host_vcpu);
-
/* The host is fully trusted, run its vCPU directly. */
- fpsimd_lazy_switch_to_guest(vcpu);
- ret = __kvm_vcpu_run(vcpu);
- fpsimd_lazy_switch_to_host(vcpu);
+ fpsimd_lazy_switch_to_guest(host_vcpu);
+ ret = __kvm_vcpu_run(host_vcpu);
+ fpsimd_lazy_switch_to_host(host_vcpu);
}
out:
cpu_reg(host_ctxt, 1) = ret;
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH v2 7/8] KVM: arm64: Add primitives to flush/sync the VGIC state at EL2
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070719.812227-1-tabba@google.com>
From: Marc Zyngier <maz@kernel.org>
pKVM performs its own world switch for protected VMs but has no
primitives to move the per-vCPU VGIC state between the host and
hypervisor vCPU contexts.
Add flush_hyp_vgic_state() and sync_hyp_vgic_state(). Flush copies
vgic_hcr, the in-use list registers and used_lrs from the host into the
hyp vCPU and pins vgic_sre to a fixed value; sync copies vgic_hcr,
vgic_vmcr and the in-use list registers back. The active priority
registers are handled separately by the save/restore-aprs path.
Bound used_lrs by hyp_gicv3_nr_lr, the cached implemented-LR count,
instead of reading ICH_VTR_EL2 on each entry. That clamps the
host-supplied value and avoids a per-entry sysreg read that is costly
under NV.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Co-developed-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 55 ++++++++++++++++++++++--------
1 file changed, 41 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index f25ee3971528..0194965930e6 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -102,6 +102,45 @@ static void fpsimd_sve_sync(struct kvm_vcpu *vcpu)
*host_data_ptr(fp_owner) = FP_STATE_HOST_OWNED;
}
+static void flush_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+ struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+ struct vgic_v3_cpu_if *host_cpu_if, *hyp_cpu_if;
+ unsigned int used_lrs, i;
+
+ host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+ hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+
+ used_lrs = host_cpu_if->used_lrs;
+ used_lrs = min(used_lrs, hyp_gicv3_nr_lr);
+
+ hyp_cpu_if->vgic_hcr = host_cpu_if->vgic_hcr;
+ /* Should be a one-off */
+ hyp_cpu_if->vgic_sre = (ICC_SRE_EL1_DIB |
+ ICC_SRE_EL1_DFB |
+ ICC_SRE_EL1_SRE);
+ hyp_cpu_if->used_lrs = used_lrs;
+
+ for (i = 0; i < used_lrs; i++)
+ hyp_cpu_if->vgic_lr[i] = host_cpu_if->vgic_lr[i];
+}
+
+static void sync_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+ struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+ struct vgic_v3_cpu_if *host_cpu_if, *hyp_cpu_if;
+ unsigned int i;
+
+ host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
+ hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
+
+ host_cpu_if->vgic_hcr = hyp_cpu_if->vgic_hcr;
+ host_cpu_if->vgic_vmcr = hyp_cpu_if->vgic_vmcr;
+
+ for (i = 0; i < hyp_cpu_if->used_lrs; i++)
+ host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
+}
+
static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)
{
struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
@@ -150,13 +189,7 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
hyp_vcpu->vcpu.arch.vsesr_el2 = host_vcpu->arch.vsesr_el2;
- hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3 = host_vcpu->arch.vgic_cpu.vgic_v3;
-
- /* Bound used_lrs by the number of implemented list registers. */
- hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3.used_lrs =
- min_t(unsigned int,
- hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3.used_lrs,
- hyp_gicv3_nr_lr);
+ flush_hyp_vgic_state(hyp_vcpu);
hyp_vcpu->vcpu.arch.pid = host_vcpu->arch.pid;
}
@@ -164,9 +197,6 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
{
struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
- struct vgic_v3_cpu_if *hyp_cpu_if = &hyp_vcpu->vcpu.arch.vgic_cpu.vgic_v3;
- struct vgic_v3_cpu_if *host_cpu_if = &host_vcpu->arch.vgic_cpu.vgic_v3;
- unsigned int i;
fpsimd_sve_sync(&hyp_vcpu->vcpu);
sync_debug_state(hyp_vcpu);
@@ -179,10 +209,7 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
host_vcpu->arch.iflags = hyp_vcpu->vcpu.arch.iflags;
- host_cpu_if->vgic_hcr = hyp_cpu_if->vgic_hcr;
- host_cpu_if->vgic_vmcr = hyp_cpu_if->vgic_vmcr;
- for (i = 0; i < hyp_cpu_if->used_lrs; ++i)
- host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
+ sync_hyp_vgic_state(hyp_vcpu);
}
static void handle___pkvm_vcpu_load(struct kvm_cpu_context *host_ctxt)
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* [PATCH v2 8/8] KVM: arm64: Implement lazy vCPU state sync for non-protected guests
From: Fuad Tabba @ 2026-06-19 7:07 UTC (permalink / raw)
To: Marc Zyngier, Oliver Upton, kvmarm, linux-arm-kernel,
linux-kernel
Cc: Catalin Marinas, Will Deacon, Joey Gouly, Steffen Eiden,
Suzuki K Poulose, Zenghui Yu, Vincent Donnefort, Quentin Perret,
Sebastian Ene, Hyunwoo Kim, Fuad Tabba
In-Reply-To: <20260619070719.812227-1-tabba@google.com>
pKVM copies a non-protected guest's register context between the host
and the hypervisor on every world switch, even when the host never
inspects it. Defer the copy: on entry, flush the host context into the
hyp vCPU only when the host marked it dirty (PKVM_HOST_STATE_DIRTY); on
exit, leave it in the hyp vCPU and copy it back only when the host needs
it, via a __pkvm_vcpu_sync_state hypercall on trap handling or at vcpu
put. A protected guest's context is copied as before, since lazy sync
only helps where the host is trusted to see the guest's registers.
PC and PSTATE are the exception: they are copied back on every exit so
the kvm_exit tracepoint reports the guest's real exit PC, and the run
loop's vcpu_mode_is_bad_32bit() and SError-masking checks evaluate the
guest's current PSTATE rather than the value left by the previous sync.
handle_exit_early() can also inject an SError, which writes the guest
context (ESR_EL1) outside the trap-handling path. For a non-protected
guest it therefore syncs the context from the hyp vCPU and marks it
dirty, as handle_trap_exceptions() does, so the injection reaches the
hyp vCPU on re-entry rather than being dropped.
Signed-off-by: Fuad Tabba <tabba@google.com>
---
arch/arm64/include/asm/kvm_asm.h | 1 +
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/kvm/arm.c | 7 +++
arch/arm64/kvm/handle_exit.c | 30 +++++++++++
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 86 ++++++++++++++++++++++++++++--
5 files changed, 121 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 043495f7fc78..6e1135b3ded4 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -113,6 +113,7 @@ enum __kvm_host_smccc_func {
__KVM_HOST_SMCCC_FUNC___pkvm_finalize_teardown_vm,
__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_load,
__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_put,
+ __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_sync_state,
__KVM_HOST_SMCCC_FUNC___pkvm_tlb_flush_vmid,
MARKER(__KVM_HOST_SMCCC_FUNC_MAX)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2faa60df847d..caa39ee5125f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1068,6 +1068,8 @@ struct kvm_vcpu_arch {
#define INCREMENT_PC __vcpu_single_flag(iflags, BIT(1))
/* Target EL/MODE (not a single flag, but let's abuse the macro) */
#define EXCEPT_MASK __vcpu_single_flag(iflags, GENMASK(3, 1))
+/* Host-set: the hyp flushes the non-protected vCPU state in on entry */
+#define PKVM_HOST_STATE_DIRTY __vcpu_single_flag(iflags, BIT(4))
/* Helpers to encode exceptions with minimum fuss */
#define __EXCEPT_MASK_VAL unpack_vcpu_flag(EXCEPT_MASK)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3732ee9eb0d4..4e89558d8027 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -733,6 +733,10 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
if (is_protected_kvm_enabled()) {
kvm_call_hyp(__vgic_v3_save_aprs, &vcpu->arch.vgic_cpu.vgic_v3);
kvm_call_hyp_nvhe(__pkvm_vcpu_put);
+
+ /* __pkvm_vcpu_put implies a sync of the state */
+ if (!kvm_vm_is_protected(vcpu->kvm))
+ vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
}
kvm_vcpu_put_debug(vcpu);
@@ -964,6 +968,9 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
return ret;
if (is_protected_kvm_enabled()) {
+ /* Start with the vcpu in a dirty state */
+ if (!kvm_vm_is_protected(vcpu->kvm))
+ vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
ret = pkvm_create_hyp_vm(kvm);
if (ret)
return ret;
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 54aedf93c78b..8963621bcdd1 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -422,6 +422,20 @@ static int handle_trap_exceptions(struct kvm_vcpu *vcpu)
{
int handled;
+ /*
+ * If we run a non-protected VM when protection is enabled
+ * system-wide, resync the state from the hypervisor and mark
+ * it as dirty on the host side if it wasn't dirty already
+ * (which could happen if preemption has taken place).
+ */
+ if (is_protected_kvm_enabled() && !kvm_vm_is_protected(vcpu->kvm)) {
+ guard(preempt)();
+ if (!(vcpu_get_flag(vcpu, PKVM_HOST_STATE_DIRTY))) {
+ kvm_call_hyp_nvhe(__pkvm_vcpu_sync_state);
+ vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
+ }
+ }
+
/*
* See ARM ARM B1.14.1: "Hyp traps on instructions
* that fail their condition code check"
@@ -489,6 +503,22 @@ int handle_exit(struct kvm_vcpu *vcpu, int exception_index)
/* For exit types that need handling before we can be preempted */
void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
{
+ bool inject_serror = ARM_SERROR_PENDING(exception_index) ||
+ ARM_EXCEPTION_CODE(exception_index) == ARM_EXCEPTION_EL1_SERROR;
+
+ /*
+ * An SError injected below writes the host ctxt; for a non-protected
+ * guest, sync from the hyp vCPU and keep it dirty so it isn't dropped.
+ */
+ if (is_protected_kvm_enabled()) {
+ vcpu_clear_flag(vcpu, PKVM_HOST_STATE_DIRTY);
+
+ if (inject_serror && !kvm_vm_is_protected(vcpu->kvm)) {
+ kvm_call_hyp_nvhe(__pkvm_vcpu_sync_state);
+ vcpu_set_flag(vcpu, PKVM_HOST_STATE_DIRTY);
+ }
+ }
+
if (ARM_SERROR_PENDING(exception_index)) {
if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN)) {
u64 disr = kvm_vcpu_get_disr(vcpu);
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 0194965930e6..acf53aae4fe4 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -141,6 +141,48 @@ static void sync_hyp_vgic_state(struct pkvm_hyp_vcpu *hyp_vcpu)
host_cpu_if->vgic_lr[i] = hyp_cpu_if->vgic_lr[i];
}
+static void __copy_vcpu_state(const struct kvm_vcpu *from_vcpu,
+ struct kvm_vcpu *to_vcpu)
+{
+ int i;
+
+ to_vcpu->arch.ctxt.regs = from_vcpu->arch.ctxt.regs;
+ to_vcpu->arch.ctxt.spsr_abt = from_vcpu->arch.ctxt.spsr_abt;
+ to_vcpu->arch.ctxt.spsr_und = from_vcpu->arch.ctxt.spsr_und;
+ to_vcpu->arch.ctxt.spsr_irq = from_vcpu->arch.ctxt.spsr_irq;
+ to_vcpu->arch.ctxt.spsr_fiq = from_vcpu->arch.ctxt.spsr_fiq;
+ to_vcpu->arch.ctxt.fp_regs = from_vcpu->arch.ctxt.fp_regs;
+
+ /*
+ * Copy the sysregs, but don't mess with the timer state which
+ * is directly handled by EL1 and is expected to be preserved.
+ * enum vcpu_sysreg is sparse: VNCR-mapped registers take values
+ * derived from their VNCR page offset, so the timer registers do
+ * not form a contiguous numeric range and must be skipped by name.
+ */
+ for (i = 1; i < NR_SYS_REGS; i++) {
+ switch (i) {
+ case CNTVOFF_EL2:
+ case CNTV_CVAL_EL0:
+ case CNTV_CTL_EL0:
+ case CNTP_CVAL_EL0:
+ case CNTP_CTL_EL0:
+ continue;
+ }
+ to_vcpu->arch.ctxt.sys_regs[i] = from_vcpu->arch.ctxt.sys_regs[i];
+ }
+}
+
+static void sync_hyp_vcpu_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+ __copy_vcpu_state(&hyp_vcpu->vcpu, hyp_vcpu->host_vcpu);
+}
+
+static void flush_hyp_vcpu_state(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+ __copy_vcpu_state(hyp_vcpu->host_vcpu, &hyp_vcpu->vcpu);
+}
+
static void flush_debug_state(struct pkvm_hyp_vcpu *hyp_vcpu)
{
struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
@@ -170,7 +212,17 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
fpsimd_sve_flush();
flush_debug_state(hyp_vcpu);
- hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt;
+ /*
+ * If we deal with a non-protected guest and the state is potentially
+ * dirty (from a host perspective), copy the state back into the hyp
+ * vcpu.
+ */
+ if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
+ if (vcpu_get_flag(host_vcpu, PKVM_HOST_STATE_DIRTY))
+ flush_hyp_vcpu_state(hyp_vcpu);
+ } else {
+ hyp_vcpu->vcpu.arch.ctxt = host_vcpu->arch.ctxt;
+ }
/* __hyp_running_vcpu must be NULL in a guest context. */
hyp_vcpu->vcpu.arch.ctxt.__hyp_running_vcpu = NULL;
@@ -201,9 +253,13 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
fpsimd_sve_sync(&hyp_vcpu->vcpu);
sync_debug_state(hyp_vcpu);
- host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt;
-
- host_vcpu->arch.hcr_el2 = hyp_vcpu->vcpu.arch.hcr_el2;
+ if (pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
+ host_vcpu->arch.ctxt = hyp_vcpu->vcpu.arch.ctxt;
+ } else {
+ /* Keep PC (tracepoint) and PSTATE (vcpu_mode_is_bad_32bit) current. */
+ host_vcpu->arch.ctxt.regs.pc = hyp_vcpu->vcpu.arch.ctxt.regs.pc;
+ host_vcpu->arch.ctxt.regs.pstate = hyp_vcpu->vcpu.arch.ctxt.regs.pstate;
+ }
host_vcpu->arch.fault = hyp_vcpu->vcpu.arch.fault;
@@ -237,8 +293,27 @@ static void handle___pkvm_vcpu_put(struct kvm_cpu_context *host_ctxt)
{
struct pkvm_hyp_vcpu *hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
- if (hyp_vcpu)
+ if (hyp_vcpu) {
+ struct kvm_vcpu *host_vcpu = hyp_vcpu->host_vcpu;
+
+ if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu) &&
+ !vcpu_get_flag(host_vcpu, PKVM_HOST_STATE_DIRTY)) {
+ sync_hyp_vcpu_state(hyp_vcpu);
+ }
+
pkvm_put_hyp_vcpu(hyp_vcpu);
+ }
+}
+
+static void handle___pkvm_vcpu_sync_state(struct kvm_cpu_context *host_ctxt)
+{
+ struct pkvm_hyp_vcpu *hyp_vcpu;
+
+ hyp_vcpu = pkvm_get_loaded_hyp_vcpu();
+ if (!hyp_vcpu || pkvm_hyp_vcpu_is_protected(hyp_vcpu))
+ return;
+
+ sync_hyp_vcpu_state(hyp_vcpu);
}
static struct kvm_vcpu *__get_host_hyp_vcpus(struct kvm_vcpu *arg,
@@ -869,6 +944,7 @@ static const hcall_t host_hcall[] = {
HANDLE_FUNC(__pkvm_finalize_teardown_vm),
HANDLE_FUNC(__pkvm_vcpu_load),
HANDLE_FUNC(__pkvm_vcpu_put),
+ HANDLE_FUNC(__pkvm_vcpu_sync_state),
HANDLE_FUNC(__pkvm_tlb_flush_vmid),
};
--
2.55.0.rc0.738.g0c8ab3ebcc-goog
^ permalink raw reply related
* Re: [PATCH v2] net: macb: add TX stall timeout callback to recover from lost TSTART write
From: Andrea della Porta @ 2026-06-19 7:17 UTC (permalink / raw)
To: Théo Lebrun
Cc: Andrea della Porta, netdev, Nicolas Ferre, Claudiu Beznea,
Andrew Lunn, David S . Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel, linux-arm-kernel, linux-rpi-kernel,
Nicolai Buchwitz, Lukasz Raczylo, Steffen Jaeckel
In-Reply-To: <DJAKGHKZGUI2.29AHYNFX4LGBJ@bootlin.com>
Hi Theo,
On 17:07 Tue 16 Jun , Théo Lebrun wrote:
> Hello Andrea,
>
> On Tue Jun 16, 2026 at 3:23 PM CEST, Andrea della Porta wrote:
> > From: Lukasz Raczylo <lukasz@raczylo.com>
> >
> > The MACB found in the Raspberry Pi RP1 suffers from sporadic stalls on
> > the TX queue.
> > While the exact root cause is not yet fully understood, it is likely
> > related to a hardware issue where a TSTART write to the NCR register
> > is missed, preventing the transmission from being kicked off.
> >
> > Implement a timeout callback to handle TX queue stalls, triggering the
> > existing restart mechanism to recover.
> >
> > Link: https://lore.kernel.org/all/20260514215459.36109-1-lukasz@raczylo.com/
> > Fixes: dc110d1b23564 ("net: cadence: macb: Add support for Raspberry Pi RP1 ethernet controller")
> > Signed-off-by: Lukasz Raczylo <lukasz@raczylo.com>
> > Co-developed-by: Steffen Jaeckel <sjaeckel@suse.de>
> > Signed-off-by: Steffen Jaeckel <sjaeckel@suse.de>
> > Co-developed-by: Andrea della Porta <andrea.porta@suse.com>
> > Signed-off-by: Andrea della Porta <andrea.porta@suse.com>
>
> Thanks for this V2.
>
> Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com>
>
> Any news from the Raspberry Pi community about this bug investigation?
Not from my side, unfortunately.
Regards,
Andrea
>
> Thanks,
>
> --
> Théo Lebrun, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com
>
^ permalink raw reply
* Re: [PATCH v4 2/2] clk: amlogic: Add A9 AO clock controller driver
From: Jerome Brunet @ 2026-06-19 7:29 UTC (permalink / raw)
To: Julian Braha
Cc: jian.hu, Neil Armstrong, Michael Turquette, Stephen Boyd,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Xianwei Zhao,
Kevin Hilman, Martin Blumenstingl, linux-amlogic, linux-clk,
devicetree, linux-kernel, linux-arm-kernel
In-Reply-To: <79b1a519-5723-4e0c-904c-b7fdf9564ee1@gmail.com>
On jeu. 18 juin 2026 at 19:56, Julian Braha <julianbraha@gmail.com> wrote:
> Hi Jian,
>
> On 6/18/26 10:49, Jian Hu via B4 Relay wrote:
>
>> +config COMMON_CLK_A9_AO
>> + tristate "Amlogic A9 SoC AO clock controller support"
>> + depends on ARM64 || COMPILE_TEST
>> + default ARCH_MESON
>> + select COMMON_CLK_MESON_REGMAP
>> + select COMMON_CLK_MESON_CLKC_UTILS
>> + select COMMON_CLK_MESON_DUALDIV
>
> Selecting COMMON_CLK_MESON_REGMAP is unnecessary since you're already
> selecting COMMON_CLK_MESON_DUALDIV here.
No, regmap clock are directly used so this is necessary.
Relying on other module dependencies is not enough
>
> - Julian Braha
--
Jerome
^ permalink raw reply
* Re: [PATCH v1 1/1] i2c: davinci: Use generic definitions for bus frequencies
From: Bartosz Golaszewski @ 2026-06-19 7:33 UTC (permalink / raw)
To: Andy Shevchenko
Cc: Bartosz Golaszewski, Andi Shyti, linux-arm-kernel, linux-i2c,
linux-kernel
In-Reply-To: <20260618133429.3214475-1-andriy.shevchenko@linux.intel.com>
On Thu, 18 Jun 2026 15:34:29 +0200, Andy Shevchenko
<andriy.shevchenko@linux.intel.com> said:
> Since we have generic definitions for bus frequencies, let's use them.
>
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> ---
> drivers/i2c/busses/i2c-davinci.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-davinci.c b/drivers/i2c/busses/i2c-davinci.c
> index 66c23535656b..67ce39cb000d 100644
> --- a/drivers/i2c/busses/i2c-davinci.c
> +++ b/drivers/i2c/busses/i2c-davinci.c
> @@ -117,8 +117,6 @@
> /* timeout for pm runtime autosuspend */
> #define DAVINCI_I2C_PM_TIMEOUT 1000 /* ms */
>
> -#define DAVINCI_I2C_DEFAULT_BUS_FREQ 100000
> -
> struct davinci_i2c_dev {
> struct device *dev;
> void __iomem *base;
> @@ -760,7 +758,7 @@ static int davinci_i2c_probe(struct platform_device *pdev)
>
> r = device_property_read_u32(&pdev->dev, "clock-frequency", &prop);
> if (r)
> - prop = DAVINCI_I2C_DEFAULT_BUS_FREQ;
> + prop = I2C_MAX_STANDARD_MODE_FREQ;
>
> dev->bus_freq = prop / 1000;
>
> --
> 2.50.1
>
>
Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
^ permalink raw reply
* Re: [PATCH 0/8] Add PCIe M.2 Key E connector support for NXP i.MX boards
From: Bartosz Golaszewski @ 2026-06-19 7:37 UTC (permalink / raw)
To: Sherry Sun (OSS)
Cc: imx, linux-pci, linux-arm-kernel, devicetree, linux-kernel,
linux-bluetooth, linux-pm, sherry.sun, robh, krzk+dt, conor+dt,
Frank.Li, s.hauer, kernel, festevam, amitkumar.karwar,
neeraj.sanjaykale, marcel, luiz.dentz, hongxing.zhu, l.stach,
lpieralisi, kwilczynski, mani, bhelgaas, brgl
In-Reply-To: <20260618101047.4185497-1-sherry.sun@oss.nxp.com>
On Thu, 18 Jun 2026 12:10:39 +0200, "Sherry Sun (OSS)"
<sherry.sun@oss.nxp.com> said:
> From: Sherry Sun <sherry.sun@nxp.com>
>
> This series adds support for NXP Wi-Fi/BT combo chips (88W9098, AW693)
> inserted into PCIe M.2 Key E connectors on several i.MX EVK/MEK boards.
>
> For M.2 cards that rely on PCIe L2 link state and wake-up mechanisms, the
> card must remain powered during suspend. Patch 1 uses the existing
> dw_pcie_rp::skip_pwrctrl_off flag to skip power-off during suspend and skip
> power-on during the init path.
>
> Alsp the btnxpuart driver is extended to obtain a pwrseq descriptor via the
> OF graph on the UART controller device in patch 2.
>
> Sherry Sun (8):
> PCI: imx6: Add skip_pwrctrl_off flag support
> power: sequencing: pcie-m2: Add PCI ID for NXP 88W9098 and AW693
> Bluetooth
Can this be applied independently without build-time issues?
Bart
> Bluetooth: btnxpuart: Add M.2 Bluetooth device support using pwrseq
> arm64: dts: imx8mq-evk: Describe the PCIe M.2 Key E connector
> arm64: dts: imx95-19x19-evk: Describe the PCIe M.2 Key E connector
> arm64: dts: imx8dxl-evk: Describe the PCIe M.2 Key E connector
> arm64: dts: imx8qm-mek: Describe the PCIe M.2 Key E connector
> arm64: dts: imx8qxp-mek: Describe the PCIe M.2 Key E connector
>
> arch/arm64/boot/dts/freescale/imx8dxl-evk.dts | 56 +++++++++++++-----
> arch/arm64/boot/dts/freescale/imx8mq-evk.dts | 44 ++++++++++++--
> arch/arm64/boot/dts/freescale/imx8qm-mek.dts | 58 ++++++++++++++-----
> arch/arm64/boot/dts/freescale/imx8qxp-mek.dts | 54 ++++++++++++-----
> .../boot/dts/freescale/imx95-19x19-evk.dts | 55 +++++++++++++-----
> drivers/bluetooth/btnxpuart.c | 33 ++++++++++-
> drivers/pci/controller/dwc/pci-imx6.c | 36 +++++++-----
> drivers/power/sequencing/pwrseq-pcie-m2.c | 4 ++
> 8 files changed, 264 insertions(+), 76 deletions(-)
>
> --
> 2.50.1
>
>
^ permalink raw reply
* Re: [PATCH v2] net: macb: add TX stall timeout callback to recover from lost TSTART write
From: Nicolai Buchwitz @ 2026-06-19 7:39 UTC (permalink / raw)
To: Andrea della Porta
Cc: Théo Lebrun, netdev, Nicolas Ferre, Claudiu Beznea,
Andrew Lunn, David S . Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, linux-kernel, linux-arm-kernel, linux-rpi-kernel,
Lukasz Raczylo, Steffen Jaeckel
In-Reply-To: <ajTtF2xy3WS6NfBi@apocalypse>
On 19.6.2026 09:17, Andrea della Porta wrote:
> [...]
>> Any news from the Raspberry Pi community about this bug investigation?
>
> Not from my side, unfortunately.
If I remember it correctly, the downstream kernel carries earlier
versions of Lukasz patches,
which he also submitted there. If time permits, I will run some tests
with mainline kernel
on Pi5 + downstream kernel with reverted patches + only the upstream
patches.
But realistically this won't happen before end of next week.
BR
Nicolai
> [...]
^ permalink raw reply
* Re: [PATCH v4 0/2] clk: amlogic: Add A9 AO clock controller
From: Jerome Brunet @ 2026-06-19 7:40 UTC (permalink / raw)
To: Jian Hu via B4 Relay
Cc: Neil Armstrong, Michael Turquette, Stephen Boyd, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Xianwei Zhao, Kevin Hilman,
Martin Blumenstingl, jian.hu, linux-amlogic, linux-clk,
devicetree, linux-kernel, linux-arm-kernel, Conor Dooley
In-Reply-To: <20260618-a9_aoclk-v4-0-569d0425e50c@amlogic.com>
On jeu. 18 juin 2026 at 17:49, Jian Hu via B4 Relay <devnull+jian.hu.amlogic.com@kernel.org> wrote:
> This series adds Amlogic A9 AO clock support, including dt-binding and AO clock driver.
>
> Signed-off-by: Jian Hu <jian.hu@amlogic.com>
A part from the 2 minor problems found by sashiko, This looks good.
> ---
> Changes in v4:
> - Drop CLK_IS_CRITICAL for ao_xtal_in clock.
> - Drop CLK_HW_INIT* and revert to explicit clock declarations.
> - Link to v3: https://lore.kernel.org/r/20260610-a9_aoclk-v3-0-b7592d6c31e2@amlogic.com
>
> Changes in v3:
> - Move COMPILE_TEST after 'depends on ARM64' reported by sashiko-bot.
> - Rename i2c3 to i3c reported by sashiko-bot.
> - Reword the comment describing ao_xtal_in's flags.
> - Use struct clk_init_data to describe ao_xtal_in's hw.init.
> - Link to v2: https://lore.kernel.org/r/20260603-a9_aoclk-v2-0-f47ea616ee78@amlogic.com
>
> Changes in v2:
> - Split the A9 clock driver and send the AO clock separately.
> - Rename aobus to soc.
> - Use CLK_HW_INIT_FW_NAME to describe clk_init_data.
> - Use CLK_HW_INIT_PARENTS_DATA to describe clk_init_data.
> - Use a9_ao prefix for MESON_COMP_SEL.
> - Correct duandiv name.
> - Fix pwm b reg.
> - Link to v1: https://lore.kernel.org/all/20260511-b4-a9_clk-v1-0-41cb4071b7c9@amlogic.com/
>
> ---
> Jian Hu (2):
> dt-bindings: clock: Add Amlogic A9 AO clock controller
> clk: amlogic: Add A9 AO clock controller driver
>
> .../bindings/clock/amlogic,a9-aoclkc.yaml | 76 ++++
> drivers/clk/meson/Kconfig | 13 +
> drivers/clk/meson/Makefile | 1 +
> drivers/clk/meson/a9-aoclk.c | 488 +++++++++++++++++++++
> include/dt-bindings/clock/amlogic,a9-aoclkc.h | 76 ++++
> 5 files changed, 654 insertions(+)
> ---
> base-commit: ca89c88bcf69daca829044c638a8163d5ce47af0
> change-id: 20260603-a9_aoclk-bbf531badc63
>
> Best regards,
--
Jerome
^ permalink raw reply
* Re: Question: pinctrl-backed GPIO set_config and gpio_chip::can_sleep
From: Linus Walleij @ 2026-06-19 7:43 UTC (permalink / raw)
To: Runyu Xiao
Cc: Linus Walleij, Bartosz Golaszewski, Ludovic Desroches,
Nicolas Ferre, Alexandre Belloni, Claudiu Beznea, Antonio Borneo,
Maxime Coquelin, Alexandre Torgue, Chen-Yu Tsai, Jernej Skrabec,
Samuel Holland, linux-arm-kernel, linux-gpio, linux-stm32,
linux-sunxi, linux-kernel, jianhao.xu
In-Reply-To: <20260618151052.3984665-1-runyu.xiao@seu.edu.cn>
Hi Runyu,
On Thu, Jun 18, 2026 at 5:11 PM Runyu Xiao <runyu.xiao@seu.edu.cn> wrote:
> I agree that marking these memory-mapped controllers as can_sleep is too
> broad if the only sleepable part is the pinctrl range lookup. That would
> make consumers treat otherwise MMIO-backed get/set paths as sleepable,
> which is not the contract I want to change.
>
> I will hold back the at91-pio4/stm32/sunxi can_sleep series and look at
> the pinctrl core direction instead, specifically whether
> pinctrldev_list_mutex can be replaced by a non-sleeping lock for
> pinctrl_get_device_gpio_range(). That should also line up with the GPIO
> direction callback case discussed in the other thread.
Your replies look like those one would get from an AI agent,
because they are repeating information (context) that I have just
provided, just with other words.
If this is the case you need to instruct your agent to be terse in
mailing list replies: it needs to quote what I just said with >
markers in the margin and just add the word "Agreed" after
it.
Yours,
Linus Walleij
^ permalink raw reply
* Re: [PATCH v4 2/2] clk: amlogic: Add A9 AO clock controller driver
From: Julian Braha @ 2026-06-19 7:48 UTC (permalink / raw)
To: Jerome Brunet
Cc: jian.hu, Neil Armstrong, Michael Turquette, Stephen Boyd,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Xianwei Zhao,
Kevin Hilman, Martin Blumenstingl, linux-amlogic, linux-clk,
devicetree, linux-kernel, linux-arm-kernel
In-Reply-To: <1jbjd7c7ip.fsf@starbuckisacylon.baylibre.com>
Hi Jerome,
On 6/19/26 08:29, Jerome Brunet wrote:
> No, regmap clock are directly used so this is necessary.
> Relying on other module dependencies is not enough
What do you mean it's "not enough"?
Functionally, any user of COMMON_CLK_MESON_DUALDIV can also use
COMMON_CLK_MESON_REGMAP.
Unless you mean for documentation purposes?
- Julian Braha
^ permalink raw reply
* [PATCH v3 1/1] reset: imx7: Correct polarity of MIPI CSI resets on i.MX8MQ
From: robby.cai @ 2026-06-19 7:31 UTC (permalink / raw)
To: p.zabel, Frank.Li, s.hauer, festevam
Cc: krzk+dt, andrew.smirnov, kernel, imx, linux-arm-kernel,
linux-kernel, aisheng.dong, guoniu.zhou
From: Robby Cai <robby.cai@nxp.com>
On i.MX8MQ, the MIPI CSI reset lines are active-low and not self-clearing.
Writing '0' asserts reset and it remains asserted until explicitly
deasserted by software.
This driver previously treated the MIPI CSI reset signals as active-high,
which led to incorrect reset assert/deassert sequencing. This issue was
exposed by commit 6d79bb8fd2aa ("media: imx8mq-mipi-csi2: Explicitly
release reset").
Fix this by reflecting the correct reset polarity and ensuring proper
reset handling.
Fixes: c979dbf59987 ("reset: imx7: Add support for i.MX8MQ IP block variant")
Cc: <stable@vger.kernel.org> # 6d79bb8fd2aa: media: imx8mq-mipi-csi2: Explicitly release reset
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Robby Cai <robby.cai@nxp.com>
---
Changes in v3:
- Add Cc tag as suggested by Philipp Zabel
- Add R-b tag from Philipp Zabel
Link to v2: https://lore.kernel.org/imx/20260417080851.489303-1-robby.cai@nxp.com/
Changes in v2:
- Drop the naming change in response to feedback from Krzysztof Kozlowski
- Refine the patch subject and commit message
Link to v1: https://lore.kernel.org/imx/20260331101331.1405588-1-robby.cai@nxp.com/
---
drivers/reset/reset-imx7.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/reset/reset-imx7.c b/drivers/reset/reset-imx7.c
index dd01fe11c5cb..a3cb8244d76a 100644
--- a/drivers/reset/reset-imx7.c
+++ b/drivers/reset/reset-imx7.c
@@ -236,6 +236,12 @@ static int imx8mq_reset_set(struct reset_controller_dev *rcdev,
case IMX8MQ_RESET_PCIE_CTRL_APPS_EN:
case IMX8MQ_RESET_PCIE2_CTRL_APPS_EN:
+ case IMX8MQ_RESET_MIPI_CSI1_CORE_RESET:
+ case IMX8MQ_RESET_MIPI_CSI1_PHY_REF_RESET:
+ case IMX8MQ_RESET_MIPI_CSI1_ESC_RESET:
+ case IMX8MQ_RESET_MIPI_CSI2_CORE_RESET:
+ case IMX8MQ_RESET_MIPI_CSI2_PHY_REF_RESET:
+ case IMX8MQ_RESET_MIPI_CSI2_ESC_RESET:
case IMX8MQ_RESET_MIPI_DSI_PCLK_RESET_N:
case IMX8MQ_RESET_MIPI_DSI_ESC_RESET_N:
case IMX8MQ_RESET_MIPI_DSI_DPI_RESET_N:
--
2.50.1
^ permalink raw reply related
* [PATCH net v4] net: airoha: Fix skb->priority underflow in airoha_dev_select_queue()
From: Wayen Yan @ 2026-06-19 7:50 UTC (permalink / raw)
To: netdev
Cc: lorenzo, horms, pabeni, kuba, edumazet, andrew+netdev,
angelogioacchino.delregno, matthias.bgg, linux-arm-kernel,
linux-mediatek
In airoha_dev_select_queue(), the expression:
queue = (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES;
implicitly converts to unsigned arithmetic: when skb->priority is 0
(the default for unclassified traffic), (0u - 1u) wraps to UINT_MAX,
and UINT_MAX % 8 = 7, routing default best-effort packets to the
highest-priority QoS queue. This causes QoS inversion where the
majority of traffic on a PON gateway starves actual high-priority
flows (VoIP, gaming, etc.).
The "- 1" offset was a leftover from the ETS offload implementation
that has since been removed. The correct mapping is a direct modulo:
queue = skb->priority % AIROHA_NUM_QOS_QUEUES;
This maps priority 0 → queue 0 (lowest), priority 7 → queue 7
(highest), with higher priorities wrapping around. This is the
standard Linux sk_prio → HW queue mapping used by other drivers.
Fixes: 2b288b81560b ("net: airoha: Introduce ndo_select_queue callback")
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Joe Damato <joe@dama.to>
Signed-off-by: Wayen Yan <win847@gmail.com>
---
Changes in v4:
- Remove the "- 1" offset entirely as suggested by Lorenzo and Jakub.
The offset was an ETS offload leftover; the correct mapping is a
direct modulo of skb->priority to AIROHA_NUM_QOS_QUEUES, matching
the standard Linux sk_prio → HW queue convention. (v3 used a
ternary guard which addressed the underflow but kept the unneeded
offset.)
Link: https://lore.kernel.org/netdev/20260617164448.31e189bc@kernel.org/
Link: https://lore.kernel.org/netdev/ajPCgH7E_ke6Fdur@lore-desk/
drivers/net/ethernet/airoha/airoha_eth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index d0c0c0ec8a..9ec3f22754 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -1928,7 +1928,7 @@ static u16 airoha_dev_select_queue(struct net_device *dev, struct sk_buff *skb,
*/
channel = netdev_uses_dsa(dev) ? skb_get_queue_mapping(skb) : port->id;
channel = channel % AIROHA_NUM_QOS_CHANNELS;
- queue = (skb->priority - 1) % AIROHA_NUM_QOS_QUEUES; /* QoS queue */
+ queue = skb->priority % AIROHA_NUM_QOS_QUEUES;
queue = channel * AIROHA_NUM_QOS_QUEUES + queue;
return queue < dev->num_tx_queues ? queue : 0;
--
2.51.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox