[PATCH v1 0/3] KVM: arm64: Fix guest feature sanitization and pKVM state synchronization

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v1 0/3] KVM: arm64: Fix guest feature sanitization and pKVM state synchronization
@ 2026-02-12  9:02 Fuad Tabba
  2026-02-12  9:02 ` [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host Fuad Tabba
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Fuad Tabba @ 2026-02-12  9:02 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, tabba, stable

This series addresses state management and feature synchronization
vulnerabilities in both standard KVM and pKVM implementations on arm64.
The primary focus is ensuring that the hypervisor correctly handles
architectural extensions during context switches to prevent state
corruption.

The series is structured as follows:

* Patch 1: Addresses an issue in KVM/arm64 in general where FEAT_S1POE
  is exposed to guests based solely on hardware capability. If the host
  kernel is built without CONFIG_ARM64_POE, it will not context-switch
  POR_EL1. Masking the S1POE bit in ID_AA64MMFR3_EL1 when
  system_supports_poe() is false prevents state corruption.

* Patch 2: Fixes a bug in pKVM non-protected guest initialization.
  Previously, pkvm_init_features_from_host() copied the initialized flag
  without copying the actual id_regs array. This caused EL2 feature
  checks (such as ctxt_has_tcrx()) to silently fail, breaking the
  save/restore logic for system registers like TCR2_EL1, PIR_EL1, and
  POR_EL1 during world switches. The fix initializes the ID registers.

* Patch 3: Removes a redundant kern_hyp_va() macro invocation in
  unpin_host_sve_state(). The sve_state pointer is already initialized
  as a hypervisor virtual address. While idempotent, the macro is
  unnecessary here.

Based on Linux 6.19.

Cheers,
/fuad

Cc: stable@vger.kernel.org

Fuad Tabba (3):
  KVM: arm64: Hide S1POE from guests when not supported by the host
  KVM: arm64: Fix ID register initialization for non-protected pKVM
    guests
  KVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state()

 arch/arm64/include/asm/kvm_host.h |  3 ++-
 arch/arm64/kvm/hyp/nvhe/pkvm.c    | 39 ++++++++++++++++++++++++++++---
 arch/arm64/kvm/sys_regs.c         |  3 +++
 3 files changed, 41 insertions(+), 4 deletions(-)

base-commit: 05f7e89ab9731565d8a62e3b5d1ec206485eeb0b
-- 
2.53.0.239.g8d8fc8a987-goog

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host
  2026-02-12  9:02 [PATCH v1 0/3] KVM: arm64: Fix guest feature sanitization and pKVM state synchronization Fuad Tabba
@ 2026-02-12  9:02 ` Fuad Tabba
  2026-02-12  9:29   ` Marc Zyngier
  2026-02-12  9:02 ` [PATCH v1 2/3] KVM: arm64: Fix ID register initialization for non-protected pKVM guests Fuad Tabba
  2026-02-12  9:02 ` [PATCH v1 3/3] KVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state() Fuad Tabba
  2 siblings, 1 reply; 11+ messages in thread
From: Fuad Tabba @ 2026-02-12  9:02 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, tabba, stable

When CONFIG_ARM64_POE is disabled, KVM does not save/restore POR_EL1.
However, ID_AA64MMFR3_EL1 sanitisation currently exposes the feature to
guests whenever the hardware supports it, ignoring the host kernel
configuration.

If a guest detects this feature and attempts to use it, the host will
fail to context-switch POR_EL1, potentially leading to state corruption.

Fix this by masking ID_AA64MMFR3_EL1.S1POE and preventing KVM from
advertising the feature when the host does not support it, i.e.,
system_supports_poe() is false.

Fixes: 70ed7238297f ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1")
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h | 3 ++-
 arch/arm64/kvm/sys_regs.c         | 3 +++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index ac7f970c7883..7af72ca749a6 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1592,7 +1592,8 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val);
 	(kvm_has_feat((k), ID_AA64MMFR3_EL1, S1PIE, IMP))
 
 #define kvm_has_s1poe(k)				\
-	(kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP))
+	(system_supports_poe() &&			\
+	 kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP))
 
 #define kvm_has_ras(k)					\
 	(kvm_has_feat((k), ID_AA64PFR0_EL1, RAS, IMP))
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 88a57ca36d96..237e8bd1cf29 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1816,6 +1816,9 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
 		       ID_AA64MMFR3_EL1_SCTLRX |
 		       ID_AA64MMFR3_EL1_S1POE |
 		       ID_AA64MMFR3_EL1_S1PIE;
+
+		if (!system_supports_poe())
+			val &= ~ID_AA64MMFR3_EL1_S1POE;
 		break;
 	case SYS_ID_MMFR4_EL1:
 		val &= ~ID_MMFR4_EL1_CCIDX;
-- 
2.53.0.239.g8d8fc8a987-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 2/3] KVM: arm64: Fix ID register initialization for non-protected pKVM guests
  2026-02-12  9:02 [PATCH v1 0/3] KVM: arm64: Fix guest feature sanitization and pKVM state synchronization Fuad Tabba
  2026-02-12  9:02 ` [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host Fuad Tabba
@ 2026-02-12  9:02 ` Fuad Tabba
  2026-02-13 11:03   ` Marc Zyngier
  2026-02-12  9:02 ` [PATCH v1 3/3] KVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state() Fuad Tabba
  2 siblings, 1 reply; 11+ messages in thread
From: Fuad Tabba @ 2026-02-12  9:02 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, tabba, stable

In protected mode, the hypervisor maintains a separate instance of
the `kvm` structure for each VM. For non-protected VMs, this structure is
initialized from the host's `kvm` state.

Currently, `pkvm_init_features_from_host()` copies the
`KVM_ARCH_FLAG_ID_REGS_INITIALIZED` flag from the host without the
underlying `id_regs` data being initialized. This results in the
hypervisor seeing the flag as set while the ID registers remain zeroed.

Consequently, `kvm_has_feat()` checks at EL2 fail (return 0) for
non-protected VMs. This breaks logic that relies on feature detection,
such as `ctxt_has_tcrx()` for TCR2_EL1 support. As a result, certain
system registers (e.g., TCR2_EL1, PIR_EL1, POR_EL1) are not
saved/restored during the world switch, which could lead to state
corruption.

Fix this by explicitly copying the ID registers from the host `kvm` to
the hypervisor `kvm` for non-protected VMs during vCPU initialization,
since we trust the host with its non-protected guests' features. Also
ensure `KVM_ARCH_FLAG_ID_REGS_INITIALIZED` is cleared initially in
`pkvm_init_features_from_host` so that `vm_copy_id_regs` can properly
initialize them and set the flag once done.

Fixes: 41d6028e28bd ("KVM: arm64: Convert the SVE guest vcpu flag to a vm flag")
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/pkvm.c | 37 ++++++++++++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 12b2acfbcfd1..267854ed29c8 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -344,6 +344,8 @@ static void pkvm_init_features_from_host(struct pkvm_hyp_vm *hyp_vm, const struc
 
 	/* No restrictions for non-protected VMs. */
 	if (!kvm_vm_is_protected(kvm)) {
+		clear_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &host_arch_flags);
+
 		hyp_vm->kvm.arch.flags = host_arch_flags;
 
 		bitmap_copy(kvm->arch.vcpu_features,
@@ -471,6 +473,36 @@ static int pkvm_vcpu_init_sve(struct pkvm_hyp_vcpu *hyp_vcpu, struct kvm_vcpu *h
 	return ret;
 }
 
+static int vm_copy_id_regs(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+	struct pkvm_hyp_vm *hyp_vm = pkvm_hyp_vcpu_to_hyp_vm(hyp_vcpu);
+	const struct kvm *host_kvm = hyp_vm->host_kvm;
+	struct kvm *kvm = &hyp_vm->kvm;
+
+	if (!test_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &host_kvm->arch.flags))
+		return -EINVAL;
+
+	if (test_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags))
+		return 0;
+
+	memcpy(kvm->arch.id_regs, host_kvm->arch.id_regs, sizeof(kvm->arch.id_regs));
+	set_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags);
+
+	return 0;
+}
+
+static int pkvm_vcpu_init_sysregs(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+	int ret = 0;
+
+	if (pkvm_hyp_vcpu_is_protected(hyp_vcpu))
+		kvm_init_pvm_id_regs(&hyp_vcpu->vcpu);
+	else
+		ret = vm_copy_id_regs(hyp_vcpu);
+
+	return ret;
+}
+
 static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu,
 			      struct pkvm_hyp_vm *hyp_vm,
 			      struct kvm_vcpu *host_vcpu)
@@ -490,8 +522,9 @@ static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu,
 	hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags);
 	hyp_vcpu->vcpu.arch.mp_state.mp_state = KVM_MP_STATE_STOPPED;
 
-	if (pkvm_hyp_vcpu_is_protected(hyp_vcpu))
-		kvm_init_pvm_id_regs(&hyp_vcpu->vcpu);
+	ret = pkvm_vcpu_init_sysregs(hyp_vcpu);
+	if (ret)
+		goto done;
 
 	ret = pkvm_vcpu_init_traps(hyp_vcpu);
 	if (ret)
-- 
2.53.0.239.g8d8fc8a987-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v1 3/3] KVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state()
  2026-02-12  9:02 [PATCH v1 0/3] KVM: arm64: Fix guest feature sanitization and pKVM state synchronization Fuad Tabba
  2026-02-12  9:02 ` [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host Fuad Tabba
  2026-02-12  9:02 ` [PATCH v1 2/3] KVM: arm64: Fix ID register initialization for non-protected pKVM guests Fuad Tabba
@ 2026-02-12  9:02 ` Fuad Tabba
  2 siblings, 0 replies; 11+ messages in thread
From: Fuad Tabba @ 2026-02-12  9:02 UTC (permalink / raw)
  To: kvm, kvmarm, linux-arm-kernel
  Cc: maz, oliver.upton, joey.gouly, suzuki.poulose, yuzenghui,
	catalin.marinas, will, tabba, stable

The `sve_state` pointer in `hyp_vcpu->vcpu.arch` is initialized as a
hypervisor virtual address during vCPU initialization in
`pkvm_vcpu_init_sve()`.

`unpin_host_sve_state()` calls `kern_hyp_va()` on this address. Since
`kern_hyp_va()` is idempotent, it's not a bug. However, it is
unnecessary and potentially confusing. Remove the redundant conversion.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/pkvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 267854ed29c8..8b9e027ec86a 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -393,7 +393,7 @@ static void unpin_host_sve_state(struct pkvm_hyp_vcpu *hyp_vcpu)
 	if (!vcpu_has_feature(&hyp_vcpu->vcpu, KVM_ARM_VCPU_SVE))
 		return;
 
-	sve_state = kern_hyp_va(hyp_vcpu->vcpu.arch.sve_state);
+	sve_state = hyp_vcpu->vcpu.arch.sve_state;
 	hyp_unpin_shared_mem(sve_state,
 			     sve_state + vcpu_sve_state_size(&hyp_vcpu->vcpu));
 }
-- 
2.53.0.239.g8d8fc8a987-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host
  2026-02-12  9:02 ` [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host Fuad Tabba
@ 2026-02-12  9:29   ` Marc Zyngier
  2026-02-12  9:41     ` Fuad Tabba
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2026-02-12  9:29 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvm, kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, stable

Hi Fuad,

On Thu, 12 Feb 2026 09:02:50 +0000,
Fuad Tabba <tabba@google.com> wrote:
> 
> When CONFIG_ARM64_POE is disabled, KVM does not save/restore POR_EL1.
> However, ID_AA64MMFR3_EL1 sanitisation currently exposes the feature to
> guests whenever the hardware supports it, ignoring the host kernel
> configuration.

This is the umpteenth time we get caught by this. PAN was the latest
instance until this one. Maybe an approach would be to have a default
override when a config option is not enabled, so that KVM is
consistent with the rest of the kernel?

> 
> If a guest detects this feature and attempts to use it, the host will
> fail to context-switch POR_EL1, potentially leading to state corruption.
> 
> Fix this by masking ID_AA64MMFR3_EL1.S1POE and preventing KVM from
> advertising the feature when the host does not support it, i.e.,
> system_supports_poe() is false.
> 
> Fixes: 70ed7238297f ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1")
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h | 3 ++-
>  arch/arm64/kvm/sys_regs.c         | 3 +++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index ac7f970c7883..7af72ca749a6 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -1592,7 +1592,8 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val);
>  	(kvm_has_feat((k), ID_AA64MMFR3_EL1, S1PIE, IMP))
>  
>  #define kvm_has_s1poe(k)				\
> -	(kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP))
> +	(system_supports_poe() &&			\
> +	 kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP))

Why do we need to further key this on system_supports_poe()? I can see
this is a potential optimisation, but I don't think this is part of
the minimal fix.

>  
>  #define kvm_has_ras(k)					\
>  	(kvm_has_feat((k), ID_AA64PFR0_EL1, RAS, IMP))
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 88a57ca36d96..237e8bd1cf29 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1816,6 +1816,9 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
>  		       ID_AA64MMFR3_EL1_SCTLRX |
>  		       ID_AA64MMFR3_EL1_S1POE |
>  		       ID_AA64MMFR3_EL1_S1PIE;
> +
> +		if (!system_supports_poe())
> +			val &= ~ID_AA64MMFR3_EL1_S1POE;

How about S1PIE? It seems to have a similar problem, in the sense that
it has extra state. But I guess because we don't put it behind a
config option, we context-switch it anyway and all is good?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host
  2026-02-12  9:29   ` Marc Zyngier
@ 2026-02-12  9:41     ` Fuad Tabba
  2026-02-12 15:35       ` Marc Zyngier
  0 siblings, 1 reply; 11+ messages in thread
From: Fuad Tabba @ 2026-02-12  9:41 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, stable

Hi Marc,

On Thu, 12 Feb 2026 at 09:29, Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Fuad,
>
> On Thu, 12 Feb 2026 09:02:50 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > When CONFIG_ARM64_POE is disabled, KVM does not save/restore POR_EL1.
> > However, ID_AA64MMFR3_EL1 sanitisation currently exposes the feature to
> > guests whenever the hardware supports it, ignoring the host kernel
> > configuration.
>
> This is the umpteenth time we get caught by this. PAN was the latest
> instance until this one. Maybe an approach would be to have a default
> override when a config option is not enabled, so that KVM is
> consistent with the rest of the kernel?

I spoke to Will about this, and one thing he'll look into is whether
this value in `struct arm64_ftr_reg` can be made consistent with the
cpu configuration itself (in cpufeature.c itself) . This would avoid
the problem altogether if possible. The question is whether the kernel
needs to somehow know that a certain feature exists even if it's
disabled in the config...

If he thinks it's not doable at that level, I'll look into
alternatives to make it correct by construction.

> >
> > If a guest detects this feature and attempts to use it, the host will
> > fail to context-switch POR_EL1, potentially leading to state corruption.
> >
> > Fix this by masking ID_AA64MMFR3_EL1.S1POE and preventing KVM from
> > advertising the feature when the host does not support it, i.e.,
> > system_supports_poe() is false.
> >
> > Fixes: 70ed7238297f ("KVM: arm64: Sanitise ID_AA64MMFR3_EL1")
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 3 ++-
> >  arch/arm64/kvm/sys_regs.c         | 3 +++
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index ac7f970c7883..7af72ca749a6 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -1592,7 +1592,8 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val);
> >       (kvm_has_feat((k), ID_AA64MMFR3_EL1, S1PIE, IMP))
> >
> >  #define kvm_has_s1poe(k)                             \
> > -     (kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP))
> > +     (system_supports_poe() &&                       \
> > +      kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP))
>
> Why do we need to further key this on system_supports_poe()? I can see
> this is a potential optimisation, but I don't think this is part of
> the minimal fix.

I did this to be consistent with similar checks, e.g., kvm_has_fpmr().
I can remove it or put it in a separate patch, whichever you prefer.

>
> >
> >  #define kvm_has_ras(k)                                       \
> >       (kvm_has_feat((k), ID_AA64PFR0_EL1, RAS, IMP))
> > diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> > index 88a57ca36d96..237e8bd1cf29 100644
> > --- a/arch/arm64/kvm/sys_regs.c
> > +++ b/arch/arm64/kvm/sys_regs.c
> > @@ -1816,6 +1816,9 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
> >                      ID_AA64MMFR3_EL1_SCTLRX |
> >                      ID_AA64MMFR3_EL1_S1POE |
> >                      ID_AA64MMFR3_EL1_S1PIE;
> > +
> > +             if (!system_supports_poe())
> > +                     val &= ~ID_AA64MMFR3_EL1_S1POE;
>
> How about S1PIE? It seems to have a similar problem, in the sense that
> it has extra state. But I guess because we don't put it behind a
> config option, we context-switch it anyway and all is good?

Like you said, S1PIE doesn't have a config option. I thought if I were
to add one to S1PIE, then why not add one to the other features that
don't have config options... It is conceivable that a config option
might be added in the future, but like you noted above, I think we
need a deeper fix that nips this problem in the bud.

Let's see if Will thinks that we can fix this at the `arm64`
cpufeature layer, rather than somewhere in KVM.

Cheers,
/fuad

> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host
  2026-02-12  9:41     ` Fuad Tabba
@ 2026-02-12 15:35       ` Marc Zyngier
  2026-02-12 18:53         ` Fuad Tabba
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2026-02-12 15:35 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvm, kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, stable

On Thu, 12 Feb 2026 09:41:22 +0000,
Fuad Tabba <tabba@google.com> wrote:
> 
> Hi Marc,
> 
> On Thu, 12 Feb 2026 at 09:29, Marc Zyngier <maz@kernel.org> wrote:
> >
> > Hi Fuad,
> >
> > On Thu, 12 Feb 2026 09:02:50 +0000,
> > Fuad Tabba <tabba@google.com> wrote:
> > >
> > > When CONFIG_ARM64_POE is disabled, KVM does not save/restore POR_EL1.
> > > However, ID_AA64MMFR3_EL1 sanitisation currently exposes the feature to
> > > guests whenever the hardware supports it, ignoring the host kernel
> > > configuration.
> >
> > This is the umpteenth time we get caught by this. PAN was the latest
> > instance until this one. Maybe an approach would be to have a default
> > override when a config option is not enabled, so that KVM is
> > consistent with the rest of the kernel?
> 
> I spoke to Will about this, and one thing he'll look into is whether
> this value in `struct arm64_ftr_reg` can be made consistent with the
> cpu configuration itself (in cpufeature.c itself) . This would avoid
> the problem altogether if possible. The question is whether the kernel
> needs to somehow know that a certain feature exists even if it's
> disabled in the config...
> 
> If he thinks it's not doable at that level, I'll look into
> alternatives to make it correct by construction.

What I currently have for that is rather ugly:

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 72f39cecce93a..3bde0ad5ea972 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -971,6 +971,7 @@ struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id);
 extern struct arm64_ftr_override id_aa64mmfr0_override;
 extern struct arm64_ftr_override id_aa64mmfr1_override;
 extern struct arm64_ftr_override id_aa64mmfr2_override;
+extern struct arm64_ftr_override id_aa64mmfr3_override;
 extern struct arm64_ftr_override id_aa64pfr0_override;
 extern struct arm64_ftr_override id_aa64pfr1_override;
 extern struct arm64_ftr_override id_aa64zfr0_override;
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 1a7eec542675b..32069da9651bf 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -778,6 +778,7 @@ static const struct arm64_ftr_bits ftr_raz[] = {
 struct arm64_ftr_override __read_mostly id_aa64mmfr0_override;
 struct arm64_ftr_override __read_mostly id_aa64mmfr1_override;
 struct arm64_ftr_override __read_mostly id_aa64mmfr2_override;
+struct arm64_ftr_override __read_mostly id_aa64mmfr3_override;
 struct arm64_ftr_override __read_mostly id_aa64pfr0_override;
 struct arm64_ftr_override __read_mostly id_aa64pfr1_override;
 struct arm64_ftr_override __read_mostly id_aa64zfr0_override;
@@ -850,7 +851,8 @@ static const struct __ftr_reg_entry {
 			       &id_aa64mmfr1_override),
 	ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2,
 			       &id_aa64mmfr2_override),
-	ARM64_FTR_REG(SYS_ID_AA64MMFR3_EL1, ftr_id_aa64mmfr3),
+	ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR3_EL1, ftr_id_aa64mmfr3,
+			       &id_aa64mmfr3_override),
 	ARM64_FTR_REG(SYS_ID_AA64MMFR4_EL1, ftr_id_aa64mmfr4),
 
 	/* Op1 = 0, CRn = 10, CRm = 4 */
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 85bc629270bd9..202e165a4680c 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -51,6 +51,7 @@ PI_EXPORT_SYM(id_aa64isar2_override);
 PI_EXPORT_SYM(id_aa64mmfr0_override);
 PI_EXPORT_SYM(id_aa64mmfr1_override);
 PI_EXPORT_SYM(id_aa64mmfr2_override);
+PI_EXPORT_SYM(id_aa64mmfr3_override);
 PI_EXPORT_SYM(id_aa64pfr0_override);
 PI_EXPORT_SYM(id_aa64pfr1_override);
 PI_EXPORT_SYM(id_aa64smfr0_override);
diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
index e5ea280452c3b..b8dbe02e53171 100644
--- a/arch/arm64/kernel/pi/idreg-override.c
+++ b/arch/arm64/kernel/pi/idreg-override.c
@@ -24,10 +24,12 @@
 static u64 __boot_status __initdata;
 
 typedef bool filter_t(u64 val);
+typedef void cfg_override_t(struct arm64_ftr_override *);
 
 struct ftr_set_desc {
 	char 				name[FTR_DESC_NAME_LEN];
 	PREL64(struct arm64_ftr_override, override);
+	PREL64(cfg_override_t,		cfg_override);
 	struct {
 		char			name[FTR_DESC_FIELD_LEN];
 		u8			shift;
@@ -106,6 +108,22 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = {
 	},
 };
 
+static void __init cfg_mmfr3_override(struct arm64_ftr_override *override)
+{
+#ifndef CONFIG_ARM64_POE
+	override->mask |= ID_AA64MMFR3_EL1_S1POE_MASK;
+#endif
+}
+
+static const struct ftr_set_desc mmfr3 __prel64_initconst = {
+	.name		= "id_aa64mmfr3",
+	.override	= &id_aa64mmfr3_override,
+	.cfg_override	= cfg_mmfr3_override,
+	.fields		= {
+		{}
+	},
+};
+
 static bool __init pfr0_sve_filter(u64 val)
 {
 	/*
@@ -221,6 +239,7 @@ PREL64(const struct ftr_set_desc, reg) regs[] __prel64_initconst = {
 	{ &mmfr0	},
 	{ &mmfr1	},
 	{ &mmfr2	},
+	{ &mmfr3	},
 	{ &pfr0 	},
 	{ &pfr1 	},
 	{ &isar1	},
@@ -398,14 +417,19 @@ void __init init_feature_override(u64 boot_status, const void *fdt,
 {
 	struct arm64_ftr_override *override;
 	const struct ftr_set_desc *reg;
+	cfg_override_t *cfg_override;
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(regs); i++) {
 		reg = prel64_pointer(regs[i].reg);
 		override = prel64_pointer(reg->override);
+		cfg_override = prel64_pointer(reg->cfg_override);
 
 		override->val  = 0;
 		override->mask = 0;
+
+		if (cfg_override)
+			cfg_override(override);
 	}
 
 	__boot_status = boot_status;


which works, but is not super friendly.

Looking at the arm64_ftr_reg structure, this could work if
FTR_VISIBLE_IF_IS_ENABLED() didn't simply put "HIDDEN" when the
feature is not present, but forced things to be disabled
altogether. The problem is that "HIDDEN" means not shown to userspace,
and that we have plenty of HIDDEN features that must make it into KVM.

I'll have a think.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host
  2026-02-12 15:35       ` Marc Zyngier
@ 2026-02-12 18:53         ` Fuad Tabba
  2026-02-13 10:40           ` Marc Zyngier
  0 siblings, 1 reply; 11+ messages in thread
From: Fuad Tabba @ 2026-02-12 18:53 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, stable

/fuad

On Thu, 12 Feb 2026 at 15:35, Marc Zyngier <maz@kernel.org> wrote:
>
> On Thu, 12 Feb 2026 09:41:22 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi Marc,
> >
> > On Thu, 12 Feb 2026 at 09:29, Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > Hi Fuad,
> > >
> > > On Thu, 12 Feb 2026 09:02:50 +0000,
> > > Fuad Tabba <tabba@google.com> wrote:
> > > >
> > > > When CONFIG_ARM64_POE is disabled, KVM does not save/restore POR_EL1.
> > > > However, ID_AA64MMFR3_EL1 sanitisation currently exposes the feature to
> > > > guests whenever the hardware supports it, ignoring the host kernel
> > > > configuration.
> > >
> > > This is the umpteenth time we get caught by this. PAN was the latest
> > > instance until this one. Maybe an approach would be to have a default
> > > override when a config option is not enabled, so that KVM is
> > > consistent with the rest of the kernel?
> >
> > I spoke to Will about this, and one thing he'll look into is whether
> > this value in `struct arm64_ftr_reg` can be made consistent with the
> > cpu configuration itself (in cpufeature.c itself) . This would avoid
> > the problem altogether if possible. The question is whether the kernel
> > needs to somehow know that a certain feature exists even if it's
> > disabled in the config...
> >
> > If he thinks it's not doable at that level, I'll look into
> > alternatives to make it correct by construction.
>
> What I currently have for that is rather ugly:
>
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 72f39cecce93a..3bde0ad5ea972 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -971,6 +971,7 @@ struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id);
>  extern struct arm64_ftr_override id_aa64mmfr0_override;
>  extern struct arm64_ftr_override id_aa64mmfr1_override;
>  extern struct arm64_ftr_override id_aa64mmfr2_override;
> +extern struct arm64_ftr_override id_aa64mmfr3_override;
>  extern struct arm64_ftr_override id_aa64pfr0_override;
>  extern struct arm64_ftr_override id_aa64pfr1_override;
>  extern struct arm64_ftr_override id_aa64zfr0_override;
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 1a7eec542675b..32069da9651bf 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -778,6 +778,7 @@ static const struct arm64_ftr_bits ftr_raz[] = {
>  struct arm64_ftr_override __read_mostly id_aa64mmfr0_override;
>  struct arm64_ftr_override __read_mostly id_aa64mmfr1_override;
>  struct arm64_ftr_override __read_mostly id_aa64mmfr2_override;
> +struct arm64_ftr_override __read_mostly id_aa64mmfr3_override;
>  struct arm64_ftr_override __read_mostly id_aa64pfr0_override;
>  struct arm64_ftr_override __read_mostly id_aa64pfr1_override;
>  struct arm64_ftr_override __read_mostly id_aa64zfr0_override;
> @@ -850,7 +851,8 @@ static const struct __ftr_reg_entry {
>                                &id_aa64mmfr1_override),
>         ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2,
>                                &id_aa64mmfr2_override),
> -       ARM64_FTR_REG(SYS_ID_AA64MMFR3_EL1, ftr_id_aa64mmfr3),
> +       ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR3_EL1, ftr_id_aa64mmfr3,
> +                              &id_aa64mmfr3_override),
>         ARM64_FTR_REG(SYS_ID_AA64MMFR4_EL1, ftr_id_aa64mmfr4),
>
>         /* Op1 = 0, CRn = 10, CRm = 4 */
> diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
> index 85bc629270bd9..202e165a4680c 100644
> --- a/arch/arm64/kernel/image-vars.h
> +++ b/arch/arm64/kernel/image-vars.h
> @@ -51,6 +51,7 @@ PI_EXPORT_SYM(id_aa64isar2_override);
>  PI_EXPORT_SYM(id_aa64mmfr0_override);
>  PI_EXPORT_SYM(id_aa64mmfr1_override);
>  PI_EXPORT_SYM(id_aa64mmfr2_override);
> +PI_EXPORT_SYM(id_aa64mmfr3_override);
>  PI_EXPORT_SYM(id_aa64pfr0_override);
>  PI_EXPORT_SYM(id_aa64pfr1_override);
>  PI_EXPORT_SYM(id_aa64smfr0_override);
> diff --git a/arch/arm64/kernel/pi/idreg-override.c b/arch/arm64/kernel/pi/idreg-override.c
> index e5ea280452c3b..b8dbe02e53171 100644
> --- a/arch/arm64/kernel/pi/idreg-override.c
> +++ b/arch/arm64/kernel/pi/idreg-override.c
> @@ -24,10 +24,12 @@
>  static u64 __boot_status __initdata;
>
>  typedef bool filter_t(u64 val);
> +typedef void cfg_override_t(struct arm64_ftr_override *);
>
>  struct ftr_set_desc {
>         char                            name[FTR_DESC_NAME_LEN];
>         PREL64(struct arm64_ftr_override, override);
> +       PREL64(cfg_override_t,          cfg_override);
>         struct {
>                 char                    name[FTR_DESC_FIELD_LEN];
>                 u8                      shift;
> @@ -106,6 +108,22 @@ static const struct ftr_set_desc mmfr2 __prel64_initconst = {
>         },
>  };
>
> +static void __init cfg_mmfr3_override(struct arm64_ftr_override *override)
> +{
> +#ifndef CONFIG_ARM64_POE
> +       override->mask |= ID_AA64MMFR3_EL1_S1POE_MASK;
> +#endif
> +}
> +
> +static const struct ftr_set_desc mmfr3 __prel64_initconst = {
> +       .name           = "id_aa64mmfr3",
> +       .override       = &id_aa64mmfr3_override,
> +       .cfg_override   = cfg_mmfr3_override,
> +       .fields         = {
> +               {}
> +       },
> +};
> +
>  static bool __init pfr0_sve_filter(u64 val)
>  {
>         /*
> @@ -221,6 +239,7 @@ PREL64(const struct ftr_set_desc, reg) regs[] __prel64_initconst = {
>         { &mmfr0        },
>         { &mmfr1        },
>         { &mmfr2        },
> +       { &mmfr3        },
>         { &pfr0         },
>         { &pfr1         },
>         { &isar1        },
> @@ -398,14 +417,19 @@ void __init init_feature_override(u64 boot_status, const void *fdt,
>  {
>         struct arm64_ftr_override *override;
>         const struct ftr_set_desc *reg;
> +       cfg_override_t *cfg_override;
>         int i;
>
>         for (i = 0; i < ARRAY_SIZE(regs); i++) {
>                 reg = prel64_pointer(regs[i].reg);
>                 override = prel64_pointer(reg->override);
> +               cfg_override = prel64_pointer(reg->cfg_override);
>
>                 override->val  = 0;
>                 override->mask = 0;
> +
> +               if (cfg_override)
> +                       cfg_override(override);
>         }
>
>         __boot_status = boot_status;
>
>
> which works, but is not super friendly.

Ouch! Yeah, no easy solutions here. For now, as a fix to be able to
backport, would you like me to respin this without the change to
kvm_has_s1poe(), or with that change as a separate patch?

Cheers,
/fuad
> Looking at the arm64_ftr_reg structure, this could work if
> FTR_VISIBLE_IF_IS_ENABLED() didn't simply put "HIDDEN" when the
> feature is not present, but forced things to be disabled
> altogether. The problem is that "HIDDEN" means not shown to userspace,
> and that we have plenty of HIDDEN features that must make it into KVM.
>
> I'll have a think.
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host
  2026-02-12 18:53         ` Fuad Tabba
@ 2026-02-13 10:40           ` Marc Zyngier
  0 siblings, 0 replies; 11+ messages in thread
From: Marc Zyngier @ 2026-02-13 10:40 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvm, kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, stable

On Thu, 12 Feb 2026 18:53:05 +0000,
Fuad Tabba <tabba@google.com> wrote:

> Ouch! Yeah, no easy solutions here. For now, as a fix to be able to
> backport, would you like me to respin this without the change to
> kvm_has_s1poe(), or with that change as a separate patch?

I'll remove the hunk myself when applying the series, no need to
resend (unless I find another issue in the rest of the series, but it
looks good so far).

FWIW, I have an alternative set of hacks at [1], which appear to work,
but we need to collectively convince ourselves that this is the
correct thing to do (i.e. the required traps are always set,
irrespective of the configuration).

This affects PAuth, SVE, MTE and S1POE. SME and GCS are not supported
in KVM, and BTI has no additional state, so these shouldn't be
affected.

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/log/?h=arm64/ftr_config

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 2/3] KVM: arm64: Fix ID register initialization for non-protected pKVM guests
  2026-02-12  9:02 ` [PATCH v1 2/3] KVM: arm64: Fix ID register initialization for non-protected pKVM guests Fuad Tabba
@ 2026-02-13 11:03   ` Marc Zyngier
  2026-02-13 11:07     ` Fuad Tabba
  0 siblings, 1 reply; 11+ messages in thread
From: Marc Zyngier @ 2026-02-13 11:03 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvm, kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, stable

On Thu, 12 Feb 2026 09:02:51 +0000,
Fuad Tabba <tabba@google.com> wrote:
> 
> In protected mode, the hypervisor maintains a separate instance of
> the `kvm` structure for each VM. For non-protected VMs, this structure is
> initialized from the host's `kvm` state.
> 
> Currently, `pkvm_init_features_from_host()` copies the
> `KVM_ARCH_FLAG_ID_REGS_INITIALIZED` flag from the host without the
> underlying `id_regs` data being initialized. This results in the
> hypervisor seeing the flag as set while the ID registers remain zeroed.
> 
> Consequently, `kvm_has_feat()` checks at EL2 fail (return 0) for
> non-protected VMs. This breaks logic that relies on feature detection,
> such as `ctxt_has_tcrx()` for TCR2_EL1 support. As a result, certain
> system registers (e.g., TCR2_EL1, PIR_EL1, POR_EL1) are not
> saved/restored during the world switch, which could lead to state
> corruption.
> 
> Fix this by explicitly copying the ID registers from the host `kvm` to
> the hypervisor `kvm` for non-protected VMs during vCPU initialization,
> since we trust the host with its non-protected guests' features. Also
> ensure `KVM_ARCH_FLAG_ID_REGS_INITIALIZED` is cleared initially in
> `pkvm_init_features_from_host` so that `vm_copy_id_regs` can properly
> initialize them and set the flag once done.
> 
> Fixes: 41d6028e28bd ("KVM: arm64: Convert the SVE guest vcpu flag to a vm flag")
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/pkvm.c | 37 ++++++++++++++++++++++++++++++++--
>  1 file changed, 35 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> index 12b2acfbcfd1..267854ed29c8 100644
> --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
> +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> @@ -344,6 +344,8 @@ static void pkvm_init_features_from_host(struct pkvm_hyp_vm *hyp_vm, const struc
>  
>  	/* No restrictions for non-protected VMs. */
>  	if (!kvm_vm_is_protected(kvm)) {
> +		clear_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &host_arch_flags);
> +
>  		hyp_vm->kvm.arch.flags = host_arch_flags;

Can't you just have

  		hyp_vm->kvm.arch.flags &= ~BIT_ULL(KVM_ARCH_FLAG_ID_REGS_INITIALIZED);

since there are no atomicity requirements here?

>  
>  		bitmap_copy(kvm->arch.vcpu_features,
> @@ -471,6 +473,36 @@ static int pkvm_vcpu_init_sve(struct pkvm_hyp_vcpu *hyp_vcpu, struct kvm_vcpu *h
>  	return ret;
>  }
>  
> +static int vm_copy_id_regs(struct pkvm_hyp_vcpu *hyp_vcpu)
> +{
> +	struct pkvm_hyp_vm *hyp_vm = pkvm_hyp_vcpu_to_hyp_vm(hyp_vcpu);
> +	const struct kvm *host_kvm = hyp_vm->host_kvm;
> +	struct kvm *kvm = &hyp_vm->kvm;
> +
> +	if (!test_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &host_kvm->arch.flags))
> +		return -EINVAL;
> +
> +	if (test_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags))
> +		return 0;
> +
> +	memcpy(kvm->arch.id_regs, host_kvm->arch.id_regs, sizeof(kvm->arch.id_regs));
> +	set_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags);

This looks a bit odd. Can you have another vcpu doing this in
parallel? You seem to be holding vm_table_lock at this stage, so
that's probably OK,  but I'd have expected something like:

	if (test_and_set_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags))
		return 0;

	memcpy(kvm->arch.id_regs, host_kvm->arch.id_regs, sizeof(kvm->arch.id_regs));

which makes the intent slightly clearer.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v1 2/3] KVM: arm64: Fix ID register initialization for non-protected pKVM guests
  2026-02-13 11:03   ` Marc Zyngier
@ 2026-02-13 11:07     ` Fuad Tabba
  0 siblings, 0 replies; 11+ messages in thread
From: Fuad Tabba @ 2026-02-13 11:07 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, kvmarm, linux-arm-kernel, oliver.upton, joey.gouly,
	suzuki.poulose, yuzenghui, catalin.marinas, will, stable

On Fri, 13 Feb 2026 at 11:03, Marc Zyngier <maz@kernel.org> wrote:
>
> On Thu, 12 Feb 2026 09:02:51 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > In protected mode, the hypervisor maintains a separate instance of
> > the `kvm` structure for each VM. For non-protected VMs, this structure is
> > initialized from the host's `kvm` state.
> >
> > Currently, `pkvm_init_features_from_host()` copies the
> > `KVM_ARCH_FLAG_ID_REGS_INITIALIZED` flag from the host without the
> > underlying `id_regs` data being initialized. This results in the
> > hypervisor seeing the flag as set while the ID registers remain zeroed.
> >
> > Consequently, `kvm_has_feat()` checks at EL2 fail (return 0) for
> > non-protected VMs. This breaks logic that relies on feature detection,
> > such as `ctxt_has_tcrx()` for TCR2_EL1 support. As a result, certain
> > system registers (e.g., TCR2_EL1, PIR_EL1, POR_EL1) are not
> > saved/restored during the world switch, which could lead to state
> > corruption.
> >
> > Fix this by explicitly copying the ID registers from the host `kvm` to
> > the hypervisor `kvm` for non-protected VMs during vCPU initialization,
> > since we trust the host with its non-protected guests' features. Also
> > ensure `KVM_ARCH_FLAG_ID_REGS_INITIALIZED` is cleared initially in
> > `pkvm_init_features_from_host` so that `vm_copy_id_regs` can properly
> > initialize them and set the flag once done.
> >
> > Fixes: 41d6028e28bd ("KVM: arm64: Convert the SVE guest vcpu flag to a vm flag")
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/pkvm.c | 37 ++++++++++++++++++++++++++++++++--
> >  1 file changed, 35 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > index 12b2acfbcfd1..267854ed29c8 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > @@ -344,6 +344,8 @@ static void pkvm_init_features_from_host(struct pkvm_hyp_vm *hyp_vm, const struc
> >
> >       /* No restrictions for non-protected VMs. */
> >       if (!kvm_vm_is_protected(kvm)) {
> > +             clear_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &host_arch_flags);
> > +
> >               hyp_vm->kvm.arch.flags = host_arch_flags;
>
> Can't you just have
>
>                 hyp_vm->kvm.arch.flags &= ~BIT_ULL(KVM_ARCH_FLAG_ID_REGS_INITIALIZED);
>
> since there are no atomicity requirements here?

I'll fix this.

> >
> >               bitmap_copy(kvm->arch.vcpu_features,
> > @@ -471,6 +473,36 @@ static int pkvm_vcpu_init_sve(struct pkvm_hyp_vcpu *hyp_vcpu, struct kvm_vcpu *h
> >       return ret;
> >  }
> >
> > +static int vm_copy_id_regs(struct pkvm_hyp_vcpu *hyp_vcpu)
> > +{
> > +     struct pkvm_hyp_vm *hyp_vm = pkvm_hyp_vcpu_to_hyp_vm(hyp_vcpu);
> > +     const struct kvm *host_kvm = hyp_vm->host_kvm;
> > +     struct kvm *kvm = &hyp_vm->kvm;
> > +
> > +     if (!test_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &host_kvm->arch.flags))
> > +             return -EINVAL;
> > +
> > +     if (test_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags))
> > +             return 0;
> > +
> > +     memcpy(kvm->arch.id_regs, host_kvm->arch.id_regs, sizeof(kvm->arch.id_regs));
> > +     set_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags);
>
> This looks a bit odd. Can you have another vcpu doing this in
> parallel? You seem to be holding vm_table_lock at this stage, so
> that's probably OK,  but I'd have expected something like:

You're right, it cannot happen in parallel, but another vCPU could
have beaten this one to it.

>
>         if (test_and_set_bit(KVM_ARCH_FLAG_ID_REGS_INITIALIZED, &kvm->arch.flags))
>                 return 0;
>
>         memcpy(kvm->arch.id_regs, host_kvm->arch.id_regs, sizeof(kvm->arch.id_regs));
>
> which makes the intent slightly clearer.

I agree. I'll fix this too.

Thanks,
/fuad

>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-02-13 11:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-12  9:02 [PATCH v1 0/3] KVM: arm64: Fix guest feature sanitization and pKVM state synchronization Fuad Tabba
2026-02-12  9:02 ` [PATCH v1 1/3] KVM: arm64: Hide S1POE from guests when not supported by the host Fuad Tabba
2026-02-12  9:29   ` Marc Zyngier
2026-02-12  9:41     ` Fuad Tabba
2026-02-12 15:35       ` Marc Zyngier
2026-02-12 18:53         ` Fuad Tabba
2026-02-13 10:40           ` Marc Zyngier
2026-02-12  9:02 ` [PATCH v1 2/3] KVM: arm64: Fix ID register initialization for non-protected pKVM guests Fuad Tabba
2026-02-13 11:03   ` Marc Zyngier
2026-02-13 11:07     ` Fuad Tabba
2026-02-12  9:02 ` [PATCH v1 3/3] KVM: arm64: Remove redundant kern_hyp_va() in unpin_host_sve_state() Fuad Tabba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox