* [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
@ 2026-06-08 8:11 Marc Zyngier
2026-06-08 8:26 ` sashiko-bot
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Marc Zyngier @ 2026-06-08 8:11 UTC (permalink / raw)
To: kvmarm, kvm, linux-arm-kernel
Cc: Steffen Eiden, Joey Gouly, Suzuki K Poulose, Oliver Upton,
Zenghui Yu
Sashiko reports that there is a race between initialising vncr_tlb
and making use of it, as we don't hold the mmu_lock at this point.
Additionally, it identifies a memory leak, should userspace repeatedly
invokes the KVM_RUN ioctl after a failure of kvm_arch_vcpu_run_pid_change(),
as we assign vncr_tlb blindly on first run, irrespective of prior
allocations.
Slap the two bugs in one go by taking the kvm->mmu_lock on assigning
vncr_tlb, preventing the race for good, and by checking that vncr_tlb
is indeed NULL prior to allocation.
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20260607180815.85FBC1F00893@smtp.kernel.org
---
arch/arm64/kvm/nested.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 690b8e8564166..d11e36b3cfcc2 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1253,8 +1253,14 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY))
return 0;
- vcpu->arch.vncr_tlb = kzalloc_obj(*vcpu->arch.vncr_tlb,
- GFP_KERNEL_ACCOUNT);
+ if (!vcpu->arch.vncr_tlb) {
+ struct vncr_tlb *vt = kzalloc_obj(*vcpu->arch.vncr_tlb,
+ GFP_KERNEL_ACCOUNT);
+
+ scoped_guard(write_lock, &vcpu->kvm->mmu_lock)
+ vcpu->arch.vncr_tlb = vt;
+ }
+
if (!vcpu->arch.vncr_tlb)
return -ENOMEM;
--
2.47.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
2026-06-08 8:11 [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb Marc Zyngier
@ 2026-06-08 8:26 ` sashiko-bot
2026-06-08 9:41 ` Marc Zyngier
2026-06-08 16:34 ` Oliver Upton
2026-06-08 20:55 ` Yosry Ahmed
2 siblings, 1 reply; 5+ messages in thread
From: sashiko-bot @ 2026-06-08 8:26 UTC (permalink / raw)
To: Marc Zyngier; +Cc: kvm
Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
Pre-existing issues:
- [Critical] Missing NULL check for `vcpu->arch.vncr_tlb` in `invalidate_vncr_va()` causes a host kernel panic when iterating over uninitialized vCPUs.
- [High] The lazily allocated `vncr_tlb` structure leaves the `cpu` field zero-initialized, causing it to inadvertently unmap CPU 0's VNCR fixmap on the first fault.
- [High] Resource leak of `struct page` references on error paths in `kvm_translate_vncr()`.
--
commit cf6d236759b9eb5ba69620ec32a536af7c15a9a2
Author: Marc Zyngier <maz@kernel.org>
KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
This commit fixes a race condition and memory leak in the initialization
of vncr_tlb by taking the mmu_lock during assignment and verifying it is
NULL beforehand.
Link: https://lore.kernel.org/r/20260607180815.85FBC1F00893@smtp.kernel.org
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 883b6c1008fbb..21c9d1f07d7a8 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1248,8 +1248,14 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
> if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY))
> return 0;
>
> - vcpu->arch.vncr_tlb = kzalloc_obj(*vcpu->arch.vncr_tlb,
> - GFP_KERNEL_ACCOUNT);
> + if (!vcpu->arch.vncr_tlb) {
> + struct vncr_tlb *vt = kzalloc_obj(*vcpu->arch.vncr_tlb,
> + GFP_KERNEL_ACCOUNT);
[Severity: High]
This is a pre-existing issue, but since we are allocating the vncr_tlb here
using kzalloc_obj(), vt->cpu will be zero-initialized.
Looking at invalidate_vncr(), if a vCPU encounters its first VNCR fault,
kvm_translate_vncr() calls invalidate_vncr(vt). Because vt->cpu is 0
instead of -1, the following check passes:
arch/arm64/kvm/nested.c:invalidate_vncr() {
...
if (vt->cpu != -1)
clear_fixmap(vncr_fixmap(vt->cpu));
...
}
Does this inadvertently unmap CPU 0's fixmap, potentially corrupting the
translation state if another vCPU is actively mapped on CPU 0? Could we
initialize vt->cpu to -1 right after this allocation?
> +
> + scoped_guard(write_lock, &vcpu->kvm->mmu_lock)
> + vcpu->arch.vncr_tlb = vt;
> + }
> +
> if (!vcpu->arch.vncr_tlb)
> return -ENOMEM;
[Severity: Critical]
This isn't a bug introduced by this patch, but while reviewing the vncr_tlb
lifecycle, I noticed a potential issue in invalidate_vncr_va() when
iterating over uninitialized vCPUs.
If an online vCPU triggers an EL2 TLBI instruction before another vCPU in
the same VM has run, could this lead to a NULL pointer dereference?
arch/arm64/kvm/nested.c:invalidate_vncr_va() {
...
kvm_for_each_vcpu(i, vcpu, kvm) {
struct vncr_tlb *vt = vcpu->arch.vncr_tlb;
if (!vt->valid)
continue;
...
}
Since vncr_tlb is allocated lazily on the first run, vt could be NULL here.
Does this path need a NULL check before accessing vt->valid?
[Severity: High]
This is also a pre-existing issue, but I noticed a potential page reference
leak in kvm_translate_vncr() on the error paths.
When __kvm_faultin_pfn() successfully pins a page, the early return for a
write fault without a writable page bypasses kvm_release_faultin_page():
arch/arm64/kvm/nested.c:kvm_translate_vncr() {
...
if (is_error_noslot_pfn(pfn) || (write_fault && !writable))
return -EFAULT;
...
}
Similarly, the MMU retry check also returns directly:
arch/arm64/kvm/nested.c:kvm_translate_vncr() {
...
if (mmu_invalidate_retry(vcpu->kvm, mmu_seq))
return -EAGAIN;
...
}
Do these error paths leave the page pinned and permanently leak the struct
page reference count?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260608081108.2244133-1-maz@kernel.org?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
2026-06-08 8:26 ` sashiko-bot
@ 2026-06-08 9:41 ` Marc Zyngier
0 siblings, 0 replies; 5+ messages in thread
From: Marc Zyngier @ 2026-06-08 9:41 UTC (permalink / raw)
To: sashiko-reviews; +Cc: kvm
On Mon, 08 Jun 2026 09:26:02 +0100,
sashiko-bot@kernel.org wrote:
>
> Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
>
> Pre-existing issues:
> - [Critical] Missing NULL check for `vcpu->arch.vncr_tlb` in `invalidate_vncr_va()` causes a host kernel panic when iterating over uninitialized vCPUs.
> - [High] The lazily allocated `vncr_tlb` structure leaves the `cpu` field zero-initialized, causing it to inadvertently unmap CPU 0's VNCR fixmap on the first fault.
> - [High] Resource leak of `struct page` references on error paths in `kvm_translate_vncr()`.
> --
>
> commit cf6d236759b9eb5ba69620ec32a536af7c15a9a2
> Author: Marc Zyngier <maz@kernel.org>
>
> KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
>
> This commit fixes a race condition and memory leak in the initialization
> of vncr_tlb by taking the mmu_lock during assignment and verifying it is
> NULL beforehand.
>
> Link: https://lore.kernel.org/r/20260607180815.85FBC1F00893@smtp.kernel.org
>
> > diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> > index 883b6c1008fbb..21c9d1f07d7a8 100644
> > --- a/arch/arm64/kvm/nested.c
> > +++ b/arch/arm64/kvm/nested.c
> > @@ -1248,8 +1248,14 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
> > if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY))
> > return 0;
> >
> > - vcpu->arch.vncr_tlb = kzalloc_obj(*vcpu->arch.vncr_tlb,
> > - GFP_KERNEL_ACCOUNT);
> > + if (!vcpu->arch.vncr_tlb) {
> > + struct vncr_tlb *vt = kzalloc_obj(*vcpu->arch.vncr_tlb,
> > + GFP_KERNEL_ACCOUNT);
>
> [Severity: High]
> This is a pre-existing issue, but since we are allocating the vncr_tlb here
> using kzalloc_obj(), vt->cpu will be zero-initialized.
>
> Looking at invalidate_vncr(), if a vCPU encounters its first VNCR fault,
> kvm_translate_vncr() calls invalidate_vncr(vt). Because vt->cpu is 0
> instead of -1, the following check passes:
>
> arch/arm64/kvm/nested.c:invalidate_vncr() {
> ...
> if (vt->cpu != -1)
> clear_fixmap(vncr_fixmap(vt->cpu));
> ...
> }
>
> Does this inadvertently unmap CPU 0's fixmap, potentially corrupting the
> translation state if another vCPU is actively mapped on CPU 0? Could we
> initialize vt->cpu to -1 right after this allocation?
This is only evaluated when valid is true, and this is checked in the
calling context. Since valid is initialised to false by virtue of the
structure being 0-initialised, this path doesn't look possible.
>
> > +
> > + scoped_guard(write_lock, &vcpu->kvm->mmu_lock)
> > + vcpu->arch.vncr_tlb = vt;
> > + }
> > +
> > if (!vcpu->arch.vncr_tlb)
> > return -ENOMEM;
>
> [Severity: Critical]
> This isn't a bug introduced by this patch, but while reviewing the vncr_tlb
> lifecycle, I noticed a potential issue in invalidate_vncr_va() when
> iterating over uninitialized vCPUs.
>
> If an online vCPU triggers an EL2 TLBI instruction before another vCPU in
> the same VM has run, could this lead to a NULL pointer dereference?
>
> arch/arm64/kvm/nested.c:invalidate_vncr_va() {
> ...
> kvm_for_each_vcpu(i, vcpu, kvm) {
> struct vncr_tlb *vt = vcpu->arch.vncr_tlb;
>
> if (!vt->valid)
> continue;
> ...
> }
>
> Since vncr_tlb is allocated lazily on the first run, vt could be NULL here.
> Does this path need a NULL check before accessing vt->valid?
>
This is already fixed by a separate patch
(https://lore.kernel.org/all/20260607175745.297793-1-maz@kernel.org/)
> [Severity: High]
> This is also a pre-existing issue, but I noticed a potential page reference
> leak in kvm_translate_vncr() on the error paths.
>
> When __kvm_faultin_pfn() successfully pins a page, the early return for a
> write fault without a writable page bypasses kvm_release_faultin_page():
>
> arch/arm64/kvm/nested.c:kvm_translate_vncr() {
> ...
> if (is_error_noslot_pfn(pfn) || (write_fault && !writable))
> return -EFAULT;
> ...
> }
This looks like a real issue indeed.
> Similarly, the MMU retry check also returns directly:
>
> arch/arm64/kvm/nested.c:kvm_translate_vncr() {
> ...
> if (mmu_invalidate_retry(vcpu->kvm, mmu_seq))
> return -EAGAIN;
> ...
> }
This one has been fixed already:
https://patch.msgid.link/20260602235450.103057-2-oupton@kernel.org
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
2026-06-08 8:11 [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb Marc Zyngier
2026-06-08 8:26 ` sashiko-bot
@ 2026-06-08 16:34 ` Oliver Upton
2026-06-08 20:55 ` Yosry Ahmed
2 siblings, 0 replies; 5+ messages in thread
From: Oliver Upton @ 2026-06-08 16:34 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, kvm, linux-arm-kernel, Steffen Eiden, Joey Gouly,
Suzuki K Poulose, Zenghui Yu
On Mon, Jun 08, 2026 at 09:11:08AM +0100, Marc Zyngier wrote:
> Sashiko reports that there is a race between initialising vncr_tlb
> and making use of it, as we don't hold the mmu_lock at this point.
>
> Additionally, it identifies a memory leak, should userspace repeatedly
> invokes the KVM_RUN ioctl after a failure of kvm_arch_vcpu_run_pid_change(),
> as we assign vncr_tlb blindly on first run, irrespective of prior
> allocations.
>
> Slap the two bugs in one go by taking the kvm->mmu_lock on assigning
> vncr_tlb, preventing the race for good, and by checking that vncr_tlb
> is indeed NULL prior to allocation.
>
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Link: https://lore.kernel.org/r/20260607180815.85FBC1F00893@smtp.kernel.org
Reviewed-by: Oliver Upton <oupton@kernel.org>
Thanks,
Oliver
> ---
> arch/arm64/kvm/nested.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 690b8e8564166..d11e36b3cfcc2 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1253,8 +1253,14 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
> if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY))
> return 0;
>
> - vcpu->arch.vncr_tlb = kzalloc_obj(*vcpu->arch.vncr_tlb,
> - GFP_KERNEL_ACCOUNT);
> + if (!vcpu->arch.vncr_tlb) {
> + struct vncr_tlb *vt = kzalloc_obj(*vcpu->arch.vncr_tlb,
> + GFP_KERNEL_ACCOUNT);
> +
> + scoped_guard(write_lock, &vcpu->kvm->mmu_lock)
> + vcpu->arch.vncr_tlb = vt;
> + }
> +
> if (!vcpu->arch.vncr_tlb)
> return -ENOMEM;
>
> --
> 2.47.3
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
2026-06-08 8:11 [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb Marc Zyngier
2026-06-08 8:26 ` sashiko-bot
2026-06-08 16:34 ` Oliver Upton
@ 2026-06-08 20:55 ` Yosry Ahmed
2 siblings, 0 replies; 5+ messages in thread
From: Yosry Ahmed @ 2026-06-08 20:55 UTC (permalink / raw)
To: Marc Zyngier
Cc: kvmarm, kvm, linux-arm-kernel, Steffen Eiden, Joey Gouly,
Suzuki K Poulose, Oliver Upton, Zenghui Yu
On Mon, Jun 08, 2026 at 09:11:08AM +0100, Marc Zyngier wrote:
> Sashiko reports that there is a race between initialising vncr_tlb
> and making use of it, as we don't hold the mmu_lock at this point.
>
> Additionally, it identifies a memory leak, should userspace repeatedly
> invokes the KVM_RUN ioctl after a failure of kvm_arch_vcpu_run_pid_change(),
> as we assign vncr_tlb blindly on first run, irrespective of prior
> allocations.
>
> Slap the two bugs in one go by taking the kvm->mmu_lock on assigning
> vncr_tlb, preventing the race for good, and by checking that vncr_tlb
> is indeed NULL prior to allocation.
>
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Link: https://lore.kernel.org/r/20260607180815.85FBC1F00893@smtp.kernel.org
> ---
> arch/arm64/kvm/nested.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 690b8e8564166..d11e36b3cfcc2 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1253,8 +1253,14 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
> if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY))
> return 0;
>
> - vcpu->arch.vncr_tlb = kzalloc_obj(*vcpu->arch.vncr_tlb,
> - GFP_KERNEL_ACCOUNT);
> + if (!vcpu->arch.vncr_tlb) {
> + struct vncr_tlb *vt = kzalloc_obj(*vcpu->arch.vncr_tlb,
> + GFP_KERNEL_ACCOUNT);
> +
> + scoped_guard(write_lock, &vcpu->kvm->mmu_lock)
> + vcpu->arch.vncr_tlb = vt;
> + }
(I am not familiar with this code at all, so apologies in advance if I
am making an idiot out of myself here)
IIUC, the point of holding the lock here is *not* to protect against
concurrent initialization, as in this case the NULL check needs to be
done under the lock.
Rather, the goal is to prevent re-ordering of zeroing from kzalloc and
the assignment to vcpu->arch.vncr_tlb, by depending on the barriers
provided by the lock. The lock is held by the readers so holding it here
conviently means we do not need to add any barriers to the readers.
Is my understanding correct?
If yes, I think the code looks confusing, at least to a layman like
myself. It initially seems like the lock protects against concurrent
initializations, but then the NULL check is not done again under the
lock. The goal of the lock is not clear without the original report.
Mayeb it's clearer to explicitly use barriers if the goal is preventing
reordering?
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-08 20:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08 8:11 [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb Marc Zyngier
2026-06-08 8:26 ` sashiko-bot
2026-06-08 9:41 ` Marc Zyngier
2026-06-08 16:34 ` Oliver Upton
2026-06-08 20:55 ` Yosry Ahmed
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox