Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Marc Zyngier" <maz@kernel.org>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
Date: Mon, 08 Jun 2026 08:26:02 +0000	[thread overview]
Message-ID: <20260608082603.16AEC1F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260608081108.2244133-1-maz@kernel.org>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [Critical] Missing NULL check for `vcpu->arch.vncr_tlb` in `invalidate_vncr_va()` causes a host kernel panic when iterating over uninitialized vCPUs.
- [High] The lazily allocated `vncr_tlb` structure leaves the `cpu` field zero-initialized, causing it to inadvertently unmap CPU 0's VNCR fixmap on the first fault.
- [High] Resource leak of `struct page` references on error paths in `kvm_translate_vncr()`.
--

commit cf6d236759b9eb5ba69620ec32a536af7c15a9a2
Author: Marc Zyngier <maz@kernel.org>

KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb

This commit fixes a race condition and memory leak in the initialization
of vncr_tlb by taking the mmu_lock during assignment and verifying it is
NULL beforehand.

Link: https://lore.kernel.org/r/20260607180815.85FBC1F00893@smtp.kernel.org

> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 883b6c1008fbb..21c9d1f07d7a8 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1248,8 +1248,14 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
>  	if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY))
>  		return 0;
>  
> -	vcpu->arch.vncr_tlb = kzalloc_obj(*vcpu->arch.vncr_tlb,
> -					  GFP_KERNEL_ACCOUNT);
> +	if (!vcpu->arch.vncr_tlb) {
> +		struct vncr_tlb *vt = kzalloc_obj(*vcpu->arch.vncr_tlb,
> +						  GFP_KERNEL_ACCOUNT);

[Severity: High]
This is a pre-existing issue, but since we are allocating the vncr_tlb here
using kzalloc_obj(), vt->cpu will be zero-initialized.

Looking at invalidate_vncr(), if a vCPU encounters its first VNCR fault,
kvm_translate_vncr() calls invalidate_vncr(vt). Because vt->cpu is 0
instead of -1, the following check passes:

arch/arm64/kvm/nested.c:invalidate_vncr() {
        ...
        if (vt->cpu != -1)
                clear_fixmap(vncr_fixmap(vt->cpu));
        ...
}

Does this inadvertently unmap CPU 0's fixmap, potentially corrupting the
translation state if another vCPU is actively mapped on CPU 0? Could we
initialize vt->cpu to -1 right after this allocation?

> +
> +		scoped_guard(write_lock, &vcpu->kvm->mmu_lock)
> +			vcpu->arch.vncr_tlb = vt;
> +	}
> +
>  	if (!vcpu->arch.vncr_tlb)
>  		return -ENOMEM;

[Severity: Critical]
This isn't a bug introduced by this patch, but while reviewing the vncr_tlb
lifecycle, I noticed a potential issue in invalidate_vncr_va() when
iterating over uninitialized vCPUs.

If an online vCPU triggers an EL2 TLBI instruction before another vCPU in
the same VM has run, could this lead to a NULL pointer dereference?

arch/arm64/kvm/nested.c:invalidate_vncr_va() {
        ...
        kvm_for_each_vcpu(i, vcpu, kvm) {
                struct vncr_tlb *vt = vcpu->arch.vncr_tlb;

                if (!vt->valid)
                        continue;
        ...
}

Since vncr_tlb is allocated lazily on the first run, vt could be NULL here.
Does this path need a NULL check before accessing vt->valid?

[Severity: High]
This is also a pre-existing issue, but I noticed a potential page reference
leak in kvm_translate_vncr() on the error paths.

When __kvm_faultin_pfn() successfully pins a page, the early return for a
write fault without a writable page bypasses kvm_release_faultin_page():

arch/arm64/kvm/nested.c:kvm_translate_vncr() {
        ...
        if (is_error_noslot_pfn(pfn) || (write_fault && !writable))
                return -EFAULT;
        ...
}

Similarly, the MMU retry check also returns directly:

arch/arm64/kvm/nested.c:kvm_translate_vncr() {
        ...
        if (mmu_invalidate_retry(vcpu->kvm, mmu_seq))
                return -EAGAIN;
        ...
}

Do these error paths leave the page pinned and permanently leak the struct
page reference count?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260608081108.2244133-1-maz@kernel.org?part=1

  reply	other threads:[~2026-06-08  8:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08  8:11 [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb Marc Zyngier
2026-06-08  8:26 ` sashiko-bot [this message]
2026-06-08  9:41   ` Marc Zyngier
2026-06-08 16:34 ` Oliver Upton
2026-06-08 20:55 ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260608082603.16AEC1F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox