All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Marc Zyngier" <maz@kernel.org>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb
Date: Mon, 08 Jun 2026 08:26:02 +0000	[thread overview]
Message-ID: <20260608082603.16AEC1F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260608081108.2244133-1-maz@kernel.org>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:

Pre-existing issues:
- [Critical] Missing NULL check for `vcpu->arch.vncr_tlb` in `invalidate_vncr_va()` causes a host kernel panic when iterating over uninitialized vCPUs.
- [High] The lazily allocated `vncr_tlb` structure leaves the `cpu` field zero-initialized, causing it to inadvertently unmap CPU 0's VNCR fixmap on the first fault.
- [High] Resource leak of `struct page` references on error paths in `kvm_translate_vncr()`.
--

commit cf6d236759b9eb5ba69620ec32a536af7c15a9a2
Author: Marc Zyngier <maz@kernel.org>

KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb

This commit fixes a race condition and memory leak in the initialization
of vncr_tlb by taking the mmu_lock during assignment and verifying it is
NULL beforehand.

Link: https://lore.kernel.org/r/20260607180815.85FBC1F00893@smtp.kernel.org

> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 883b6c1008fbb..21c9d1f07d7a8 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1248,8 +1248,14 @@ int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu)
>  	if (!kvm_has_feat(vcpu->kvm, ID_AA64MMFR4_EL1, NV_frac, NV2_ONLY))
>  		return 0;
>  
> -	vcpu->arch.vncr_tlb = kzalloc_obj(*vcpu->arch.vncr_tlb,
> -					  GFP_KERNEL_ACCOUNT);
> +	if (!vcpu->arch.vncr_tlb) {
> +		struct vncr_tlb *vt = kzalloc_obj(*vcpu->arch.vncr_tlb,
> +						  GFP_KERNEL_ACCOUNT);

[Severity: High]
This is a pre-existing issue, but since we are allocating the vncr_tlb here
using kzalloc_obj(), vt->cpu will be zero-initialized.

Looking at invalidate_vncr(), if a vCPU encounters its first VNCR fault,
kvm_translate_vncr() calls invalidate_vncr(vt). Because vt->cpu is 0
instead of -1, the following check passes:

arch/arm64/kvm/nested.c:invalidate_vncr() {
        ...
        if (vt->cpu != -1)
                clear_fixmap(vncr_fixmap(vt->cpu));
        ...
}

Does this inadvertently unmap CPU 0's fixmap, potentially corrupting the
translation state if another vCPU is actively mapped on CPU 0? Could we
initialize vt->cpu to -1 right after this allocation?

> +
> +		scoped_guard(write_lock, &vcpu->kvm->mmu_lock)
> +			vcpu->arch.vncr_tlb = vt;
> +	}
> +
>  	if (!vcpu->arch.vncr_tlb)
>  		return -ENOMEM;

[Severity: Critical]
This isn't a bug introduced by this patch, but while reviewing the vncr_tlb
lifecycle, I noticed a potential issue in invalidate_vncr_va() when
iterating over uninitialized vCPUs.

If an online vCPU triggers an EL2 TLBI instruction before another vCPU in
the same VM has run, could this lead to a NULL pointer dereference?

arch/arm64/kvm/nested.c:invalidate_vncr_va() {
        ...
        kvm_for_each_vcpu(i, vcpu, kvm) {
                struct vncr_tlb *vt = vcpu->arch.vncr_tlb;

                if (!vt->valid)
                        continue;
        ...
}

Since vncr_tlb is allocated lazily on the first run, vt could be NULL here.
Does this path need a NULL check before accessing vt->valid?

[Severity: High]
This is also a pre-existing issue, but I noticed a potential page reference
leak in kvm_translate_vncr() on the error paths.

When __kvm_faultin_pfn() successfully pins a page, the early return for a
write fault without a writable page bypasses kvm_release_faultin_page():

arch/arm64/kvm/nested.c:kvm_translate_vncr() {
        ...
        if (is_error_noslot_pfn(pfn) || (write_fault && !writable))
                return -EFAULT;
        ...
}

Similarly, the MMU retry check also returns directly:

arch/arm64/kvm/nested.c:kvm_translate_vncr() {
        ...
        if (mmu_invalidate_retry(vcpu->kvm, mmu_seq))
                return -EAGAIN;
        ...
}

Do these error paths leave the page pinned and permanently leak the struct
page reference count?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260608081108.2244133-1-maz@kernel.org?part=1

  reply	other threads:[~2026-06-08  8:26 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08  8:11 [PATCH] KVM: arm64: Hold kvm->mmu_lock while initialising vcpu->arch.vncr_tlb Marc Zyngier
2026-06-08  8:26 ` sashiko-bot [this message]
2026-06-08  9:41   ` Marc Zyngier
2026-06-08 16:34 ` Oliver Upton
2026-06-08 20:55 ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260608082603.16AEC1F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=maz@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.