From: James Houghton <jthoughton@google.com>
To: jthoughton@google.com
Cc: chenhuacai@kernel.org, gshan@redhat.com, jhogan@kernel.org,
joey.gouly@arm.com, kvm@vger.kernel.org, kvmarm@lists.linux.dev,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-mips@vger.kernel.org,
loongarch@lists.linux.dev, maobibo@loongson.cn, maz@kernel.org,
oupton@kernel.org, pbonzini@redhat.com, ricarkol@google.com,
seanjc@google.com, shahuang@redhat.com, stable@vger.kernel.org,
suzuki.poulose@arm.com, yuzenghui@huawei.com,
zhaotianrui@loongson.cn
Subject: Re: [PATCH 1/5] KVM: arm64: Grab KVM MMU write lock in kvm_arch_flush_shadow_all()
Date: Mon, 4 May 2026 23:10:47 +0000 [thread overview]
Message-ID: <20260504231048.1184273-1-jthoughton@google.com> (raw)
In-Reply-To: <20260504224213.1049426-2-jthoughton@google.com>
On Mon, May 4, 2026 at 3:42 PM James Houghton <jthoughton@google.com> wrote:
>
> kvm_arch_flush_shadow_all() may sometimes be called on the same `kvm`
> concurrently in the event that the KVM's `mm` is __mmput() at the
> same time that last reference to the KVM is being dropped.
>
> T1 T2
> KVM_CREATE_VM
> Get VM file from T1
> close VM
> exit_mm() close VM
>
> T1: exit_mm() -> kvm_mmu_notifier_release() -> kvm_flush_shadow_all(),
> with only the KVM srcu read lock held.
>
> T2: kvm_vm_release() ---> mmu_notifier_unregister() ->
> kvm_mmu_notifier_release() -> kvm_flush_shadow_all(),
> again, with only the KVM srcu read lock held.
>
> This leads to a potential double-free of
> kvm->arch.kvm_mmu_free_memory_cache and now with NV
> kvm->arch.nested_mmus.
>
> Cc: stable@vger.kernel.org
> Fixes: e7bf7a490c68 ("KVM: arm64: Split huge pages when dirty logging is enabled")
> Signed-off-by: James Houghton <jthoughton@google.com>
> ---
> arch/arm64/include/asm/kvm_mmu.h | 1 +
> arch/arm64/kvm/mmu.c | 23 +++++++++++++++++++----
> arch/arm64/kvm/nested.c | 4 +++-
> 3 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 01e9c72d6aa7..30d5c24fcebb 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -178,6 +178,7 @@ void stage2_unmap_vm(struct kvm *kvm);
> int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type);
> void kvm_uninit_stage2_mmu(struct kvm *kvm);
> void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu);
> +void kvm_free_stage2_pgd_locked(struct kvm_s2_mmu *mmu);
> int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
> phys_addr_t pa, unsigned long size, bool writable);
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index d089c107d9b7..4bab407d43bb 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1021,7 +1021,9 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
>
> void kvm_uninit_stage2_mmu(struct kvm *kvm)
> {
> - kvm_free_stage2_pgd(&kvm->arch.mmu);
> + lockdep_assert_held_write(&kvm->mmu_lock);
*facepalm*.... this doesn't account for the other callers of
kvm_uninit_stage2_mmu(). They will get lockdep warnings.
I've attached a diff to the bottom of this reply that *does* deal with them.
:( Sorry.
I'm guessing Marc or Oliver will probably want this patch to look quite
different, so I'll wait to hear from them before actually sending a v2.
In the meantime, I'll properly retest with lockdep enabled.
> +
> + kvm_free_stage2_pgd_locked(&kvm->arch.mmu);
> kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache);
> }
>
> @@ -1095,12 +1097,14 @@ void stage2_unmap_vm(struct kvm *kvm)
> srcu_read_unlock(&kvm->srcu, idx);
> }
>
> -void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> +static void __kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu, bool locked)
> {
> struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
> struct kvm_pgtable *pgt = NULL;
>
> - write_lock(&kvm->mmu_lock);
> + if (!locked)
> + write_lock(&kvm->mmu_lock);
> +
> pgt = mmu->pgt;
> if (pgt) {
> mmu->pgd_phys = 0;
> @@ -1111,7 +1115,8 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> if (kvm_is_nested_s2_mmu(kvm, mmu))
> kvm_init_nested_s2_mmu(mmu);
>
> - write_unlock(&kvm->mmu_lock);
> + if (!locked)
> + write_unlock(&kvm->mmu_lock);
>
> if (pgt) {
> kvm_stage2_destroy(pgt);
> @@ -1119,6 +1124,16 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> }
> }
>
> +void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> +{
> + __kvm_free_stage2_pgd(mmu, false);
> +}
> +
> +void kvm_free_stage2_pgd_locked(struct kvm_s2_mmu *mmu)
> +{
> + __kvm_free_stage2_pgd(mmu, true);
> +}
> +
> static void hyp_mc_free_fn(void *addr, void *mc)
> {
> struct kvm_hyp_memcache *memcache = mc;
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 883b6c1008fb..977598bff5e6 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -1190,11 +1190,13 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
> {
> int i;
>
> + guard(write_lock)(&kvm->mmu_lock);
> +
> for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
>
> if (!WARN_ON(atomic_read(&mmu->refcnt)))
> - kvm_free_stage2_pgd(mmu);
> + kvm_free_stage2_pgd_locked(mmu);
> }
> kvfree(kvm->arch.nested_mmus);
> kvm->arch.nested_mmus = NULL;
> --
> 2.54.0.545.g6539524ca2-goog
And here is the diff that should fix this patch. (Sorry!!)
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 30d5c24fcebb..e32e844943be 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -177,6 +177,7 @@ void kvm_stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t e
void stage2_unmap_vm(struct kvm *kvm);
int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long type);
void kvm_uninit_stage2_mmu(struct kvm *kvm);
+void kvm_uninit_stage2_mmu_locked(struct kvm *kvm);
void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu);
void kvm_free_stage2_pgd_locked(struct kvm_s2_mmu *mmu);
int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 4bab407d43bb..98ba8116676c 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1019,14 +1019,6 @@ int kvm_init_stage2_mmu(struct kvm *kvm, struct kvm_s2_mmu *mmu, unsigned long t
return err;
}
-void kvm_uninit_stage2_mmu(struct kvm *kvm)
-{
- lockdep_assert_held_write(&kvm->mmu_lock);
-
- kvm_free_stage2_pgd_locked(&kvm->arch.mmu);
- kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache);
-}
-
static void stage2_unmap_memslot(struct kvm *kvm,
struct kvm_memory_slot *memslot)
{
@@ -1134,6 +1126,24 @@ void kvm_free_stage2_pgd_locked(struct kvm_s2_mmu *mmu)
__kvm_free_stage2_pgd(mmu, true);
}
+static void __kvm_uninit_stage2_mmu(struct kvm *kvm, bool locked)
+{
+ __kvm_free_stage2_pgd(&kvm->arch.mmu, locked);
+ kvm_mmu_free_memory_cache(&kvm->arch.mmu.split_page_cache);
+}
+
+void kvm_uninit_stage2_mmu(struct kvm *kvm)
+{
+ __kvm_uninit_stage2_mmu(kvm, false);
+}
+
+void kvm_uninit_stage2_mmu_locked(struct kvm *kvm)
+{
+ lockdep_assert_held_write(&kvm->mmu_lock);
+
+ __kvm_uninit_stage2_mmu(kvm, true);
+}
+
static void hyp_mc_free_fn(void *addr, void *mc)
{
struct kvm_hyp_memcache *memcache = mc;
diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
index 977598bff5e6..f61f0244f0fb 100644
--- a/arch/arm64/kvm/nested.c
+++ b/arch/arm64/kvm/nested.c
@@ -1201,7 +1201,7 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
kvfree(kvm->arch.nested_mmus);
kvm->arch.nested_mmus = NULL;
kvm->arch.nested_mmus_size = 0;
- kvm_uninit_stage2_mmu(kvm);
+ kvm_uninit_stage2_mmu_locked(kvm);
}
/*
next prev parent reply other threads:[~2026-05-04 23:10 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-04 22:42 [PATCH 0/5] KVM: Fix race conditions in kvm_arch_flush_shadow_all() James Houghton
2026-05-04 22:42 ` [PATCH 1/5] KVM: arm64: Grab KVM MMU write lock " James Houghton
2026-05-04 23:10 ` James Houghton [this message]
2026-05-05 17:05 ` Sean Christopherson
2026-05-05 18:01 ` James Houghton
2026-05-05 18:16 ` Sean Christopherson
2026-05-04 22:42 ` [PATCH 2/5] KVM: loongarch: Grab MMU " James Houghton
2026-05-04 22:42 ` [PATCH 3/5] KVM: mips: " James Houghton
2026-05-04 22:42 ` [PATCH 4/5] KVM: Hold MMU lock exclusively when calling kvm_arch_flush_shadow_all() James Houghton
2026-05-04 22:42 ` [PATCH 5/5] DO NOT MERGE: KVM: selftests: Reproducer for arm64 double-free James Houghton
2026-05-04 22:44 ` [PATCH 0/5] KVM: Fix race conditions in kvm_arch_flush_shadow_all() James Houghton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260504231048.1184273-1-jthoughton@google.com \
--to=jthoughton@google.com \
--cc=chenhuacai@kernel.org \
--cc=gshan@redhat.com \
--cc=jhogan@kernel.org \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=loongarch@lists.linux.dev \
--cc=maobibo@loongson.cn \
--cc=maz@kernel.org \
--cc=oupton@kernel.org \
--cc=pbonzini@redhat.com \
--cc=ricarkol@google.com \
--cc=seanjc@google.com \
--cc=shahuang@redhat.com \
--cc=stable@vger.kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=yuzenghui@huawei.com \
--cc=zhaotianrui@loongson.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox