Re: [RFC PATCH] KVM: Ignore MMU notifiers for guest_memfd-only memslots

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Sean Christopherson <seanjc@google.com>
To: sashiko-reviews@lists.linux.dev
Cc: Alexandru Elisei <alexandru.elisei@arm.com>,
	Marc Zyngier <maz@kernel.org>,
	kvm@vger.kernel.org,  kvmarm@lists.linux.dev,
	Oliver Upton <oupton@kernel.org>
Subject: Re: [RFC PATCH] KVM: Ignore MMU notifiers for guest_memfd-only memslots
Date: Mon, 15 Jun 2026 11:09:21 -0700	[thread overview]
Message-ID: <ajA_0TQCQLSz7qcU@google.com> (raw)
In-Reply-To: <ajA6ogaNiI3KDv5i@google.com>

On Mon, Jun 15, 2026, Sean Christopherson wrote:
> On Mon, Jun 15, 2026, sashiko-bot@kernel.org wrote:
> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > --- a/virt/kvm/kvm_main.c
> > > +++ b/virt/kvm/kvm_main.c
> > [ ... ]
> > > @@ -592,6 +592,10 @@ static __always_inline kvm_mn_ret_t kvm_handle_hva_range(struct kvm *kvm,
> > >  			unsigned long hva_start, hva_end;
> > >  
> > >  			slot = container_of(node, struct kvm_memory_slot, hva_node[slots->node_idx]);
> > > +
> > > +			if (kvm_slot_has_gmem(slot) && kvm_memslot_is_gmem_only(slot))
> > > +				continue;
> > > +
> > 
> > [Severity: Critical]
> > Does this change inadvertently introduce a regression in the pfncache retry
> > protocol?
> > 
> > Looking at the pfncache framework, it maps guest memory into kernel space and
> > explicitly drops the page reference after mapping it:
> > 
> > virt/kvm/pfncache.c:hva_to_pfn_retry() {
> >     ...
> >     kvm_release_page_clean(page);
> >     ...
> > }
> > 
> > It appears to rely entirely on KVM's MMU notifiers (kvm->mmu_invalidate_seq)
> > to invalidate the cache when the page is unmapped by the host.
> > 
> > If a VMM defines a guest_memfd-backed memslot with KVM_MEMSLOT_GMEM_ONLY
> > but still provides a valid anonymous user mapping as its userspace_addr,
> > could this regression lead to a use-after-free?
> 
> Sadly, yes.  To land this, we would need to first teach the gfn_to_pfn_cache code
> to be able to pull directly from guest_memfd.  I forget if anyone is working on
> that.

Actually, we just need to ensure the invalidation tracking is updated, the MMU
itself can be left as-is.

Compile tested only, but this?

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 27498e990dff..690ab707816b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -260,6 +260,7 @@ union kvm_mmu_notifier_arg {
 enum kvm_gfn_range_filter {
        KVM_FILTER_SHARED               = BIT(0),
        KVM_FILTER_PRIVATE              = BIT(1),
+       KVM_FILTER_USERSPACE_MAPPINGS   = BIT(2),
 };
 
 struct kvm_gfn_range {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e44c20c04961..84b693de7e35 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -608,7 +608,8 @@ static __always_inline kvm_mn_ret_t kvm_handle_hva_range(struct kvm *kvm,
                         * HVA-based notifications aren't relevant to private
                         * mappings as they don't have a userspace mapping.
                         */
-                       gfn_range.attr_filter = KVM_FILTER_SHARED;
+                       gfn_range.attr_filter = KVM_FILTER_SHARED |
+                                               KVM_FILTER_USERSPACE_MAPPINGS;
 
                        /*
                         * {gfn(page) | page intersects with [hva_start, hva_end)} =
@@ -715,6 +716,21 @@ void kvm_mmu_invalidate_range_add(struct kvm *kvm, gfn_t start, gfn_t end)
 bool kvm_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
 {
        kvm_mmu_invalidate_range_add(kvm, range->start, range->end);
+
+       /*
+        * When reacting to changes in userspace mappings, don't unmap memslots
+        * that are guest_memfd-only, in which case KVM's MMU mappings are
+        * pulled directly from guest_memfd, i.e. don't depend on the userspace
+        * mappings.
+        *
+        * TODO: Skip gmem-only memslots on mmu_notifier events entirely, once
+        * gfn_to_pfn_cache is also wired up to directly pull from guest_memfd.
+        */
+       if (range->attr_filter & KVM_FILTER_USERSPACE_MAPPINGS &&
+           kvm_slot_has_gmem(range->slot) &&
+           kvm_memslot_is_gmem_only(range->slot))
+               return false;
+
        return kvm_unmap_gfn_range(kvm, range);
 }

next prev parent reply	other threads:[~2026-06-15 18:09 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-15 15:52 [RFC PATCH] KVM: Ignore MMU notifiers for guest_memfd-only memslots Alexandru Elisei
2026-06-15 16:09 ` sashiko-bot
2026-06-15 17:47   ` Sean Christopherson
2026-06-15 18:09     ` Sean Christopherson [this message]
2026-06-15 19:07 ` David Hildenbrand

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:27498e990df dfblob:690ab707816 dfblob:e44c20c0496
dfblob:84b693de7e3 )
 OR (
bs:"Re: [RFC PATCH] KVM: Ignore MMU notifiers for guest_memfd-only memslots" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajA_0TQCQLSz7qcU@google.com \
    --to=seanjc@google.com \
    --cc=alexandru.elisei@arm.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=oupton@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.