Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	 Vishal Annapurve <vannapurve@google.com>
Subject: Re: [PATCH] KVM: guest_memfd: Elaborate on how release() vs. get_pfn() is safe against UAF
Date: Wed, 3 Jun 2026 17:10:59 -0700	[thread overview]
Message-ID: <aiDCkz9V2aS5Qcuf@google.com> (raw)
In-Reply-To: <aRsIstNXu0Bl0oID@yzhao56-desk.sh.intel.com>

On Mon, Nov 17, 2025, Yan Zhao wrote:
> On Thu, Nov 13, 2025 at 03:22:29PM -0800, Sean Christopherson wrote:
> > Add more context and information to the comment in kvm_gmem_release() that
> > explains why there's no synchronization on RCU _or_ kvm->srcu.  Point (b)
> > from commit 67b43038ce14 ("KVM: guest_memfd: Remove RCU-protected attribute
> > from slot->gmem.file")
> > 
> >       b) kvm->srcu ensures that kvm_gmem_unbind() and freeing of a memslot
> >          occur after the memslot is no longer visible to kvm_gmem_get_pfn().
> > 
> > is especially difficult to fully grok, particularly in light of commit
> > ae431059e75d ("KVM: guest_memfd: Remove bindings on memslot deletion when
> > gmem is dying"), which addressed a race between unbind() and release().
> As mentioned in commit ae431059e75d ("KVM: guest_memfd: Remove bindings on
> memslot deletion when gmem is dying"), unbind() and release() are mutually
> exclusive, i.e., both protected by slots_lock, mentioning "race" here is
> confusing. So, that commit addressed the mishandling in unbind() after
> kvm_gmem_get_file() returning NULL?

Yes, that's the race.  Just because two chunks of code are mutually exclusive
doesn't mean there's no race, it just changes the granularity of the race, and/or
how it manifests.

I consider it a race because in the failing case, there's simply no way for
release() and unbind() to run sequentially.  I.e. the only way to encounter the
bug was to run two operations concurrently, *and* for a specific ordering to
occur.

> > No functional change intended.
> > 
> > Cc: Yan Zhao <yan.y.zhao@intel.com>
> > Cc: Vishal Annapurve <vannapurve@google.com>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  virt/kvm/guest_memfd.c | 20 ++++++++++++++------
> >  1 file changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index fdaea3422c30..2e09d7ec0cfc 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -338,17 +338,25 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
> >  	 * dereferencing the slot for existing bindings needs to be protected
> >  	 * against memslot updates, specifically so that unbind doesn't race
> >  	 * and free the memslot (kvm_gmem_get_file() will return NULL).
> > -	 *
> > -	 * Since .release is called only when the reference count is zero,
> > -	 * after which file_ref_get() and get_file_active() fail,
> > -	 * kvm_gmem_get_pfn() cannot be using the file concurrently.
> > -	 * file_ref_put() provides a full barrier, and get_file_active() the
> > -	 * matching acquire barrier.
> >  	 */
> 
> >  	mutex_lock(&kvm->slots_lock);
> >  
> >  	filemap_invalidate_lock(inode->i_mapping);
> > 
> > +	/*
> > +	 * Note!  synchronize_srcu() is _not_ needed after nullifying memslot
> > +	 * bindings as slot->gmem.file cannot be set back to a non-null value
> > +	 * without the memslot first being deleted.  I.e. this relies on the
> > +	 * synchronize_srcu_expedited() in kvm_swap_active_memslots() to ensure
> > +	 * kvm_gmem_get_pfn() (which runs with kvm->srcu held for read) can't
> > +	 * grab a reference to slot->gmem.file even if the struct file object
> > +	 * is reallocated.
> Do you mean that as kvm_gmem_get_pfn() can't find a stale slot, it can't grab
> reference to stale slot->gmem.file, even if slot->gmem.file has been set to a
> different value, i.e., after invoking unbind(), bind() ?

I was specifically trying to say this:

 1. slot->gmem.file == old  ==
    synchronize_srcu()       |===> window #1 for readers
 2. slot->gmem.file == NULL ==
    synchronize_srcu()       |===> window #2 for readers
 3. slot->gmem.file == new  ==

kvm_gmem_get_pfn() can run in two "windows".  In the window #1, it can see @old
or NULL.  In window #2, it can see NULL or @new.  The synchronization ensures
it can't grab a reference to @old once window #2 has been opened.

> But I'm not sure why to put the kvm_gmem_get_pfn() relying on
> synchronize_srcu_expedited() in kvm_swap_active_memslots() in the comment of
> release().  Without the guard of kvm_gmem_get_file(), kvm_gmem_get_pfn() may
> need to provide some RCU read-critial section too for release() to wait by
> synchronize_srcu()?

No, what I'm saying is that without synchronize_srcu() to create distinct
window #1 and window #2 above, KVM would need this:

diff --git virt/kvm/guest_memfd.c virt/kvm/guest_memfd.c
index a1cb72e66288..e66e431bdb98 100644
--- virt/kvm/guest_memfd.c
+++ virt/kvm/guest_memfd.c
@@ -347,6 +347,8 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
        xa_for_each(&f->bindings, index, slot)
                WRITE_ONCE(slot->gmem.file, NULL);
 
+       synchronize_srcu();
+
        /*
         * All in-flight operations are gone and new bindings can be created.
         * Zap all SPTEs pointed at by this file.  Do not free the backing

Otherwise there is no guarantee in-flight readers can't reach the old
slot->gmem.file.

  parent reply	other threads:[~2026-06-04  0:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-13 23:22 [PATCH] KVM: guest_memfd: Elaborate on how release() vs. get_pfn() is safe against UAF Sean Christopherson
2025-11-17 11:36 ` Yan Zhao
2026-06-03  0:40   ` Ackerley Tng
2026-06-03 11:42     ` Yan Zhao
2026-06-03 15:11       ` Ackerley Tng
2026-06-04  0:10   ` Sean Christopherson [this message]
2026-06-03 15:37 ` Ackerley Tng
2026-06-03 23:29   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aiDCkz9V2aS5Qcuf@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=vannapurve@google.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox