From: Sean Christopherson <seanjc@google.com>
To: Ackerley Tng <ackerleytng@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Yan Zhao <yan.y.zhao@intel.com>,
Vishal Annapurve <vannapurve@google.com>
Subject: Re: [PATCH] KVM: guest_memfd: Elaborate on how release() vs. get_pfn() is safe against UAF
Date: Wed, 3 Jun 2026 16:29:14 -0700 [thread overview]
Message-ID: <aiC4yp5dIpFXi89X@google.com> (raw)
In-Reply-To: <CAEvNRgGmyd1yqQXsnz5hWRZpBZUs=piEWbEaqP9+cz9ZqEMQ6g@mail.gmail.com>
On Wed, Jun 03, 2026, Ackerley Tng wrote:
> Sean Christopherson <seanjc@google.com> writes:
>
> > Add more context and information to the comment in kvm_gmem_release() that
> > explains why there's no synchronization on RCU _or_ kvm->srcu. Point (b)
> > from commit 67b43038ce14 ("KVM: guest_memfd: Remove RCU-protected attribute
> > from slot->gmem.file")
> >
> > b) kvm->srcu ensures that kvm_gmem_unbind() and freeing of a memslot
> > occur after the memslot is no longer visible to kvm_gmem_get_pfn().
> >
> > is especially difficult to fully grok, particularly in light of commit
> > ae431059e75d ("KVM: guest_memfd: Remove bindings on memslot deletion when
> > gmem is dying"), which addressed a race between unbind() and release().
> >
> > No functional change intended.
> >
> > Cc: Yan Zhao <yan.y.zhao@intel.com>
> > Cc: Vishal Annapurve <vannapurve@google.com>
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> > virt/kvm/guest_memfd.c | 20 ++++++++++++++------
> > 1 file changed, 14 insertions(+), 6 deletions(-)
> >
> > diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> > index fdaea3422c30..2e09d7ec0cfc 100644
> > --- a/virt/kvm/guest_memfd.c
> > +++ b/virt/kvm/guest_memfd.c
> > @@ -338,17 +338,25 @@ static int kvm_gmem_release(struct inode *inode, struct file *file)
> > * dereferencing the slot for existing bindings needs to be protected
> > * against memslot updates, specifically so that unbind doesn't race
> > * and free the memslot (kvm_gmem_get_file() will return NULL).
> > - *
> > - * Since .release is called only when the reference count is zero,
> > - * after which file_ref_get() and get_file_active() fail,
> > - * kvm_gmem_get_pfn() cannot be using the file concurrently.
> > - * file_ref_put() provides a full barrier, and get_file_active() the
> > - * matching acquire barrier.
> > */
> > mutex_lock(&kvm->slots_lock);
> >
> > filemap_invalidate_lock(inode->i_mapping);
> >
> > + /*
> > + * Note! synchronize_srcu() is _not_ needed after nullifying memslot
> > + * bindings as slot->gmem.file cannot be set back to a non-null value
> > + * without the memslot first being deleted. I.e. this relies on the
> > + * synchronize_srcu_expedited() in kvm_swap_active_memslots() to ensure
>
> Is this kvm_swap_active_memslots() referring not to swapping out of the
> deleted memslot, but swapping in of a new memslot?
No, it's the deletion that matters. When doing an (S)RCU-protect pointer swap,
readers can see _either_ pointer (the old or the new). I.e. when creating a new
memslot, synchronize_srcu_expedited() would only guarantee readers would see the
new slot *after* synchronization completes.
But that's not what we care about. What we care about is that an in-flight
kvm_gmem_get_pfn() _can't_ reach the file for new, re-allocated memslot. That's
guaranteed because deleting the old memslot waiting for readers to away, i.e.
guarantees that any readers that saw a non-NULL slot->gmem.file saw the one
associated with the old memslot.
> > + * kvm_gmem_get_pfn() (which runs with kvm->srcu held for read) can't
> > + * grab a reference to slot->gmem.file even if the struct file object
> > + * is reallocated.
> > + *
>
> I think the part that's missing for me is: why is synchronize_srcu()
> required after NULLifying memslot bindings? Normally, synchronize_srcu()
> will be needed after memslot updates to make sure the next user of
> kvm->srcu gets the new state of the memslots?
>
> The "new state" here could be
>
> 1. slot->gmem.file == NULL
> 2. slot->gmem.file == some new file, set after the old memslot was
> replaced
Only because there's a synchronize_srcu(). Without that, it could be:
1. slot->gmem.file == old ==
2. slot->gmem.file == NULL |==> window for readers
3. slot->gmem.file == new ==
versus
1. slot->gmem.file == old ==
synchronize_srcu() |===> window for readers
2. slot->gmem.file == NULL ==
synchronize_srcu() |===> window #2 for readers
3. slot->gmem.file == new ==
prev parent reply other threads:[~2026-06-03 23:29 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-13 23:22 [PATCH] KVM: guest_memfd: Elaborate on how release() vs. get_pfn() is safe against UAF Sean Christopherson
2025-11-17 11:36 ` Yan Zhao
2026-06-03 0:40 ` Ackerley Tng
2026-06-03 11:42 ` Yan Zhao
2026-06-03 15:11 ` Ackerley Tng
2026-06-05 8:52 ` Yan Zhao
2026-06-04 0:10 ` Sean Christopherson
2026-06-05 8:32 ` Yan Zhao
2026-06-09 1:41 ` Sean Christopherson
2026-06-09 8:31 ` Yan Zhao
2026-06-03 15:37 ` Ackerley Tng
2026-06-03 23:29 ` Sean Christopherson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aiC4yp5dIpFXi89X@google.com \
--to=seanjc@google.com \
--cc=ackerleytng@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=vannapurve@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.