From: Sean Christopherson <seanjc@google.com>
To: Ackerley Tng <ackerleytng@google.com>
Cc: sashiko-reviews@lists.linux.dev,
Ackerley Tng via B4 Relay
<devnull+ackerleytng.google.com@kernel.org>,
kvm@vger.kernel.org
Subject: Re: [PATCH v7 16/42] KVM: guest_memfd: Determine invalidation filter from memory attributes
Date: Fri, 29 May 2026 13:44:27 -0700 [thread overview]
Message-ID: <ahn6qysOGfa74A2E@google.com> (raw)
In-Reply-To: <CAEvNRgH21BoKT1mOQzgmKHKpDi4xbwtbMuenGv5U1ZUSENrJmg@mail.gmail.com>
On Wed, May 27, 2026, Ackerley Tng wrote:
> > In __kvm_gmem_invalidate_begin(), the code iterates over f->bindings using
> > xa_for_each_range(). The retrieved slot pointers are protected by the kvm
> > srcu, but the read lock doesn't appear to be held during the iteration:
> >
> > static void __kvm_gmem_invalidate_begin(...)
> > {
> > ...
> > xa_for_each_range(&f->bindings, index, slot, start, end - 1) {
> > pgoff_t pgoff = slot->gmem.pgoff;
> > ...
> >
> > When a memory slot is deleted, the code calls kvm_gmem_unbind() and
> > kfree(slot) after synchronize_srcu(&kvm->srcu). If the read lock isn't held,
> > could a concurrent memslot deletion free the slot while it is still being
> > accessed in this loop?
> >
>
> The issue here is between __kvm_gmem_unbind() and
> __kvm_gmem_invalidate_begin(). Since f->bindings is protected by
> filemap_invalidate_lock(), I'm dividing the analysis into two cases
> where accessing slot through f->bindings with and without holding
> filemap_invalidate_lock().
>
> ## Holding filemap_invalidate_lock()
>
> If unbind happens before invalidate, we have
>
> filemap_invalidate_lock()
> __kvm_gmem_unbind(), would have removed slot from f->bindings.
> filemap_invalidate_unlock()
>
> After this, slot can be freed.
>
> filemap_invalidate_lock()
> __kvm_gmem_invalidate_begin(), which iterates f->bindings and doesn't
> see bound slot.
> filemap_invalidate_unlock()
>
> ## Not holding filemap_invalidate_lock()
>
> In kvm_gmem_unbind(), __kvm_gmem_unbind() can be called without taking
> the filemap_invalidate_lock() if !file. The only places where NULL is
> written to slot->gmem.file is kvm_gmem_release() and __kvm_gmem_unbind()
> itself.
>
> kvm_gmem_release() is prevented from racing with kvm_gmem_unbind() since
> kvm_gmem_release() holds slots_lock, so since kvm_gmem_unbind() is
> called while holding slots_lock, it should either see a guest_memfd
> memslot with a file or no slot at all? Perhaps I'm missing something
> about get_file_active().
The comments in the code pretty much say it all:
kvm_gmem_release():
/*
* Prevent concurrent attempts to *unbind* a memslot. This is the last
* reference to the file and thus no new bindings can be created, but
* dereferencing the slot for existing bindings needs to be protected
* against memslot updates, specifically so that unbind doesn't race
* and free the memslot (kvm_gmem_get_file() will return NULL).
*
* Since .release is called only when the reference count is zero,
* after which file_ref_get() and get_file_active() fail,
* kvm_gmem_get_pfn() cannot be using the file concurrently.
* file_ref_put() provides a full barrier, and get_file_active() the
* matching acquire barrier.
*/
kvm_gmem_unbind():
/*
* However, if the file is _being_ closed, then the bindings need to be
* removed as kvm_gmem_release() might not run until after the memslot
* is freed. Note, modifying the bindings is safe even though the file
* is dying as kvm_gmem_release() nullifies slot->gmem.file under
* slots_lock, and only puts its reference to KVM after destroying all
* bindings. I.e. reaching this point means kvm_gmem_release() hasn't
* yet destroyed the bindings or freed the gmem_file, and can't do so
* until the caller drops slots_lock.
*/
On a related topic, I posted a patch a while back to clarify exactly why the
release() code is safe. I need to get back to that, but in the meantime, more
eyeballs would be appreciated.
https://lore.kernel.org/all/20251113232229.1698886-1-seanjc@google.com
> Would like Sean to check my understanding.
You're not missing anything, Sashiko struggles with scenarios where thing X
makes thing Y impossible.
next prev parent reply other threads:[~2026-05-29 20:44 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-23 0:17 [PATCH v7 00/42] guest_memfd: In-place conversion support Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 01/42] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 02/42] KVM: Rename KVM_GENERIC_MEMORY_ATTRIBUTES to KVM_VM_MEMORY_ATTRIBUTES Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 03/42] KVM: Enumerate support for PRIVATE memory iff kvm_arch_has_private_mem is defined Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 04/42] KVM: Stub in ability to disable per-VM memory attribute tracking Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 05/42] KVM: guest_memfd: Wire up kvm_get_memory_attributes() to per-gmem attributes Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 06/42] KVM: guest_memfd: Update kvm_gmem_populate() to use gmem attributes Ackerley Tng via B4 Relay
2026-05-23 0:59 ` sashiko-bot
2026-05-23 0:17 ` [PATCH v7 07/42] KVM: guest_memfd: Only prepare folios for private pages Ackerley Tng via B4 Relay
2026-05-23 0:52 ` sashiko-bot
2026-05-27 21:22 ` Ackerley Tng
2026-05-23 0:17 ` [PATCH v7 08/42] KVM: Move kvm_supported_mem_attributes() to kvm_host.h Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 09/42] KVM: guest_memfd: Add base support for KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng via B4 Relay
2026-05-23 1:01 ` sashiko-bot
2026-05-27 21:27 ` Ackerley Tng
2026-05-23 0:17 ` [PATCH v7 10/42] KVM: guest_memfd: Ensure pages are not in use before conversion Ackerley Tng via B4 Relay
2026-05-23 0:55 ` sashiko-bot
2026-05-23 0:17 ` [PATCH v7 11/42] KVM: guest_memfd: Call arch invalidate hooks on conversion Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 12/42] KVM: guest_memfd: Return early if range already has requested attributes Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 13/42] KVM: guest_memfd: Advertise KVM_SET_MEMORY_ATTRIBUTES2 ioctl Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 14/42] KVM: guest_memfd: Handle lru_add fbatch refcounts during conversion safety check Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 15/42] KVM: guest_memfd: Use actual size for invalidation in kvm_gmem_release() Ackerley Tng via B4 Relay
2026-05-23 0:17 ` [PATCH v7 16/42] KVM: guest_memfd: Determine invalidation filter from memory attributes Ackerley Tng via B4 Relay
2026-05-23 1:06 ` sashiko-bot
[not found] ` <CAEvNRgH21BoKT1mOQzgmKHKpDi4xbwtbMuenGv5U1ZUSENrJmg@mail.gmail.com>
2026-05-29 20:44 ` Sean Christopherson [this message]
2026-05-23 0:17 ` [PATCH v7 17/42] KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86 Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 18/42] KVM: Let userspace disable per-VM mem attributes, enable per-gmem attributes Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 19/42] KVM: guest_memfd: Enable INIT_SHARED on guest_memfd for x86 Coco VMs Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 20/42] KVM: SEV: Make 'uaddr' parameter optional for KVM_SEV_SNP_LAUNCH_UPDATE Ackerley Tng via B4 Relay
2026-05-23 0:55 ` sashiko-bot
2026-05-27 23:31 ` Ackerley Tng
2026-05-23 0:18 ` [PATCH v7 21/42] KVM: TDX: Make source page optional for KVM_TDX_INIT_MEM_REGION Ackerley Tng via B4 Relay
2026-05-23 1:07 ` sashiko-bot
2026-05-23 0:18 ` [PATCH v7 22/42] KVM: selftests: Create gmem fd before "regular" fd when adding memslot Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 23/42] KVM: selftests: Rename guest_memfd{,_offset} to gmem_{fd,offset} Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 24/42] KVM: selftests: Add support for mmap() on guest_memfd in core library Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 25/42] KVM: selftests: Add selftests global for guest memory attributes capability Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 26/42] KVM: selftests: Add helpers for calling ioctls on guest_memfd Ackerley Tng via B4 Relay
2026-05-23 0:42 ` sashiko-bot
2026-05-23 0:18 ` [PATCH v7 27/42] KVM: selftests: Test basic single-page conversion flow Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 28/42] KVM: selftests: Test conversion flow when INIT_SHARED Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 29/42] KVM: selftests: Test conversion precision in guest_memfd Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 30/42] KVM: selftests: Test conversion before allocation Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 31/42] KVM: selftests: Convert with allocated folios in different layouts Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 32/42] KVM: selftests: Test that truncation does not change shared/private status Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 33/42] KVM: selftests: Test that shared/private status is consistent across processes Ackerley Tng via B4 Relay
2026-05-23 1:11 ` sashiko-bot
2026-05-23 0:18 ` [PATCH v7 34/42] KVM: selftests: Test conversion with elevated page refcount Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 35/42] KVM: selftests: Reset shared memory after hole-punching Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 36/42] KVM: selftests: Provide function to look up guest_memfd details from gpa Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 37/42] KVM: selftests: Provide common function to set memory attributes Ackerley Tng via B4 Relay
2026-05-23 1:35 ` sashiko-bot
2026-05-23 0:18 ` [PATCH v7 38/42] KVM: selftests: Check fd/flags provided to mmap() when setting up memslot Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 39/42] KVM: selftests: Make TEST_EXPECT_SIGBUS thread-safe Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 40/42] KVM: selftests: Update private_mem_conversions_test to mmap() guest_memfd Ackerley Tng via B4 Relay
2026-05-23 0:18 ` [PATCH v7 41/42] KVM: selftests: Add script to exercise private_mem_conversions_test Ackerley Tng via B4 Relay
2026-05-23 1:15 ` sashiko-bot
2026-05-23 0:18 ` [PATCH v7 42/42] KVM: selftests: Update private memory exits test to work with per-gmem attributes Ackerley Tng via B4 Relay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ahn6qysOGfa74A2E@google.com \
--to=seanjc@google.com \
--cc=ackerleytng@google.com \
--cc=devnull+ackerleytng.google.com@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=sashiko-reviews@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox