Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Ackerley Tng <ackerleytng@google.com>
Cc: sashiko-reviews@lists.linux.dev,
	 Ackerley Tng via B4 Relay
	<devnull+ackerleytng.google.com@kernel.org>,
	kvm@vger.kernel.org
Subject: Re: [PATCH v7 16/42] KVM: guest_memfd: Determine invalidation filter from memory attributes
Date: Fri, 29 May 2026 13:44:27 -0700	[thread overview]
Message-ID: <ahn6qysOGfa74A2E@google.com> (raw)
In-Reply-To: <CAEvNRgH21BoKT1mOQzgmKHKpDi4xbwtbMuenGv5U1ZUSENrJmg@mail.gmail.com>

On Wed, May 27, 2026, Ackerley Tng wrote:
> > In __kvm_gmem_invalidate_begin(), the code iterates over f->bindings using
> > xa_for_each_range(). The retrieved slot pointers are protected by the kvm
> > srcu, but the read lock doesn't appear to be held during the iteration:
> >
> > static void __kvm_gmem_invalidate_begin(...)
> > {
> > 	...
> > 	xa_for_each_range(&f->bindings, index, slot, start, end - 1) {
> > 		pgoff_t pgoff = slot->gmem.pgoff;
> > 	...
> >
> > When a memory slot is deleted, the code calls kvm_gmem_unbind() and
> > kfree(slot) after synchronize_srcu(&kvm->srcu). If the read lock isn't held,
> > could a concurrent memslot deletion free the slot while it is still being
> > accessed in this loop?
> >
> 
> The issue here is between __kvm_gmem_unbind() and
> __kvm_gmem_invalidate_begin(). Since f->bindings is protected by
> filemap_invalidate_lock(), I'm dividing the analysis into two cases
> where accessing slot through f->bindings with and without holding
> filemap_invalidate_lock().
> 
> ## Holding filemap_invalidate_lock()
> 
> If unbind happens before invalidate, we have
> 
>     filemap_invalidate_lock()
>     __kvm_gmem_unbind(), would have removed slot from f->bindings.
>     filemap_invalidate_unlock()
> 
>     After this, slot can be freed.
> 
>     filemap_invalidate_lock()
>     __kvm_gmem_invalidate_begin(), which iterates f->bindings and doesn't
>        see bound slot.
>     filemap_invalidate_unlock()
> 
> ## Not holding filemap_invalidate_lock()
> 
> In kvm_gmem_unbind(), __kvm_gmem_unbind() can be called without taking
> the filemap_invalidate_lock() if !file. The only places where NULL is
> written to slot->gmem.file is kvm_gmem_release() and __kvm_gmem_unbind()
> itself.
> 
> kvm_gmem_release() is prevented from racing with kvm_gmem_unbind() since
> kvm_gmem_release() holds slots_lock, so since kvm_gmem_unbind() is
> called while holding slots_lock, it should either see a guest_memfd
> memslot with a file or no slot at all? Perhaps I'm missing something
> about get_file_active().

The comments in the code pretty much say it all:

kvm_gmem_release():

	/*
	 * Prevent concurrent attempts to *unbind* a memslot.  This is the last
	 * reference to the file and thus no new bindings can be created, but
	 * dereferencing the slot for existing bindings needs to be protected
	 * against memslot updates, specifically so that unbind doesn't race
	 * and free the memslot (kvm_gmem_get_file() will return NULL).
	 *
	 * Since .release is called only when the reference count is zero,
	 * after which file_ref_get() and get_file_active() fail,
	 * kvm_gmem_get_pfn() cannot be using the file concurrently.
	 * file_ref_put() provides a full barrier, and get_file_active() the
	 * matching acquire barrier.
	 */

kvm_gmem_unbind():

	/*
	 * However, if the file is _being_ closed, then the bindings need to be
	 * removed as kvm_gmem_release() might not run until after the memslot
	 * is freed.  Note, modifying the bindings is safe even though the file
	 * is dying as kvm_gmem_release() nullifies slot->gmem.file under
	 * slots_lock, and only puts its reference to KVM after destroying all
	 * bindings.  I.e. reaching this point means kvm_gmem_release() hasn't
	 * yet destroyed the bindings or freed the gmem_file, and can't do so
	 * until the caller drops slots_lock.
	 */

On a related topic, I posted a patch a while back to clarify exactly why the
release() code is safe.  I need to get back to that, but in the meantime, more
eyeballs would be appreciated.

https://lore.kernel.org/all/20251113232229.1698886-1-seanjc@google.com

> Would like Sean to check my understanding.

You're not missing anything, Sashiko struggles with scenarios where thing X
makes thing Y impossible.

  parent reply	other threads:[~2026-05-29 20:44 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-23  0:17 [PATCH v7 00/42] guest_memfd: In-place conversion support Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 01/42] KVM: guest_memfd: Introduce per-gmem attributes, use to guard user mappings Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 02/42] KVM: Rename KVM_GENERIC_MEMORY_ATTRIBUTES to KVM_VM_MEMORY_ATTRIBUTES Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 03/42] KVM: Enumerate support for PRIVATE memory iff kvm_arch_has_private_mem is defined Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 04/42] KVM: Stub in ability to disable per-VM memory attribute tracking Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 05/42] KVM: guest_memfd: Wire up kvm_get_memory_attributes() to per-gmem attributes Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 06/42] KVM: guest_memfd: Update kvm_gmem_populate() to use gmem attributes Ackerley Tng via B4 Relay
2026-05-23  0:59   ` sashiko-bot
2026-05-23  0:17 ` [PATCH v7 07/42] KVM: guest_memfd: Only prepare folios for private pages Ackerley Tng via B4 Relay
2026-05-23  0:52   ` sashiko-bot
2026-05-27 21:22     ` Ackerley Tng
2026-05-23  0:17 ` [PATCH v7 08/42] KVM: Move kvm_supported_mem_attributes() to kvm_host.h Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 09/42] KVM: guest_memfd: Add base support for KVM_SET_MEMORY_ATTRIBUTES2 Ackerley Tng via B4 Relay
2026-05-23  1:01   ` sashiko-bot
2026-05-27 21:27     ` Ackerley Tng
2026-05-23  0:17 ` [PATCH v7 10/42] KVM: guest_memfd: Ensure pages are not in use before conversion Ackerley Tng via B4 Relay
2026-05-23  0:55   ` sashiko-bot
2026-05-23  0:17 ` [PATCH v7 11/42] KVM: guest_memfd: Call arch invalidate hooks on conversion Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 12/42] KVM: guest_memfd: Return early if range already has requested attributes Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 13/42] KVM: guest_memfd: Advertise KVM_SET_MEMORY_ATTRIBUTES2 ioctl Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 14/42] KVM: guest_memfd: Handle lru_add fbatch refcounts during conversion safety check Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 15/42] KVM: guest_memfd: Use actual size for invalidation in kvm_gmem_release() Ackerley Tng via B4 Relay
2026-05-23  0:17 ` [PATCH v7 16/42] KVM: guest_memfd: Determine invalidation filter from memory attributes Ackerley Tng via B4 Relay
2026-05-23  1:06   ` sashiko-bot
     [not found]     ` <CAEvNRgH21BoKT1mOQzgmKHKpDi4xbwtbMuenGv5U1ZUSENrJmg@mail.gmail.com>
2026-05-29 20:44       ` Sean Christopherson [this message]
2026-05-23  0:17 ` [PATCH v7 17/42] KVM: Move KVM_VM_MEMORY_ATTRIBUTES config definition to x86 Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 18/42] KVM: Let userspace disable per-VM mem attributes, enable per-gmem attributes Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 19/42] KVM: guest_memfd: Enable INIT_SHARED on guest_memfd for x86 Coco VMs Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 20/42] KVM: SEV: Make 'uaddr' parameter optional for KVM_SEV_SNP_LAUNCH_UPDATE Ackerley Tng via B4 Relay
2026-05-23  0:55   ` sashiko-bot
2026-05-27 23:31     ` Ackerley Tng
2026-05-23  0:18 ` [PATCH v7 21/42] KVM: TDX: Make source page optional for KVM_TDX_INIT_MEM_REGION Ackerley Tng via B4 Relay
2026-05-23  1:07   ` sashiko-bot
2026-05-23  0:18 ` [PATCH v7 22/42] KVM: selftests: Create gmem fd before "regular" fd when adding memslot Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 23/42] KVM: selftests: Rename guest_memfd{,_offset} to gmem_{fd,offset} Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 24/42] KVM: selftests: Add support for mmap() on guest_memfd in core library Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 25/42] KVM: selftests: Add selftests global for guest memory attributes capability Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 26/42] KVM: selftests: Add helpers for calling ioctls on guest_memfd Ackerley Tng via B4 Relay
2026-05-23  0:42   ` sashiko-bot
2026-05-23  0:18 ` [PATCH v7 27/42] KVM: selftests: Test basic single-page conversion flow Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 28/42] KVM: selftests: Test conversion flow when INIT_SHARED Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 29/42] KVM: selftests: Test conversion precision in guest_memfd Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 30/42] KVM: selftests: Test conversion before allocation Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 31/42] KVM: selftests: Convert with allocated folios in different layouts Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 32/42] KVM: selftests: Test that truncation does not change shared/private status Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 33/42] KVM: selftests: Test that shared/private status is consistent across processes Ackerley Tng via B4 Relay
2026-05-23  1:11   ` sashiko-bot
2026-05-23  0:18 ` [PATCH v7 34/42] KVM: selftests: Test conversion with elevated page refcount Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 35/42] KVM: selftests: Reset shared memory after hole-punching Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 36/42] KVM: selftests: Provide function to look up guest_memfd details from gpa Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 37/42] KVM: selftests: Provide common function to set memory attributes Ackerley Tng via B4 Relay
2026-05-23  1:35   ` sashiko-bot
2026-05-23  0:18 ` [PATCH v7 38/42] KVM: selftests: Check fd/flags provided to mmap() when setting up memslot Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 39/42] KVM: selftests: Make TEST_EXPECT_SIGBUS thread-safe Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 40/42] KVM: selftests: Update private_mem_conversions_test to mmap() guest_memfd Ackerley Tng via B4 Relay
2026-05-23  0:18 ` [PATCH v7 41/42] KVM: selftests: Add script to exercise private_mem_conversions_test Ackerley Tng via B4 Relay
2026-05-23  1:15   ` sashiko-bot
2026-05-23  0:18 ` [PATCH v7 42/42] KVM: selftests: Update private memory exits test to work with per-gmem attributes Ackerley Tng via B4 Relay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahn6qysOGfa74A2E@google.com \
    --to=seanjc@google.com \
    --cc=ackerleytng@google.com \
    --cc=devnull+ackerleytng.google.com@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox