The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [PATCH 0/2] KVM: selftests: Fixes for guest_memfd_test and FD double-close
@ 2026-05-11 11:37 Fuad Tabba
  2026-05-11 11:37 ` [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains Fuad Tabba
  2026-05-11 11:37 ` [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() Fuad Tabba
  0 siblings, 2 replies; 11+ messages in thread
From: Fuad Tabba @ 2026-05-11 11:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Shuah Khan
  Cc: Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng,
	David Matlack, kvm, linux-kselftest, linux-kernel

Hi folks,

Two more patches, standalone KVM selftests fixes that I'd like to
land before posting a pKVM selftests series, a follow-up to Will's
pKVM infrastructure series [1].

The first patch fixes a build failure I ran into on Debian Bookworm
(glibc 2.36): guest_memfd_test.c uses MADV_COLLAPSE unconditionally,
but the constant was only exposed in glibc 2.37. The fix follows the
established selftest idiom of providing a compile-time fallback to
the kernel-ABI value when the libc header does not define it.

The second patch fixes an FD double-close in kvm_vm_release().
kvm_vm_free() calls kvm_vm_release() internally, so a test that
calls kvm_vm_release() and then kvm_vm_free() without reopening the
VM in between double-closes vmp->fd and vmp->kvm_fd. Existing
in-tree callers reopen via vm_recreate_with_one_vcpu() and do not
hit this today. The double-close becomes a hard test failure for
the pKVM selftests I will submit, which release the VM and then
inspect host-mapped memory before freeing.

Based on Linux 7.1-rc3.

Cheers,
/fuad

[1] https://lore.kernel.org/all/20260105154939.11041-1-will@kernel.org/

Fuad Tabba (2):
  KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains
  KVM: selftests: Fix FD double-close in kvm_vm_release()

 tools/testing/selftests/kvm/guest_memfd_test.c |  4 ++++
 tools/testing/selftests/kvm/lib/kvm_util.c     | 10 ++++++++--
 2 files changed, 12 insertions(+), 2 deletions(-)

-- 
2.54.0.563.g4f69b47b94-goog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains
  2026-05-11 11:37 [PATCH 0/2] KVM: selftests: Fixes for guest_memfd_test and FD double-close Fuad Tabba
@ 2026-05-11 11:37 ` Fuad Tabba
  2026-05-11 14:59   ` Sean Christopherson
  2026-05-11 11:37 ` [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() Fuad Tabba
  1 sibling, 1 reply; 11+ messages in thread
From: Fuad Tabba @ 2026-05-11 11:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Shuah Khan
  Cc: Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng,
	David Matlack, kvm, linux-kselftest, linux-kernel

guest_memfd_test.c uses MADV_COLLAPSE unconditionally, but the
constant was only exposed in glibc 2.37. Distros still on older glibc
(e.g. Debian Bookworm, glibc 2.36) cannot build the test.

Add a compile-time fallback that defines MADV_COLLAPSE to its
kernel-ABI value when the libc header does not, matching the
established selftest pattern for sometimes-missing libc constants
(see selftests/bpf/uprobe_multi.c, selftests/mm/hugetlb-soft-offline.c).

Fixes: 9830209b4ae8 ("KVM: selftests: Test MADV_COLLAPSE on guest_memfd")
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 tools/testing/selftests/kvm/guest_memfd_test.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
index d6528c6f5e03..45990ae5f1e5 100644
--- a/tools/testing/selftests/kvm/guest_memfd_test.c
+++ b/tools/testing/selftests/kvm/guest_memfd_test.c
@@ -23,6 +23,10 @@
 #include "test_util.h"
 #include "ucall_common.h"
 
+#ifndef MADV_COLLAPSE
+#define MADV_COLLAPSE	25
+#endif
+
 static size_t page_size;
 
 static void test_file_read_write(int fd, size_t total_size)
-- 
2.54.0.563.g4f69b47b94-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
  2026-05-11 11:37 [PATCH 0/2] KVM: selftests: Fixes for guest_memfd_test and FD double-close Fuad Tabba
  2026-05-11 11:37 ` [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains Fuad Tabba
@ 2026-05-11 11:37 ` Fuad Tabba
  2026-05-11 14:58   ` Sean Christopherson
  1 sibling, 1 reply; 11+ messages in thread
From: Fuad Tabba @ 2026-05-11 11:37 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson, Shuah Khan
  Cc: Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng,
	David Matlack, kvm, linux-kselftest, linux-kernel

kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and
kvm_vm_free() calls kvm_vm_release() at teardown.  A test that calls
kvm_vm_release() and then kvm_vm_free() without a
vm_recreate_with_one_vcpu() in between double-closes both FDs.  Since
kvm_close() asserts on close() failure, the second close trips
TEST_ASSERT and aborts the test, or, if the FD was recycled, silently
closes an unrelated file.

Guard the two closes in kvm_vm_release() by checking each FD against
-1 and resetting it to -1 after closing, matching the existing
kvm_stats_release() idiom.  Existing in-tree callers all pass through
vm_recreate_with_one_vcpu() before teardown, so they reassign the FDs
and do not hit the bug today.

Fixes: fa3899add105 ("kvm: selftests: add basic test for state save and restore")
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 tools/testing/selftests/kvm/lib/kvm_util.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 2a76eca7029d..e44223714fd4 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -793,8 +793,14 @@ void kvm_vm_release(struct kvm_vm *vmp)
 	list_for_each_entry_safe(vcpu, tmp, &vmp->vcpus, list)
 		vm_vcpu_rm(vmp, vcpu);
 
-	kvm_close(vmp->fd);
-	kvm_close(vmp->kvm_fd);
+	if (vmp->fd >= 0) {
+		kvm_close(vmp->fd);
+		vmp->fd = -1;
+	}
+	if (vmp->kvm_fd >= 0) {
+		kvm_close(vmp->kvm_fd);
+		vmp->kvm_fd = -1;
+	}
 
 	/* Free cached stats metadata and close FD */
 	kvm_stats_release(&vmp->stats);
-- 
2.54.0.563.g4f69b47b94-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
  2026-05-11 11:37 ` [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() Fuad Tabba
@ 2026-05-11 14:58   ` Sean Christopherson
  2026-05-11 15:19     ` Fuad Tabba
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2026-05-11 14:58 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton,
	Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest,
	linux-kernel

On Mon, May 11, 2026, Fuad Tabba wrote:
> kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and
> kvm_vm_free() calls kvm_vm_release() at teardown.  A test that calls
> kvm_vm_release() and then kvm_vm_free() without a
> vm_recreate_with_one_vcpu() in between double-closes both FDs.

That's a test bug, no?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains
  2026-05-11 11:37 ` [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains Fuad Tabba
@ 2026-05-11 14:59   ` Sean Christopherson
  0 siblings, 0 replies; 11+ messages in thread
From: Sean Christopherson @ 2026-05-11 14:59 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton,
	Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest,
	linux-kernel

On Mon, May 11, 2026, Fuad Tabba wrote:
> guest_memfd_test.c uses MADV_COLLAPSE unconditionally, but the
> constant was only exposed in glibc 2.37. Distros still on older glibc
> (e.g. Debian Bookworm, glibc 2.36) cannot build the test.
> 
> Add a compile-time fallback that defines MADV_COLLAPSE to its
> kernel-ABI value when the libc header does not, matching the
> established selftest pattern for sometimes-missing libc constants
> (see selftests/bpf/uprobe_multi.c, selftests/mm/hugetlb-soft-offline.c).
> 
> Fixes: 9830209b4ae8 ("KVM: selftests: Test MADV_COLLAPSE on guest_memfd")
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  tools/testing/selftests/kvm/guest_memfd_test.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
> index d6528c6f5e03..45990ae5f1e5 100644
> --- a/tools/testing/selftests/kvm/guest_memfd_test.c
> +++ b/tools/testing/selftests/kvm/guest_memfd_test.c
> @@ -23,6 +23,10 @@
>  #include "test_util.h"
>  #include "ucall_common.h"
>  
> +#ifndef MADV_COLLAPSE
> +#define MADV_COLLAPSE	25
> +#endif

Fixed by:

  https://lore.kernel.org/all/20260428012503.1213654-1-seanjc@google.com

I'll get that applied and on its way to Paolo this week.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
  2026-05-11 14:58   ` Sean Christopherson
@ 2026-05-11 15:19     ` Fuad Tabba
  2026-05-11 20:25       ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Fuad Tabba @ 2026-05-11 15:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton,
	Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest,
	linux-kernel

On Mon, 11 May 2026 at 15:58, Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, May 11, 2026, Fuad Tabba wrote:
> > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and
> > kvm_vm_free() calls kvm_vm_release() at teardown.  A test that calls
> > kvm_vm_release() and then kvm_vm_free() without a
> > vm_recreate_with_one_vcpu() in between double-closes both FDs.
>
> That's a test bug, no?

Fair point.  Let me describe the actual use case and then I'd like
your guidance on which way to fix it.

The pKVM selftests I'm preparing need to verify that protected guest
memory is zeroed when the VM is destroyed before the host regains
visibility of those pages.  The test pattern is:

  vm_userspace_mem_region_add(vm, ..., GPA_BASE, 1, GPA_PAGES, 0);
  /* run protected guest, which writes/shares/unshares its memory */
  ...
  TEST_ASSERT(get_proc_locked_vm_size() > GPA_SIZE);   /* still pinned */

  kvm_vm_release(vm);                                  /* trigger destroy */

  /* host_mem is still mmap'd in the test process; pKVM should have
   * zeroed it before unpinning */
  TEST_ASSERT(is_zero(region->host_mem, GPA_PAGES * PAGE_SIZE));

  kvm_vm_free(vm);                                     /* tidy up */
  TEST_ASSERT(get_proc_locked_vm_size() == 0);

I used kvm_vm_release() because it's the only public API that closes
vm->fd to trigger kernel-side destruction.  But the existing callers
follow it with vm_recreate_with_one_vcpu(), so the "release + later
kvm_vm_free()" path isn't exercised today.

I see three ways to make this clean:
  a) This patch: kvm_vm_release() becomes idempotent for its three
     FDs, matching the kvm_stats_release() idiom it already invokes.
  b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g.
     kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel
     destruction while leaving the kvm_vm struct intact for
     post-destruction inspection.  kvm_vm_free() learns to handle the
     half-released state.
  c) Something else entirely, e.g., the test should manage vm->fd
     directly and not rely on library helpers for this pattern.

I lean (a). (b) doesn't really avoid the change, as kvm_vm_free()
still calls kvm_vm_release() internally, so either kvm_vm_release()
gains these guards, or kvm_vm_free() grows new "already half-released"
state, or kvm_vm_destroy_kernel() open-codes the same guarded closes
on every FD it touches.  (a) is the minimal form, and it lines up
vmp->fd / vmp->kvm_fd with vmp->stats.fd, which kvm_vm_release()
already handles this way via kvm_stats_release().  But happy to take
(b) or (c) if you'd rather not broaden kvm_vm_release()'s contract.

Cheers,
/fuad

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
  2026-05-11 15:19     ` Fuad Tabba
@ 2026-05-11 20:25       ` Sean Christopherson
  2026-05-12  8:06         ` Fuad Tabba
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2026-05-11 20:25 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton,
	Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest,
	linux-kernel

On Mon, May 11, 2026, Fuad Tabba wrote:
> On Mon, 11 May 2026 at 15:58, Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Mon, May 11, 2026, Fuad Tabba wrote:
> > > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and
> > > kvm_vm_free() calls kvm_vm_release() at teardown.  A test that calls
> > > kvm_vm_release() and then kvm_vm_free() without a
> > > vm_recreate_with_one_vcpu() in between double-closes both FDs.
> >
> > That's a test bug, no?
> 
> Fair point.  Let me describe the actual use case and then I'd like
> your guidance on which way to fix it.
> 
> The pKVM selftests I'm preparing need to verify that protected guest
> memory is zeroed when the VM is destroyed before the host regains
> visibility of those pages.  The test pattern is:
> 
>   vm_userspace_mem_region_add(vm, ..., GPA_BASE, 1, GPA_PAGES, 0);
>   /* run protected guest, which writes/shares/unshares its memory */
>   ...
>   TEST_ASSERT(get_proc_locked_vm_size() > GPA_SIZE);   /* still pinned */
> 
>   kvm_vm_release(vm);                                  /* trigger destroy */
> 
>   /* host_mem is still mmap'd in the test process; pKVM should have
>    * zeroed it before unpinning */
>   TEST_ASSERT(is_zero(region->host_mem, GPA_PAGES * PAGE_SIZE));
> 
>   kvm_vm_free(vm);                                     /* tidy up */
>   TEST_ASSERT(get_proc_locked_vm_size() == 0);
> 
> I used kvm_vm_release() because it's the only public API that closes
> vm->fd to trigger kernel-side destruction.  But the existing callers
> follow it with vm_recreate_with_one_vcpu(), so the "release + later
> kvm_vm_free()" path isn't exercised today.
> 
> I see three ways to make this clean:
>   a) This patch: kvm_vm_release() becomes idempotent for its three
>      FDs, matching the kvm_stats_release() idiom it already invokes.
>   b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g.
>      kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel
>      destruction while leaving the kvm_vm struct intact for
>      post-destruction inspection.  kvm_vm_free() learns to handle the
>      half-released state.
>   c) Something else entirely, e.g., the test should manage vm->fd
>      directly and not rely on library helpers for this pattern.

    d) Fully kill the VM; validate the semantics with an explict mmap().

The entire point of the test you are writing is to verfiy that a guest_memfd VMA
doesn't somehow cause KVM to leak state.  So, make that obvious instead of abusing
APIs that kinda sorta do what you want, but not really.

	mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED,
		       region->guest_memfd);

	...

	kvm_vm_free(vm);

	TEST_ASSERT(is_zero(mem, ...));

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
  2026-05-11 20:25       ` Sean Christopherson
@ 2026-05-12  8:06         ` Fuad Tabba
  2026-05-12 13:30           ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Fuad Tabba @ 2026-05-12  8:06 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton,
	Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest,
	linux-kernel

On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, May 11, 2026, Fuad Tabba wrote:
> > On Mon, 11 May 2026 at 15:58, Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Mon, May 11, 2026, Fuad Tabba wrote:
> > > > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and
> > > > kvm_vm_free() calls kvm_vm_release() at teardown.  A test that calls
> > > > kvm_vm_release() and then kvm_vm_free() without a
> > > > vm_recreate_with_one_vcpu() in between double-closes both FDs.
> > >
> > > That's a test bug, no?
> >
> > Fair point.  Let me describe the actual use case and then I'd like
> > your guidance on which way to fix it.
> >
> > The pKVM selftests I'm preparing need to verify that protected guest
> > memory is zeroed when the VM is destroyed before the host regains
> > visibility of those pages.  The test pattern is:
> >
> >   vm_userspace_mem_region_add(vm, ..., GPA_BASE, 1, GPA_PAGES, 0);
> >   /* run protected guest, which writes/shares/unshares its memory */
> >   ...
> >   TEST_ASSERT(get_proc_locked_vm_size() > GPA_SIZE);   /* still pinned */
> >
> >   kvm_vm_release(vm);                                  /* trigger destroy */
> >
> >   /* host_mem is still mmap'd in the test process; pKVM should have
> >    * zeroed it before unpinning */
> >   TEST_ASSERT(is_zero(region->host_mem, GPA_PAGES * PAGE_SIZE));
> >
> >   kvm_vm_free(vm);                                     /* tidy up */
> >   TEST_ASSERT(get_proc_locked_vm_size() == 0);
> >
> > I used kvm_vm_release() because it's the only public API that closes
> > vm->fd to trigger kernel-side destruction.  But the existing callers
> > follow it with vm_recreate_with_one_vcpu(), so the "release + later
> > kvm_vm_free()" path isn't exercised today.
> >
> > I see three ways to make this clean:
> >   a) This patch: kvm_vm_release() becomes idempotent for its three
> >      FDs, matching the kvm_stats_release() idiom it already invokes.
> >   b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g.
> >      kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel
> >      destruction while leaving the kvm_vm struct intact for
> >      post-destruction inspection.  kvm_vm_free() learns to handle the
> >      half-released state.
> >   c) Something else entirely, e.g., the test should manage vm->fd
> >      directly and not rely on library helpers for this pattern.
>
>     d) Fully kill the VM; validate the semantics with an explict mmap().
>
> The entire point of the test you are writing is to verfiy that a guest_memfd VMA
> doesn't somehow cause KVM to leak state.  So, make that obvious instead of abusing
> APIs that kinda sorta do what you want, but not really.
>
>         mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED,
>                        region->guest_memfd);
>
>         ...
>
>         kvm_vm_free(vm);
>
>         TEST_ASSERT(is_zero(mem, ...));

The test isn't about guest_memfd.  The pKVM support that just landed
via Will's series [1] is built on anonymous host memory, and that's
the only backing this selftest exercises (VM_MEM_SRC_ANONYMOUS and
VM_MEM_SRC_ANONYMOUS_THP).  The property under test is the one
documented by the same series in Documentation/virt/kvm/arm/pkvm.rst:

  Status: Isolation of anonymous memory and metadata pages.
  ...
  Reclaimed memory is zeroed by the hypervisor ...

The test mirrors that contract: assert VmLck > donated size before
teardown (pages pinned), trigger kernel-side destruction, assert the
host VMA reads as zero (hypervisor scrubbed on reclaim), then assert
VmLck == 0 (pages unpinned).

kvm_mmap() + kvm_vm_free() + is_zero() doesn't translate here.  The only
host view of the donated pages is the memslot mmap, and kvm_vm_free()
munmaps it on the way out, so inspection has to happen between
kernel-side destruction and userspace free.  kvm_vm_release() is the
only library primitive that does that today.

What do you suggest?

Cheers,
/fuad

[1] https://lore.kernel.org/all/20260105154939.11041-1-will@kernel.org/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
  2026-05-12  8:06         ` Fuad Tabba
@ 2026-05-12 13:30           ` Sean Christopherson
  2026-05-12 15:04             ` Will Deacon
  2026-05-12 15:06             ` Fuad Tabba
  0 siblings, 2 replies; 11+ messages in thread
From: Sean Christopherson @ 2026-05-12 13:30 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton,
	Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest,
	linux-kernel

On Tue, May 12, 2026, Fuad Tabba wrote:
> On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@google.com> wrote:
> > > I used kvm_vm_release() because it's the only public API that closes
> > > vm->fd to trigger kernel-side destruction.  But the existing callers
> > > follow it with vm_recreate_with_one_vcpu(), so the "release + later
> > > kvm_vm_free()" path isn't exercised today.
> > >
> > > I see three ways to make this clean:
> > >   a) This patch: kvm_vm_release() becomes idempotent for its three
> > >      FDs, matching the kvm_stats_release() idiom it already invokes.
> > >   b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g.
> > >      kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel
> > >      destruction while leaving the kvm_vm struct intact for
> > >      post-destruction inspection.  kvm_vm_free() learns to handle the
> > >      half-released state.
> > >   c) Something else entirely, e.g., the test should manage vm->fd
> > >      directly and not rely on library helpers for this pattern.
> >
> >     d) Fully kill the VM; validate the semantics with an explict mmap().
> >
> > The entire point of the test you are writing is to verfiy that a guest_memfd VMA
> > doesn't somehow cause KVM to leak state.  So, make that obvious instead of abusing
> > APIs that kinda sorta do what you want, but not really.
> >
> >         mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED,
> >                        region->guest_memfd);
> >
> >         ...
> >
> >         kvm_vm_free(vm);
> >
> >         TEST_ASSERT(is_zero(mem, ...));
> 
> The test isn't about guest_memfd.  The pKVM support that just landed
> via Will's series [1] 

Landed where?  Is pKVM actually going upstream with anonymous memory?  I thought
the inability to protect against page faults in the untrusted kernel was a
non-starter?

> kvm_mmap() + kvm_vm_free() + is_zero() doesn't translate here.  The only
> host view of the donated pages is the memslot mmap, and kvm_vm_free()
> munmaps it on the way out, so inspection has to happen between
> kernel-side destruction and userspace free.  kvm_vm_release() is the
> only library primitive that does that today.
> 
> What do you suggest?

Manually allocate the memory and expose it to the guest via
vm_set_user_memory_region2() vm_set_user_memory_region().


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
  2026-05-12 13:30           ` Sean Christopherson
@ 2026-05-12 15:04             ` Will Deacon
  2026-05-12 15:06             ` Fuad Tabba
  1 sibling, 0 replies; 11+ messages in thread
From: Will Deacon @ 2026-05-12 15:04 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Fuad Tabba, Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton,
	Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel

On Tue, May 12, 2026 at 06:30:00AM -0700, Sean Christopherson wrote:
> On Tue, May 12, 2026, Fuad Tabba wrote:
> > On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@google.com> wrote:
> > > > I used kvm_vm_release() because it's the only public API that closes
> > > > vm->fd to trigger kernel-side destruction.  But the existing callers
> > > > follow it with vm_recreate_with_one_vcpu(), so the "release + later
> > > > kvm_vm_free()" path isn't exercised today.
> > > >
> > > > I see three ways to make this clean:
> > > >   a) This patch: kvm_vm_release() becomes idempotent for its three
> > > >      FDs, matching the kvm_stats_release() idiom it already invokes.
> > > >   b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g.
> > > >      kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel
> > > >      destruction while leaving the kvm_vm struct intact for
> > > >      post-destruction inspection.  kvm_vm_free() learns to handle the
> > > >      half-released state.
> > > >   c) Something else entirely, e.g., the test should manage vm->fd
> > > >      directly and not rely on library helpers for this pattern.
> > >
> > >     d) Fully kill the VM; validate the semantics with an explict mmap().
> > >
> > > The entire point of the test you are writing is to verfiy that a guest_memfd VMA
> > > doesn't somehow cause KVM to leak state.  So, make that obvious instead of abusing
> > > APIs that kinda sorta do what you want, but not really.
> > >
> > >         mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED,
> > >                        region->guest_memfd);
> > >
> > >         ...
> > >
> > >         kvm_vm_free(vm);
> > >
> > >         TEST_ASSERT(is_zero(mem, ...));
> > 
> > The test isn't about guest_memfd.  The pKVM support that just landed
> > via Will's series [1] 
> 
> Landed where?  Is pKVM actually going upstream with anonymous memory?

The basic memory protection code for pKVM using anonymous memory is now
upstream but we still have a tonne to do for CPU state, DMA, firmware
loading etc. It obviously doesn't preclude support for guest_memfd in
future and, as you know, Fuad has been actively involved in that
discussion over the years. It does, however, mean that we'd be offering
guest_memfd alongside anonymous memory, which matches what we'd have to
do in Android at this point anyway (given the millions of devices using
anonymous memory in the field). Of course, if that means we should have
an additional set of tests, then that's fine.

> I thought the inability to protect against page faults in the untrusted
> kernel was a non-starter?

We ended up solving that by forcefully reclaiming the target page from
the guest. So the flow looks something like:

1. The host kernel accesses a private (faulting) page via a kernel mapping
2. An exception is taken to the hypervisor
3. The hypervisor injects an exception back into the kernel, with
   sufficient syndrome information to triage it as an access to a private
   page.
4. The kernel searches its usual exception handlers (for things like
   load_unaligned_zeropad())
5. If no handlers are found, the kernel translates the faulting virtual
   address into a physical address using the 'AT' instruction and issues
   a hypercall to forcefully reclaim the page.
6. The hypervisor unmaps the page from the guest and replaces the guest
   pte with an immutable "poison" (faulting) entry. If a vCPU later
   faults on this entry, it returns to userspace with -EFAULT.
7. The hypervisor clears the unmapped page, maps it back into the host
   and returns from the hypercall.
8. The kernel treats the fault as "spurious" and retries the faulting
   instruction.

If the faulting access in (1) instead came from userspace, the kernel
injects a SEGV at (4). If the faulting access came from a uaccess
routine (e.g. copy_from_user()), then the fixup handler would run like
normal in (4).

Will

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release()
  2026-05-12 13:30           ` Sean Christopherson
  2026-05-12 15:04             ` Will Deacon
@ 2026-05-12 15:06             ` Fuad Tabba
  1 sibling, 0 replies; 11+ messages in thread
From: Fuad Tabba @ 2026-05-12 15:06 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton,
	Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest,
	linux-kernel

On Tue, 12 May 2026 at 14:30, Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, May 12, 2026, Fuad Tabba wrote:
> > On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@google.com> wrote:
> > > > I used kvm_vm_release() because it's the only public API that closes
> > > > vm->fd to trigger kernel-side destruction.  But the existing callers
> > > > follow it with vm_recreate_with_one_vcpu(), so the "release + later
> > > > kvm_vm_free()" path isn't exercised today.
> > > >
> > > > I see three ways to make this clean:
> > > >   a) This patch: kvm_vm_release() becomes idempotent for its three
> > > >      FDs, matching the kvm_stats_release() idiom it already invokes.
> > > >   b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g.
> > > >      kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel
> > > >      destruction while leaving the kvm_vm struct intact for
> > > >      post-destruction inspection.  kvm_vm_free() learns to handle the
> > > >      half-released state.
> > > >   c) Something else entirely, e.g., the test should manage vm->fd
> > > >      directly and not rely on library helpers for this pattern.
> > >
> > >     d) Fully kill the VM; validate the semantics with an explict mmap().
> > >
> > > The entire point of the test you are writing is to verfiy that a guest_memfd VMA
> > > doesn't somehow cause KVM to leak state.  So, make that obvious instead of abusing
> > > APIs that kinda sorta do what you want, but not really.
> > >
> > >         mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED,
> > >                        region->guest_memfd);
> > >
> > >         ...
> > >
> > >         kvm_vm_free(vm);
> > >
> > >         TEST_ASSERT(is_zero(mem, ...));
> >
> > The test isn't about guest_memfd.  The pKVM support that just landed
> > via Will's series [1]
>
> Landed where?  Is pKVM actually going upstream with anonymous memory?  I thought
> the inability to protect against page faults in the untrusted kernel was a
> non-starter?
>
> > kvm_mmap() + kvm_vm_free() + is_zero() doesn't translate here.  The only
> > host view of the donated pages is the memslot mmap, and kvm_vm_free()
> > munmaps it on the way out, so inspection has to happen between
> > kernel-side destruction and userspace free.  kvm_vm_release() is the
> > only library primitive that does that today.
> >
> > What do you suggest?
>
> Manually allocate the memory and expose it to the guest via
> vm_set_user_memory_region2() vm_set_user_memory_region().
>

That works for this test and is cleaner.  I'll restructure: mmap the
backing in the test, vm_set_user_memory_region2() it in, then
kvm_vm_free() -> is_zero() through the test-owned VMA -> munmap().
That drops the motivation for the kvm_vm_release() fd-guard patch
in this series; happy to drop it, or to keep it as a standalone
"match the kvm_stats_release() idiom" cleanup, whichever you prefer.

Thanks,
/fuad

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-05-12 15:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 11:37 [PATCH 0/2] KVM: selftests: Fixes for guest_memfd_test and FD double-close Fuad Tabba
2026-05-11 11:37 ` [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains Fuad Tabba
2026-05-11 14:59   ` Sean Christopherson
2026-05-11 11:37 ` [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() Fuad Tabba
2026-05-11 14:58   ` Sean Christopherson
2026-05-11 15:19     ` Fuad Tabba
2026-05-11 20:25       ` Sean Christopherson
2026-05-12  8:06         ` Fuad Tabba
2026-05-12 13:30           ` Sean Christopherson
2026-05-12 15:04             ` Will Deacon
2026-05-12 15:06             ` Fuad Tabba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox