* [PATCH 0/2] KVM: selftests: Fixes for guest_memfd_test and FD double-close @ 2026-05-11 11:37 Fuad Tabba 2026-05-11 11:37 ` [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains Fuad Tabba 2026-05-11 11:37 ` [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() Fuad Tabba 0 siblings, 2 replies; 11+ messages in thread From: Fuad Tabba @ 2026-05-11 11:37 UTC (permalink / raw) To: Paolo Bonzini, Sean Christopherson, Shuah Khan Cc: Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel Hi folks, Two more patches, standalone KVM selftests fixes that I'd like to land before posting a pKVM selftests series, a follow-up to Will's pKVM infrastructure series [1]. The first patch fixes a build failure I ran into on Debian Bookworm (glibc 2.36): guest_memfd_test.c uses MADV_COLLAPSE unconditionally, but the constant was only exposed in glibc 2.37. The fix follows the established selftest idiom of providing a compile-time fallback to the kernel-ABI value when the libc header does not define it. The second patch fixes an FD double-close in kvm_vm_release(). kvm_vm_free() calls kvm_vm_release() internally, so a test that calls kvm_vm_release() and then kvm_vm_free() without reopening the VM in between double-closes vmp->fd and vmp->kvm_fd. Existing in-tree callers reopen via vm_recreate_with_one_vcpu() and do not hit this today. The double-close becomes a hard test failure for the pKVM selftests I will submit, which release the VM and then inspect host-mapped memory before freeing. Based on Linux 7.1-rc3. Cheers, /fuad [1] https://lore.kernel.org/all/20260105154939.11041-1-will@kernel.org/ Fuad Tabba (2): KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains KVM: selftests: Fix FD double-close in kvm_vm_release() tools/testing/selftests/kvm/guest_memfd_test.c | 4 ++++ tools/testing/selftests/kvm/lib/kvm_util.c | 10 ++++++++-- 2 files changed, 12 insertions(+), 2 deletions(-) -- 2.54.0.563.g4f69b47b94-goog ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains 2026-05-11 11:37 [PATCH 0/2] KVM: selftests: Fixes for guest_memfd_test and FD double-close Fuad Tabba @ 2026-05-11 11:37 ` Fuad Tabba 2026-05-11 14:59 ` Sean Christopherson 2026-05-11 11:37 ` [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() Fuad Tabba 1 sibling, 1 reply; 11+ messages in thread From: Fuad Tabba @ 2026-05-11 11:37 UTC (permalink / raw) To: Paolo Bonzini, Sean Christopherson, Shuah Khan Cc: Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel guest_memfd_test.c uses MADV_COLLAPSE unconditionally, but the constant was only exposed in glibc 2.37. Distros still on older glibc (e.g. Debian Bookworm, glibc 2.36) cannot build the test. Add a compile-time fallback that defines MADV_COLLAPSE to its kernel-ABI value when the libc header does not, matching the established selftest pattern for sometimes-missing libc constants (see selftests/bpf/uprobe_multi.c, selftests/mm/hugetlb-soft-offline.c). Fixes: 9830209b4ae8 ("KVM: selftests: Test MADV_COLLAPSE on guest_memfd") Signed-off-by: Fuad Tabba <tabba@google.com> --- tools/testing/selftests/kvm/guest_memfd_test.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index d6528c6f5e03..45990ae5f1e5 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -23,6 +23,10 @@ #include "test_util.h" #include "ucall_common.h" +#ifndef MADV_COLLAPSE +#define MADV_COLLAPSE 25 +#endif + static size_t page_size; static void test_file_read_write(int fd, size_t total_size) -- 2.54.0.563.g4f69b47b94-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains 2026-05-11 11:37 ` [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains Fuad Tabba @ 2026-05-11 14:59 ` Sean Christopherson 0 siblings, 0 replies; 11+ messages in thread From: Sean Christopherson @ 2026-05-11 14:59 UTC (permalink / raw) To: Fuad Tabba Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel On Mon, May 11, 2026, Fuad Tabba wrote: > guest_memfd_test.c uses MADV_COLLAPSE unconditionally, but the > constant was only exposed in glibc 2.37. Distros still on older glibc > (e.g. Debian Bookworm, glibc 2.36) cannot build the test. > > Add a compile-time fallback that defines MADV_COLLAPSE to its > kernel-ABI value when the libc header does not, matching the > established selftest pattern for sometimes-missing libc constants > (see selftests/bpf/uprobe_multi.c, selftests/mm/hugetlb-soft-offline.c). > > Fixes: 9830209b4ae8 ("KVM: selftests: Test MADV_COLLAPSE on guest_memfd") > Signed-off-by: Fuad Tabba <tabba@google.com> > --- > tools/testing/selftests/kvm/guest_memfd_test.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c > index d6528c6f5e03..45990ae5f1e5 100644 > --- a/tools/testing/selftests/kvm/guest_memfd_test.c > +++ b/tools/testing/selftests/kvm/guest_memfd_test.c > @@ -23,6 +23,10 @@ > #include "test_util.h" > #include "ucall_common.h" > > +#ifndef MADV_COLLAPSE > +#define MADV_COLLAPSE 25 > +#endif Fixed by: https://lore.kernel.org/all/20260428012503.1213654-1-seanjc@google.com I'll get that applied and on its way to Paolo this week. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() 2026-05-11 11:37 [PATCH 0/2] KVM: selftests: Fixes for guest_memfd_test and FD double-close Fuad Tabba 2026-05-11 11:37 ` [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains Fuad Tabba @ 2026-05-11 11:37 ` Fuad Tabba 2026-05-11 14:58 ` Sean Christopherson 1 sibling, 1 reply; 11+ messages in thread From: Fuad Tabba @ 2026-05-11 11:37 UTC (permalink / raw) To: Paolo Bonzini, Sean Christopherson, Shuah Khan Cc: Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and kvm_vm_free() calls kvm_vm_release() at teardown. A test that calls kvm_vm_release() and then kvm_vm_free() without a vm_recreate_with_one_vcpu() in between double-closes both FDs. Since kvm_close() asserts on close() failure, the second close trips TEST_ASSERT and aborts the test, or, if the FD was recycled, silently closes an unrelated file. Guard the two closes in kvm_vm_release() by checking each FD against -1 and resetting it to -1 after closing, matching the existing kvm_stats_release() idiom. Existing in-tree callers all pass through vm_recreate_with_one_vcpu() before teardown, so they reassign the FDs and do not hit the bug today. Fixes: fa3899add105 ("kvm: selftests: add basic test for state save and restore") Signed-off-by: Fuad Tabba <tabba@google.com> --- tools/testing/selftests/kvm/lib/kvm_util.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 2a76eca7029d..e44223714fd4 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -793,8 +793,14 @@ void kvm_vm_release(struct kvm_vm *vmp) list_for_each_entry_safe(vcpu, tmp, &vmp->vcpus, list) vm_vcpu_rm(vmp, vcpu); - kvm_close(vmp->fd); - kvm_close(vmp->kvm_fd); + if (vmp->fd >= 0) { + kvm_close(vmp->fd); + vmp->fd = -1; + } + if (vmp->kvm_fd >= 0) { + kvm_close(vmp->kvm_fd); + vmp->kvm_fd = -1; + } /* Free cached stats metadata and close FD */ kvm_stats_release(&vmp->stats); -- 2.54.0.563.g4f69b47b94-goog ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() 2026-05-11 11:37 ` [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() Fuad Tabba @ 2026-05-11 14:58 ` Sean Christopherson 2026-05-11 15:19 ` Fuad Tabba 0 siblings, 1 reply; 11+ messages in thread From: Sean Christopherson @ 2026-05-11 14:58 UTC (permalink / raw) To: Fuad Tabba Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel On Mon, May 11, 2026, Fuad Tabba wrote: > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and > kvm_vm_free() calls kvm_vm_release() at teardown. A test that calls > kvm_vm_release() and then kvm_vm_free() without a > vm_recreate_with_one_vcpu() in between double-closes both FDs. That's a test bug, no? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() 2026-05-11 14:58 ` Sean Christopherson @ 2026-05-11 15:19 ` Fuad Tabba 2026-05-11 20:25 ` Sean Christopherson 0 siblings, 1 reply; 11+ messages in thread From: Fuad Tabba @ 2026-05-11 15:19 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel On Mon, 11 May 2026 at 15:58, Sean Christopherson <seanjc@google.com> wrote: > > On Mon, May 11, 2026, Fuad Tabba wrote: > > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and > > kvm_vm_free() calls kvm_vm_release() at teardown. A test that calls > > kvm_vm_release() and then kvm_vm_free() without a > > vm_recreate_with_one_vcpu() in between double-closes both FDs. > > That's a test bug, no? Fair point. Let me describe the actual use case and then I'd like your guidance on which way to fix it. The pKVM selftests I'm preparing need to verify that protected guest memory is zeroed when the VM is destroyed before the host regains visibility of those pages. The test pattern is: vm_userspace_mem_region_add(vm, ..., GPA_BASE, 1, GPA_PAGES, 0); /* run protected guest, which writes/shares/unshares its memory */ ... TEST_ASSERT(get_proc_locked_vm_size() > GPA_SIZE); /* still pinned */ kvm_vm_release(vm); /* trigger destroy */ /* host_mem is still mmap'd in the test process; pKVM should have * zeroed it before unpinning */ TEST_ASSERT(is_zero(region->host_mem, GPA_PAGES * PAGE_SIZE)); kvm_vm_free(vm); /* tidy up */ TEST_ASSERT(get_proc_locked_vm_size() == 0); I used kvm_vm_release() because it's the only public API that closes vm->fd to trigger kernel-side destruction. But the existing callers follow it with vm_recreate_with_one_vcpu(), so the "release + later kvm_vm_free()" path isn't exercised today. I see three ways to make this clean: a) This patch: kvm_vm_release() becomes idempotent for its three FDs, matching the kvm_stats_release() idiom it already invokes. b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g. kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel destruction while leaving the kvm_vm struct intact for post-destruction inspection. kvm_vm_free() learns to handle the half-released state. c) Something else entirely, e.g., the test should manage vm->fd directly and not rely on library helpers for this pattern. I lean (a). (b) doesn't really avoid the change, as kvm_vm_free() still calls kvm_vm_release() internally, so either kvm_vm_release() gains these guards, or kvm_vm_free() grows new "already half-released" state, or kvm_vm_destroy_kernel() open-codes the same guarded closes on every FD it touches. (a) is the minimal form, and it lines up vmp->fd / vmp->kvm_fd with vmp->stats.fd, which kvm_vm_release() already handles this way via kvm_stats_release(). But happy to take (b) or (c) if you'd rather not broaden kvm_vm_release()'s contract. Cheers, /fuad ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() 2026-05-11 15:19 ` Fuad Tabba @ 2026-05-11 20:25 ` Sean Christopherson 2026-05-12 8:06 ` Fuad Tabba 0 siblings, 1 reply; 11+ messages in thread From: Sean Christopherson @ 2026-05-11 20:25 UTC (permalink / raw) To: Fuad Tabba Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel On Mon, May 11, 2026, Fuad Tabba wrote: > On Mon, 11 May 2026 at 15:58, Sean Christopherson <seanjc@google.com> wrote: > > > > On Mon, May 11, 2026, Fuad Tabba wrote: > > > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and > > > kvm_vm_free() calls kvm_vm_release() at teardown. A test that calls > > > kvm_vm_release() and then kvm_vm_free() without a > > > vm_recreate_with_one_vcpu() in between double-closes both FDs. > > > > That's a test bug, no? > > Fair point. Let me describe the actual use case and then I'd like > your guidance on which way to fix it. > > The pKVM selftests I'm preparing need to verify that protected guest > memory is zeroed when the VM is destroyed before the host regains > visibility of those pages. The test pattern is: > > vm_userspace_mem_region_add(vm, ..., GPA_BASE, 1, GPA_PAGES, 0); > /* run protected guest, which writes/shares/unshares its memory */ > ... > TEST_ASSERT(get_proc_locked_vm_size() > GPA_SIZE); /* still pinned */ > > kvm_vm_release(vm); /* trigger destroy */ > > /* host_mem is still mmap'd in the test process; pKVM should have > * zeroed it before unpinning */ > TEST_ASSERT(is_zero(region->host_mem, GPA_PAGES * PAGE_SIZE)); > > kvm_vm_free(vm); /* tidy up */ > TEST_ASSERT(get_proc_locked_vm_size() == 0); > > I used kvm_vm_release() because it's the only public API that closes > vm->fd to trigger kernel-side destruction. But the existing callers > follow it with vm_recreate_with_one_vcpu(), so the "release + later > kvm_vm_free()" path isn't exercised today. > > I see three ways to make this clean: > a) This patch: kvm_vm_release() becomes idempotent for its three > FDs, matching the kvm_stats_release() idiom it already invokes. > b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g. > kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel > destruction while leaving the kvm_vm struct intact for > post-destruction inspection. kvm_vm_free() learns to handle the > half-released state. > c) Something else entirely, e.g., the test should manage vm->fd > directly and not rely on library helpers for this pattern. d) Fully kill the VM; validate the semantics with an explict mmap(). The entire point of the test you are writing is to verfiy that a guest_memfd VMA doesn't somehow cause KVM to leak state. So, make that obvious instead of abusing APIs that kinda sorta do what you want, but not really. mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, region->guest_memfd); ... kvm_vm_free(vm); TEST_ASSERT(is_zero(mem, ...)); ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() 2026-05-11 20:25 ` Sean Christopherson @ 2026-05-12 8:06 ` Fuad Tabba 2026-05-12 13:30 ` Sean Christopherson 0 siblings, 1 reply; 11+ messages in thread From: Fuad Tabba @ 2026-05-12 8:06 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@google.com> wrote: > > On Mon, May 11, 2026, Fuad Tabba wrote: > > On Mon, 11 May 2026 at 15:58, Sean Christopherson <seanjc@google.com> wrote: > > > > > > On Mon, May 11, 2026, Fuad Tabba wrote: > > > > kvm_vm_release() closes vmp->fd and vmp->kvm_fd unconditionally, and > > > > kvm_vm_free() calls kvm_vm_release() at teardown. A test that calls > > > > kvm_vm_release() and then kvm_vm_free() without a > > > > vm_recreate_with_one_vcpu() in between double-closes both FDs. > > > > > > That's a test bug, no? > > > > Fair point. Let me describe the actual use case and then I'd like > > your guidance on which way to fix it. > > > > The pKVM selftests I'm preparing need to verify that protected guest > > memory is zeroed when the VM is destroyed before the host regains > > visibility of those pages. The test pattern is: > > > > vm_userspace_mem_region_add(vm, ..., GPA_BASE, 1, GPA_PAGES, 0); > > /* run protected guest, which writes/shares/unshares its memory */ > > ... > > TEST_ASSERT(get_proc_locked_vm_size() > GPA_SIZE); /* still pinned */ > > > > kvm_vm_release(vm); /* trigger destroy */ > > > > /* host_mem is still mmap'd in the test process; pKVM should have > > * zeroed it before unpinning */ > > TEST_ASSERT(is_zero(region->host_mem, GPA_PAGES * PAGE_SIZE)); > > > > kvm_vm_free(vm); /* tidy up */ > > TEST_ASSERT(get_proc_locked_vm_size() == 0); > > > > I used kvm_vm_release() because it's the only public API that closes > > vm->fd to trigger kernel-side destruction. But the existing callers > > follow it with vm_recreate_with_one_vcpu(), so the "release + later > > kvm_vm_free()" path isn't exercised today. > > > > I see three ways to make this clean: > > a) This patch: kvm_vm_release() becomes idempotent for its three > > FDs, matching the kvm_stats_release() idiom it already invokes. > > b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g. > > kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel > > destruction while leaving the kvm_vm struct intact for > > post-destruction inspection. kvm_vm_free() learns to handle the > > half-released state. > > c) Something else entirely, e.g., the test should manage vm->fd > > directly and not rely on library helpers for this pattern. > > d) Fully kill the VM; validate the semantics with an explict mmap(). > > The entire point of the test you are writing is to verfiy that a guest_memfd VMA > doesn't somehow cause KVM to leak state. So, make that obvious instead of abusing > APIs that kinda sorta do what you want, but not really. > > mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, > region->guest_memfd); > > ... > > kvm_vm_free(vm); > > TEST_ASSERT(is_zero(mem, ...)); The test isn't about guest_memfd. The pKVM support that just landed via Will's series [1] is built on anonymous host memory, and that's the only backing this selftest exercises (VM_MEM_SRC_ANONYMOUS and VM_MEM_SRC_ANONYMOUS_THP). The property under test is the one documented by the same series in Documentation/virt/kvm/arm/pkvm.rst: Status: Isolation of anonymous memory and metadata pages. ... Reclaimed memory is zeroed by the hypervisor ... The test mirrors that contract: assert VmLck > donated size before teardown (pages pinned), trigger kernel-side destruction, assert the host VMA reads as zero (hypervisor scrubbed on reclaim), then assert VmLck == 0 (pages unpinned). kvm_mmap() + kvm_vm_free() + is_zero() doesn't translate here. The only host view of the donated pages is the memslot mmap, and kvm_vm_free() munmaps it on the way out, so inspection has to happen between kernel-side destruction and userspace free. kvm_vm_release() is the only library primitive that does that today. What do you suggest? Cheers, /fuad [1] https://lore.kernel.org/all/20260105154939.11041-1-will@kernel.org/ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() 2026-05-12 8:06 ` Fuad Tabba @ 2026-05-12 13:30 ` Sean Christopherson 2026-05-12 15:04 ` Will Deacon 2026-05-12 15:06 ` Fuad Tabba 0 siblings, 2 replies; 11+ messages in thread From: Sean Christopherson @ 2026-05-12 13:30 UTC (permalink / raw) To: Fuad Tabba Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel On Tue, May 12, 2026, Fuad Tabba wrote: > On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@google.com> wrote: > > > I used kvm_vm_release() because it's the only public API that closes > > > vm->fd to trigger kernel-side destruction. But the existing callers > > > follow it with vm_recreate_with_one_vcpu(), so the "release + later > > > kvm_vm_free()" path isn't exercised today. > > > > > > I see three ways to make this clean: > > > a) This patch: kvm_vm_release() becomes idempotent for its three > > > FDs, matching the kvm_stats_release() idiom it already invokes. > > > b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g. > > > kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel > > > destruction while leaving the kvm_vm struct intact for > > > post-destruction inspection. kvm_vm_free() learns to handle the > > > half-released state. > > > c) Something else entirely, e.g., the test should manage vm->fd > > > directly and not rely on library helpers for this pattern. > > > > d) Fully kill the VM; validate the semantics with an explict mmap(). > > > > The entire point of the test you are writing is to verfiy that a guest_memfd VMA > > doesn't somehow cause KVM to leak state. So, make that obvious instead of abusing > > APIs that kinda sorta do what you want, but not really. > > > > mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, > > region->guest_memfd); > > > > ... > > > > kvm_vm_free(vm); > > > > TEST_ASSERT(is_zero(mem, ...)); > > The test isn't about guest_memfd. The pKVM support that just landed > via Will's series [1] Landed where? Is pKVM actually going upstream with anonymous memory? I thought the inability to protect against page faults in the untrusted kernel was a non-starter? > kvm_mmap() + kvm_vm_free() + is_zero() doesn't translate here. The only > host view of the donated pages is the memslot mmap, and kvm_vm_free() > munmaps it on the way out, so inspection has to happen between > kernel-side destruction and userspace free. kvm_vm_release() is the > only library primitive that does that today. > > What do you suggest? Manually allocate the memory and expose it to the guest via vm_set_user_memory_region2() vm_set_user_memory_region(). ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() 2026-05-12 13:30 ` Sean Christopherson @ 2026-05-12 15:04 ` Will Deacon 2026-05-12 15:06 ` Fuad Tabba 1 sibling, 0 replies; 11+ messages in thread From: Will Deacon @ 2026-05-12 15:04 UTC (permalink / raw) To: Sean Christopherson Cc: Fuad Tabba, Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel On Tue, May 12, 2026 at 06:30:00AM -0700, Sean Christopherson wrote: > On Tue, May 12, 2026, Fuad Tabba wrote: > > On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@google.com> wrote: > > > > I used kvm_vm_release() because it's the only public API that closes > > > > vm->fd to trigger kernel-side destruction. But the existing callers > > > > follow it with vm_recreate_with_one_vcpu(), so the "release + later > > > > kvm_vm_free()" path isn't exercised today. > > > > > > > > I see three ways to make this clean: > > > > a) This patch: kvm_vm_release() becomes idempotent for its three > > > > FDs, matching the kvm_stats_release() idiom it already invokes. > > > > b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g. > > > > kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel > > > > destruction while leaving the kvm_vm struct intact for > > > > post-destruction inspection. kvm_vm_free() learns to handle the > > > > half-released state. > > > > c) Something else entirely, e.g., the test should manage vm->fd > > > > directly and not rely on library helpers for this pattern. > > > > > > d) Fully kill the VM; validate the semantics with an explict mmap(). > > > > > > The entire point of the test you are writing is to verfiy that a guest_memfd VMA > > > doesn't somehow cause KVM to leak state. So, make that obvious instead of abusing > > > APIs that kinda sorta do what you want, but not really. > > > > > > mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, > > > region->guest_memfd); > > > > > > ... > > > > > > kvm_vm_free(vm); > > > > > > TEST_ASSERT(is_zero(mem, ...)); > > > > The test isn't about guest_memfd. The pKVM support that just landed > > via Will's series [1] > > Landed where? Is pKVM actually going upstream with anonymous memory? The basic memory protection code for pKVM using anonymous memory is now upstream but we still have a tonne to do for CPU state, DMA, firmware loading etc. It obviously doesn't preclude support for guest_memfd in future and, as you know, Fuad has been actively involved in that discussion over the years. It does, however, mean that we'd be offering guest_memfd alongside anonymous memory, which matches what we'd have to do in Android at this point anyway (given the millions of devices using anonymous memory in the field). Of course, if that means we should have an additional set of tests, then that's fine. > I thought the inability to protect against page faults in the untrusted > kernel was a non-starter? We ended up solving that by forcefully reclaiming the target page from the guest. So the flow looks something like: 1. The host kernel accesses a private (faulting) page via a kernel mapping 2. An exception is taken to the hypervisor 3. The hypervisor injects an exception back into the kernel, with sufficient syndrome information to triage it as an access to a private page. 4. The kernel searches its usual exception handlers (for things like load_unaligned_zeropad()) 5. If no handlers are found, the kernel translates the faulting virtual address into a physical address using the 'AT' instruction and issues a hypercall to forcefully reclaim the page. 6. The hypervisor unmaps the page from the guest and replaces the guest pte with an immutable "poison" (faulting) entry. If a vCPU later faults on this entry, it returns to userspace with -EFAULT. 7. The hypervisor clears the unmapped page, maps it back into the host and returns from the hypercall. 8. The kernel treats the fault as "spurious" and retries the faulting instruction. If the faulting access in (1) instead came from userspace, the kernel injects a SEGV at (4). If the faulting access came from a uaccess routine (e.g. copy_from_user()), then the fixup handler would run like normal in (4). Will ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() 2026-05-12 13:30 ` Sean Christopherson 2026-05-12 15:04 ` Will Deacon @ 2026-05-12 15:06 ` Fuad Tabba 1 sibling, 0 replies; 11+ messages in thread From: Fuad Tabba @ 2026-05-12 15:06 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Shuah Khan, Marc Zyngier, Oliver Upton, Will Deacon, Ackerley Tng, David Matlack, kvm, linux-kselftest, linux-kernel On Tue, 12 May 2026 at 14:30, Sean Christopherson <seanjc@google.com> wrote: > > On Tue, May 12, 2026, Fuad Tabba wrote: > > On Mon, 11 May 2026 at 21:25, Sean Christopherson <seanjc@google.com> wrote: > > > > I used kvm_vm_release() because it's the only public API that closes > > > > vm->fd to trigger kernel-side destruction. But the existing callers > > > > follow it with vm_recreate_with_one_vcpu(), so the "release + later > > > > kvm_vm_free()" path isn't exercised today. > > > > > > > > I see three ways to make this clean: > > > > a) This patch: kvm_vm_release() becomes idempotent for its three > > > > FDs, matching the kvm_stats_release() idiom it already invokes. > > > > b) Leave kvm_vm_release() as-is and add a dedicated helper, e.g. > > > > kvm_vm_destroy_kernel(), that closes vm->fd to trigger kernel > > > > destruction while leaving the kvm_vm struct intact for > > > > post-destruction inspection. kvm_vm_free() learns to handle the > > > > half-released state. > > > > c) Something else entirely, e.g., the test should manage vm->fd > > > > directly and not rely on library helpers for this pattern. > > > > > > d) Fully kill the VM; validate the semantics with an explict mmap(). > > > > > > The entire point of the test you are writing is to verfiy that a guest_memfd VMA > > > doesn't somehow cause KVM to leak state. So, make that obvious instead of abusing > > > APIs that kinda sorta do what you want, but not really. > > > > > > mem = kvm_mmap(region->mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, > > > region->guest_memfd); > > > > > > ... > > > > > > kvm_vm_free(vm); > > > > > > TEST_ASSERT(is_zero(mem, ...)); > > > > The test isn't about guest_memfd. The pKVM support that just landed > > via Will's series [1] > > Landed where? Is pKVM actually going upstream with anonymous memory? I thought > the inability to protect against page faults in the untrusted kernel was a > non-starter? > > > kvm_mmap() + kvm_vm_free() + is_zero() doesn't translate here. The only > > host view of the donated pages is the memslot mmap, and kvm_vm_free() > > munmaps it on the way out, so inspection has to happen between > > kernel-side destruction and userspace free. kvm_vm_release() is the > > only library primitive that does that today. > > > > What do you suggest? > > Manually allocate the memory and expose it to the guest via > vm_set_user_memory_region2() vm_set_user_memory_region(). > That works for this test and is cleaner. I'll restructure: mmap the backing in the test, vm_set_user_memory_region2() it in, then kvm_vm_free() -> is_zero() through the test-owned VMA -> munmap(). That drops the motivation for the kvm_vm_release() fd-guard patch in this series; happy to drop it, or to keep it as a standalone "match the kvm_stats_release() idiom" cleanup, whichever you prefer. Thanks, /fuad ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-05-12 15:06 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-11 11:37 [PATCH 0/2] KVM: selftests: Fixes for guest_memfd_test and FD double-close Fuad Tabba 2026-05-11 11:37 ` [PATCH 1/2] KVM: selftests: Fix MADV_COLLAPSE build failure on older toolchains Fuad Tabba 2026-05-11 14:59 ` Sean Christopherson 2026-05-11 11:37 ` [PATCH 2/2] KVM: selftests: Fix FD double-close in kvm_vm_release() Fuad Tabba 2026-05-11 14:58 ` Sean Christopherson 2026-05-11 15:19 ` Fuad Tabba 2026-05-11 20:25 ` Sean Christopherson 2026-05-12 8:06 ` Fuad Tabba 2026-05-12 13:30 ` Sean Christopherson 2026-05-12 15:04 ` Will Deacon 2026-05-12 15:06 ` Fuad Tabba
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox