public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	 Christian Borntraeger <borntraeger@linux.ibm.com>,
	Janosch Frank <frankja@linux.ibm.com>,
	 Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	 David Hildenbrand <david@redhat.com>,
	Fuad Tabba <tabba@google.com>,
	 Ackerley Tng <ackerleytng@google.com>
Subject: [PATCH v2 02/13] KVM: guest_memfd: Add INIT_SHARED flag, reject user page faults if not set
Date: Fri,  3 Oct 2025 16:25:55 -0700	[thread overview]
Message-ID: <20251003232606.4070510-3-seanjc@google.com> (raw)
In-Reply-To: <20251003232606.4070510-1-seanjc@google.com>

Add a guest_memfd flag to allow userspace to state that the underlying
memory should be configured to be initialized as shared, and reject user
page faults if the guest_memfd instance's memory isn't shared.  Because
KVM doesn't yet support in-place private<=>shared conversions, all
guest_memfd memory effectively follows the initial state.

Alternatively, KVM could deduce the initial state based on MMAP, which for
all intents and purposes is what KVM currently does.  However, implicitly
deriving the default state based on MMAP will result in a messy ABI when
support for in-place conversions is added.

For x86 CoCo VMs, which don't yet support MMAP, memory is currently private
by default (otherwise the memory would be unusable).  If MMAP implies
memory is shared by default, then the default state for CoCo VMs will vary
based on MMAP, and from userspace's perspective, will change when in-place
conversion support is added.  I.e. to maintain guest<=>host ABI, userspace
would need to immediately convert all memory from shared=>private, which
is both ugly and inefficient.  The inefficiency could be avoided by adding
a flag to state that memory is _private_ by default, irrespective of MMAP,
but that would lead to an equally messy and hard to document ABI.

Bite the bullet and immediately add a flag to control the default state so
that the effective behavior is explicit and straightforward.

Fixes: 3d3a04fad25a ("KVM: Allow and advertise support for host mmap() on guest_memfd files")
Cc: David Hildenbrand <david@redhat.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Ackerley Tng <ackerleytng@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 Documentation/virt/kvm/api.rst                 |  5 +++++
 include/uapi/linux/kvm.h                       |  3 ++-
 tools/testing/selftests/kvm/guest_memfd_test.c | 15 ++++++++++++---
 virt/kvm/guest_memfd.c                         |  6 +++++-
 virt/kvm/kvm_main.c                            |  3 ++-
 5 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 7ba92f2ced38..754b662a453c 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6438,6 +6438,11 @@ specified via KVM_CREATE_GUEST_MEMFD.  Currently defined flags:
   ============================ ================================================
   GUEST_MEMFD_FLAG_MMAP        Enable using mmap() on the guest_memfd file
                                descriptor.
+  GUEST_MEMFD_FLAG_INIT_SHARED Make all memory in the file shared during
+                               KVM_CREATE_GUEST_MEMFD (memory files created
+                               without INIT_SHARED will be marked private).
+                               Shared memory can be faulted into host userspace
+                               page tables. Private memory cannot.
   ============================ ================================================
 
 When the KVM MMU performs a PFN lookup to service a guest fault and the backing
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index b1d52d0c56ec..52f6000ab020 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1599,7 +1599,8 @@ struct kvm_memory_attributes {
 #define KVM_MEMORY_ATTRIBUTE_PRIVATE           (1ULL << 3)
 
 #define KVM_CREATE_GUEST_MEMFD	_IOWR(KVMIO,  0xd4, struct kvm_create_guest_memfd)
-#define GUEST_MEMFD_FLAG_MMAP	(1ULL << 0)
+#define GUEST_MEMFD_FLAG_MMAP		(1ULL << 0)
+#define GUEST_MEMFD_FLAG_INIT_SHARED	(1ULL << 1)
 
 struct kvm_create_guest_memfd {
 	__u64 size;
diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
index 3e58bd496104..0de56ce3c4e2 100644
--- a/tools/testing/selftests/kvm/guest_memfd_test.c
+++ b/tools/testing/selftests/kvm/guest_memfd_test.c
@@ -239,8 +239,9 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm)
 	close(fd1);
 }
 
-static void test_guest_memfd_flags(struct kvm_vm *vm, uint64_t valid_flags)
+static void test_guest_memfd_flags(struct kvm_vm *vm)
 {
+	uint64_t valid_flags = vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_FLAGS);
 	size_t page_size = getpagesize();
 	uint64_t flag;
 	int fd;
@@ -274,6 +275,10 @@ static void test_guest_memfd(unsigned long vm_type)
 	vm = vm_create_barebones_type(vm_type);
 	flags = vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_FLAGS);
 
+	/* This test doesn't yet support testing mmap() on private memory. */
+	if (!(flags & GUEST_MEMFD_FLAG_INIT_SHARED))
+		flags &= ~GUEST_MEMFD_FLAG_MMAP;
+
 	test_create_guest_memfd_multiple(vm);
 	test_create_guest_memfd_invalid_sizes(vm, flags, page_size);
 
@@ -292,7 +297,7 @@ static void test_guest_memfd(unsigned long vm_type)
 	test_fallocate(fd, page_size, total_size);
 	test_invalid_punch_hole(fd, page_size, total_size);
 
-	test_guest_memfd_flags(vm, flags);
+	test_guest_memfd_flags(vm);
 
 	close(fd);
 	kvm_vm_free(vm);
@@ -334,9 +339,13 @@ static void test_guest_memfd_guest(void)
 	TEST_ASSERT(vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_FLAGS) & GUEST_MEMFD_FLAG_MMAP,
 		    "Default VM type should support MMAP, supported flags = 0x%x",
 		    vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_FLAGS));
+	TEST_ASSERT(vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_FLAGS) & GUEST_MEMFD_FLAG_INIT_SHARED,
+		    "Default VM type should support INIT_SHARED, supported flags = 0x%x",
+		    vm_check_cap(vm, KVM_CAP_GUEST_MEMFD_FLAGS));
 
 	size = vm->page_size;
-	fd = vm_create_guest_memfd(vm, size, GUEST_MEMFD_FLAG_MMAP);
+	fd = vm_create_guest_memfd(vm, size, GUEST_MEMFD_FLAG_MMAP |
+					     GUEST_MEMFD_FLAG_INIT_SHARED);
 	vm_set_user_memory_region2(vm, slot, KVM_MEM_GUEST_MEMFD, gpa, size, NULL, fd, 0);
 
 	mem = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index 94bafd6c558c..cf3afba23a6b 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -328,6 +328,9 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
 	if (((loff_t)vmf->pgoff << PAGE_SHIFT) >= i_size_read(inode))
 		return VM_FAULT_SIGBUS;
 
+	if (!((u64)inode->i_private & GUEST_MEMFD_FLAG_INIT_SHARED))
+		return VM_FAULT_SIGBUS;
+
 	folio = kvm_gmem_get_folio(inode, vmf->pgoff);
 	if (IS_ERR(folio)) {
 		int err = PTR_ERR(folio);
@@ -525,7 +528,8 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args)
 	u64 valid_flags = 0;
 
 	if (kvm_arch_supports_gmem_mmap(kvm))
-		valid_flags |= GUEST_MEMFD_FLAG_MMAP;
+		valid_flags |= GUEST_MEMFD_FLAG_MMAP |
+			       GUEST_MEMFD_FLAG_INIT_SHARED;
 
 	if (flags & ~valid_flags)
 		return -EINVAL;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e3a268757621..5f644ca54af3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4930,7 +4930,8 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 		return 1;
 	case KVM_CAP_GUEST_MEMFD_FLAGS:
 		if (!kvm || kvm_arch_supports_gmem_mmap(kvm))
-			return GUEST_MEMFD_FLAG_MMAP;
+			return GUEST_MEMFD_FLAG_MMAP |
+			       GUEST_MEMFD_FLAG_INIT_SHARED;
 
 		return 0;
 #endif
-- 
2.51.0.618.g983fd99d29-goog


  parent reply	other threads:[~2025-10-03 23:26 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-03 23:25 [PATCH v2 00/13] KVM: guest_memfd: MMAP and related fixes Sean Christopherson
2025-10-03 23:25 ` [PATCH v2 01/13] KVM: Rework KVM_CAP_GUEST_MEMFD_MMAP into KVM_CAP_GUEST_MEMFD_FLAGS Sean Christopherson
2025-10-06 19:16   ` Ackerley Tng
2025-10-06 20:19     ` Sean Christopherson
2025-10-07 16:09       ` Ackerley Tng
2025-10-07 16:13         ` Sean Christopherson
2025-10-10 14:07   ` David Hildenbrand
2025-10-03 23:25 ` Sean Christopherson [this message]
2025-10-07 16:14   ` [PATCH v2 02/13] KVM: guest_memfd: Add INIT_SHARED flag, reject user page faults if not set Ackerley Tng
2025-10-10 14:08   ` David Hildenbrand
2025-10-03 23:25 ` [PATCH v2 03/13] KVM: guest_memfd: Invalidate SHARED GPAs if gmem supports INIT_SHARED Sean Christopherson
2025-10-07 16:31   ` Ackerley Tng
2025-10-10 14:09   ` David Hildenbrand
2025-10-03 23:25 ` [PATCH v2 04/13] KVM: Explicitly mark KVM_GUEST_MEMFD as depending on KVM_GENERIC_MMU_NOTIFIER Sean Christopherson
2025-10-10 14:10   ` David Hildenbrand
2025-10-03 23:25 ` [PATCH v2 05/13] KVM: guest_memfd: Allow mmap() on guest_memfd for x86 VMs with private memory Sean Christopherson
2025-10-07 16:43   ` Ackerley Tng
2025-10-10 14:11   ` David Hildenbrand
2025-10-03 23:25 ` [PATCH v2 06/13] KVM: selftests: Stash the host page size in a global in the guest_memfd test Sean Christopherson
2025-10-06 18:30   ` Ackerley Tng
2025-10-03 23:26 ` [PATCH v2 07/13] KVM: selftests: Create a new guest_memfd for each testcase Sean Christopherson
2025-10-06 18:29   ` Ackerley Tng
2025-10-07 22:54   ` Lisa Wang
2025-10-10 15:04   ` David Hildenbrand
2025-10-10 20:12     ` Sean Christopherson
2025-10-03 23:26 ` [PATCH v2 08/13] KVM: selftests: Add test coverage for guest_memfd without GUEST_MEMFD_FLAG_MMAP Sean Christopherson
2025-10-03 23:26 ` [PATCH v2 09/13] KVM: selftests: Add wrappers for mmap() and munmap() to assert success Sean Christopherson
2025-10-03 23:26 ` [PATCH v2 10/13] KVM: selftests: Isolate the guest_memfd Copy-on-Write negative testcase Sean Christopherson
2025-10-06 18:28   ` Ackerley Tng
2025-10-03 23:26 ` [PATCH v2 11/13] KVM: selftests: Add wrapper macro to handle and assert on expected SIGBUS Sean Christopherson
2025-10-06 18:21   ` Ackerley Tng
2025-10-07 21:16   ` Lisa Wang
2025-10-03 23:26 ` [PATCH v2 12/13] KVM: selftests: Verify that faulting in private guest_memfd memory fails Sean Christopherson
2025-10-06 18:26   ` Ackerley Tng
2025-10-03 23:26 ` [PATCH v2 13/13] KVM: selftests: Verify that reads to inaccessible guest_memfd VMAs SIGBUS Sean Christopherson
2025-10-06 18:22   ` Ackerley Tng
2025-10-06 19:24     ` Sean Christopherson
2025-10-07 18:06   ` Lisa Wang
2025-10-10 21:30 ` [PATCH v2 00/13] KVM: guest_memfd: MMAP and related fixes Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251003232606.4070510-3-seanjc@google.com \
    --to=seanjc@google.com \
    --cc=ackerleytng@google.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=imbrenda@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=tabba@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox