All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends
@ 2025-12-15 20:51 Peter Xu
  2025-12-15 20:51 ` [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported Peter Xu
                   ` (12 more replies)
  0 siblings, 13 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

v1: https://lore.kernel.org/r/20251023185913.2923322-1-peterx@redhat.com
v2: https://lore.kernel.org/r/20251119172913.577392-1-peterx@redhat.com

v3:
- Collect R-bs from Xiaoyao
- Rebased to 10.2-rc3; no dependency needed now, as those got merged
- Reorder patches, touch up commit messages or comments on in-place misuse
- Added patch "kvm: Provide explicit error for kvm_create_guest_memfd()" [Xiaoyao]
- Added one patch for renaming machine_require_guest_memfd() [Xiaoyao]
- Added one patch for renaming memory_region_init_ram_guest_memfd() [Xiaoyao]

=========8<===========

This series allows QEMU to consume init-shared guest-memfd to be a common
memory backend. Before this series, guest-memfd was only used in CoCo and
the fds will be created implicitly whenever CoCo environment is detected.
When used in init-shared mode, the guest-memfd will be specified in the
command lines directly just like other types of memory backends.

In the current patchset, I reused the memory-backend-memfd object, rather
than creating a new type of object.  After all, guest-memfd (at least from
userspace POV) works similarly like a memfd, except that it was tailored
for VM's use case.

This approach so far also does not involve gmem bindings to KVM instances,
hence it is not prone to issues when the same chunk of RAM will be attached
to more than one KVM memslots.

Now, instead of using a normal memfd backend using:

  -object memory-backend-memfd,id=ID,size=SIZE,share=on

One can also boot a VM with guest-memfd:

  -object memory-backend-memfd,id=ID,size=SIZE,share=on,guest-memfd=on

The init-shared guest-memfd relies on almost the latest linux, as the
mmap() support just landed v6.18-rc2.  When run it on an older qemu, we'll
see errors like:

  qemu-system-x86_64: KVM does not support guest_memfd

One thing to mention is live migration is by default supported, however
postcopy is still currently not supported.  The postcopy support will have
some kernel dependency work to be merged in Linux first.

Thanks,

Peter Xu (11):
  kvm: Detect guest-memfd flags supported
  kvm: Provide explicit error for kvm_create_guest_memfd()
  ramblock: Rename guest_memfd to guest_memfd_private
  memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
  memory: Rename memory_region_has_guest_memfd() to *_private()
  hostmem: Rename guest_memfd to guest_memfd_private
  hostmem: Support fully shared guest memfd to back a VM
  machine: Rename machine_require_guest_memfd() to *_private()
  memory: Rename memory_region_init_ram_guest_memfd() to *_private()
  tests/migration-test: Support guest-memfd init shared mem type
  tests/migration-test: Add a precopy test for guest-memfd

Xiaoyao Li (1):
  kvm: Decouple memory attribute check from kvm_guest_memfd_supported

 qapi/qom.json                         |  6 ++-
 include/hw/boards.h                   |  2 +-
 include/system/hostmem.h              |  2 +-
 include/system/kvm.h                  |  1 +
 include/system/memory.h               | 27 ++++++------
 include/system/ram_addr.h             |  2 +-
 include/system/ramblock.h             |  7 +++-
 tests/qtest/migration/framework.h     |  4 ++
 accel/kvm/kvm-all.c                   | 33 ++++++++++++---
 accel/stubs/kvm-stub.c                |  6 +++
 backends/hostmem-file.c               |  2 +-
 backends/hostmem-memfd.c              | 55 +++++++++++++++++++++---
 backends/hostmem-ram.c                |  2 +-
 backends/hostmem-shm.c                |  2 +-
 backends/hostmem.c                    |  2 +-
 backends/igvm.c                       |  4 +-
 hw/core/machine.c                     |  2 +-
 hw/i386/pc.c                          |  6 +--
 hw/i386/pc_sysfw.c                    |  8 ++--
 hw/i386/x86-common.c                  |  8 ++--
 system/memory.c                       | 17 ++++----
 system/physmem.c                      | 37 ++++++++++-------
 target/i386/kvm/kvm.c                 |  3 +-
 tests/qtest/migration/framework.c     | 60 +++++++++++++++++++++++++++
 tests/qtest/migration/precopy-tests.c | 12 ++++++
 25 files changed, 239 insertions(+), 71 deletions(-)

-- 
2.50.1



^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
@ 2025-12-15 20:51 ` Peter Xu
  2025-12-16 12:41   ` Xiaoyao Li
                     ` (2 more replies)
  2025-12-15 20:51 ` [PATCH v3 02/12] kvm: Detect guest-memfd flags supported Peter Xu
                   ` (11 subsequent siblings)
  12 siblings, 3 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

From: Xiaoyao Li <xiaoyao.li@intel.com>

With the mmap support of guest memfd, KVM allows usersapce to create
guest memfd serving as normal non-private memory for X86 DEFEAULT VM.
However, KVM doesn't support private memory attriute for X86 DEFAULT
VM.

Make kvm_guest_memfd_supported not rely on KVM_MEMORY_ATTRIBUTE_PRIVATE
and check KVM_MEMORY_ATTRIBUTE_PRIVATE separately when the machine
requires guest_memfd to serve as private memory.

This allows QMEU to create guest memfd with mmap to serve as the memory
backend for X86 DEFAULT VM.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/kvm.h   | 1 +
 accel/kvm/kvm-all.c    | 8 ++++++--
 accel/stubs/kvm-stub.c | 5 +++++
 system/physmem.c       | 8 ++++++++
 4 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/include/system/kvm.h b/include/system/kvm.h
index 8f9eecf044..b5811c90f1 100644
--- a/include/system/kvm.h
+++ b/include/system/kvm.h
@@ -561,6 +561,7 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp);
 
 int kvm_set_memory_attributes_private(hwaddr start, uint64_t size);
 int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size);
+bool kvm_private_memory_attribute_supported(void);
 
 int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
 
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 28006d73c5..59836ebdff 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1501,6 +1501,11 @@ int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size)
     return kvm_set_memory_attributes(start, size, 0);
 }
 
+bool kvm_private_memory_attribute_supported(void)
+{
+    return !!(kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
+}
+
 /* Called with KVMMemoryListener.slots_lock held */
 static void kvm_set_phys_mem(KVMMemoryListener *kml,
                              MemoryRegionSection *section, bool add)
@@ -2781,8 +2786,7 @@ static int kvm_init(AccelState *as, MachineState *ms)
     kvm_supported_memory_attributes = kvm_vm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
     kvm_guest_memfd_supported =
         kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
-        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
-        (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
+        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
     kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
 
     if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 68cd33ba97..73f04eb589 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -125,3 +125,8 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
 {
     return -ENOSYS;
 }
+
+bool kvm_private_memory_attribute_supported(void)
+{
+    return false;
+}
diff --git a/system/physmem.c b/system/physmem.c
index c9869e4049..3555d2f6f7 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2211,6 +2211,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
                        object_get_typename(OBJECT(current_machine->cgs)));
             goto out_free;
         }
+
+        if (!kvm_private_memory_attribute_supported()) {
+            error_setg(errp, "cannot set up private guest memory for %s: "
+                       " KVM does not support private memory attribute",
+                       object_get_typename(OBJECT(current_machine->cgs)));
+            goto out_free;
+        }
+
         assert(new_block->guest_memfd < 0);
 
         ret = ram_block_coordinated_discard_require(true);
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 02/12] kvm: Detect guest-memfd flags supported
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
  2025-12-15 20:51 ` [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported Peter Xu
@ 2025-12-15 20:51 ` Peter Xu
  2025-12-16 13:54   ` Fabiano Rosas
  2026-06-02  1:29   ` Michael Roth
  2025-12-15 20:51 ` [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd() Peter Xu
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Detect supported guest-memfd flags by the current kernel, and reject
creations of guest-memfd using invalid flags.  When the cap isn't
available, then no flag is supported.

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 accel/kvm/kvm-all.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 59836ebdff..68d57c1af0 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -108,6 +108,7 @@ static int kvm_sstep_flags;
 static bool kvm_immediate_exit;
 static uint64_t kvm_supported_memory_attributes;
 static bool kvm_guest_memfd_supported;
+static uint64_t kvm_guest_memfd_flags_supported;
 static hwaddr kvm_max_slot_size = ~0;
 
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
@@ -2787,6 +2788,10 @@ static int kvm_init(AccelState *as, MachineState *ms)
     kvm_guest_memfd_supported =
         kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
         kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
+
+    ret = kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD_FLAGS);
+    kvm_guest_memfd_flags_supported = ret > 0 ? ret : 0;
+
     kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
 
     if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
@@ -4492,6 +4497,13 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
         return -1;
     }
 
+    if (flags & ~kvm_guest_memfd_flags_supported) {
+        error_setg(errp, "Current KVM instance does not support "
+                   "guest-memfd flag: 0x%"PRIx64,
+                   flags & ~kvm_guest_memfd_flags_supported);
+        return -1;
+    }
+
     fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &guest_memfd);
     if (fd < 0) {
         error_setg_errno(errp, errno, "Error creating KVM guest_memfd");
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd()
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
  2025-12-15 20:51 ` [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported Peter Xu
  2025-12-15 20:51 ` [PATCH v3 02/12] kvm: Detect guest-memfd flags supported Peter Xu
@ 2025-12-15 20:51 ` Peter Xu
  2025-12-16  4:03   ` Xiaoyao Li
                     ` (2 more replies)
  2025-12-15 20:51 ` [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private Peter Xu
                   ` (9 subsequent siblings)
  12 siblings, 3 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

So that there will be a verbal string returned when kvm not enabled, or kvm
not compiled.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 accel/kvm/kvm-all.c    | 5 +++++
 accel/stubs/kvm-stub.c | 1 +
 2 files changed, 6 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 68d57c1af0..c32fbcf9cc 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -4492,6 +4492,11 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
         .flags = flags,
     };
 
+    if (!kvm_enabled()) {
+        error_setg(errp, "guest-memfd requires KVM accelerator");
+        return -1;
+    }
+
     if (!kvm_guest_memfd_supported) {
         error_setg(errp, "KVM does not support guest_memfd");
         return -1;
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 73f04eb589..01b1d6285e 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -123,6 +123,7 @@ bool kvm_hwpoisoned_mem(void)
 
 int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
 {
+    error_setg(errp, "KVM is not enabled");
     return -ENOSYS;
 }
 
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (2 preceding siblings ...)
  2025-12-15 20:51 ` [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd() Peter Xu
@ 2025-12-15 20:51 ` Peter Xu
  2026-06-02  1:37   ` Michael Roth
  2025-12-15 20:51 ` [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Rename the field to reflect the fact that the guest_memfd in this case only
backs private portion of the ramblock rather than all of it.

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/memory.h   |  7 ++++---
 include/system/ramblock.h |  7 ++++++-
 accel/kvm/kvm-all.c       |  2 +-
 system/memory.c           |  2 +-
 system/physmem.c          | 21 +++++++++++----------
 5 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/include/system/memory.h b/include/system/memory.h
index 3bd5ffa5e0..2384575065 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -1823,10 +1823,11 @@ static inline bool memory_region_is_romd(MemoryRegion *mr)
 bool memory_region_is_protected(MemoryRegion *mr);
 
 /**
- * memory_region_has_guest_memfd: check whether a memory region has guest_memfd
- *     associated
+ * memory_region_has_guest_memfd: check whether a memory region has
+ *     guest_memfd_private associated
  *
- * Returns %true if a memory region's ram_block has valid guest_memfd assigned.
+ * Returns %true if a memory region's ram_block has guest_memfd_private
+ * assigned.
  *
  * @mr: the memory region being queried
  */
diff --git a/include/system/ramblock.h b/include/system/ramblock.h
index 76694fe1b5..9ecf7f970c 100644
--- a/include/system/ramblock.h
+++ b/include/system/ramblock.h
@@ -40,7 +40,12 @@ struct RAMBlock {
     Error *cpr_blocker;
     int fd;
     uint64_t fd_offset;
-    int guest_memfd;
+    /*
+     * When RAM_GUEST_MEMFD_PRIVATE flag is set, this ramblock can have
+     * private pages backed by guest_memfd_private specified, while shared
+     * pages are backed by the ramblock on its own.
+     */
+    int guest_memfd_private;
     RamBlockAttributes *attributes;
     size_t page_size;
     /* dirty bitmap used during migration */
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index c32fbcf9cc..1126b6f477 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1603,7 +1603,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
         mem->ram_start_offset = ram_start_offset;
         mem->ram = ram;
         mem->flags = kvm_mem_flags(mr);
-        mem->guest_memfd = mr->ram_block->guest_memfd;
+        mem->guest_memfd = mr->ram_block->guest_memfd_private;
         mem->guest_memfd_offset = mem->guest_memfd >= 0 ?
                                   (uint8_t*)ram - mr->ram_block->host : 0;
 
diff --git a/system/memory.c b/system/memory.c
index 8b84661ae3..355b1fa26b 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1899,7 +1899,7 @@ bool memory_region_is_protected(MemoryRegion *mr)
 
 bool memory_region_has_guest_memfd(MemoryRegion *mr)
 {
-    return mr->ram_block && mr->ram_block->guest_memfd >= 0;
+    return mr->ram_block && mr->ram_block->guest_memfd_private >= 0;
 }
 
 uint8_t memory_region_get_dirty_log_mask(MemoryRegion *mr)
diff --git a/system/physmem.c b/system/physmem.c
index 3555d2f6f7..c3c7a81310 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2219,7 +2219,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
             goto out_free;
         }
 
-        assert(new_block->guest_memfd < 0);
+        assert(new_block->guest_memfd_private < 0);
 
         ret = ram_block_coordinated_discard_require(true);
         if (ret < 0) {
@@ -2229,9 +2229,9 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
             goto out_free;
         }
 
-        new_block->guest_memfd = kvm_create_guest_memfd(new_block->max_length,
-                                                        0, errp);
-        if (new_block->guest_memfd < 0) {
+        new_block->guest_memfd_private =
+            kvm_create_guest_memfd(new_block->max_length, 0, errp);
+        if (new_block->guest_memfd_private < 0) {
             qemu_mutex_unlock_ramlist();
             goto out_free;
         }
@@ -2248,7 +2248,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
         new_block->attributes = ram_block_attributes_create(new_block);
         if (!new_block->attributes) {
             error_setg(errp, "Failed to create ram block attribute");
-            close(new_block->guest_memfd);
+            close(new_block->guest_memfd_private);
             ram_block_coordinated_discard_require(false);
             qemu_mutex_unlock_ramlist();
             goto out_free;
@@ -2385,7 +2385,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, ram_addr_t max_size,
     new_block->max_length = max_size;
     new_block->resized = resized;
     new_block->flags = ram_flags;
-    new_block->guest_memfd = -1;
+    new_block->guest_memfd_private = -1;
     new_block->host = file_ram_alloc(new_block, max_size, fd,
                                      file_size < offset + max_size,
                                      offset, errp);
@@ -2558,7 +2558,7 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
     new_block->used_length = size;
     new_block->max_length = max_size;
     new_block->fd = -1;
-    new_block->guest_memfd = -1;
+    new_block->guest_memfd_private = -1;
     new_block->page_size = qemu_real_host_page_size();
     new_block->host = host;
     new_block->flags = ram_flags;
@@ -2609,9 +2609,9 @@ static void reclaim_ramblock(RAMBlock *block)
         qemu_anon_ram_free(block->host, block->max_length);
     }
 
-    if (block->guest_memfd >= 0) {
+    if (block->guest_memfd_private >= 0) {
         ram_block_attributes_destroy(block->attributes);
-        close(block->guest_memfd);
+        close(block->guest_memfd_private);
         ram_block_coordinated_discard_require(false);
     }
 
@@ -4222,7 +4222,8 @@ int ram_block_discard_guest_memfd_range(RAMBlock *rb, uint64_t offset,
 
 #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
     /* ignore fd_offset with guest_memfd */
-    ret = fallocate(rb->guest_memfd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+    ret = fallocate(rb->guest_memfd_private,
+                    FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
                     offset, length);
 
     if (ret) {
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (3 preceding siblings ...)
  2025-12-15 20:51 ` [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private Peter Xu
@ 2025-12-15 20:51 ` Peter Xu
  2025-12-16  5:49   ` Xiaoyao Li
  2026-06-02  1:39   ` Michael Roth
  2025-12-15 20:51 ` [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private() Peter Xu
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

This name is too generic, and can conflict with in-place guest-memfd
support.  Add a _PRIVATE suffix to show what it really means: it is always
silently using an internal guest-memfd to back a shared host backend,
rather than used in-place.

This paves way for in-place guest-memfd, which means we can have a ramblock
that allocates pages completely from guest-memfd (private or shared).

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/memory.h   | 8 ++++----
 include/system/ram_addr.h | 2 +-
 backends/hostmem-file.c   | 2 +-
 backends/hostmem-memfd.c  | 2 +-
 backends/hostmem-ram.c    | 2 +-
 backends/hostmem-shm.c    | 2 +-
 system/memory.c           | 3 ++-
 system/physmem.c          | 8 ++++----
 8 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/include/system/memory.h b/include/system/memory.h
index 2384575065..1f49f9a0ff 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -263,7 +263,7 @@ typedef struct IOMMUTLBEvent {
 #define RAM_READONLY_FD (1 << 11)
 
 /* RAM can be private that has kvm guest memfd backend */
-#define RAM_GUEST_MEMFD   (1 << 12)
+#define RAM_GUEST_MEMFD_PRIVATE   (1 << 12)
 
 /*
  * In RAMBlock creation functions, if MAP_SHARED is 0 in the flags parameter,
@@ -1401,7 +1401,7 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
  *        must be unique within any device
  * @size: size of the region.
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE,
- *             RAM_GUEST_MEMFD.
+ *             RAM_GUEST_MEMFD_PRIVATE.
  * @errp: pointer to Error*, to store an error if it happens.
  *
  * Note that this function does not do anything to cause the data in the
@@ -1463,7 +1463,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
  *         (getpagesize()) will be used.
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *             RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- *             RAM_READONLY_FD, RAM_GUEST_MEMFD
+ *             RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
  * @path: the path in which to allocate the RAM.
  * @offset: offset within the file referenced by path
  * @errp: pointer to Error*, to store an error if it happens.
@@ -1493,7 +1493,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
  * @size: size of the region.
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *             RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- *             RAM_READONLY_FD, RAM_GUEST_MEMFD
+ *             RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
  * @fd: the fd to mmap.
  * @offset: offset within the file referenced by fd
  * @errp: pointer to Error*, to store an error if it happens.
diff --git a/include/system/ram_addr.h b/include/system/ram_addr.h
index 683485980c..930d3824d7 100644
--- a/include/system/ram_addr.h
+++ b/include/system/ram_addr.h
@@ -92,7 +92,7 @@ static inline unsigned long int ramblock_recv_bitmap_offset(void *host_addr,
  *  @resized: callback after calls to qemu_ram_resize
  *  @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *              RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- *              RAM_READONLY_FD, RAM_GUEST_MEMFD
+ *              RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
  *  @mem_path or @fd: specify the backing file or device
  *  @offset: Offset into target file
  *  @grow: extend file if necessary (but an empty file is always extended).
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 8e3219c061..1f20cd8fd6 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -86,7 +86,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
     ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
+    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
     ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
     ram_flags |= RAM_NAMED_FILE;
     return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index 923239f9cf..3f3e485709 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -60,7 +60,7 @@ have_fd:
     backend->aligned = true;
     ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
+    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
     return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
                                           backend->size, ram_flags, fd, 0, errp);
 }
diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
index 062b1abb11..96ad29112d 100644
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -30,7 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     name = host_memory_backend_get_name(backend);
     ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
+    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
     return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
                                                   name, backend->size,
                                                   ram_flags, errp);
diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
index 806e2670e0..e86fb2e0aa 100644
--- a/backends/hostmem-shm.c
+++ b/backends/hostmem-shm.c
@@ -54,7 +54,7 @@ have_fd:
     /* Let's do the same as memory-backend-ram,share=on would do. */
     ram_flags = RAM_SHARED;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
+    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
 
     return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
                                               backend_name, backend->size,
diff --git a/system/memory.c b/system/memory.c
index 355b1fa26b..e8c6d484e6 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -3755,7 +3755,8 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
     DeviceState *owner_dev;
 
     if (!memory_region_init_ram_flags_nomigrate(mr, owner, name, size,
-                                                RAM_GUEST_MEMFD, errp)) {
+                                                RAM_GUEST_MEMFD_PRIVATE,
+                                                errp)) {
         return false;
     }
     /* This will assert if owner is neither NULL nor a DeviceState.
diff --git a/system/physmem.c b/system/physmem.c
index c3c7a81310..d30fd690d1 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2203,7 +2203,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
         }
     }
 
-    if (new_block->flags & RAM_GUEST_MEMFD) {
+    if (new_block->flags & RAM_GUEST_MEMFD_PRIVATE) {
         int ret;
 
         if (!kvm_enabled()) {
@@ -2341,7 +2341,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, ram_addr_t max_size,
     /* Just support these ram flags by now. */
     assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE |
                           RAM_PROTECTED | RAM_NAMED_FILE | RAM_READONLY |
-                          RAM_READONLY_FD | RAM_GUEST_MEMFD |
+                          RAM_READONLY_FD | RAM_GUEST_MEMFD_PRIVATE |
                           RAM_RESIZEABLE)) == 0);
     assert(max_size >= size);
 
@@ -2498,7 +2498,7 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
     ram_flags &= ~RAM_PRIVATE;
 
     assert((ram_flags & ~(RAM_SHARED | RAM_RESIZEABLE | RAM_PREALLOC |
-                          RAM_NORESERVE | RAM_GUEST_MEMFD)) == 0);
+                          RAM_NORESERVE | RAM_GUEST_MEMFD_PRIVATE)) == 0);
     assert(!host ^ (ram_flags & RAM_PREALLOC));
     assert(max_size >= size);
 
@@ -2581,7 +2581,7 @@ RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
 RAMBlock *qemu_ram_alloc(ram_addr_t size, uint32_t ram_flags,
                          MemoryRegion *mr, Error **errp)
 {
-    assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE | RAM_GUEST_MEMFD |
+    assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE | RAM_GUEST_MEMFD_PRIVATE |
                           RAM_PRIVATE)) == 0);
     return qemu_ram_alloc_internal(size, size, NULL, NULL, ram_flags, mr, errp);
 }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private()
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (4 preceding siblings ...)
  2025-12-15 20:51 ` [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
@ 2025-12-15 20:51 ` Peter Xu
  2026-06-02  1:40   ` Michael Roth
  2025-12-15 20:51 ` [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private Peter Xu
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Rename the function with "_private" suffix, to show that it returns true
only if it has an internal guest-memfd to back private pages (rather than
fully shared guest-memfd).

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/memory.h | 4 ++--
 accel/kvm/kvm-all.c     | 6 +++---
 system/memory.c         | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/system/memory.h b/include/system/memory.h
index 1f49f9a0ff..9b58303bb8 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -1823,7 +1823,7 @@ static inline bool memory_region_is_romd(MemoryRegion *mr)
 bool memory_region_is_protected(MemoryRegion *mr);
 
 /**
- * memory_region_has_guest_memfd: check whether a memory region has
+ * memory_region_has_guest_memfd_private: check whether a memory region has
  *     guest_memfd_private associated
  *
  * Returns %true if a memory region's ram_block has guest_memfd_private
@@ -1831,7 +1831,7 @@ bool memory_region_is_protected(MemoryRegion *mr);
  *
  * @mr: the memory region being queried
  */
-bool memory_region_has_guest_memfd(MemoryRegion *mr);
+bool memory_region_has_guest_memfd_private(MemoryRegion *mr);
 
 /**
  * memory_region_get_iommu: check whether a memory region is an iommu
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 1126b6f477..0b7ce5a9dd 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -666,7 +666,7 @@ static int kvm_mem_flags(MemoryRegion *mr)
     if (readonly && kvm_readonly_mem_allowed) {
         flags |= KVM_MEM_READONLY;
     }
-    if (memory_region_has_guest_memfd(mr)) {
+    if (memory_region_has_guest_memfd_private(mr)) {
         assert(kvm_guest_memfd_supported);
         flags |= KVM_MEM_GUEST_MEMFD;
     }
@@ -1615,7 +1615,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
             abort();
         }
 
-        if (memory_region_has_guest_memfd(mr)) {
+        if (memory_region_has_guest_memfd_private(mr)) {
             err = kvm_set_memory_attributes_private(start_addr, slot_size);
             if (err) {
                 error_report("%s: failed to set memory attribute private: %s",
@@ -3101,7 +3101,7 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
         return ret;
     }
 
-    if (!memory_region_has_guest_memfd(mr)) {
+    if (!memory_region_has_guest_memfd_private(mr)) {
         /*
          * Because vMMIO region must be shared, guest TD may convert vMMIO
          * region to shared explicitly.  Don't complain such case.  See
diff --git a/system/memory.c b/system/memory.c
index e8c6d484e6..d70968c966 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1897,7 +1897,7 @@ bool memory_region_is_protected(MemoryRegion *mr)
     return mr->ram && (mr->ram_block->flags & RAM_PROTECTED);
 }
 
-bool memory_region_has_guest_memfd(MemoryRegion *mr)
+bool memory_region_has_guest_memfd_private(MemoryRegion *mr)
 {
     return mr->ram_block && mr->ram_block->guest_memfd_private >= 0;
 }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (5 preceding siblings ...)
  2025-12-15 20:51 ` [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private() Peter Xu
@ 2025-12-15 20:51 ` Peter Xu
  2025-12-16  5:54   ` Xiaoyao Li
  2026-06-02 18:56   ` Michael Roth
  2025-12-15 20:51 ` [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM Peter Xu
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Rename the HostMemoryBackend.guest_memfd field to reflect what it really
means, on whether it needs guest_memfd to back its private portion of
mapping.  This will help on clearance when we introduce in-place
guest_memfd for hostmem.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/hostmem.h | 2 +-
 backends/hostmem-file.c  | 2 +-
 backends/hostmem-memfd.c | 2 +-
 backends/hostmem-ram.c   | 2 +-
 backends/hostmem-shm.c   | 2 +-
 backends/hostmem.c       | 2 +-
 6 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/system/hostmem.h b/include/system/hostmem.h
index 88fa791ac7..dcbf81aeae 100644
--- a/include/system/hostmem.h
+++ b/include/system/hostmem.h
@@ -76,7 +76,7 @@ struct HostMemoryBackend {
     uint64_t size;
     bool merge, dump, use_canonical_path;
     bool prealloc, is_mapped, share, reserve;
-    bool guest_memfd, aligned;
+    bool guest_memfd_private, aligned;
     uint32_t prealloc_threads;
     ThreadContext *prealloc_context;
     DECLARE_BITMAP(host_nodes, MAX_NODES + 1);
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 1f20cd8fd6..0e4cfd6dc6 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -86,7 +86,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
     ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
+    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
     ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
     ram_flags |= RAM_NAMED_FILE;
     return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index 3f3e485709..ea93f034e4 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -60,7 +60,7 @@ have_fd:
     backend->aligned = true;
     ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
+    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
     return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
                                           backend->size, ram_flags, fd, 0, errp);
 }
diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
index 96ad29112d..6a507fad77 100644
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -30,7 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     name = host_memory_backend_get_name(backend);
     ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
+    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
     return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
                                                   name, backend->size,
                                                   ram_flags, errp);
diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
index e86fb2e0aa..4766db6aad 100644
--- a/backends/hostmem-shm.c
+++ b/backends/hostmem-shm.c
@@ -54,7 +54,7 @@ have_fd:
     /* Let's do the same as memory-backend-ram,share=on would do. */
     ram_flags = RAM_SHARED;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
+    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
 
     return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
                                               backend_name, backend->size,
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 35734d6f4d..70450733db 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -288,7 +288,7 @@ static void host_memory_backend_init(Object *obj)
     /* TODO: convert access to globals to compat properties */
     backend->merge = machine_mem_merge(machine);
     backend->dump = machine_dump_guest_core(machine);
-    backend->guest_memfd = machine_require_guest_memfd(machine);
+    backend->guest_memfd_private = machine_require_guest_memfd(machine);
     backend->reserve = true;
     backend->prealloc_threads = machine->smp.cpus;
 }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (6 preceding siblings ...)
  2025-12-15 20:51 ` [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private Peter Xu
@ 2025-12-15 20:51 ` Peter Xu
  2025-12-16  6:54   ` Xiaoyao Li
                     ` (2 more replies)
  2025-12-15 20:52 ` [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private() Peter Xu
                   ` (4 subsequent siblings)
  12 siblings, 3 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:51 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Host backends supports guest-memfd now by detecting whether it's a
confidential VM.  There's no way to choose it yet from the memory level to
use it fully shared.  If we use guest-memfd, it so far always implies we
need two layers of memory backends, while the guest-memfd only provides the
private set of pages.

This patch introduces a way so that QEMU can consume guest memfd as the
only source of memory to back the object (aka, fully shared).

To use the fully shared guest-memfd, one can add a memfd object with:

  -object memory-backend-memfd,guest-memfd=on,share=on

Note that share=on is required with fully shared guest_memfd.

PS: there's a trivial touch-up on fd<0 check, because the stub to create
guest-memfd may return negative but not -1.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qapi/qom.json            |  6 ++++-
 backends/hostmem-memfd.c | 53 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 6f5c9de0f0..9ebf17bfc7 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -763,13 +763,17 @@
 # @seal: if true, create a sealed-file, which will block further
 #     resizing of the memory (default: true)
 #
+# @guest-memfd: if true, use guest-memfd to back the memory region.
+#     (default: false, since: 11.0)
+#
 # Since: 2.12
 ##
 { 'struct': 'MemoryBackendMemfdProperties',
   'base': 'MemoryBackendProperties',
   'data': { '*hugetlb': 'bool',
             '*hugetlbsize': 'size',
-            '*seal': 'bool' },
+            '*seal': 'bool',
+            '*guest-memfd': 'bool' },
   'if': 'CONFIG_LINUX' }
 
 ##
diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index ea93f034e4..9299cd7675 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -18,6 +18,8 @@
 #include "qapi/error.h"
 #include "qom/object.h"
 #include "migration/cpr.h"
+#include "system/kvm.h"
+#include <linux/kvm.h>
 
 OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendMemfd, MEMORY_BACKEND_MEMFD)
 
@@ -28,6 +30,13 @@ struct HostMemoryBackendMemfd {
     bool hugetlb;
     uint64_t hugetlbsize;
     bool seal;
+    /*
+     * NOTE: this differs from HostMemoryBackend's guest_memfd_private,
+     * which represents a internally private guest-memfd that only backs
+     * private pages.  Instead, this flag marks the memory backend will
+     * 100% use the guest-memfd pages in-place.
+     */
+    bool guest_memfd;
 };
 
 static bool
@@ -47,11 +56,26 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
         goto have_fd;
     }
 
-    fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
-                           m->hugetlb, m->hugetlbsize, m->seal ?
-                           F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
-                           errp);
-    if (fd == -1) {
+    if (m->guest_memfd) {
+        /* User choose to use fully shared guest-memfd to back the VM.. */
+        if (!backend->share) {
+            error_setg(errp, "Guest-memfd=on must be used with share=on");
+            return false;
+        }
+
+        /* TODO: add huge page support */
+        fd = kvm_create_guest_memfd(backend->size,
+                                    GUEST_MEMFD_FLAG_MMAP |
+                                    GUEST_MEMFD_FLAG_INIT_SHARED,
+                                    errp);
+    } else {
+        fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
+                               m->hugetlb, m->hugetlbsize, m->seal ?
+                               F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
+                               errp);
+    }
+
+    if (fd < 0) {
         return false;
     }
     cpr_save_fd(name, 0, fd);
@@ -65,6 +89,18 @@ have_fd:
                                           backend->size, ram_flags, fd, 0, errp);
 }
 
+static bool
+memfd_backend_get_guest_memfd(Object *o, Error **errp)
+{
+    return MEMORY_BACKEND_MEMFD(o)->guest_memfd;
+}
+
+static void
+memfd_backend_set_guest_memfd(Object *o, bool value, Error **errp)
+{
+    MEMORY_BACKEND_MEMFD(o)->guest_memfd = value;
+}
+
 static bool
 memfd_backend_get_hugetlb(Object *o, Error **errp)
 {
@@ -152,6 +188,13 @@ memfd_backend_class_init(ObjectClass *oc, const void *data)
         object_class_property_set_description(oc, "hugetlbsize",
                                               "Huge pages size (ex: 2M, 1G)");
     }
+
+    object_class_property_add_bool(oc, "guest-memfd",
+                                   memfd_backend_get_guest_memfd,
+                                   memfd_backend_set_guest_memfd);
+    object_class_property_set_description(oc, "guest-memfd",
+                                          "Use guest memfd");
+
     object_class_property_add_bool(oc, "seal",
                                    memfd_backend_get_seal,
                                    memfd_backend_set_seal);
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private()
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (7 preceding siblings ...)
  2025-12-15 20:51 ` [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM Peter Xu
@ 2025-12-15 20:52 ` Peter Xu
  2025-12-16  6:55   ` Xiaoyao Li
  2026-06-02 21:46   ` Michael Roth
  2025-12-15 20:52 ` [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() " Peter Xu
                   ` (3 subsequent siblings)
  12 siblings, 2 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Differenciate it from fully shared guest-memfd use cases.

When at it, add proper brackets in kvm_handle_hc_map_gpa_range() otherwise
checkpatch may complain.

Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/hw/boards.h   | 2 +-
 backends/hostmem.c    | 2 +-
 hw/core/machine.c     | 2 +-
 hw/i386/pc.c          | 2 +-
 hw/i386/pc_sysfw.c    | 4 ++--
 hw/i386/x86-common.c  | 4 ++--
 target/i386/kvm/kvm.c | 3 ++-
 7 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index a48ed4f86a..3a0a051d19 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -42,7 +42,7 @@ bool machine_usb(MachineState *machine);
 int machine_phandle_start(MachineState *machine);
 bool machine_dump_guest_core(MachineState *machine);
 bool machine_mem_merge(MachineState *machine);
-bool machine_require_guest_memfd(MachineState *machine);
+bool machine_require_guest_memfd_private(MachineState *machine);
 HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine);
 void machine_set_cpu_numa_node(MachineState *machine,
                                const CpuInstanceProperties *props,
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 70450733db..e2dcae50c4 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -288,7 +288,7 @@ static void host_memory_backend_init(Object *obj)
     /* TODO: convert access to globals to compat properties */
     backend->merge = machine_mem_merge(machine);
     backend->dump = machine_dump_guest_core(machine);
-    backend->guest_memfd_private = machine_require_guest_memfd(machine);
+    backend->guest_memfd_private = machine_require_guest_memfd_private(machine);
     backend->reserve = true;
     backend->prealloc_threads = machine->smp.cpus;
 }
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 27372bb01e..3bdce197f7 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1376,7 +1376,7 @@ bool machine_mem_merge(MachineState *machine)
     return machine->mem_merge;
 }
 
-bool machine_require_guest_memfd(MachineState *machine)
+bool machine_require_guest_memfd_private(MachineState *machine)
 {
     return machine->cgs && machine->cgs->require_guest_memfd;
 }
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f8b919cb6c..b2d55ceb5e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -962,7 +962,7 @@ void pc_memory_init(PCMachineState *pcms,
 
     if (!is_tdx_vm()) {
         option_rom_mr = g_malloc(sizeof(*option_rom_mr));
-        if (machine_require_guest_memfd(machine)) {
+        if (machine_require_guest_memfd_private(machine)) {
             memory_region_init_ram_guest_memfd(option_rom_mr, NULL, "pc.rom",
                                             PC_ROM_SIZE, &error_fatal);
         } else {
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 1a12b635ad..1c37258654 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -52,7 +52,7 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
 
     /* map the last 128KB of the BIOS in ISA space */
     isa_bios_size = MIN(flash_size, 128 * KiB);
-    if (machine_require_guest_memfd(MACHINE(pcms))) {
+    if (machine_require_guest_memfd_private(MACHINE(pcms))) {
         memory_region_init_ram_guest_memfd(isa_bios, NULL, "isa-bios",
                                            isa_bios_size, &error_fatal);
     } else {
@@ -71,7 +71,7 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
            ((uint8_t*)flash_ptr) + (flash_size - isa_bios_size),
            isa_bios_size);
 
-    if (!machine_require_guest_memfd(current_machine)) {
+    if (!machine_require_guest_memfd_private(current_machine)) {
         memory_region_set_readonly(isa_bios, true);
     }
 }
diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index c844749900..33ac7fb6e9 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -1044,7 +1044,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
         (bios_size % 65536) != 0) {
         goto bios_error;
     }
-    if (machine_require_guest_memfd(MACHINE(x86ms))) {
+    if (machine_require_guest_memfd_private(MACHINE(x86ms))) {
         memory_region_init_ram_guest_memfd(&x86ms->bios, NULL, "pc.bios",
                                            bios_size, &error_fatal);
         if (is_tdx_vm()) {
@@ -1074,7 +1074,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
     }
     g_free(filename);
 
-    if (!machine_require_guest_memfd(MACHINE(x86ms))) {
+    if (!machine_require_guest_memfd_private(MACHINE(x86ms))) {
         /* map the last 128KB of the BIOS in ISA space */
         x86_isa_bios_init(&x86ms->isa_bios, rom_memory, &x86ms->bios,
                           !isapc_ram_fw);
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 60c7981138..5d0d02bcaf 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -6050,8 +6050,9 @@ static int kvm_handle_hc_map_gpa_range(X86CPU *cpu, struct kvm_run *run)
     uint64_t gpa, size, attributes;
     int ret;
 
-    if (!machine_require_guest_memfd(current_machine))
+    if (!machine_require_guest_memfd_private(current_machine)) {
         return -EINVAL;
+    }
 
     gpa = run->hypercall.args[0];
     size = run->hypercall.args[1] * TARGET_PAGE_SIZE;
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() to *_private()
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (8 preceding siblings ...)
  2025-12-15 20:52 ` [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private() Peter Xu
@ 2025-12-15 20:52 ` Peter Xu
  2025-12-16  6:56   ` Xiaoyao Li
  2026-06-02 21:49   ` Michael Roth
  2025-12-15 20:52 ` [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type Peter Xu
                   ` (2 subsequent siblings)
  12 siblings, 2 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Differenciate it from fully shared guest-memfd use cases.

Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/memory.h | 10 +++++-----
 backends/igvm.c         |  4 ++--
 hw/i386/pc.c            |  4 ++--
 hw/i386/pc_sysfw.c      |  4 ++--
 hw/i386/x86-common.c    |  4 ++--
 system/memory.c         | 10 +++++-----
 6 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/system/memory.h b/include/system/memory.h
index 9b58303bb8..b3d000a563 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -1693,11 +1693,11 @@ bool memory_region_init_ram(MemoryRegion *mr,
                             uint64_t size,
                             Error **errp);
 
-bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
-                                        Object *owner,
-                                        const char *name,
-                                        uint64_t size,
-                                        Error **errp);
+bool memory_region_init_ram_guest_memfd_private(MemoryRegion *mr,
+                                                Object *owner,
+                                                const char *name,
+                                                uint64_t size,
+                                                Error **errp);
 
 /**
  * memory_region_init_rom: Initialize a ROM memory region.
diff --git a/backends/igvm.c b/backends/igvm.c
index 905bd8d989..91631829e5 100644
--- a/backends/igvm.c
+++ b/backends/igvm.c
@@ -221,8 +221,8 @@ static void *qigvm_prepare_memory(QIgvm *ctx, uint64_t addr, uint64_t size,
             g_strdup_printf("igvm.%X", region_identifier);
         igvm_pages = g_new0(MemoryRegion, 1);
         if (ctx->cgs && ctx->cgs->require_guest_memfd) {
-            if (!memory_region_init_ram_guest_memfd(igvm_pages, NULL,
-                                                    region_name, size, errp)) {
+            if (!memory_region_init_ram_guest_memfd_private(
+                    igvm_pages, NULL, region_name, size, errp)) {
                 return NULL;
             }
         } else {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index b2d55ceb5e..41dfbbdcf0 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -963,8 +963,8 @@ void pc_memory_init(PCMachineState *pcms,
     if (!is_tdx_vm()) {
         option_rom_mr = g_malloc(sizeof(*option_rom_mr));
         if (machine_require_guest_memfd_private(machine)) {
-            memory_region_init_ram_guest_memfd(option_rom_mr, NULL, "pc.rom",
-                                            PC_ROM_SIZE, &error_fatal);
+            memory_region_init_ram_guest_memfd_private(
+                option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE, &error_fatal);
         } else {
             memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
                                 &error_fatal);
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 1c37258654..ad55d4eba6 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -53,8 +53,8 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
     /* map the last 128KB of the BIOS in ISA space */
     isa_bios_size = MIN(flash_size, 128 * KiB);
     if (machine_require_guest_memfd_private(MACHINE(pcms))) {
-        memory_region_init_ram_guest_memfd(isa_bios, NULL, "isa-bios",
-                                           isa_bios_size, &error_fatal);
+        memory_region_init_ram_guest_memfd_private(
+            isa_bios, NULL, "isa-bios", isa_bios_size, &error_fatal);
     } else {
         memory_region_init_ram(isa_bios, NULL, "isa-bios", isa_bios_size,
                                &error_fatal);
diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index 33ac7fb6e9..27854a9164 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -1045,8 +1045,8 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
         goto bios_error;
     }
     if (machine_require_guest_memfd_private(MACHINE(x86ms))) {
-        memory_region_init_ram_guest_memfd(&x86ms->bios, NULL, "pc.bios",
-                                           bios_size, &error_fatal);
+        memory_region_init_ram_guest_memfd_private(
+            &x86ms->bios, NULL, "pc.bios", bios_size, &error_fatal);
         if (is_tdx_vm()) {
             tdx_set_tdvf_region(&x86ms->bios);
         }
diff --git a/system/memory.c b/system/memory.c
index d70968c966..28810dcb29 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -3746,11 +3746,11 @@ bool memory_region_init_ram(MemoryRegion *mr,
     return true;
 }
 
-bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
-                                        Object *owner,
-                                        const char *name,
-                                        uint64_t size,
-                                        Error **errp)
+bool memory_region_init_ram_guest_memfd_private(MemoryRegion *mr,
+                                                Object *owner,
+                                                const char *name,
+                                                uint64_t size,
+                                                Error **errp)
 {
     DeviceState *owner_dev;
 
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (9 preceding siblings ...)
  2025-12-15 20:52 ` [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() " Peter Xu
@ 2025-12-15 20:52 ` Peter Xu
  2025-12-16 14:18   ` Fabiano Rosas
  2025-12-15 20:52 ` [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd Peter Xu
  2026-06-02 22:02 ` [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Michael Roth
  12 siblings, 1 reply; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Support the guest-memfd type when the fd has init share enabled.  It means
the gmemfd can be used similarly to memfd.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tests/qtest/migration/framework.h |  4 +++
 tests/qtest/migration/framework.c | 60 +++++++++++++++++++++++++++++++
 2 files changed, 64 insertions(+)

diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
index ed85ed502d..b4c5edcad3 100644
--- a/tests/qtest/migration/framework.h
+++ b/tests/qtest/migration/framework.h
@@ -34,6 +34,10 @@ typedef enum {
      * but only anonymously allocated.
      */
     MEM_TYPE_MEMFD,
+    /*
+     * Use guest-memfd, shared mappings.
+     */
+    MEM_TYPE_GUEST_MEMFD,
     MEM_TYPE_NUM,
 } MemType;
 
diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
index e35839c95f..9aa353bac6 100644
--- a/tests/qtest/migration/framework.c
+++ b/tests/qtest/migration/framework.c
@@ -26,6 +26,10 @@
 #include "qemu/range.h"
 #include "qemu/sockets.h"
 
+#ifdef CONFIG_LINUX
+#include <linux/kvm.h>
+#include <sys/ioctl.h>
+#endif
 
 #define QEMU_VM_FILE_MAGIC 0x5145564d
 #define QEMU_ENV_SRC "QTEST_QEMU_BINARY_SRC"
@@ -283,6 +287,9 @@ static char *migrate_mem_type_get_opts(MemType type, const char *memory_size)
     case MEM_TYPE_MEMFD:
         backend = g_strdup("-object memory-backend-memfd");
         break;
+    case MEM_TYPE_GUEST_MEMFD:
+        backend = g_strdup("-object memory-backend-memfd,guest-memfd=on");
+        break;
     default:
         g_assert_not_reached();
         break;
@@ -425,8 +432,55 @@ int migrate_args(char **from, char **to, const char *uri, MigrateStart *args)
     return 0;
 }
 
+static bool kvm_guest_memfd_init_shared_supported(const char **reason)
+{
+    assert(*reason == NULL);
+
+#ifdef CONFIG_LINUX
+    int ret, fd = -1;
+
+    if (!migration_get_env()->has_kvm) {
+        *reason = "KVM is not enabled in the current QEMU build";
+        goto out;
+    }
+
+    fd = open("/dev/kvm", O_RDWR);
+    if (fd < 0) {
+        *reason = "KVM module isn't available or missing permission";
+        goto out;
+    }
+
+    ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_GUEST_MEMFD);
+    if (!ret) {
+        *reason = "KVM module doesn't suport guest-memfd";
+        goto out;
+    }
+
+    ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_GUEST_MEMFD_FLAGS);
+    if (ret < 0) {
+        *reason = "KVM doesn't support KVM_CAP_GUEST_MEMFD_FLAGS";
+        goto out;
+    }
+
+    if (!(ret & GUEST_MEMFD_FLAG_INIT_SHARED)) {
+        *reason = "KVM doesn't support GUEST_MEMFD_FLAG_INIT_SHARED";
+        goto out;
+    }
+out:
+    if (fd >= 0) {
+        close(fd);
+    }
+#else
+    *reason = "KVM not supported on non-Linux OS";
+#endif
+
+    return !*reason;
+}
+
 static bool migrate_mem_type_prepare(MemType type)
 {
+    const char *reason = NULL;
+
     switch (type) {
     case MEM_TYPE_SHMEM:
         if (!g_file_test("/dev/shm", G_FILE_TEST_IS_DIR)) {
@@ -434,6 +488,12 @@ static bool migrate_mem_type_prepare(MemType type)
             return false;
         }
         break;
+    case MEM_TYPE_GUEST_MEMFD:
+        if (!kvm_guest_memfd_init_shared_supported(&reason)) {
+            g_test_skip(reason);
+            return false;
+        }
+        break;
     default:
         break;
     }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (10 preceding siblings ...)
  2025-12-15 20:52 ` [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type Peter Xu
@ 2025-12-15 20:52 ` Peter Xu
  2025-12-16 14:20   ` Fabiano Rosas
  2026-06-02 22:02 ` [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Michael Roth
  12 siblings, 1 reply; 47+ messages in thread
From: Peter Xu @ 2025-12-15 20:52 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

Add a plain tcp test for guest-memfd.  Note that the test will be
automatically skipped whenever not supported (e.g. qemu compiled without
KVM, or host kernel doesn't support kvm, or old kernels, etc.).

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 tests/qtest/migration/precopy-tests.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
index 57ca623de5..88d2627efd 100644
--- a/tests/qtest/migration/precopy-tests.c
+++ b/tests/qtest/migration/precopy-tests.c
@@ -215,6 +215,16 @@ static void test_precopy_tcp_plain(void)
     test_precopy_common(&args);
 }
 
+static void test_precopy_tcp_plain_gmemfd(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "tcp:127.0.0.1:0",
+        .start.mem_type = MEM_TYPE_GUEST_MEMFD,
+    };
+
+    test_precopy_common(&args);
+}
+
 static void test_precopy_tcp_switchover_ack(void)
 {
     MigrateCommon args = {
@@ -1276,6 +1286,8 @@ void migration_test_add_precopy(MigrationTestEnv *env)
         return;
     }
 
+    migration_test_add("/migration/precopy/tcp/plain/guest-memfd",
+                       test_precopy_tcp_plain_gmemfd);
     migration_test_add("/migration/precopy/tcp/plain/switchover-ack",
                        test_precopy_tcp_switchover_ack);
 
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd()
  2025-12-15 20:51 ` [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd() Peter Xu
@ 2025-12-16  4:03   ` Xiaoyao Li
  2025-12-16 13:55   ` Fabiano Rosas
  2026-06-02  1:31   ` Michael Roth
  2 siblings, 0 replies; 47+ messages in thread
From: Xiaoyao Li @ 2025-12-16  4:03 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy

On 12/16/2025 4:51 AM, Peter Xu wrote:
> So that there will be a verbal string returned when kvm not enabled, or kvm
> not compiled.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
>   accel/kvm/kvm-all.c    | 5 +++++
>   accel/stubs/kvm-stub.c | 1 +
>   2 files changed, 6 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 68d57c1af0..c32fbcf9cc 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -4492,6 +4492,11 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
>           .flags = flags,
>       };
>   
> +    if (!kvm_enabled()) {
> +        error_setg(errp, "guest-memfd requires KVM accelerator");
> +        return -1;
> +    }
> +
>       if (!kvm_guest_memfd_supported) {
>           error_setg(errp, "KVM does not support guest_memfd");
>           return -1;
> diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
> index 73f04eb589..01b1d6285e 100644
> --- a/accel/stubs/kvm-stub.c
> +++ b/accel/stubs/kvm-stub.c
> @@ -123,6 +123,7 @@ bool kvm_hwpoisoned_mem(void)
>   
>   int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
>   {
> +    error_setg(errp, "KVM is not enabled");
>       return -ENOSYS;
>   }
>   



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
  2025-12-15 20:51 ` [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
@ 2025-12-16  5:49   ` Xiaoyao Li
  2025-12-23 17:04     ` Peter Xu
  2026-06-02  1:39   ` Michael Roth
  1 sibling, 1 reply; 47+ messages in thread
From: Xiaoyao Li @ 2025-12-16  5:49 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy

On 12/16/2025 4:51 AM, Peter Xu wrote:
> This name is too generic, and can conflict with in-place guest-memfd
> support.  Add a _PRIVATE suffix to show what it really means: it is always
> silently using an internal guest-memfd to back a shared host backend,
> rather than used in-place.
> 
> This paves way for in-place guest-memfd, which means we can have a ramblock
> that allocates pages completely from guest-memfd (private or shared).

Well, the term of "in-place" needs to be changed to "init-shared".


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private
  2025-12-15 20:51 ` [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private Peter Xu
@ 2025-12-16  5:54   ` Xiaoyao Li
  2026-06-02 18:56   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Xiaoyao Li @ 2025-12-16  5:54 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy

On 12/16/2025 4:51 AM, Peter Xu wrote:
> Rename the HostMemoryBackend.guest_memfd field to reflect what it really
> means, on whether it needs guest_memfd to back its private portion of
> mapping.  This will help on clearance when we introduce in-place
> guest_memfd for hostmem.

fix the term of "in-place"?

Other than it,

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   include/system/hostmem.h | 2 +-
>   backends/hostmem-file.c  | 2 +-
>   backends/hostmem-memfd.c | 2 +-
>   backends/hostmem-ram.c   | 2 +-
>   backends/hostmem-shm.c   | 2 +-
>   backends/hostmem.c       | 2 +-
>   6 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/include/system/hostmem.h b/include/system/hostmem.h
> index 88fa791ac7..dcbf81aeae 100644
> --- a/include/system/hostmem.h
> +++ b/include/system/hostmem.h
> @@ -76,7 +76,7 @@ struct HostMemoryBackend {
>       uint64_t size;
>       bool merge, dump, use_canonical_path;
>       bool prealloc, is_mapped, share, reserve;
> -    bool guest_memfd, aligned;
> +    bool guest_memfd_private, aligned;
>       uint32_t prealloc_threads;
>       ThreadContext *prealloc_context;
>       DECLARE_BITMAP(host_nodes, MAX_NODES + 1);
> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> index 1f20cd8fd6..0e4cfd6dc6 100644
> --- a/backends/hostmem-file.c
> +++ b/backends/hostmem-file.c
> @@ -86,7 +86,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>       ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
>       ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
>       ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
> +    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
>       ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
>       ram_flags |= RAM_NAMED_FILE;
>       return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
> diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
> index 3f3e485709..ea93f034e4 100644
> --- a/backends/hostmem-memfd.c
> +++ b/backends/hostmem-memfd.c
> @@ -60,7 +60,7 @@ have_fd:
>       backend->aligned = true;
>       ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
>       ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
> +    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
>       return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
>                                             backend->size, ram_flags, fd, 0, errp);
>   }
> diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
> index 96ad29112d..6a507fad77 100644
> --- a/backends/hostmem-ram.c
> +++ b/backends/hostmem-ram.c
> @@ -30,7 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>       name = host_memory_backend_get_name(backend);
>       ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
>       ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
> +    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
>       return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
>                                                     name, backend->size,
>                                                     ram_flags, errp);
> diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
> index e86fb2e0aa..4766db6aad 100644
> --- a/backends/hostmem-shm.c
> +++ b/backends/hostmem-shm.c
> @@ -54,7 +54,7 @@ have_fd:
>       /* Let's do the same as memory-backend-ram,share=on would do. */
>       ram_flags = RAM_SHARED;
>       ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
> +    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
>   
>       return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
>                                                 backend_name, backend->size,
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index 35734d6f4d..70450733db 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -288,7 +288,7 @@ static void host_memory_backend_init(Object *obj)
>       /* TODO: convert access to globals to compat properties */
>       backend->merge = machine_mem_merge(machine);
>       backend->dump = machine_dump_guest_core(machine);
> -    backend->guest_memfd = machine_require_guest_memfd(machine);
> +    backend->guest_memfd_private = machine_require_guest_memfd(machine);
>       backend->reserve = true;
>       backend->prealloc_threads = machine->smp.cpus;
>   }



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
  2025-12-15 20:51 ` [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM Peter Xu
@ 2025-12-16  6:54   ` Xiaoyao Li
  2025-12-16 14:02   ` Fabiano Rosas
  2026-06-02 21:40   ` Michael Roth
  2 siblings, 0 replies; 47+ messages in thread
From: Xiaoyao Li @ 2025-12-16  6:54 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy

On 12/16/2025 4:51 AM, Peter Xu wrote:
> Host backends supports guest-memfd now by detecting whether it's a
> confidential VM.  There's no way to choose it yet from the memory level to
> use it fully shared.  If we use guest-memfd, it so far always implies we
> need two layers of memory backends, while the guest-memfd only provides the
> private set of pages.
> 
> This patch introduces a way so that QEMU can consume guest memfd as the
> only source of memory to back the object (aka, fully shared).
> 
> To use the fully shared guest-memfd, one can add a memfd object with:
> 
>    -object memory-backend-memfd,guest-memfd=on,share=on
> 
> Note that share=on is required with fully shared guest_memfd.
> 
> PS: there's a trivial touch-up on fd<0 check, because the stub to create
> guest-memfd may return negative but not -1.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private()
  2025-12-15 20:52 ` [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private() Peter Xu
@ 2025-12-16  6:55   ` Xiaoyao Li
  2026-06-02 21:46   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Xiaoyao Li @ 2025-12-16  6:55 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy

On 12/16/2025 4:52 AM, Peter Xu wrote:
> Differenciate it from fully shared guest-memfd use cases.
> 
> When at it, add proper brackets in kvm_handle_hc_map_gpa_range() otherwise
> checkpatch may complain.
> 
> Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
>   include/hw/boards.h   | 2 +-
>   backends/hostmem.c    | 2 +-
>   hw/core/machine.c     | 2 +-
>   hw/i386/pc.c          | 2 +-
>   hw/i386/pc_sysfw.c    | 4 ++--
>   hw/i386/x86-common.c  | 4 ++--
>   target/i386/kvm/kvm.c | 3 ++-
>   7 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index a48ed4f86a..3a0a051d19 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -42,7 +42,7 @@ bool machine_usb(MachineState *machine);
>   int machine_phandle_start(MachineState *machine);
>   bool machine_dump_guest_core(MachineState *machine);
>   bool machine_mem_merge(MachineState *machine);
> -bool machine_require_guest_memfd(MachineState *machine);
> +bool machine_require_guest_memfd_private(MachineState *machine);
>   HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine);
>   void machine_set_cpu_numa_node(MachineState *machine,
>                                  const CpuInstanceProperties *props,
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index 70450733db..e2dcae50c4 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -288,7 +288,7 @@ static void host_memory_backend_init(Object *obj)
>       /* TODO: convert access to globals to compat properties */
>       backend->merge = machine_mem_merge(machine);
>       backend->dump = machine_dump_guest_core(machine);
> -    backend->guest_memfd_private = machine_require_guest_memfd(machine);
> +    backend->guest_memfd_private = machine_require_guest_memfd_private(machine);
>       backend->reserve = true;
>       backend->prealloc_threads = machine->smp.cpus;
>   }
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 27372bb01e..3bdce197f7 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -1376,7 +1376,7 @@ bool machine_mem_merge(MachineState *machine)
>       return machine->mem_merge;
>   }
>   
> -bool machine_require_guest_memfd(MachineState *machine)
> +bool machine_require_guest_memfd_private(MachineState *machine)
>   {
>       return machine->cgs && machine->cgs->require_guest_memfd;
>   }
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index f8b919cb6c..b2d55ceb5e 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -962,7 +962,7 @@ void pc_memory_init(PCMachineState *pcms,
>   
>       if (!is_tdx_vm()) {
>           option_rom_mr = g_malloc(sizeof(*option_rom_mr));
> -        if (machine_require_guest_memfd(machine)) {
> +        if (machine_require_guest_memfd_private(machine)) {
>               memory_region_init_ram_guest_memfd(option_rom_mr, NULL, "pc.rom",
>                                               PC_ROM_SIZE, &error_fatal);
>           } else {
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index 1a12b635ad..1c37258654 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -52,7 +52,7 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
>   
>       /* map the last 128KB of the BIOS in ISA space */
>       isa_bios_size = MIN(flash_size, 128 * KiB);
> -    if (machine_require_guest_memfd(MACHINE(pcms))) {
> +    if (machine_require_guest_memfd_private(MACHINE(pcms))) {
>           memory_region_init_ram_guest_memfd(isa_bios, NULL, "isa-bios",
>                                              isa_bios_size, &error_fatal);
>       } else {
> @@ -71,7 +71,7 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
>              ((uint8_t*)flash_ptr) + (flash_size - isa_bios_size),
>              isa_bios_size);
>   
> -    if (!machine_require_guest_memfd(current_machine)) {
> +    if (!machine_require_guest_memfd_private(current_machine)) {
>           memory_region_set_readonly(isa_bios, true);
>       }
>   }
> diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
> index c844749900..33ac7fb6e9 100644
> --- a/hw/i386/x86-common.c
> +++ b/hw/i386/x86-common.c
> @@ -1044,7 +1044,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
>           (bios_size % 65536) != 0) {
>           goto bios_error;
>       }
> -    if (machine_require_guest_memfd(MACHINE(x86ms))) {
> +    if (machine_require_guest_memfd_private(MACHINE(x86ms))) {
>           memory_region_init_ram_guest_memfd(&x86ms->bios, NULL, "pc.bios",
>                                              bios_size, &error_fatal);
>           if (is_tdx_vm()) {
> @@ -1074,7 +1074,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
>       }
>       g_free(filename);
>   
> -    if (!machine_require_guest_memfd(MACHINE(x86ms))) {
> +    if (!machine_require_guest_memfd_private(MACHINE(x86ms))) {
>           /* map the last 128KB of the BIOS in ISA space */
>           x86_isa_bios_init(&x86ms->isa_bios, rom_memory, &x86ms->bios,
>                             !isapc_ram_fw);
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 60c7981138..5d0d02bcaf 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -6050,8 +6050,9 @@ static int kvm_handle_hc_map_gpa_range(X86CPU *cpu, struct kvm_run *run)
>       uint64_t gpa, size, attributes;
>       int ret;
>   
> -    if (!machine_require_guest_memfd(current_machine))
> +    if (!machine_require_guest_memfd_private(current_machine)) {
>           return -EINVAL;
> +    }
>   
>       gpa = run->hypercall.args[0];
>       size = run->hypercall.args[1] * TARGET_PAGE_SIZE;



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() to *_private()
  2025-12-15 20:52 ` [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() " Peter Xu
@ 2025-12-16  6:56   ` Xiaoyao Li
  2026-06-02 21:49   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Xiaoyao Li @ 2025-12-16  6:56 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy

On 12/16/2025 4:52 AM, Peter Xu wrote:
> Differenciate it from fully shared guest-memfd use cases.
> 
> Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
>   include/system/memory.h | 10 +++++-----
>   backends/igvm.c         |  4 ++--
>   hw/i386/pc.c            |  4 ++--
>   hw/i386/pc_sysfw.c      |  4 ++--
>   hw/i386/x86-common.c    |  4 ++--
>   system/memory.c         | 10 +++++-----
>   6 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/include/system/memory.h b/include/system/memory.h
> index 9b58303bb8..b3d000a563 100644
> --- a/include/system/memory.h
> +++ b/include/system/memory.h
> @@ -1693,11 +1693,11 @@ bool memory_region_init_ram(MemoryRegion *mr,
>                               uint64_t size,
>                               Error **errp);
>   
> -bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
> -                                        Object *owner,
> -                                        const char *name,
> -                                        uint64_t size,
> -                                        Error **errp);
> +bool memory_region_init_ram_guest_memfd_private(MemoryRegion *mr,
> +                                                Object *owner,
> +                                                const char *name,
> +                                                uint64_t size,
> +                                                Error **errp);
>   
>   /**
>    * memory_region_init_rom: Initialize a ROM memory region.
> diff --git a/backends/igvm.c b/backends/igvm.c
> index 905bd8d989..91631829e5 100644
> --- a/backends/igvm.c
> +++ b/backends/igvm.c
> @@ -221,8 +221,8 @@ static void *qigvm_prepare_memory(QIgvm *ctx, uint64_t addr, uint64_t size,
>               g_strdup_printf("igvm.%X", region_identifier);
>           igvm_pages = g_new0(MemoryRegion, 1);
>           if (ctx->cgs && ctx->cgs->require_guest_memfd) {
> -            if (!memory_region_init_ram_guest_memfd(igvm_pages, NULL,
> -                                                    region_name, size, errp)) {
> +            if (!memory_region_init_ram_guest_memfd_private(
> +                    igvm_pages, NULL, region_name, size, errp)) {
>                   return NULL;
>               }
>           } else {
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index b2d55ceb5e..41dfbbdcf0 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -963,8 +963,8 @@ void pc_memory_init(PCMachineState *pcms,
>       if (!is_tdx_vm()) {
>           option_rom_mr = g_malloc(sizeof(*option_rom_mr));
>           if (machine_require_guest_memfd_private(machine)) {
> -            memory_region_init_ram_guest_memfd(option_rom_mr, NULL, "pc.rom",
> -                                            PC_ROM_SIZE, &error_fatal);
> +            memory_region_init_ram_guest_memfd_private(
> +                option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE, &error_fatal);
>           } else {
>               memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
>                                   &error_fatal);
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index 1c37258654..ad55d4eba6 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -53,8 +53,8 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
>       /* map the last 128KB of the BIOS in ISA space */
>       isa_bios_size = MIN(flash_size, 128 * KiB);
>       if (machine_require_guest_memfd_private(MACHINE(pcms))) {
> -        memory_region_init_ram_guest_memfd(isa_bios, NULL, "isa-bios",
> -                                           isa_bios_size, &error_fatal);
> +        memory_region_init_ram_guest_memfd_private(
> +            isa_bios, NULL, "isa-bios", isa_bios_size, &error_fatal);
>       } else {
>           memory_region_init_ram(isa_bios, NULL, "isa-bios", isa_bios_size,
>                                  &error_fatal);
> diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
> index 33ac7fb6e9..27854a9164 100644
> --- a/hw/i386/x86-common.c
> +++ b/hw/i386/x86-common.c
> @@ -1045,8 +1045,8 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
>           goto bios_error;
>       }
>       if (machine_require_guest_memfd_private(MACHINE(x86ms))) {
> -        memory_region_init_ram_guest_memfd(&x86ms->bios, NULL, "pc.bios",
> -                                           bios_size, &error_fatal);
> +        memory_region_init_ram_guest_memfd_private(
> +            &x86ms->bios, NULL, "pc.bios", bios_size, &error_fatal);
>           if (is_tdx_vm()) {
>               tdx_set_tdvf_region(&x86ms->bios);
>           }
> diff --git a/system/memory.c b/system/memory.c
> index d70968c966..28810dcb29 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -3746,11 +3746,11 @@ bool memory_region_init_ram(MemoryRegion *mr,
>       return true;
>   }
>   
> -bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
> -                                        Object *owner,
> -                                        const char *name,
> -                                        uint64_t size,
> -                                        Error **errp)
> +bool memory_region_init_ram_guest_memfd_private(MemoryRegion *mr,
> +                                                Object *owner,
> +                                                const char *name,
> +                                                uint64_t size,
> +                                                Error **errp)
>   {
>       DeviceState *owner_dev;
>   



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported
  2025-12-15 20:51 ` [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported Peter Xu
@ 2025-12-16 12:41   ` Xiaoyao Li
  2025-12-23 16:56     ` Peter Xu
  2025-12-16 13:53   ` Fabiano Rosas
  2026-06-02  1:10   ` Michael Roth
  2 siblings, 1 reply; 47+ messages in thread
From: Xiaoyao Li @ 2025-12-16 12:41 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy

Hi Peter,

On 12/16/2025 4:51 AM, Peter Xu wrote:
> diff --git a/system/physmem.c b/system/physmem.c
> index c9869e4049..3555d2f6f7 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2211,6 +2211,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>                          object_get_typename(OBJECT(current_machine->cgs)));
>               goto out_free;
>           }
> +
> +        if (!kvm_private_memory_attribute_supported()) {
> +            error_setg(errp, "cannot set up private guest memory for %s: "
> +                       " KVM does not support private memory attribute",

There is one redundant blank space at the beginning since the previous 
line leaves one at the end.

Please help fix it. Thanks!

> +                       object_get_typename(OBJECT(current_machine->cgs)));
> +            goto out_free;
> +        }
> +
>           assert(new_block->guest_memfd < 0);
>   
>           ret = ram_block_coordinated_discard_require(true);



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported
  2025-12-15 20:51 ` [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported Peter Xu
  2025-12-16 12:41   ` Xiaoyao Li
@ 2025-12-16 13:53   ` Fabiano Rosas
  2025-12-23 17:02     ` Peter Xu
  2026-06-02  1:10   ` Michael Roth
  2 siblings, 1 reply; 47+ messages in thread
From: Fabiano Rosas @ 2025-12-16 13:53 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Alexey Kardashevskiy, Li Xiaoyao

Peter Xu <peterx@redhat.com> writes:

> From: Xiaoyao Li <xiaoyao.li@intel.com>
>
> With the mmap support of guest memfd, KVM allows usersapce to create
> guest memfd serving as normal non-private memory for X86 DEFEAULT VM.
> However, KVM doesn't support private memory attriute for X86 DEFAULT
> VM.
>
> Make kvm_guest_memfd_supported not rely on KVM_MEMORY_ATTRIBUTE_PRIVATE
> and check KVM_MEMORY_ATTRIBUTE_PRIVATE separately when the machine
> requires guest_memfd to serve as private memory.
>
> This allows QMEU to create guest memfd with mmap to serve as the memory
> backend for X86 DEFAULT VM.
>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  include/system/kvm.h   | 1 +
>  accel/kvm/kvm-all.c    | 8 ++++++--
>  accel/stubs/kvm-stub.c | 5 +++++
>  system/physmem.c       | 8 ++++++++
>  4 files changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/include/system/kvm.h b/include/system/kvm.h
> index 8f9eecf044..b5811c90f1 100644
> --- a/include/system/kvm.h
> +++ b/include/system/kvm.h
> @@ -561,6 +561,7 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp);
>  
>  int kvm_set_memory_attributes_private(hwaddr start, uint64_t size);
>  int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size);
> +bool kvm_private_memory_attribute_supported(void);
>  
>  int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
>  
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 28006d73c5..59836ebdff 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -1501,6 +1501,11 @@ int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size)
>      return kvm_set_memory_attributes(start, size, 0);
>  }
>  
> +bool kvm_private_memory_attribute_supported(void)
> +{
> +    return !!(kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
> +}
> +
>  /* Called with KVMMemoryListener.slots_lock held */
>  static void kvm_set_phys_mem(KVMMemoryListener *kml,
>                               MemoryRegionSection *section, bool add)
> @@ -2781,8 +2786,7 @@ static int kvm_init(AccelState *as, MachineState *ms)
>      kvm_supported_memory_attributes = kvm_vm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
>      kvm_guest_memfd_supported =
>          kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
> -        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
> -        (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
> +        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
>      kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
>  
>      if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
> diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
> index 68cd33ba97..73f04eb589 100644
> --- a/accel/stubs/kvm-stub.c
> +++ b/accel/stubs/kvm-stub.c
> @@ -125,3 +125,8 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
>  {
>      return -ENOSYS;
>  }
> +
> +bool kvm_private_memory_attribute_supported(void)
> +{
> +    return false;
> +}
> diff --git a/system/physmem.c b/system/physmem.c
> index c9869e4049..3555d2f6f7 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2211,6 +2211,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>                         object_get_typename(OBJECT(current_machine->cgs)));
>              goto out_free;
>          }
> +
> +        if (!kvm_private_memory_attribute_supported()) {
> +            error_setg(errp, "cannot set up private guest memory for %s: "
> +                       " KVM does not support private memory attribute",
> +                       object_get_typename(OBJECT(current_machine->cgs)));
> +            goto out_free;
> +        }

Hm, it took me a while to understand why this is under (new_block->flags
& RAM_GUEST_MEMFD) but checking for private memory support. If it's at
all feasible I would just squash all those patches doing
s/guest_memfd/guest_memfd_private/ to avoid having intermediate patches
where the terminology is not aligned.

Anyway, up to you. For this one:

Reviewed-by: Fabiano Rosas <farosas@suse.de>



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 02/12] kvm: Detect guest-memfd flags supported
  2025-12-15 20:51 ` [PATCH v3 02/12] kvm: Detect guest-memfd flags supported Peter Xu
@ 2025-12-16 13:54   ` Fabiano Rosas
  2026-06-02  1:29   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Fabiano Rosas @ 2025-12-16 13:54 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Alexey Kardashevskiy, Li Xiaoyao

Peter Xu <peterx@redhat.com> writes:

> Detect supported guest-memfd flags by the current kernel, and reject
> creations of guest-memfd using invalid flags.  When the cap isn't
> available, then no flag is supported.
>
> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd()
  2025-12-15 20:51 ` [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd() Peter Xu
  2025-12-16  4:03   ` Xiaoyao Li
@ 2025-12-16 13:55   ` Fabiano Rosas
  2026-06-02  1:31   ` Michael Roth
  2 siblings, 0 replies; 47+ messages in thread
From: Fabiano Rosas @ 2025-12-16 13:55 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Alexey Kardashevskiy, Li Xiaoyao

Peter Xu <peterx@redhat.com> writes:

> So that there will be a verbal string returned when kvm not enabled, or kvm
> not compiled.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
  2025-12-15 20:51 ` [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM Peter Xu
  2025-12-16  6:54   ` Xiaoyao Li
@ 2025-12-16 14:02   ` Fabiano Rosas
  2026-06-02 21:40   ` Michael Roth
  2 siblings, 0 replies; 47+ messages in thread
From: Fabiano Rosas @ 2025-12-16 14:02 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Alexey Kardashevskiy, Li Xiaoyao

Peter Xu <peterx@redhat.com> writes:

> Host backends supports guest-memfd now by detecting whether it's a
> confidential VM.  There's no way to choose it yet from the memory level to
> use it fully shared.  If we use guest-memfd, it so far always implies we
> need two layers of memory backends, while the guest-memfd only provides the
> private set of pages.
>
> This patch introduces a way so that QEMU can consume guest memfd as the
> only source of memory to back the object (aka, fully shared).
>
> To use the fully shared guest-memfd, one can add a memfd object with:
>
>   -object memory-backend-memfd,guest-memfd=on,share=on
>
> Note that share=on is required with fully shared guest_memfd.
>
> PS: there's a trivial touch-up on fd<0 check, because the stub to create
> guest-memfd may return negative but not -1.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type
  2025-12-15 20:52 ` [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type Peter Xu
@ 2025-12-16 14:18   ` Fabiano Rosas
  2025-12-23 17:09     ` Peter Xu
  0 siblings, 1 reply; 47+ messages in thread
From: Fabiano Rosas @ 2025-12-16 14:18 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Alexey Kardashevskiy, Li Xiaoyao

Peter Xu <peterx@redhat.com> writes:

> Support the guest-memfd type when the fd has init share enabled.  It means
> the gmemfd can be used similarly to memfd.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  tests/qtest/migration/framework.h |  4 +++
>  tests/qtest/migration/framework.c | 60 +++++++++++++++++++++++++++++++
>  2 files changed, 64 insertions(+)
>
> diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
> index ed85ed502d..b4c5edcad3 100644
> --- a/tests/qtest/migration/framework.h
> +++ b/tests/qtest/migration/framework.h
> @@ -34,6 +34,10 @@ typedef enum {
>       * but only anonymously allocated.
>       */
>      MEM_TYPE_MEMFD,
> +    /*
> +     * Use guest-memfd, shared mappings.
> +     */
> +    MEM_TYPE_GUEST_MEMFD,
>      MEM_TYPE_NUM,
>  } MemType;
>  
> diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
> index e35839c95f..9aa353bac6 100644
> --- a/tests/qtest/migration/framework.c
> +++ b/tests/qtest/migration/framework.c
> @@ -26,6 +26,10 @@
>  #include "qemu/range.h"
>  #include "qemu/sockets.h"
>  
> +#ifdef CONFIG_LINUX
> +#include <linux/kvm.h>
> +#include <sys/ioctl.h>
> +#endif
>  
>  #define QEMU_VM_FILE_MAGIC 0x5145564d
>  #define QEMU_ENV_SRC "QTEST_QEMU_BINARY_SRC"
> @@ -283,6 +287,9 @@ static char *migrate_mem_type_get_opts(MemType type, const char *memory_size)
>      case MEM_TYPE_MEMFD:
>          backend = g_strdup("-object memory-backend-memfd");
>          break;
> +    case MEM_TYPE_GUEST_MEMFD:
> +        backend = g_strdup("-object memory-backend-memfd,guest-memfd=on");
> +        break;
>      default:
>          g_assert_not_reached();
>          break;
> @@ -425,8 +432,55 @@ int migrate_args(char **from, char **to, const char *uri, MigrateStart *args)
>      return 0;
>  }
>  
> +static bool kvm_guest_memfd_init_shared_supported(const char **reason)

Should be in migration-util.c, like kvm_dirty_ring_supported() and
ufd_version_check().

> +{
> +    assert(*reason == NULL);
> +
> +#ifdef CONFIG_LINUX
> +    int ret, fd = -1;
> +
> +    if (!migration_get_env()->has_kvm) {
> +        *reason = "KVM is not enabled in the current QEMU build";
> +        goto out;
> +    }
> +
> +    fd = open("/dev/kvm", O_RDWR);
> +    if (fd < 0) {
> +        *reason = "KVM module isn't available or missing permission";
> +        goto out;
> +    }
> +
> +    ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_GUEST_MEMFD);
> +    if (!ret) {
> +        *reason = "KVM module doesn't suport guest-memfd";
> +        goto out;
> +    }
> +
> +    ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_GUEST_MEMFD_FLAGS);
> +    if (ret < 0) {

Should this be <= ? I see there's a window between the addition of
KVM_CAP_GUEST_MEMFD and KVM_CAP_GUEST_MEMFD_FLAGS in the kernel.

> +        *reason = "KVM doesn't support KVM_CAP_GUEST_MEMFD_FLAGS";
> +        goto out;
> +    }
> +
> +    if (!(ret & GUEST_MEMFD_FLAG_INIT_SHARED)) {
> +        *reason = "KVM doesn't support GUEST_MEMFD_FLAG_INIT_SHARED";
> +        goto out;
> +    }
> +out:
> +    if (fd >= 0) {
> +        close(fd);
> +    }
> +#else
> +    *reason = "KVM not supported on non-Linux OS";
> +#endif
> +
> +    return !*reason;
> +}
> +
>  static bool migrate_mem_type_prepare(MemType type)
>  {
> +    const char *reason = NULL;
> +
>      switch (type) {
>      case MEM_TYPE_SHMEM:
>          if (!g_file_test("/dev/shm", G_FILE_TEST_IS_DIR)) {
> @@ -434,6 +488,12 @@ static bool migrate_mem_type_prepare(MemType type)
>              return false;
>          }
>          break;
> +    case MEM_TYPE_GUEST_MEMFD:
> +        if (!kvm_guest_memfd_init_shared_supported(&reason)) {
> +            g_test_skip(reason);
> +            return false;
> +        }
> +        break;
>      default:
>          break;
>      }


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd
  2025-12-15 20:52 ` [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd Peter Xu
@ 2025-12-16 14:20   ` Fabiano Rosas
  0 siblings, 0 replies; 47+ messages in thread
From: Fabiano Rosas @ 2025-12-16 14:20 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Juraj Marcin, David Hildenbrand, Paolo Bonzini, Chenyi Qiang,
	peterx, Alexey Kardashevskiy, Li Xiaoyao

Peter Xu <peterx@redhat.com> writes:

> Add a plain tcp test for guest-memfd.  Note that the test will be
> automatically skipped whenever not supported (e.g. qemu compiled without
> KVM, or host kernel doesn't support kvm, or old kernels, etc.).
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  tests/qtest/migration/precopy-tests.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
> index 57ca623de5..88d2627efd 100644
> --- a/tests/qtest/migration/precopy-tests.c
> +++ b/tests/qtest/migration/precopy-tests.c
> @@ -215,6 +215,16 @@ static void test_precopy_tcp_plain(void)
>      test_precopy_common(&args);
>  }
>  
> +static void test_precopy_tcp_plain_gmemfd(void)
> +{
> +    MigrateCommon args = {
> +        .listen_uri = "tcp:127.0.0.1:0",
> +        .start.mem_type = MEM_TYPE_GUEST_MEMFD,
> +    };
> +
> +    test_precopy_common(&args);
> +}
> +
>  static void test_precopy_tcp_switchover_ack(void)
>  {
>      MigrateCommon args = {
> @@ -1276,6 +1286,8 @@ void migration_test_add_precopy(MigrationTestEnv *env)
>          return;
>      }
>  
> +    migration_test_add("/migration/precopy/tcp/plain/guest-memfd",
> +                       test_precopy_tcp_plain_gmemfd);
>      migration_test_add("/migration/precopy/tcp/plain/switchover-ack",
>                         test_precopy_tcp_switchover_ack);

Reviewed-by: Fabiano Rosas <farosas@suse.de>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported
  2025-12-16 12:41   ` Xiaoyao Li
@ 2025-12-23 16:56     ` Peter Xu
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-23 16:56 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy

On Tue, Dec 16, 2025 at 08:41:50PM +0800, Xiaoyao Li wrote:
> Hi Peter,
> 
> On 12/16/2025 4:51 AM, Peter Xu wrote:
> > diff --git a/system/physmem.c b/system/physmem.c
> > index c9869e4049..3555d2f6f7 100644
> > --- a/system/physmem.c
> > +++ b/system/physmem.c
> > @@ -2211,6 +2211,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
> >                          object_get_typename(OBJECT(current_machine->cgs)));
> >               goto out_free;
> >           }
> > +
> > +        if (!kvm_private_memory_attribute_supported()) {
> > +            error_setg(errp, "cannot set up private guest memory for %s: "
> > +                       " KVM does not support private memory attribute",
> 
> There is one redundant blank space at the beginning since the previous line
> leaves one at the end.
> 
> Please help fix it. Thanks!

Sure!

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported
  2025-12-16 13:53   ` Fabiano Rosas
@ 2025-12-23 17:02     ` Peter Xu
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-23 17:02 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Alexey Kardashevskiy, Li Xiaoyao

On Tue, Dec 16, 2025 at 10:53:04AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > From: Xiaoyao Li <xiaoyao.li@intel.com>
> >
> > With the mmap support of guest memfd, KVM allows usersapce to create
> > guest memfd serving as normal non-private memory for X86 DEFEAULT VM.
> > However, KVM doesn't support private memory attriute for X86 DEFAULT
> > VM.
> >
> > Make kvm_guest_memfd_supported not rely on KVM_MEMORY_ATTRIBUTE_PRIVATE
> > and check KVM_MEMORY_ATTRIBUTE_PRIVATE separately when the machine
> > requires guest_memfd to serve as private memory.
> >
> > This allows QMEU to create guest memfd with mmap to serve as the memory
> > backend for X86 DEFAULT VM.
> >
> > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  include/system/kvm.h   | 1 +
> >  accel/kvm/kvm-all.c    | 8 ++++++--
> >  accel/stubs/kvm-stub.c | 5 +++++
> >  system/physmem.c       | 8 ++++++++
> >  4 files changed, 20 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/system/kvm.h b/include/system/kvm.h
> > index 8f9eecf044..b5811c90f1 100644
> > --- a/include/system/kvm.h
> > +++ b/include/system/kvm.h
> > @@ -561,6 +561,7 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp);
> >  
> >  int kvm_set_memory_attributes_private(hwaddr start, uint64_t size);
> >  int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size);
> > +bool kvm_private_memory_attribute_supported(void);
> >  
> >  int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
> >  
> > diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> > index 28006d73c5..59836ebdff 100644
> > --- a/accel/kvm/kvm-all.c
> > +++ b/accel/kvm/kvm-all.c
> > @@ -1501,6 +1501,11 @@ int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size)
> >      return kvm_set_memory_attributes(start, size, 0);
> >  }
> >  
> > +bool kvm_private_memory_attribute_supported(void)
> > +{
> > +    return !!(kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
> > +}
> > +
> >  /* Called with KVMMemoryListener.slots_lock held */
> >  static void kvm_set_phys_mem(KVMMemoryListener *kml,
> >                               MemoryRegionSection *section, bool add)
> > @@ -2781,8 +2786,7 @@ static int kvm_init(AccelState *as, MachineState *ms)
> >      kvm_supported_memory_attributes = kvm_vm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
> >      kvm_guest_memfd_supported =
> >          kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
> > -        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
> > -        (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
> > +        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
> >      kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
> >  
> >      if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
> > diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
> > index 68cd33ba97..73f04eb589 100644
> > --- a/accel/stubs/kvm-stub.c
> > +++ b/accel/stubs/kvm-stub.c
> > @@ -125,3 +125,8 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
> >  {
> >      return -ENOSYS;
> >  }
> > +
> > +bool kvm_private_memory_attribute_supported(void)
> > +{
> > +    return false;
> > +}
> > diff --git a/system/physmem.c b/system/physmem.c
> > index c9869e4049..3555d2f6f7 100644
> > --- a/system/physmem.c
> > +++ b/system/physmem.c
> > @@ -2211,6 +2211,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
> >                         object_get_typename(OBJECT(current_machine->cgs)));
> >              goto out_free;
> >          }
> > +
> > +        if (!kvm_private_memory_attribute_supported()) {
> > +            error_setg(errp, "cannot set up private guest memory for %s: "
> > +                       " KVM does not support private memory attribute",
> > +                       object_get_typename(OBJECT(current_machine->cgs)));
> > +            goto out_free;
> > +        }
> 
> Hm, it took me a while to understand why this is under (new_block->flags
> & RAM_GUEST_MEMFD) but checking for private memory support. If it's at
> all feasible I would just squash all those patches doing
> s/guest_memfd/guest_memfd_private/ to avoid having intermediate patches
> where the terminology is not aligned.

Yeah, the hope is it'll stop being confusing after this series applied.

Keeping them separate is logically more sensible, not only to make review
easier, but the rule that each commit should be self contained and also
minimum..

> 
> Anyway, up to you. For this one:
> 
> Reviewed-by: Fabiano Rosas <farosas@suse.de>

Thanks!

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
  2025-12-16  5:49   ` Xiaoyao Li
@ 2025-12-23 17:04     ` Peter Xu
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-23 17:04 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy

On Tue, Dec 16, 2025 at 01:49:27PM +0800, Xiaoyao Li wrote:
> On 12/16/2025 4:51 AM, Peter Xu wrote:
> > This name is too generic, and can conflict with in-place guest-memfd
> > support.  Add a _PRIVATE suffix to show what it really means: it is always
> > silently using an internal guest-memfd to back a shared host backend,
> > rather than used in-place.
> > 
> > This paves way for in-place guest-memfd, which means we can have a ramblock
> > that allocates pages completely from guest-memfd (private or shared).
> 
> Well, the term of "in-place" needs to be changed to "init-shared".

Right.. I'll fix those and keep the r-b.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type
  2025-12-16 14:18   ` Fabiano Rosas
@ 2025-12-23 17:09     ` Peter Xu
  0 siblings, 0 replies; 47+ messages in thread
From: Peter Xu @ 2025-12-23 17:09 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Alexey Kardashevskiy, Li Xiaoyao

On Tue, Dec 16, 2025 at 11:18:48AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > Support the guest-memfd type when the fd has init share enabled.  It means
> > the gmemfd can be used similarly to memfd.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  tests/qtest/migration/framework.h |  4 +++
> >  tests/qtest/migration/framework.c | 60 +++++++++++++++++++++++++++++++
> >  2 files changed, 64 insertions(+)
> >
> > diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
> > index ed85ed502d..b4c5edcad3 100644
> > --- a/tests/qtest/migration/framework.h
> > +++ b/tests/qtest/migration/framework.h
> > @@ -34,6 +34,10 @@ typedef enum {
> >       * but only anonymously allocated.
> >       */
> >      MEM_TYPE_MEMFD,
> > +    /*
> > +     * Use guest-memfd, shared mappings.
> > +     */
> > +    MEM_TYPE_GUEST_MEMFD,
> >      MEM_TYPE_NUM,
> >  } MemType;
> >  
> > diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
> > index e35839c95f..9aa353bac6 100644
> > --- a/tests/qtest/migration/framework.c
> > +++ b/tests/qtest/migration/framework.c
> > @@ -26,6 +26,10 @@
> >  #include "qemu/range.h"
> >  #include "qemu/sockets.h"
> >  
> > +#ifdef CONFIG_LINUX
> > +#include <linux/kvm.h>
> > +#include <sys/ioctl.h>
> > +#endif
> >  
> >  #define QEMU_VM_FILE_MAGIC 0x5145564d
> >  #define QEMU_ENV_SRC "QTEST_QEMU_BINARY_SRC"
> > @@ -283,6 +287,9 @@ static char *migrate_mem_type_get_opts(MemType type, const char *memory_size)
> >      case MEM_TYPE_MEMFD:
> >          backend = g_strdup("-object memory-backend-memfd");
> >          break;
> > +    case MEM_TYPE_GUEST_MEMFD:
> > +        backend = g_strdup("-object memory-backend-memfd,guest-memfd=on");
> > +        break;
> >      default:
> >          g_assert_not_reached();
> >          break;
> > @@ -425,8 +432,55 @@ int migrate_args(char **from, char **to, const char *uri, MigrateStart *args)
> >      return 0;
> >  }
> >  
> > +static bool kvm_guest_memfd_init_shared_supported(const char **reason)
> 
> Should be in migration-util.c, like kvm_dirty_ring_supported() and
> ufd_version_check().

Ah.. sure.

> 
> > +{
> > +    assert(*reason == NULL);
> > +
> > +#ifdef CONFIG_LINUX
> > +    int ret, fd = -1;
> > +
> > +    if (!migration_get_env()->has_kvm) {
> > +        *reason = "KVM is not enabled in the current QEMU build";
> > +        goto out;
> > +    }
> > +
> > +    fd = open("/dev/kvm", O_RDWR);
> > +    if (fd < 0) {
> > +        *reason = "KVM module isn't available or missing permission";
> > +        goto out;
> > +    }
> > +
> > +    ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_GUEST_MEMFD);
> > +    if (!ret) {
> > +        *reason = "KVM module doesn't suport guest-memfd";
> > +        goto out;
> > +    }
> > +
> > +    ret = ioctl(fd, KVM_CHECK_EXTENSION, KVM_CAP_GUEST_MEMFD_FLAGS);
> > +    if (ret < 0) {
> 
> Should this be <= ? I see there's a window between the addition of
> KVM_CAP_GUEST_MEMFD and KVM_CAP_GUEST_MEMFD_FLAGS in the kernel.

That was checked right below [1], so ret==0 will fail with a better error
message:

> 
> > +        *reason = "KVM doesn't support KVM_CAP_GUEST_MEMFD_FLAGS";
> > +        goto out;
> > +    }
> > +
> > +    if (!(ret & GUEST_MEMFD_FLAG_INIT_SHARED)) {

[1]

> > +        *reason = "KVM doesn't support GUEST_MEMFD_FLAG_INIT_SHARED";
> > +        goto out;
> > +    }
> > +out:
> > +    if (fd >= 0) {
> > +        close(fd);
> > +    }
> > +#else
> > +    *reason = "KVM not supported on non-Linux OS";
> > +#endif
> > +
> > +    return !*reason;
> > +}
> > +
> >  static bool migrate_mem_type_prepare(MemType type)
> >  {
> > +    const char *reason = NULL;
> > +
> >      switch (type) {
> >      case MEM_TYPE_SHMEM:
> >          if (!g_file_test("/dev/shm", G_FILE_TEST_IS_DIR)) {
> > @@ -434,6 +488,12 @@ static bool migrate_mem_type_prepare(MemType type)
> >              return false;
> >          }
> >          break;
> > +    case MEM_TYPE_GUEST_MEMFD:
> > +        if (!kvm_guest_memfd_init_shared_supported(&reason)) {
> > +            g_test_skip(reason);
> > +            return false;
> > +        }
> > +        break;
> >      default:
> >          break;
> >      }
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported
  2025-12-15 20:51 ` [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported Peter Xu
  2025-12-16 12:41   ` Xiaoyao Li
  2025-12-16 13:53   ` Fabiano Rosas
@ 2026-06-02  1:10   ` Michael Roth
  2 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02  1:10 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:52PM -0500, Peter Xu wrote:
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> 
> With the mmap support of guest memfd, KVM allows usersapce to create
> guest memfd serving as normal non-private memory for X86 DEFEAULT VM.
> However, KVM doesn't support private memory attriute for X86 DEFAULT
> VM.
> 
> Make kvm_guest_memfd_supported not rely on KVM_MEMORY_ATTRIBUTE_PRIVATE
> and check KVM_MEMORY_ATTRIBUTE_PRIVATE separately when the machine
> requires guest_memfd to serve as private memory.
> 
> This allows QMEU to create guest memfd with mmap to serve as the memory
> backend for X86 DEFAULT VM.
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  include/system/kvm.h   | 1 +
>  accel/kvm/kvm-all.c    | 8 ++++++--
>  accel/stubs/kvm-stub.c | 5 +++++
>  system/physmem.c       | 8 ++++++++
>  4 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/include/system/kvm.h b/include/system/kvm.h
> index 8f9eecf044..b5811c90f1 100644
> --- a/include/system/kvm.h
> +++ b/include/system/kvm.h
> @@ -561,6 +561,7 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp);
>  
>  int kvm_set_memory_attributes_private(hwaddr start, uint64_t size);
>  int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size);
> +bool kvm_private_memory_attribute_supported(void);
>  
>  int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);
>  
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 28006d73c5..59836ebdff 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -1501,6 +1501,11 @@ int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size)
>      return kvm_set_memory_attributes(start, size, 0);
>  }
>  
> +bool kvm_private_memory_attribute_supported(void)
> +{
> +    return !!(kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
> +}
> +
>  /* Called with KVMMemoryListener.slots_lock held */
>  static void kvm_set_phys_mem(KVMMemoryListener *kml,
>                               MemoryRegionSection *section, bool add)
> @@ -2781,8 +2786,7 @@ static int kvm_init(AccelState *as, MachineState *ms)
>      kvm_supported_memory_attributes = kvm_vm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
>      kvm_guest_memfd_supported =
>          kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
> -        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
> -        (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
> +        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
>      kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
>  
>      if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
> diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
> index 68cd33ba97..73f04eb589 100644
> --- a/accel/stubs/kvm-stub.c
> +++ b/accel/stubs/kvm-stub.c
> @@ -125,3 +125,8 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
>  {
>      return -ENOSYS;
>  }
> +
> +bool kvm_private_memory_attribute_supported(void)
> +{
> +    return false;
> +}
> diff --git a/system/physmem.c b/system/physmem.c
> index c9869e4049..3555d2f6f7 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2211,6 +2211,14 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>                         object_get_typename(OBJECT(current_machine->cgs)));
>              goto out_free;
>          }
> +
> +        if (!kvm_private_memory_attribute_supported()) {
> +            error_setg(errp, "cannot set up private guest memory for %s: "
> +                       " KVM does not support private memory attribute",
> +                       object_get_typename(OBJECT(current_machine->cgs)));
> +            goto out_free;
> +        }
> +
>          assert(new_block->guest_memfd < 0);
>  
>          ret = ram_block_coordinated_discard_require(true);
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 02/12] kvm: Detect guest-memfd flags supported
  2025-12-15 20:51 ` [PATCH v3 02/12] kvm: Detect guest-memfd flags supported Peter Xu
  2025-12-16 13:54   ` Fabiano Rosas
@ 2026-06-02  1:29   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02  1:29 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:53PM -0500, Peter Xu wrote:
> Detect supported guest-memfd flags by the current kernel, and reject
> creations of guest-memfd using invalid flags.  When the cap isn't
> available, then no flag is supported.
> 
> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  accel/kvm/kvm-all.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 59836ebdff..68d57c1af0 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -108,6 +108,7 @@ static int kvm_sstep_flags;
>  static bool kvm_immediate_exit;
>  static uint64_t kvm_supported_memory_attributes;
>  static bool kvm_guest_memfd_supported;
> +static uint64_t kvm_guest_memfd_flags_supported;
>  static hwaddr kvm_max_slot_size = ~0;
>  
>  static const KVMCapabilityInfo kvm_required_capabilites[] = {
> @@ -2787,6 +2788,10 @@ static int kvm_init(AccelState *as, MachineState *ms)
>      kvm_guest_memfd_supported =
>          kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
>          kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
> +
> +    ret = kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD_FLAGS);
> +    kvm_guest_memfd_flags_supported = ret > 0 ? ret : 0;

kvm_vm_check_extension() zeroes out negative return values already so
should be able to use the same format as below.

But either way:

Reviewed-by: Michael Roth <michael.roth@amd.com>

> +
>      kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
>  
>      if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
> @@ -4492,6 +4497,13 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
>          return -1;
>      }
>  
> +    if (flags & ~kvm_guest_memfd_flags_supported) {
> +        error_setg(errp, "Current KVM instance does not support "
> +                   "guest-memfd flag: 0x%"PRIx64,
> +                   flags & ~kvm_guest_memfd_flags_supported);
> +        return -1;
> +    }
> +
>      fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &guest_memfd);
>      if (fd < 0) {
>          error_setg_errno(errp, errno, "Error creating KVM guest_memfd");
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd()
  2025-12-15 20:51 ` [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd() Peter Xu
  2025-12-16  4:03   ` Xiaoyao Li
  2025-12-16 13:55   ` Fabiano Rosas
@ 2026-06-02  1:31   ` Michael Roth
  2 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02  1:31 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:54PM -0500, Peter Xu wrote:
> So that there will be a verbal string returned when kvm not enabled, or kvm
> not compiled.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  accel/kvm/kvm-all.c    | 5 +++++
>  accel/stubs/kvm-stub.c | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 68d57c1af0..c32fbcf9cc 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -4492,6 +4492,11 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
>          .flags = flags,
>      };
>  
> +    if (!kvm_enabled()) {
> +        error_setg(errp, "guest-memfd requires KVM accelerator");
> +        return -1;
> +    }
> +
>      if (!kvm_guest_memfd_supported) {
>          error_setg(errp, "KVM does not support guest_memfd");
>          return -1;
> diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
> index 73f04eb589..01b1d6285e 100644
> --- a/accel/stubs/kvm-stub.c
> +++ b/accel/stubs/kvm-stub.c
> @@ -123,6 +123,7 @@ bool kvm_hwpoisoned_mem(void)
>  
>  int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
>  {
> +    error_setg(errp, "KVM is not enabled");
>      return -ENOSYS;
>  }
>  
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private
  2025-12-15 20:51 ` [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private Peter Xu
@ 2026-06-02  1:37   ` Michael Roth
  0 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02  1:37 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:55PM -0500, Peter Xu wrote:
> Rename the field to reflect the fact that the guest_memfd in this case only
> backs private portion of the ramblock rather than all of it.
> 
> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  include/system/memory.h   |  7 ++++---
>  include/system/ramblock.h |  7 ++++++-
>  accel/kvm/kvm-all.c       |  2 +-
>  system/memory.c           |  2 +-
>  system/physmem.c          | 21 +++++++++++----------
>  5 files changed, 23 insertions(+), 16 deletions(-)
> 
> diff --git a/include/system/memory.h b/include/system/memory.h
> index 3bd5ffa5e0..2384575065 100644
> --- a/include/system/memory.h
> +++ b/include/system/memory.h
> @@ -1823,10 +1823,11 @@ static inline bool memory_region_is_romd(MemoryRegion *mr)
>  bool memory_region_is_protected(MemoryRegion *mr);
>  
>  /**
> - * memory_region_has_guest_memfd: check whether a memory region has guest_memfd
> - *     associated
> + * memory_region_has_guest_memfd: check whether a memory region has
> + *     guest_memfd_private associated
>   *
> - * Returns %true if a memory region's ram_block has valid guest_memfd assigned.
> + * Returns %true if a memory region's ram_block has guest_memfd_private
> + * assigned.
>   *
>   * @mr: the memory region being queried
>   */
> diff --git a/include/system/ramblock.h b/include/system/ramblock.h
> index 76694fe1b5..9ecf7f970c 100644
> --- a/include/system/ramblock.h
> +++ b/include/system/ramblock.h
> @@ -40,7 +40,12 @@ struct RAMBlock {
>      Error *cpr_blocker;
>      int fd;
>      uint64_t fd_offset;
> -    int guest_memfd;
> +    /*
> +     * When RAM_GUEST_MEMFD_PRIVATE flag is set, this ramblock can have
> +     * private pages backed by guest_memfd_private specified, while shared
> +     * pages are backed by the ramblock on its own.
> +     */
> +    int guest_memfd_private;
>      RamBlockAttributes *attributes;
>      size_t page_size;
>      /* dirty bitmap used during migration */
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index c32fbcf9cc..1126b6f477 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -1603,7 +1603,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
>          mem->ram_start_offset = ram_start_offset;
>          mem->ram = ram;
>          mem->flags = kvm_mem_flags(mr);
> -        mem->guest_memfd = mr->ram_block->guest_memfd;
> +        mem->guest_memfd = mr->ram_block->guest_memfd_private;
>          mem->guest_memfd_offset = mem->guest_memfd >= 0 ?
>                                    (uint8_t*)ram - mr->ram_block->host : 0;
>  
> diff --git a/system/memory.c b/system/memory.c
> index 8b84661ae3..355b1fa26b 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -1899,7 +1899,7 @@ bool memory_region_is_protected(MemoryRegion *mr)
>  
>  bool memory_region_has_guest_memfd(MemoryRegion *mr)
>  {
> -    return mr->ram_block && mr->ram_block->guest_memfd >= 0;
> +    return mr->ram_block && mr->ram_block->guest_memfd_private >= 0;
>  }
>  
>  uint8_t memory_region_get_dirty_log_mask(MemoryRegion *mr)
> diff --git a/system/physmem.c b/system/physmem.c
> index 3555d2f6f7..c3c7a81310 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2219,7 +2219,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>              goto out_free;
>          }
>  
> -        assert(new_block->guest_memfd < 0);
> +        assert(new_block->guest_memfd_private < 0);
>  
>          ret = ram_block_coordinated_discard_require(true);
>          if (ret < 0) {
> @@ -2229,9 +2229,9 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>              goto out_free;
>          }
>  
> -        new_block->guest_memfd = kvm_create_guest_memfd(new_block->max_length,
> -                                                        0, errp);
> -        if (new_block->guest_memfd < 0) {
> +        new_block->guest_memfd_private =
> +            kvm_create_guest_memfd(new_block->max_length, 0, errp);
> +        if (new_block->guest_memfd_private < 0) {
>              qemu_mutex_unlock_ramlist();
>              goto out_free;
>          }
> @@ -2248,7 +2248,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>          new_block->attributes = ram_block_attributes_create(new_block);
>          if (!new_block->attributes) {
>              error_setg(errp, "Failed to create ram block attribute");
> -            close(new_block->guest_memfd);
> +            close(new_block->guest_memfd_private);
>              ram_block_coordinated_discard_require(false);
>              qemu_mutex_unlock_ramlist();
>              goto out_free;
> @@ -2385,7 +2385,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, ram_addr_t max_size,
>      new_block->max_length = max_size;
>      new_block->resized = resized;
>      new_block->flags = ram_flags;
> -    new_block->guest_memfd = -1;
> +    new_block->guest_memfd_private = -1;
>      new_block->host = file_ram_alloc(new_block, max_size, fd,
>                                       file_size < offset + max_size,
>                                       offset, errp);
> @@ -2558,7 +2558,7 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
>      new_block->used_length = size;
>      new_block->max_length = max_size;
>      new_block->fd = -1;
> -    new_block->guest_memfd = -1;
> +    new_block->guest_memfd_private = -1;
>      new_block->page_size = qemu_real_host_page_size();
>      new_block->host = host;
>      new_block->flags = ram_flags;
> @@ -2609,9 +2609,9 @@ static void reclaim_ramblock(RAMBlock *block)
>          qemu_anon_ram_free(block->host, block->max_length);
>      }
>  
> -    if (block->guest_memfd >= 0) {
> +    if (block->guest_memfd_private >= 0) {
>          ram_block_attributes_destroy(block->attributes);
> -        close(block->guest_memfd);
> +        close(block->guest_memfd_private);
>          ram_block_coordinated_discard_require(false);
>      }
>  
> @@ -4222,7 +4222,8 @@ int ram_block_discard_guest_memfd_range(RAMBlock *rb, uint64_t offset,
>  
>  #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
>      /* ignore fd_offset with guest_memfd */
> -    ret = fallocate(rb->guest_memfd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
> +    ret = fallocate(rb->guest_memfd_private,
> +                    FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>                      offset, length);
>  
>      if (ret) {
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
  2025-12-15 20:51 ` [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
  2025-12-16  5:49   ` Xiaoyao Li
@ 2026-06-02  1:39   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02  1:39 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:56PM -0500, Peter Xu wrote:
> This name is too generic, and can conflict with in-place guest-memfd
> support.  Add a _PRIVATE suffix to show what it really means: it is always
> silently using an internal guest-memfd to back a shared host backend,
> rather than used in-place.
> 
> This paves way for in-place guest-memfd, which means we can have a ramblock
> that allocates pages completely from guest-memfd (private or shared).
> 
> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  include/system/memory.h   | 8 ++++----
>  include/system/ram_addr.h | 2 +-
>  backends/hostmem-file.c   | 2 +-
>  backends/hostmem-memfd.c  | 2 +-
>  backends/hostmem-ram.c    | 2 +-
>  backends/hostmem-shm.c    | 2 +-
>  system/memory.c           | 3 ++-
>  system/physmem.c          | 8 ++++----
>  8 files changed, 15 insertions(+), 14 deletions(-)
> 
> diff --git a/include/system/memory.h b/include/system/memory.h
> index 2384575065..1f49f9a0ff 100644
> --- a/include/system/memory.h
> +++ b/include/system/memory.h
> @@ -263,7 +263,7 @@ typedef struct IOMMUTLBEvent {
>  #define RAM_READONLY_FD (1 << 11)
>  
>  /* RAM can be private that has kvm guest memfd backend */
> -#define RAM_GUEST_MEMFD   (1 << 12)
> +#define RAM_GUEST_MEMFD_PRIVATE   (1 << 12)
>  
>  /*
>   * In RAMBlock creation functions, if MAP_SHARED is 0 in the flags parameter,
> @@ -1401,7 +1401,7 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
>   *        must be unique within any device
>   * @size: size of the region.
>   * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE,
> - *             RAM_GUEST_MEMFD.
> + *             RAM_GUEST_MEMFD_PRIVATE.
>   * @errp: pointer to Error*, to store an error if it happens.
>   *
>   * Note that this function does not do anything to cause the data in the
> @@ -1463,7 +1463,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
>   *         (getpagesize()) will be used.
>   * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>   *             RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
> - *             RAM_READONLY_FD, RAM_GUEST_MEMFD
> + *             RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
>   * @path: the path in which to allocate the RAM.
>   * @offset: offset within the file referenced by path
>   * @errp: pointer to Error*, to store an error if it happens.
> @@ -1493,7 +1493,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
>   * @size: size of the region.
>   * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>   *             RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
> - *             RAM_READONLY_FD, RAM_GUEST_MEMFD
> + *             RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
>   * @fd: the fd to mmap.
>   * @offset: offset within the file referenced by fd
>   * @errp: pointer to Error*, to store an error if it happens.
> diff --git a/include/system/ram_addr.h b/include/system/ram_addr.h
> index 683485980c..930d3824d7 100644
> --- a/include/system/ram_addr.h
> +++ b/include/system/ram_addr.h
> @@ -92,7 +92,7 @@ static inline unsigned long int ramblock_recv_bitmap_offset(void *host_addr,
>   *  @resized: callback after calls to qemu_ram_resize
>   *  @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>   *              RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
> - *              RAM_READONLY_FD, RAM_GUEST_MEMFD
> + *              RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
>   *  @mem_path or @fd: specify the backing file or device
>   *  @offset: Offset into target file
>   *  @grow: extend file if necessary (but an empty file is always extended).
> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> index 8e3219c061..1f20cd8fd6 100644
> --- a/backends/hostmem-file.c
> +++ b/backends/hostmem-file.c
> @@ -86,7 +86,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>      ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
>      ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
>      ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
> +    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
>      ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
>      ram_flags |= RAM_NAMED_FILE;
>      return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
> diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
> index 923239f9cf..3f3e485709 100644
> --- a/backends/hostmem-memfd.c
> +++ b/backends/hostmem-memfd.c
> @@ -60,7 +60,7 @@ have_fd:
>      backend->aligned = true;
>      ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
>      ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
> +    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
>      return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
>                                            backend->size, ram_flags, fd, 0, errp);
>  }
> diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
> index 062b1abb11..96ad29112d 100644
> --- a/backends/hostmem-ram.c
> +++ b/backends/hostmem-ram.c
> @@ -30,7 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>      name = host_memory_backend_get_name(backend);
>      ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
>      ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
> +    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
>      return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
>                                                    name, backend->size,
>                                                    ram_flags, errp);
> diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
> index 806e2670e0..e86fb2e0aa 100644
> --- a/backends/hostmem-shm.c
> +++ b/backends/hostmem-shm.c
> @@ -54,7 +54,7 @@ have_fd:
>      /* Let's do the same as memory-backend-ram,share=on would do. */
>      ram_flags = RAM_SHARED;
>      ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
> +    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
>  
>      return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
>                                                backend_name, backend->size,
> diff --git a/system/memory.c b/system/memory.c
> index 355b1fa26b..e8c6d484e6 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -3755,7 +3755,8 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
>      DeviceState *owner_dev;
>  
>      if (!memory_region_init_ram_flags_nomigrate(mr, owner, name, size,
> -                                                RAM_GUEST_MEMFD, errp)) {
> +                                                RAM_GUEST_MEMFD_PRIVATE,
> +                                                errp)) {
>          return false;
>      }
>      /* This will assert if owner is neither NULL nor a DeviceState.
> diff --git a/system/physmem.c b/system/physmem.c
> index c3c7a81310..d30fd690d1 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2203,7 +2203,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>          }
>      }
>  
> -    if (new_block->flags & RAM_GUEST_MEMFD) {
> +    if (new_block->flags & RAM_GUEST_MEMFD_PRIVATE) {
>          int ret;
>  
>          if (!kvm_enabled()) {
> @@ -2341,7 +2341,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, ram_addr_t max_size,
>      /* Just support these ram flags by now. */
>      assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE |
>                            RAM_PROTECTED | RAM_NAMED_FILE | RAM_READONLY |
> -                          RAM_READONLY_FD | RAM_GUEST_MEMFD |
> +                          RAM_READONLY_FD | RAM_GUEST_MEMFD_PRIVATE |
>                            RAM_RESIZEABLE)) == 0);
>      assert(max_size >= size);
>  
> @@ -2498,7 +2498,7 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
>      ram_flags &= ~RAM_PRIVATE;
>  
>      assert((ram_flags & ~(RAM_SHARED | RAM_RESIZEABLE | RAM_PREALLOC |
> -                          RAM_NORESERVE | RAM_GUEST_MEMFD)) == 0);
> +                          RAM_NORESERVE | RAM_GUEST_MEMFD_PRIVATE)) == 0);
>      assert(!host ^ (ram_flags & RAM_PREALLOC));
>      assert(max_size >= size);
>  
> @@ -2581,7 +2581,7 @@ RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>  RAMBlock *qemu_ram_alloc(ram_addr_t size, uint32_t ram_flags,
>                           MemoryRegion *mr, Error **errp)
>  {
> -    assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE | RAM_GUEST_MEMFD |
> +    assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE | RAM_GUEST_MEMFD_PRIVATE |
>                            RAM_PRIVATE)) == 0);
>      return qemu_ram_alloc_internal(size, size, NULL, NULL, ram_flags, mr, errp);
>  }
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private()
  2025-12-15 20:51 ` [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private() Peter Xu
@ 2026-06-02  1:40   ` Michael Roth
  0 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02  1:40 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:57PM -0500, Peter Xu wrote:
> Rename the function with "_private" suffix, to show that it returns true
> only if it has an internal guest-memfd to back private pages (rather than
> fully shared guest-memfd).
> 
> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  include/system/memory.h | 4 ++--
>  accel/kvm/kvm-all.c     | 6 +++---
>  system/memory.c         | 2 +-
>  3 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/include/system/memory.h b/include/system/memory.h
> index 1f49f9a0ff..9b58303bb8 100644
> --- a/include/system/memory.h
> +++ b/include/system/memory.h
> @@ -1823,7 +1823,7 @@ static inline bool memory_region_is_romd(MemoryRegion *mr)
>  bool memory_region_is_protected(MemoryRegion *mr);
>  
>  /**
> - * memory_region_has_guest_memfd: check whether a memory region has
> + * memory_region_has_guest_memfd_private: check whether a memory region has
>   *     guest_memfd_private associated
>   *
>   * Returns %true if a memory region's ram_block has guest_memfd_private
> @@ -1831,7 +1831,7 @@ bool memory_region_is_protected(MemoryRegion *mr);
>   *
>   * @mr: the memory region being queried
>   */
> -bool memory_region_has_guest_memfd(MemoryRegion *mr);
> +bool memory_region_has_guest_memfd_private(MemoryRegion *mr);
>  
>  /**
>   * memory_region_get_iommu: check whether a memory region is an iommu
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 1126b6f477..0b7ce5a9dd 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -666,7 +666,7 @@ static int kvm_mem_flags(MemoryRegion *mr)
>      if (readonly && kvm_readonly_mem_allowed) {
>          flags |= KVM_MEM_READONLY;
>      }
> -    if (memory_region_has_guest_memfd(mr)) {
> +    if (memory_region_has_guest_memfd_private(mr)) {
>          assert(kvm_guest_memfd_supported);
>          flags |= KVM_MEM_GUEST_MEMFD;
>      }
> @@ -1615,7 +1615,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
>              abort();
>          }
>  
> -        if (memory_region_has_guest_memfd(mr)) {
> +        if (memory_region_has_guest_memfd_private(mr)) {
>              err = kvm_set_memory_attributes_private(start_addr, slot_size);
>              if (err) {
>                  error_report("%s: failed to set memory attribute private: %s",
> @@ -3101,7 +3101,7 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
>          return ret;
>      }
>  
> -    if (!memory_region_has_guest_memfd(mr)) {
> +    if (!memory_region_has_guest_memfd_private(mr)) {
>          /*
>           * Because vMMIO region must be shared, guest TD may convert vMMIO
>           * region to shared explicitly.  Don't complain such case.  See
> diff --git a/system/memory.c b/system/memory.c
> index e8c6d484e6..d70968c966 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -1897,7 +1897,7 @@ bool memory_region_is_protected(MemoryRegion *mr)
>      return mr->ram && (mr->ram_block->flags & RAM_PROTECTED);
>  }
>  
> -bool memory_region_has_guest_memfd(MemoryRegion *mr)
> +bool memory_region_has_guest_memfd_private(MemoryRegion *mr)
>  {
>      return mr->ram_block && mr->ram_block->guest_memfd_private >= 0;
>  }
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private
  2025-12-15 20:51 ` [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private Peter Xu
  2025-12-16  5:54   ` Xiaoyao Li
@ 2026-06-02 18:56   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02 18:56 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:58PM -0500, Peter Xu wrote:
> Rename the HostMemoryBackend.guest_memfd field to reflect what it really
> means, on whether it needs guest_memfd to back its private portion of
> mapping.  This will help on clearance when we introduce in-place
> guest_memfd for hostmem.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  include/system/hostmem.h | 2 +-
>  backends/hostmem-file.c  | 2 +-
>  backends/hostmem-memfd.c | 2 +-
>  backends/hostmem-ram.c   | 2 +-
>  backends/hostmem-shm.c   | 2 +-
>  backends/hostmem.c       | 2 +-
>  6 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/include/system/hostmem.h b/include/system/hostmem.h
> index 88fa791ac7..dcbf81aeae 100644
> --- a/include/system/hostmem.h
> +++ b/include/system/hostmem.h
> @@ -76,7 +76,7 @@ struct HostMemoryBackend {
>      uint64_t size;
>      bool merge, dump, use_canonical_path;
>      bool prealloc, is_mapped, share, reserve;
> -    bool guest_memfd, aligned;
> +    bool guest_memfd_private, aligned;
>      uint32_t prealloc_threads;
>      ThreadContext *prealloc_context;
>      DECLARE_BITMAP(host_nodes, MAX_NODES + 1);
> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
> index 1f20cd8fd6..0e4cfd6dc6 100644
> --- a/backends/hostmem-file.c
> +++ b/backends/hostmem-file.c
> @@ -86,7 +86,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>      ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
>      ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
>      ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
> +    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
>      ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
>      ram_flags |= RAM_NAMED_FILE;
>      return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
> diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
> index 3f3e485709..ea93f034e4 100644
> --- a/backends/hostmem-memfd.c
> +++ b/backends/hostmem-memfd.c
> @@ -60,7 +60,7 @@ have_fd:
>      backend->aligned = true;
>      ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
>      ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
> +    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
>      return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
>                                            backend->size, ram_flags, fd, 0, errp);
>  }
> diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
> index 96ad29112d..6a507fad77 100644
> --- a/backends/hostmem-ram.c
> +++ b/backends/hostmem-ram.c
> @@ -30,7 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>      name = host_memory_backend_get_name(backend);
>      ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
>      ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
> +    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
>      return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
>                                                    name, backend->size,
>                                                    ram_flags, errp);
> diff --git a/backends/hostmem-shm.c b/backends/hostmem-shm.c
> index e86fb2e0aa..4766db6aad 100644
> --- a/backends/hostmem-shm.c
> +++ b/backends/hostmem-shm.c
> @@ -54,7 +54,7 @@ have_fd:
>      /* Let's do the same as memory-backend-ram,share=on would do. */
>      ram_flags = RAM_SHARED;
>      ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> -    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
> +    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
>  
>      return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
>                                                backend_name, backend->size,
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index 35734d6f4d..70450733db 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -288,7 +288,7 @@ static void host_memory_backend_init(Object *obj)
>      /* TODO: convert access to globals to compat properties */
>      backend->merge = machine_mem_merge(machine);
>      backend->dump = machine_dump_guest_core(machine);
> -    backend->guest_memfd = machine_require_guest_memfd(machine);
> +    backend->guest_memfd_private = machine_require_guest_memfd(machine);
>      backend->reserve = true;
>      backend->prealloc_threads = machine->smp.cpus;
>  }
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
  2025-12-15 20:51 ` [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM Peter Xu
  2025-12-16  6:54   ` Xiaoyao Li
  2025-12-16 14:02   ` Fabiano Rosas
@ 2026-06-02 21:40   ` Michael Roth
  2026-06-05  7:23     ` David Hildenbrand (Arm)
  2 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2026-06-02 21:40 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:59PM -0500, Peter Xu wrote:
> Host backends supports guest-memfd now by detecting whether it's a
> confidential VM.  There's no way to choose it yet from the memory level to
> use it fully shared.  If we use guest-memfd, it so far always implies we
> need two layers of memory backends, while the guest-memfd only provides the
> private set of pages.
> 
> This patch introduces a way so that QEMU can consume guest memfd as the
> only source of memory to back the object (aka, fully shared).
> 
> To use the fully shared guest-memfd, one can add a memfd object with:
> 
>   -object memory-backend-memfd,guest-memfd=on,share=on
> 
> Note that share=on is required with fully shared guest_memfd.
> 
> PS: there's a trivial touch-up on fd<0 check, because the stub to create
> guest-memfd may return negative but not -1.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/qom.json            |  6 ++++-
>  backends/hostmem-memfd.c | 53 ++++++++++++++++++++++++++++++++++++----
>  2 files changed, 53 insertions(+), 6 deletions(-)
> 
> diff --git a/qapi/qom.json b/qapi/qom.json
> index 6f5c9de0f0..9ebf17bfc7 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -763,13 +763,17 @@
>  # @seal: if true, create a sealed-file, which will block further
>  #     resizing of the memory (default: true)
>  #
> +# @guest-memfd: if true, use guest-memfd to back the memory region.
> +#     (default: false, since: 11.0)
> +#
>  # Since: 2.12
>  ##
>  { 'struct': 'MemoryBackendMemfdProperties',
>    'base': 'MemoryBackendProperties',
>    'data': { '*hugetlb': 'bool',
>              '*hugetlbsize': 'size',
> -            '*seal': 'bool' },
> +            '*seal': 'bool',
> +            '*guest-memfd': 'bool' },
>    'if': 'CONFIG_LINUX' }
>  
>  ##
> diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
> index ea93f034e4..9299cd7675 100644
> --- a/backends/hostmem-memfd.c
> +++ b/backends/hostmem-memfd.c
> @@ -18,6 +18,8 @@
>  #include "qapi/error.h"
>  #include "qom/object.h"
>  #include "migration/cpr.h"
> +#include "system/kvm.h"
> +#include <linux/kvm.h>
>  
>  OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendMemfd, MEMORY_BACKEND_MEMFD)
>  
> @@ -28,6 +30,13 @@ struct HostMemoryBackendMemfd {
>      bool hugetlb;
>      uint64_t hugetlbsize;
>      bool seal;
> +    /*
> +     * NOTE: this differs from HostMemoryBackend's guest_memfd_private,
> +     * which represents a internally private guest-memfd that only backs
> +     * private pages.  Instead, this flag marks the memory backend will
> +     * 100% use the guest-memfd pages in-place.

I think there was previous discussion in v1 or v2 about reserving the
"in-place" naming for "in-place conversion of memory" rather than
mmap-able guest_memfd backends.

> +     */
> +    bool guest_memfd;
>  };
>  
>  static bool
> @@ -47,11 +56,26 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>          goto have_fd;
>      }
>  
> -    fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
> -                           m->hugetlb, m->hugetlbsize, m->seal ?
> -                           F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
> -                           errp);
> -    if (fd == -1) {
> +    if (m->guest_memfd) {
> +        /* User choose to use fully shared guest-memfd to back the VM.. */
> +        if (!backend->share) {
> +            error_setg(errp, "Guest-memfd=on must be used with share=on");
> +            return false;
> +        }
> +
> +        /* TODO: add huge page support */

Until that's added, the related options should be disabled. m->seal as
well doesn't seem to be applicable for guest_memfd case.

-Mike

> +        fd = kvm_create_guest_memfd(backend->size,
> +                                    GUEST_MEMFD_FLAG_MMAP |
> +                                    GUEST_MEMFD_FLAG_INIT_SHARED,
> +                                    errp);
> +    } else {
> +        fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
> +                               m->hugetlb, m->hugetlbsize, m->seal ?
> +                               F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
> +                               errp);
> +    }
> +
> +    if (fd < 0) {
>          return false;
>      }
>      cpr_save_fd(name, 0, fd);
> @@ -65,6 +89,18 @@ have_fd:
>                                            backend->size, ram_flags, fd, 0, errp);
>  }
>  
> +static bool
> +memfd_backend_get_guest_memfd(Object *o, Error **errp)
> +{
> +    return MEMORY_BACKEND_MEMFD(o)->guest_memfd;
> +}
> +
> +static void
> +memfd_backend_set_guest_memfd(Object *o, bool value, Error **errp)
> +{
> +    MEMORY_BACKEND_MEMFD(o)->guest_memfd = value;
> +}
> +
>  static bool
>  memfd_backend_get_hugetlb(Object *o, Error **errp)
>  {
> @@ -152,6 +188,13 @@ memfd_backend_class_init(ObjectClass *oc, const void *data)
>          object_class_property_set_description(oc, "hugetlbsize",
>                                                "Huge pages size (ex: 2M, 1G)");
>      }
> +
> +    object_class_property_add_bool(oc, "guest-memfd",
> +                                   memfd_backend_get_guest_memfd,
> +                                   memfd_backend_set_guest_memfd);
> +    object_class_property_set_description(oc, "guest-memfd",
> +                                          "Use guest memfd");
> +
>      object_class_property_add_bool(oc, "seal",
>                                     memfd_backend_get_seal,
>                                     memfd_backend_set_seal);
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private()
  2025-12-15 20:52 ` [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private() Peter Xu
  2025-12-16  6:55   ` Xiaoyao Li
@ 2026-06-02 21:46   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02 21:46 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:52:00PM -0500, Peter Xu wrote:
> Differenciate it from fully shared guest-memfd use cases.
> 
> When at it, add proper brackets in kvm_handle_hc_map_gpa_range() otherwise
> checkpatch may complain.
> 
> Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  include/hw/boards.h   | 2 +-
>  backends/hostmem.c    | 2 +-
>  hw/core/machine.c     | 2 +-
>  hw/i386/pc.c          | 2 +-
>  hw/i386/pc_sysfw.c    | 4 ++--
>  hw/i386/x86-common.c  | 4 ++--
>  target/i386/kvm/kvm.c | 3 ++-
>  7 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index a48ed4f86a..3a0a051d19 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -42,7 +42,7 @@ bool machine_usb(MachineState *machine);
>  int machine_phandle_start(MachineState *machine);
>  bool machine_dump_guest_core(MachineState *machine);
>  bool machine_mem_merge(MachineState *machine);
> -bool machine_require_guest_memfd(MachineState *machine);
> +bool machine_require_guest_memfd_private(MachineState *machine);
>  HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine);
>  void machine_set_cpu_numa_node(MachineState *machine,
>                                 const CpuInstanceProperties *props,
> diff --git a/backends/hostmem.c b/backends/hostmem.c
> index 70450733db..e2dcae50c4 100644
> --- a/backends/hostmem.c
> +++ b/backends/hostmem.c
> @@ -288,7 +288,7 @@ static void host_memory_backend_init(Object *obj)
>      /* TODO: convert access to globals to compat properties */
>      backend->merge = machine_mem_merge(machine);
>      backend->dump = machine_dump_guest_core(machine);
> -    backend->guest_memfd_private = machine_require_guest_memfd(machine);
> +    backend->guest_memfd_private = machine_require_guest_memfd_private(machine);
>      backend->reserve = true;
>      backend->prealloc_threads = machine->smp.cpus;
>  }
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 27372bb01e..3bdce197f7 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -1376,7 +1376,7 @@ bool machine_mem_merge(MachineState *machine)
>      return machine->mem_merge;
>  }
>  
> -bool machine_require_guest_memfd(MachineState *machine)
> +bool machine_require_guest_memfd_private(MachineState *machine)
>  {
>      return machine->cgs && machine->cgs->require_guest_memfd;
>  }
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index f8b919cb6c..b2d55ceb5e 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -962,7 +962,7 @@ void pc_memory_init(PCMachineState *pcms,
>  
>      if (!is_tdx_vm()) {
>          option_rom_mr = g_malloc(sizeof(*option_rom_mr));
> -        if (machine_require_guest_memfd(machine)) {
> +        if (machine_require_guest_memfd_private(machine)) {
>              memory_region_init_ram_guest_memfd(option_rom_mr, NULL, "pc.rom",
>                                              PC_ROM_SIZE, &error_fatal);
>          } else {
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index 1a12b635ad..1c37258654 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -52,7 +52,7 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
>  
>      /* map the last 128KB of the BIOS in ISA space */
>      isa_bios_size = MIN(flash_size, 128 * KiB);
> -    if (machine_require_guest_memfd(MACHINE(pcms))) {
> +    if (machine_require_guest_memfd_private(MACHINE(pcms))) {
>          memory_region_init_ram_guest_memfd(isa_bios, NULL, "isa-bios",
>                                             isa_bios_size, &error_fatal);
>      } else {
> @@ -71,7 +71,7 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
>             ((uint8_t*)flash_ptr) + (flash_size - isa_bios_size),
>             isa_bios_size);
>  
> -    if (!machine_require_guest_memfd(current_machine)) {
> +    if (!machine_require_guest_memfd_private(current_machine)) {
>          memory_region_set_readonly(isa_bios, true);
>      }
>  }
> diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
> index c844749900..33ac7fb6e9 100644
> --- a/hw/i386/x86-common.c
> +++ b/hw/i386/x86-common.c
> @@ -1044,7 +1044,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
>          (bios_size % 65536) != 0) {
>          goto bios_error;
>      }
> -    if (machine_require_guest_memfd(MACHINE(x86ms))) {
> +    if (machine_require_guest_memfd_private(MACHINE(x86ms))) {
>          memory_region_init_ram_guest_memfd(&x86ms->bios, NULL, "pc.bios",
>                                             bios_size, &error_fatal);
>          if (is_tdx_vm()) {
> @@ -1074,7 +1074,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
>      }
>      g_free(filename);
>  
> -    if (!machine_require_guest_memfd(MACHINE(x86ms))) {
> +    if (!machine_require_guest_memfd_private(MACHINE(x86ms))) {
>          /* map the last 128KB of the BIOS in ISA space */
>          x86_isa_bios_init(&x86ms->isa_bios, rom_memory, &x86ms->bios,
>                            !isapc_ram_fw);
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 60c7981138..5d0d02bcaf 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -6050,8 +6050,9 @@ static int kvm_handle_hc_map_gpa_range(X86CPU *cpu, struct kvm_run *run)
>      uint64_t gpa, size, attributes;
>      int ret;
>  
> -    if (!machine_require_guest_memfd(current_machine))
> +    if (!machine_require_guest_memfd_private(current_machine)) {
>          return -EINVAL;
> +    }
>  
>      gpa = run->hypercall.args[0];
>      size = run->hypercall.args[1] * TARGET_PAGE_SIZE;
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() to *_private()
  2025-12-15 20:52 ` [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() " Peter Xu
  2025-12-16  6:56   ` Xiaoyao Li
@ 2026-06-02 21:49   ` Michael Roth
  1 sibling, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-02 21:49 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:52:01PM -0500, Peter Xu wrote:
> Differenciate it from fully shared guest-memfd use cases.
> 
> Suggested-by: Xiaoyao Li <xiaoyao.li@intel.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Michael Roth <michael.roth@amd.com>

> ---
>  include/system/memory.h | 10 +++++-----
>  backends/igvm.c         |  4 ++--
>  hw/i386/pc.c            |  4 ++--
>  hw/i386/pc_sysfw.c      |  4 ++--
>  hw/i386/x86-common.c    |  4 ++--
>  system/memory.c         | 10 +++++-----
>  6 files changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/include/system/memory.h b/include/system/memory.h
> index 9b58303bb8..b3d000a563 100644
> --- a/include/system/memory.h
> +++ b/include/system/memory.h
> @@ -1693,11 +1693,11 @@ bool memory_region_init_ram(MemoryRegion *mr,
>                              uint64_t size,
>                              Error **errp);
>  
> -bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
> -                                        Object *owner,
> -                                        const char *name,
> -                                        uint64_t size,
> -                                        Error **errp);
> +bool memory_region_init_ram_guest_memfd_private(MemoryRegion *mr,
> +                                                Object *owner,
> +                                                const char *name,
> +                                                uint64_t size,
> +                                                Error **errp);
>  
>  /**
>   * memory_region_init_rom: Initialize a ROM memory region.
> diff --git a/backends/igvm.c b/backends/igvm.c
> index 905bd8d989..91631829e5 100644
> --- a/backends/igvm.c
> +++ b/backends/igvm.c
> @@ -221,8 +221,8 @@ static void *qigvm_prepare_memory(QIgvm *ctx, uint64_t addr, uint64_t size,
>              g_strdup_printf("igvm.%X", region_identifier);
>          igvm_pages = g_new0(MemoryRegion, 1);
>          if (ctx->cgs && ctx->cgs->require_guest_memfd) {
> -            if (!memory_region_init_ram_guest_memfd(igvm_pages, NULL,
> -                                                    region_name, size, errp)) {
> +            if (!memory_region_init_ram_guest_memfd_private(
> +                    igvm_pages, NULL, region_name, size, errp)) {
>                  return NULL;
>              }
>          } else {
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index b2d55ceb5e..41dfbbdcf0 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -963,8 +963,8 @@ void pc_memory_init(PCMachineState *pcms,
>      if (!is_tdx_vm()) {
>          option_rom_mr = g_malloc(sizeof(*option_rom_mr));
>          if (machine_require_guest_memfd_private(machine)) {
> -            memory_region_init_ram_guest_memfd(option_rom_mr, NULL, "pc.rom",
> -                                            PC_ROM_SIZE, &error_fatal);
> +            memory_region_init_ram_guest_memfd_private(
> +                option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE, &error_fatal);
>          } else {
>              memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
>                                  &error_fatal);
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index 1c37258654..ad55d4eba6 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -53,8 +53,8 @@ static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
>      /* map the last 128KB of the BIOS in ISA space */
>      isa_bios_size = MIN(flash_size, 128 * KiB);
>      if (machine_require_guest_memfd_private(MACHINE(pcms))) {
> -        memory_region_init_ram_guest_memfd(isa_bios, NULL, "isa-bios",
> -                                           isa_bios_size, &error_fatal);
> +        memory_region_init_ram_guest_memfd_private(
> +            isa_bios, NULL, "isa-bios", isa_bios_size, &error_fatal);
>      } else {
>          memory_region_init_ram(isa_bios, NULL, "isa-bios", isa_bios_size,
>                                 &error_fatal);
> diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
> index 33ac7fb6e9..27854a9164 100644
> --- a/hw/i386/x86-common.c
> +++ b/hw/i386/x86-common.c
> @@ -1045,8 +1045,8 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
>          goto bios_error;
>      }
>      if (machine_require_guest_memfd_private(MACHINE(x86ms))) {
> -        memory_region_init_ram_guest_memfd(&x86ms->bios, NULL, "pc.bios",
> -                                           bios_size, &error_fatal);
> +        memory_region_init_ram_guest_memfd_private(
> +            &x86ms->bios, NULL, "pc.bios", bios_size, &error_fatal);
>          if (is_tdx_vm()) {
>              tdx_set_tdvf_region(&x86ms->bios);
>          }
> diff --git a/system/memory.c b/system/memory.c
> index d70968c966..28810dcb29 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -3746,11 +3746,11 @@ bool memory_region_init_ram(MemoryRegion *mr,
>      return true;
>  }
>  
> -bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
> -                                        Object *owner,
> -                                        const char *name,
> -                                        uint64_t size,
> -                                        Error **errp)
> +bool memory_region_init_ram_guest_memfd_private(MemoryRegion *mr,
> +                                                Object *owner,
> +                                                const char *name,
> +                                                uint64_t size,
> +                                                Error **errp)
>  {
>      DeviceState *owner_dev;
>  
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends
  2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
                   ` (11 preceding siblings ...)
  2025-12-15 20:52 ` [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd Peter Xu
@ 2026-06-02 22:02 ` Michael Roth
  2026-06-03 19:27   ` Peter Xu
  12 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2026-06-02 22:02 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Mon, Dec 15, 2025 at 03:51:51PM -0500, Peter Xu wrote:
> v1: https://lore.kernel.org/r/20251023185913.2923322-1-peterx@redhat.com
> v2: https://lore.kernel.org/r/20251119172913.577392-1-peterx@redhat.com
> 
> v3:
> - Collect R-bs from Xiaoyao
> - Rebased to 10.2-rc3; no dependency needed now, as those got merged
> - Reorder patches, touch up commit messages or comments on in-place misuse
> - Added patch "kvm: Provide explicit error for kvm_create_guest_memfd()" [Xiaoyao]
> - Added one patch for renaming machine_require_guest_memfd() [Xiaoyao]
> - Added one patch for renaming memory_region_init_ram_guest_memfd() [Xiaoyao]
> 
> =========8<===========
> 
> This series allows QEMU to consume init-shared guest-memfd to be a common
> memory backend. Before this series, guest-memfd was only used in CoCo and
> the fds will be created implicitly whenever CoCo environment is detected.
> When used in init-shared mode, the guest-memfd will be specified in the
> command lines directly just like other types of memory backends.
> 
> In the current patchset, I reused the memory-backend-memfd object, rather
> than creating a new type of object.  After all, guest-memfd (at least from
> userspace POV) works similarly like a memfd, except that it was tailored
> for VM's use case.
> 
> This approach so far also does not involve gmem bindings to KVM instances,
> hence it is not prone to issues when the same chunk of RAM will be attached
> to more than one KVM memslots.
> 
> Now, instead of using a normal memfd backend using:
> 
>   -object memory-backend-memfd,id=ID,size=SIZE,share=on
> 
> One can also boot a VM with guest-memfd:
> 
>   -object memory-backend-memfd,id=ID,size=SIZE,share=on,guest-memfd=on

Hi Peter,

I'm working on enabling support for this, as well as enabling in-place
conversion support for confidential VMs[1]. In my series I added a
dedicated memory-backend-guest-memfd to handle using mmapable
guest_memfd to back normal VMs (and confidential VMs with in-place
conversion enabled on top). Xiaoyao mentioned we had some overlap and
potential inter-dependencies between our series so I took some notes
on the differences which I've included at the bottom of this email...

But at a high-level I think this series is further along in implementing
guest_memfd for normal VMs, and I would plan to just mostly rebase my
in-place conversion patches on top of your series. However I think it
would be a good idea to go with a dedicated memory-backend-guest-memfd
for reasons I outlined in my notes, so maybe this needs to be discussed
more.

I also saw you were open to having someone pick up these patches if you
don't think you'll have a chance to get to them near-term, so I'd be
happy to pick them up if that's preferable.

Thanks!

-Mike

[1] https://lore.kernel.org/qemu-devel/20260528000416.8161-1-michael.roth@amd.com/

Comparisons to the above patchset:

  [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported 
    - similar to:
        [PATCH 01/12] accel/kvm: Decouple guest_memfd checks from memory attribute checks
    - to allow mmap case, both defer error handling to ram_block_add() + RAM_GUEST_MEMFD path
    - pros: adds nice kvm_private_memory_attribute_supported() helper
    - cons: my patch checks/prints error via kvm_create_guest_memfd(), which
      makes it a more re-usable error since ram_block_add() isn't the only
      caller.
    - IMO, I think we should merge the pros of your patch into my similar patch
      and add your Co-developed-by, but also fine to keep yours as-is and deal
      with anything else needed as a follow-up patch
  [PATCH v3 02/12] kvm: Detect guest-memfd flags supported
    - similar to the kvm_supported_guest_memfd_flags / kvm_create_guest_memfd_shared()
      additions that are part of:
        [02/12] hostmem: Introduce dedicated memory backend for guest_memfd
    - This patch could be treated as a common dependency of the above and I can
      drop the corresponding changes from my patch
  [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd()
    - Keep as-is
  [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private
    - Keep as-is
  [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
    - Keep as-is
  [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private()
    - Keep as-is
  [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private
    - Keep as-is
  [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
    - alternative to:
        [02/12] hostmem: Introduce dedicated memory backend for guest_memfd
    - pros: re-uses infrastructure from hostmem-memfd
    - pros: less command-line changes vs. dedicated hostmem-guest-memfd (less libvirt changes?)
    - cons: less flexibility vs. a dedicated backend
    - cons: more risk of memfd vs guest_memfd behavior/options diverging over
            time and having less commonality (e.g. if hugetlb has special options
            we wouldn't need to muddy the existing documentation for normal
            memfds or introduce alternative options alongside)
    - IMO, a clean state patch only requires ~90 lines of potentially-duplicate
      code, and that's offset to some degree by needing less special-casing
      throughout hostmem-memfd.c (e.g. this patchset adds 55 lines on top), and
      it seems worthwhile given some of the advanced use-cases planned around
      guest_memfd (hugetlb, DAX-like functionality, and persisting userspace
      across kexec) that might require special handling/options for very
      different use-cases than normal memfds.
  [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private()
    - Keep as-is
    - (all these renames are a nice cleanup/prep and will help a lot with making
      in-place conversion handling more readable)
  [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() to *_private()
    - Keep as-is
    - (all these renames are a nice cleanup/prep and will help a lot with making
      in-place conversion handling more readable)
  [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type
    - Keep as-is
  [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd
    - Keep as-is

> 
> The init-shared guest-memfd relies on almost the latest linux, as the
> mmap() support just landed v6.18-rc2.  When run it on an older qemu, we'll
> see errors like:
> 
>   qemu-system-x86_64: KVM does not support guest_memfd
> 
> One thing to mention is live migration is by default supported, however
> postcopy is still currently not supported.  The postcopy support will have
> some kernel dependency work to be merged in Linux first.
> 
> Thanks,
> 
> Peter Xu (11):
>   kvm: Detect guest-memfd flags supported
>   kvm: Provide explicit error for kvm_create_guest_memfd()
>   ramblock: Rename guest_memfd to guest_memfd_private
>   memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
>   memory: Rename memory_region_has_guest_memfd() to *_private()
>   hostmem: Rename guest_memfd to guest_memfd_private
>   hostmem: Support fully shared guest memfd to back a VM
>   machine: Rename machine_require_guest_memfd() to *_private()
>   memory: Rename memory_region_init_ram_guest_memfd() to *_private()
>   tests/migration-test: Support guest-memfd init shared mem type
>   tests/migration-test: Add a precopy test for guest-memfd
> 
> Xiaoyao Li (1):
>   kvm: Decouple memory attribute check from kvm_guest_memfd_supported
> 
>  qapi/qom.json                         |  6 ++-
>  include/hw/boards.h                   |  2 +-
>  include/system/hostmem.h              |  2 +-
>  include/system/kvm.h                  |  1 +
>  include/system/memory.h               | 27 ++++++------
>  include/system/ram_addr.h             |  2 +-
>  include/system/ramblock.h             |  7 +++-
>  tests/qtest/migration/framework.h     |  4 ++
>  accel/kvm/kvm-all.c                   | 33 ++++++++++++---
>  accel/stubs/kvm-stub.c                |  6 +++
>  backends/hostmem-file.c               |  2 +-
>  backends/hostmem-memfd.c              | 55 +++++++++++++++++++++---
>  backends/hostmem-ram.c                |  2 +-
>  backends/hostmem-shm.c                |  2 +-
>  backends/hostmem.c                    |  2 +-
>  backends/igvm.c                       |  4 +-
>  hw/core/machine.c                     |  2 +-
>  hw/i386/pc.c                          |  6 +--
>  hw/i386/pc_sysfw.c                    |  8 ++--
>  hw/i386/x86-common.c                  |  8 ++--
>  system/memory.c                       | 17 ++++----
>  system/physmem.c                      | 37 ++++++++++-------
>  target/i386/kvm/kvm.c                 |  3 +-
>  tests/qtest/migration/framework.c     | 60 +++++++++++++++++++++++++++
>  tests/qtest/migration/precopy-tests.c | 12 ++++++
>  25 files changed, 239 insertions(+), 71 deletions(-)
> 
> -- 
> 2.50.1
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends
  2026-06-02 22:02 ` [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Michael Roth
@ 2026-06-03 19:27   ` Peter Xu
  2026-06-04 22:36     ` Michael Roth
  0 siblings, 1 reply; 47+ messages in thread
From: Peter Xu @ 2026-06-03 19:27 UTC (permalink / raw)
  To: Michael Roth
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Tue, Jun 02, 2026 at 05:02:29PM -0500, Michael Roth wrote:
> On Mon, Dec 15, 2025 at 03:51:51PM -0500, Peter Xu wrote:
> > v1: https://lore.kernel.org/r/20251023185913.2923322-1-peterx@redhat.com
> > v2: https://lore.kernel.org/r/20251119172913.577392-1-peterx@redhat.com
> > 
> > v3:
> > - Collect R-bs from Xiaoyao
> > - Rebased to 10.2-rc3; no dependency needed now, as those got merged
> > - Reorder patches, touch up commit messages or comments on in-place misuse
> > - Added patch "kvm: Provide explicit error for kvm_create_guest_memfd()" [Xiaoyao]
> > - Added one patch for renaming machine_require_guest_memfd() [Xiaoyao]
> > - Added one patch for renaming memory_region_init_ram_guest_memfd() [Xiaoyao]
> > 
> > =========8<===========
> > 
> > This series allows QEMU to consume init-shared guest-memfd to be a common
> > memory backend. Before this series, guest-memfd was only used in CoCo and
> > the fds will be created implicitly whenever CoCo environment is detected.
> > When used in init-shared mode, the guest-memfd will be specified in the
> > command lines directly just like other types of memory backends.
> > 
> > In the current patchset, I reused the memory-backend-memfd object, rather
> > than creating a new type of object.  After all, guest-memfd (at least from
> > userspace POV) works similarly like a memfd, except that it was tailored
> > for VM's use case.
> > 
> > This approach so far also does not involve gmem bindings to KVM instances,
> > hence it is not prone to issues when the same chunk of RAM will be attached
> > to more than one KVM memslots.
> > 
> > Now, instead of using a normal memfd backend using:
> > 
> >   -object memory-backend-memfd,id=ID,size=SIZE,share=on
> > 
> > One can also boot a VM with guest-memfd:
> > 
> >   -object memory-backend-memfd,id=ID,size=SIZE,share=on,guest-memfd=on
> 
> Hi Peter,

Hi, Michael,

> 
> I'm working on enabling support for this, as well as enabling in-place
> conversion support for confidential VMs[1]. In my series I added a
> dedicated memory-backend-guest-memfd to handle using mmapable
> guest_memfd to back normal VMs (and confidential VMs with in-place
> conversion enabled on top). Xiaoyao mentioned we had some overlap and
> potential inter-dependencies between our series so I took some notes
> on the differences which I've included at the bottom of this email...

To Xiaoyao: thanks for linking these works, and also thanks for answering
other question I raised in the separate thread.

> 
> But at a high-level I think this series is further along in implementing
> guest_memfd for normal VMs, and I would plan to just mostly rebase my
> in-place conversion patches on top of your series. However I think it
> would be a good idea to go with a dedicated memory-backend-guest-memfd
> for reasons I outlined in my notes, so maybe this needs to be discussed
> more.

To me, it was just natural when working on that to reuse memfd backend,
because conceptually they're really the same: I guess guest-memfd is named
as guest-memfd (not guest-special-fd etc.) also because of that.

I don't have a strong feeling here, hostmem-memfd.c is tiny so duplicating
isn't a major concern even if so.  It's just that I don't yet see when gmem
will become special.

Say, all of the features that memfd provides can easily be applied to
guest-memfd either now or at some point later:

  - hugetlb/hugetlbsize being one of them already, I believe we almost know
    1G will happen to gmem soon
  - seal: I don't see why we can't seal a gmemfd too.. maybe it'll come, in
    general the whole seal concept can apply to gmem too.
  - cpr support on memfd (or anything about live update in the future to
    happen on gmem): I believe gmem also want it..

IIUC it's a matter of if we expect future property of guest-memfd that will
stop applying to memfd anymore?

> 
> I also saw you were open to having someone pick up these patches if you
> don't think you'll have a chance to get to them near-term, so I'd be
> happy to pick them up if that's preferable.

Sure!  Indeed I don't have bandwidth to keep working on this one in the
near future. Please feel free to pick whatever needed into your series.

Thanks,

> 
> Thanks!
> 
> -Mike
> 
> [1] https://lore.kernel.org/qemu-devel/20260528000416.8161-1-michael.roth@amd.com/
> 
> Comparisons to the above patchset:
> 
>   [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported 
>     - similar to:
>         [PATCH 01/12] accel/kvm: Decouple guest_memfd checks from memory attribute checks
>     - to allow mmap case, both defer error handling to ram_block_add() + RAM_GUEST_MEMFD path
>     - pros: adds nice kvm_private_memory_attribute_supported() helper
>     - cons: my patch checks/prints error via kvm_create_guest_memfd(), which
>       makes it a more re-usable error since ram_block_add() isn't the only
>       caller.
>     - IMO, I think we should merge the pros of your patch into my similar patch
>       and add your Co-developed-by, but also fine to keep yours as-is and deal
>       with anything else needed as a follow-up patch
>   [PATCH v3 02/12] kvm: Detect guest-memfd flags supported
>     - similar to the kvm_supported_guest_memfd_flags / kvm_create_guest_memfd_shared()
>       additions that are part of:
>         [02/12] hostmem: Introduce dedicated memory backend for guest_memfd
>     - This patch could be treated as a common dependency of the above and I can
>       drop the corresponding changes from my patch
>   [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd()
>     - Keep as-is
>   [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private
>     - Keep as-is
>   [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
>     - Keep as-is
>   [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private()
>     - Keep as-is
>   [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private
>     - Keep as-is
>   [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
>     - alternative to:
>         [02/12] hostmem: Introduce dedicated memory backend for guest_memfd
>     - pros: re-uses infrastructure from hostmem-memfd
>     - pros: less command-line changes vs. dedicated hostmem-guest-memfd (less libvirt changes?)
>     - cons: less flexibility vs. a dedicated backend
>     - cons: more risk of memfd vs guest_memfd behavior/options diverging over
>             time and having less commonality (e.g. if hugetlb has special options
>             we wouldn't need to muddy the existing documentation for normal
>             memfds or introduce alternative options alongside)
>     - IMO, a clean state patch only requires ~90 lines of potentially-duplicate
>       code, and that's offset to some degree by needing less special-casing
>       throughout hostmem-memfd.c (e.g. this patchset adds 55 lines on top), and
>       it seems worthwhile given some of the advanced use-cases planned around
>       guest_memfd (hugetlb, DAX-like functionality, and persisting userspace
>       across kexec) that might require special handling/options for very
>       different use-cases than normal memfds.
>   [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private()
>     - Keep as-is
>     - (all these renames are a nice cleanup/prep and will help a lot with making
>       in-place conversion handling more readable)
>   [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() to *_private()
>     - Keep as-is
>     - (all these renames are a nice cleanup/prep and will help a lot with making
>       in-place conversion handling more readable)
>   [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type
>     - Keep as-is
>   [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd
>     - Keep as-is
> 
> > 
> > The init-shared guest-memfd relies on almost the latest linux, as the
> > mmap() support just landed v6.18-rc2.  When run it on an older qemu, we'll
> > see errors like:
> > 
> >   qemu-system-x86_64: KVM does not support guest_memfd
> > 
> > One thing to mention is live migration is by default supported, however
> > postcopy is still currently not supported.  The postcopy support will have
> > some kernel dependency work to be merged in Linux first.
> > 
> > Thanks,
> > 
> > Peter Xu (11):
> >   kvm: Detect guest-memfd flags supported
> >   kvm: Provide explicit error for kvm_create_guest_memfd()
> >   ramblock: Rename guest_memfd to guest_memfd_private
> >   memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
> >   memory: Rename memory_region_has_guest_memfd() to *_private()
> >   hostmem: Rename guest_memfd to guest_memfd_private
> >   hostmem: Support fully shared guest memfd to back a VM
> >   machine: Rename machine_require_guest_memfd() to *_private()
> >   memory: Rename memory_region_init_ram_guest_memfd() to *_private()
> >   tests/migration-test: Support guest-memfd init shared mem type
> >   tests/migration-test: Add a precopy test for guest-memfd
> > 
> > Xiaoyao Li (1):
> >   kvm: Decouple memory attribute check from kvm_guest_memfd_supported
> > 
> >  qapi/qom.json                         |  6 ++-
> >  include/hw/boards.h                   |  2 +-
> >  include/system/hostmem.h              |  2 +-
> >  include/system/kvm.h                  |  1 +
> >  include/system/memory.h               | 27 ++++++------
> >  include/system/ram_addr.h             |  2 +-
> >  include/system/ramblock.h             |  7 +++-
> >  tests/qtest/migration/framework.h     |  4 ++
> >  accel/kvm/kvm-all.c                   | 33 ++++++++++++---
> >  accel/stubs/kvm-stub.c                |  6 +++
> >  backends/hostmem-file.c               |  2 +-
> >  backends/hostmem-memfd.c              | 55 +++++++++++++++++++++---
> >  backends/hostmem-ram.c                |  2 +-
> >  backends/hostmem-shm.c                |  2 +-
> >  backends/hostmem.c                    |  2 +-
> >  backends/igvm.c                       |  4 +-
> >  hw/core/machine.c                     |  2 +-
> >  hw/i386/pc.c                          |  6 +--
> >  hw/i386/pc_sysfw.c                    |  8 ++--
> >  hw/i386/x86-common.c                  |  8 ++--
> >  system/memory.c                       | 17 ++++----
> >  system/physmem.c                      | 37 ++++++++++-------
> >  target/i386/kvm/kvm.c                 |  3 +-
> >  tests/qtest/migration/framework.c     | 60 +++++++++++++++++++++++++++
> >  tests/qtest/migration/precopy-tests.c | 12 ++++++
> >  25 files changed, 239 insertions(+), 71 deletions(-)
> > 
> > -- 
> > 2.50.1
> > 
> > 
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends
  2026-06-03 19:27   ` Peter Xu
@ 2026-06-04 22:36     ` Michael Roth
  2026-06-05 14:57       ` Peter Xu
  0 siblings, 1 reply; 47+ messages in thread
From: Michael Roth @ 2026-06-04 22:36 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Wed, Jun 03, 2026 at 03:27:17PM -0400, Peter Xu wrote:
> On Tue, Jun 02, 2026 at 05:02:29PM -0500, Michael Roth wrote:
> > On Mon, Dec 15, 2025 at 03:51:51PM -0500, Peter Xu wrote:
> > > v1: https://lore.kernel.org/r/20251023185913.2923322-1-peterx@redhat.com
> > > v2: https://lore.kernel.org/r/20251119172913.577392-1-peterx@redhat.com
> > > 
> > > v3:
> > > - Collect R-bs from Xiaoyao
> > > - Rebased to 10.2-rc3; no dependency needed now, as those got merged
> > > - Reorder patches, touch up commit messages or comments on in-place misuse
> > > - Added patch "kvm: Provide explicit error for kvm_create_guest_memfd()" [Xiaoyao]
> > > - Added one patch for renaming machine_require_guest_memfd() [Xiaoyao]
> > > - Added one patch for renaming memory_region_init_ram_guest_memfd() [Xiaoyao]
> > > 
> > > =========8<===========
> > > 
> > > This series allows QEMU to consume init-shared guest-memfd to be a common
> > > memory backend. Before this series, guest-memfd was only used in CoCo and
> > > the fds will be created implicitly whenever CoCo environment is detected.
> > > When used in init-shared mode, the guest-memfd will be specified in the
> > > command lines directly just like other types of memory backends.
> > > 
> > > In the current patchset, I reused the memory-backend-memfd object, rather
> > > than creating a new type of object.  After all, guest-memfd (at least from
> > > userspace POV) works similarly like a memfd, except that it was tailored
> > > for VM's use case.
> > > 
> > > This approach so far also does not involve gmem bindings to KVM instances,
> > > hence it is not prone to issues when the same chunk of RAM will be attached
> > > to more than one KVM memslots.
> > > 
> > > Now, instead of using a normal memfd backend using:
> > > 
> > >   -object memory-backend-memfd,id=ID,size=SIZE,share=on
> > > 
> > > One can also boot a VM with guest-memfd:
> > > 
> > >   -object memory-backend-memfd,id=ID,size=SIZE,share=on,guest-memfd=on
> > 
> > Hi Peter,
> 
> Hi, Michael,
> 
> > 
> > I'm working on enabling support for this, as well as enabling in-place
> > conversion support for confidential VMs[1]. In my series I added a
> > dedicated memory-backend-guest-memfd to handle using mmapable
> > guest_memfd to back normal VMs (and confidential VMs with in-place
> > conversion enabled on top). Xiaoyao mentioned we had some overlap and
> > potential inter-dependencies between our series so I took some notes
> > on the differences which I've included at the bottom of this email...
> 
> To Xiaoyao: thanks for linking these works, and also thanks for answering
> other question I raised in the separate thread.
> 
> > 
> > But at a high-level I think this series is further along in implementing
> > guest_memfd for normal VMs, and I would plan to just mostly rebase my
> > in-place conversion patches on top of your series. However I think it
> > would be a good idea to go with a dedicated memory-backend-guest-memfd
> > for reasons I outlined in my notes, so maybe this needs to be discussed
> > more.
> 
> To me, it was just natural when working on that to reuse memfd backend,
> because conceptually they're really the same: I guess guest-memfd is named
> as guest-memfd (not guest-special-fd etc.) also because of that.

It's true that it's a 'memfd for guest stuff', but that 'guest stuff' is
becoming a pretty wild set of additional features that I think could lead
to some 'interesting' options that will never have any line-of-sight for
normal memfd's.

> 
> I don't have a strong feeling here, hostmem-memfd.c is tiny so duplicating
> isn't a major concern even if so.  It's just that I don't yet see when gmem
> will become special.
> 
> Say, all of the features that memfd provides can easily be applied to
> guest-memfd either now or at some point later:
> 
>   - hugetlb/hugetlbsize being one of them already, I believe we almost know
>     1G will happen to gmem soon
>   - seal: I don't see why we can't seal a gmemfd too.. maybe it'll come, in
>     general the whole seal concept can apply to gmem too.
>   - cpr support on memfd (or anything about live update in the future to
>     happen on gmem): I believe gmem also want it..
> 
> IIUC it's a matter of if we expect future property of guest-memfd that will
> stop applying to memfd anymore?

Yah, I think that's the main thing to consider. There's a few things in the
pipeline where the options associated with guest_memfd might diverage
quite a bit from memfd:

  - hugetlb: yes, these could potentially use the same options memfd
    uses, and I'm guessing that will end up being the case, but one
    large gap there is that shared memory is always split to 4K, which
    we've accepted for now, but if you consider use-cases like DPDK
    there can still be major performance bottlenecks that would drive
    us to try to enable larger mappings for the shared ranges, and then
    we'd end up with guest-memfd-specific parameters intermix with
    normal memfd options, and our related documentation would need to
    covers these differences case by case
  - DAX-like stuff: there are some proposals for making device memory
    available to use as private guest memory, and since 'guest-memfd'
    is generally responsible for managing private memory, it will
    likely end up being extended to handle this at some point. One
    proposal/PoC[1] would involve at least needing additional options
    for the /dev/dax path, but there have also been discussions about
    having a general notion of custom allocators that can be plugged
    into guest_memfd, and some of these might have overlapping options
    WRT things like hugepages/etc. But at a high-level, DAX would map
    more to memory-backend-file than memory-backend-memfd, so we'd
    already be crossing up some wires there.
  - live update: there's work[2] on enabling preservation of confidential
    guest memory across kexec by preserving it through guest_memfd. This
    one is still a bit mind-blowing to me but I could see us needing
    some additional options here that would really make no sense for
    memfd.
  - directmap removal: these[3] patches allow a new guest_memfd flag to
    be set to unmap guest_memfd pages from kernel directmap to help
    mitigate speculative attacks, probably would involve a new option
    as well that wouldn't be applicable to normal memfds

It could also end up that even memory-backend-guest-memfd is too
generic, and that some of these would involve a more specialized memory
backend where may they can share a common base class for some of the
core guest_memfd stuff but otherwise be separate backends with their
own specific options. So to me, starting off building up
memory-backend-memfd seems like a potential misstep, whereas we don't
really lose much to start with a clean slate.

[1] DAX: https://lwn.net/ml/all/20260423170219.281618-1-dave.jiang@intel.com/
[2] LUO: https://lore.kernel.org/all/cover.1779080766.git.tarunsahu@google.com/#r
[3] directmap removal: https://lore.kernel.org/kvm/20260317141031.514-1-kalyazin@amazon.com/

> 
> > 
> > I also saw you were open to having someone pick up these patches if you
> > don't think you'll have a chance to get to them near-term, so I'd be
> > happy to pick them up if that's preferable.
> 
> Sure!  Indeed I don't have bandwidth to keep working on this one in the
> near future. Please feel free to pick whatever needed into your series.

Ok, sounds good, I'll pick these up for my next posting and incorporate
any changes/comments that might still be pending at that time.

Thanks for getting things to this stage!

-Mike

> 
> Thanks,
> 
> > 
> > Thanks!
> > 
> > -Mike
> > 
> > [1] https://lore.kernel.org/qemu-devel/20260528000416.8161-1-michael.roth@amd.com/
> > 
> > Comparisons to the above patchset:
> > 
> >   [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported 
> >     - similar to:
> >         [PATCH 01/12] accel/kvm: Decouple guest_memfd checks from memory attribute checks
> >     - to allow mmap case, both defer error handling to ram_block_add() + RAM_GUEST_MEMFD path
> >     - pros: adds nice kvm_private_memory_attribute_supported() helper
> >     - cons: my patch checks/prints error via kvm_create_guest_memfd(), which
> >       makes it a more re-usable error since ram_block_add() isn't the only
> >       caller.
> >     - IMO, I think we should merge the pros of your patch into my similar patch
> >       and add your Co-developed-by, but also fine to keep yours as-is and deal
> >       with anything else needed as a follow-up patch
> >   [PATCH v3 02/12] kvm: Detect guest-memfd flags supported
> >     - similar to the kvm_supported_guest_memfd_flags / kvm_create_guest_memfd_shared()
> >       additions that are part of:
> >         [02/12] hostmem: Introduce dedicated memory backend for guest_memfd
> >     - This patch could be treated as a common dependency of the above and I can
> >       drop the corresponding changes from my patch
> >   [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd()
> >     - Keep as-is
> >   [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private
> >     - Keep as-is
> >   [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
> >     - Keep as-is
> >   [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private()
> >     - Keep as-is
> >   [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private
> >     - Keep as-is
> >   [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
> >     - alternative to:
> >         [02/12] hostmem: Introduce dedicated memory backend for guest_memfd
> >     - pros: re-uses infrastructure from hostmem-memfd
> >     - pros: less command-line changes vs. dedicated hostmem-guest-memfd (less libvirt changes?)
> >     - cons: less flexibility vs. a dedicated backend
> >     - cons: more risk of memfd vs guest_memfd behavior/options diverging over
> >             time and having less commonality (e.g. if hugetlb has special options
> >             we wouldn't need to muddy the existing documentation for normal
> >             memfds or introduce alternative options alongside)
> >     - IMO, a clean state patch only requires ~90 lines of potentially-duplicate
> >       code, and that's offset to some degree by needing less special-casing
> >       throughout hostmem-memfd.c (e.g. this patchset adds 55 lines on top), and
> >       it seems worthwhile given some of the advanced use-cases planned around
> >       guest_memfd (hugetlb, DAX-like functionality, and persisting userspace
> >       across kexec) that might require special handling/options for very
> >       different use-cases than normal memfds.
> >   [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private()
> >     - Keep as-is
> >     - (all these renames are a nice cleanup/prep and will help a lot with making
> >       in-place conversion handling more readable)
> >   [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() to *_private()
> >     - Keep as-is
> >     - (all these renames are a nice cleanup/prep and will help a lot with making
> >       in-place conversion handling more readable)
> >   [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type
> >     - Keep as-is
> >   [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd
> >     - Keep as-is
> > 
> > > 
> > > The init-shared guest-memfd relies on almost the latest linux, as the
> > > mmap() support just landed v6.18-rc2.  When run it on an older qemu, we'll
> > > see errors like:
> > > 
> > >   qemu-system-x86_64: KVM does not support guest_memfd
> > > 
> > > One thing to mention is live migration is by default supported, however
> > > postcopy is still currently not supported.  The postcopy support will have
> > > some kernel dependency work to be merged in Linux first.
> > > 
> > > Thanks,
> > > 
> > > Peter Xu (11):
> > >   kvm: Detect guest-memfd flags supported
> > >   kvm: Provide explicit error for kvm_create_guest_memfd()
> > >   ramblock: Rename guest_memfd to guest_memfd_private
> > >   memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
> > >   memory: Rename memory_region_has_guest_memfd() to *_private()
> > >   hostmem: Rename guest_memfd to guest_memfd_private
> > >   hostmem: Support fully shared guest memfd to back a VM
> > >   machine: Rename machine_require_guest_memfd() to *_private()
> > >   memory: Rename memory_region_init_ram_guest_memfd() to *_private()
> > >   tests/migration-test: Support guest-memfd init shared mem type
> > >   tests/migration-test: Add a precopy test for guest-memfd
> > > 
> > > Xiaoyao Li (1):
> > >   kvm: Decouple memory attribute check from kvm_guest_memfd_supported
> > > 
> > >  qapi/qom.json                         |  6 ++-
> > >  include/hw/boards.h                   |  2 +-
> > >  include/system/hostmem.h              |  2 +-
> > >  include/system/kvm.h                  |  1 +
> > >  include/system/memory.h               | 27 ++++++------
> > >  include/system/ram_addr.h             |  2 +-
> > >  include/system/ramblock.h             |  7 +++-
> > >  tests/qtest/migration/framework.h     |  4 ++
> > >  accel/kvm/kvm-all.c                   | 33 ++++++++++++---
> > >  accel/stubs/kvm-stub.c                |  6 +++
> > >  backends/hostmem-file.c               |  2 +-
> > >  backends/hostmem-memfd.c              | 55 +++++++++++++++++++++---
> > >  backends/hostmem-ram.c                |  2 +-
> > >  backends/hostmem-shm.c                |  2 +-
> > >  backends/hostmem.c                    |  2 +-
> > >  backends/igvm.c                       |  4 +-
> > >  hw/core/machine.c                     |  2 +-
> > >  hw/i386/pc.c                          |  6 +--
> > >  hw/i386/pc_sysfw.c                    |  8 ++--
> > >  hw/i386/x86-common.c                  |  8 ++--
> > >  system/memory.c                       | 17 ++++----
> > >  system/physmem.c                      | 37 ++++++++++-------
> > >  target/i386/kvm/kvm.c                 |  3 +-
> > >  tests/qtest/migration/framework.c     | 60 +++++++++++++++++++++++++++
> > >  tests/qtest/migration/precopy-tests.c | 12 ++++++
> > >  25 files changed, 239 insertions(+), 71 deletions(-)
> > > 
> > > -- 
> > > 2.50.1
> > > 
> > > 
> > 
> 
> -- 
> Peter Xu
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
  2026-06-02 21:40   ` Michael Roth
@ 2026-06-05  7:23     ` David Hildenbrand (Arm)
  2026-06-05 11:23       ` David Hildenbrand (Arm)
  0 siblings, 1 reply; 47+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-05  7:23 UTC (permalink / raw)
  To: Michael Roth, Peter Xu
  Cc: qemu-devel, Juraj Marcin, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

>>  static bool
>> @@ -47,11 +56,26 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>>          goto have_fd;
>>      }
>>  
>> -    fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
>> -                           m->hugetlb, m->hugetlbsize, m->seal ?
>> -                           F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
>> -                           errp);
>> -    if (fd == -1) {
>> +    if (m->guest_memfd) {
>> +        /* User choose to use fully shared guest-memfd to back the VM.. */
>> +        if (!backend->share) {
>> +            error_setg(errp, "Guest-memfd=on must be used with share=on");
>> +            return false;
>> +        }
>> +
>> +        /* TODO: add huge page support */
> 
> Until that's added, the related options should be disabled. m->seal as
> well doesn't seem to be applicable for guest_memfd case.

Which raises the question whether we want a new backend (I think you implement
that in your version).

guest_memfd is rather different in some aspects (e.g., resizing is already
completely forbidden).

What are the pros and cons of either approach?

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM
  2026-06-05  7:23     ` David Hildenbrand (Arm)
@ 2026-06-05 11:23       ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 47+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-05 11:23 UTC (permalink / raw)
  To: Michael Roth, Peter Xu
  Cc: qemu-devel, Juraj Marcin, Paolo Bonzini, Chenyi Qiang,
	Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On 6/5/26 09:23, David Hildenbrand (Arm) wrote:
>>>  static bool
>>> @@ -47,11 +56,26 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>>>          goto have_fd;
>>>      }
>>>  
>>> -    fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
>>> -                           m->hugetlb, m->hugetlbsize, m->seal ?
>>> -                           F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
>>> -                           errp);
>>> -    if (fd == -1) {
>>> +    if (m->guest_memfd) {
>>> +        /* User choose to use fully shared guest-memfd to back the VM.. */
>>> +        if (!backend->share) {
>>> +            error_setg(errp, "Guest-memfd=on must be used with share=on");
>>> +            return false;
>>> +        }
>>> +
>>> +        /* TODO: add huge page support */
>>
>> Until that's added, the related options should be disabled. m->seal as
>> well doesn't seem to be applicable for guest_memfd case.
> 
> Which raises the question whether we want a new backend (I think you implement
> that in your version).
> 
> guest_memfd is rather different in some aspects (e.g., resizing is already
> completely forbidden).
> 
> What are the pros and cons of either approach?
> 

Ah, I see a discussion on that starting here:

https://lore.kernel.org/r/lpkcfd2crgparcd64ydry3ocryx3sfc5gj5pzrrms4nwvw6j4c@ulc3wa3rmefo


-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends
  2026-06-04 22:36     ` Michael Roth
@ 2026-06-05 14:57       ` Peter Xu
  2026-06-08 17:59         ` Michael Roth
  0 siblings, 1 reply; 47+ messages in thread
From: Peter Xu @ 2026-06-05 14:57 UTC (permalink / raw)
  To: Michael Roth
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Thu, Jun 04, 2026 at 05:36:42PM -0500, Michael Roth wrote:
> > IIUC it's a matter of if we expect future property of guest-memfd that will
> > stop applying to memfd anymore?
> 
> Yah, I think that's the main thing to consider. There's a few things in the
> pipeline where the options associated with guest_memfd might diverage
> quite a bit from memfd:

Thanks for all these contexts.  I'll throw some random questions below,
some of them may not be directly related to the current discussion, but
please bare with me.

> 
>   - hugetlb: yes, these could potentially use the same options memfd
>     uses, and I'm guessing that will end up being the case, but one
>     large gap there is that shared memory is always split to 4K, which
>     we've accepted for now, but if you consider use-cases like DPDK
>     there can still be major performance bottlenecks that would drive
>     us to try to enable larger mappings for the shared ranges, and then
>     we'd end up with guest-memfd-specific parameters intermix with
>     normal memfd options, and our related documentation would need to
>     covers these differences case by case

The first thing I thought about is mTHP and how it can also be similarly
applied to normal memfd (now, or in the future, that I'm not sure).

Before that..  shouldn't the whole concept of private mem / gmem about
reducing the area of mapping the host (including dpdk, if we're talking
about things like OpenVswitch)?  Can you roughly describe how huge mapping
is expected to be allowed in such case?  Does it mean the guest driver
should also be aware to allocate huge continuous physical mem for DMA only?

>   - DAX-like stuff: there are some proposals for making device memory
>     available to use as private guest memory, and since 'guest-memfd'
>     is generally responsible for managing private memory, it will
>     likely end up being extended to handle this at some point. One
>     proposal/PoC[1] would involve at least needing additional options
>     for the /dev/dax path, but there have also been discussions about
>     having a general notion of custom allocators that can be plugged
>     into guest_memfd, and some of these might have overlapping options
>     WRT things like hugepages/etc. But at a high-level, DAX would map
>     more to memory-backend-file than memory-backend-memfd, so we'd
>     already be crossing up some wires there.

I have no deep understanding on this, but IIUC we used to stick with
memory-backend-file for dax.  Why switch to memory-backend-guest-memfd?
Are we still exposing a dax via a file path ultimately, even with CoCo?

Note, here I want to differenciate two concepts: QEMU interfacing and
kernel/KVM interfacing.  I mean, I have a gut feeling that for coco dax we
could still stick with memory-backend-file, even if internally we can still
use new KVM ioctls to set them up: there's no rule to say only
memory-backend-guest-memfd can use the KVM ioctl.  IMHO they're different
stories, and here I'm focused more on the QEMU interfacing that we're
discussing here.

IMHO for QEMU's interfacing, any memory-backend should play one solo role
which is to point to QEMU (as a hypervisor) a backing store for some piece
of resource that can be used as guest memory backend.  It doesn't need to
have any implication on how we implement that backend internally.

>   - live update: there's work[2] on enabling preservation of confidential
>     guest memory across kexec by preserving it through guest_memfd. This
>     one is still a bit mind-blowing to me but I could see us needing
>     some additional options here that would really make no sense for
>     memfd.

Could you elaborate what kind of parameter you would expect?

I'm not sure if you have investigated QEMU's CPR approach, now memfd
backend is really the core of supporting such infrastructure, where fds can
be persisted.  For live update, it'll be persisted across kexec and kernel
switchover.  For CPR, it actually also works when with cpr-reboot with its
own tricky way to persist memory.

In general, what I want to say is, I really think they should play the same
in term of live update case too: if we need to register some fd for
persistency, we need to register gmem, kvm, but also memfd if some of them
are attached to the current VM, right?

>   - directmap removal: these[3] patches allow a new guest_memfd flag to
>     be set to unmap guest_memfd pages from kernel directmap to help
>     mitigate speculative attacks, probably would involve a new option
>     as well that wouldn't be applicable to normal memfds

Now the question is, do we want to remove directmap for "some" memory
backend, or do we want to remove it per-VM?

This is another thing I want to make sure we're on the same page: I want to
make sure we don't introduce per-VM setup for memory backends.

Say, "init-shared" or "in-place CoCo", what should we use for one gmem fd?
IMHO it shouldn't be a parameter in the memory-backend.  It should be a
parameter for the -machine or some similar per-vm setup, which will apply
to all gmemfd across the current VM.

My understanding is directmap removal is similar in this case, which seems
to be a per-VM (rather than per-memory-backend) attribute?  We can still
operate on that per-memory-backend, but then it'll be internally, the
backends need to understand the VM setup and do things properly, IMHO.

> 
> It could also end up that even memory-backend-guest-memfd is too
> generic, and that some of these would involve a more specialized memory
> backend where may they can share a common base class for some of the
> core guest_memfd stuff but otherwise be separate backends with their
> own specific options. So to me, starting off building up
> memory-backend-memfd seems like a potential misstep, whereas we don't
> really lose much to start with a clean slate.
> 
> [1] DAX: https://lwn.net/ml/all/20260423170219.281618-1-dave.jiang@intel.com/
> [2] LUO: https://lore.kernel.org/all/cover.1779080766.git.tarunsahu@google.com/#r
> [3] directmap removal: https://lore.kernel.org/kvm/20260317141031.514-1-kalyazin@amazon.com/
> 
> > 
> > > 
> > > I also saw you were open to having someone pick up these patches if you
> > > don't think you'll have a chance to get to them near-term, so I'd be
> > > happy to pick them up if that's preferable.
> > 
> > Sure!  Indeed I don't have bandwidth to keep working on this one in the
> > near future. Please feel free to pick whatever needed into your series.
> 
> Ok, sounds good, I'll pick these up for my next posting and incorporate
> any changes/comments that might still be pending at that time.
> 
> Thanks for getting things to this stage!

Thanks for picking it up!  Juraj in our team may have some future
exploration on gmem over 1G for postcopy on init-shared, so it's great the
code is moving closer to that direction.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends
  2026-06-05 14:57       ` Peter Xu
@ 2026-06-08 17:59         ` Michael Roth
  0 siblings, 0 replies; 47+ messages in thread
From: Michael Roth @ 2026-06-08 17:59 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Juraj Marcin, David Hildenbrand, Paolo Bonzini,
	Chenyi Qiang, Fabiano Rosas, Alexey Kardashevskiy, Li Xiaoyao

On Fri, Jun 05, 2026 at 10:57:34AM -0400, Peter Xu wrote:
> On Thu, Jun 04, 2026 at 05:36:42PM -0500, Michael Roth wrote:
> > > IIUC it's a matter of if we expect future property of guest-memfd that will
> > > stop applying to memfd anymore?
> > 
> > Yah, I think that's the main thing to consider. There's a few things in the
> > pipeline where the options associated with guest_memfd might diverage
> > quite a bit from memfd:
> 
> Thanks for all these contexts.  I'll throw some random questions below,
> some of them may not be directly related to the current discussion, but
> please bare with me.
> 
> > 
> >   - hugetlb: yes, these could potentially use the same options memfd
> >     uses, and I'm guessing that will end up being the case, but one
> >     large gap there is that shared memory is always split to 4K, which
> >     we've accepted for now, but if you consider use-cases like DPDK
> >     there can still be major performance bottlenecks that would drive
> >     us to try to enable larger mappings for the shared ranges, and then
> >     we'd end up with guest-memfd-specific parameters intermix with
> >     normal memfd options, and our related documentation would need to
> >     covers these differences case by case
> 
> The first thing I thought about is mTHP and how it can also be similarly
> applied to normal memfd (now, or in the future, that I'm not sure).

That might work, though for some architectures the shared pages might
benefit from a wider range of hugepage granularities than private. For
instance with SNP it might make sense to expose 1GB hugepages but then
limit them to 2MB granularity for private (since private pages get 2MB
TLB entries anyway so rebuilding from 2MB->1GB is a waste)

https://github.com/AMDESE/amdsev#prepare-hostBut IIUC that sort of support will likely depend on some mm/ changes
are refcounting that probably won't be completed any time soon so it's
hard to say to anticipate what that'll end up looking like.

> 
> Before that..  shouldn't the whole concept of private mem / gmem about
> reducing the area of mapping the host (including dpdk, if we're talking
> about things like OpenVswitch)?  Can you roughly describe how huge mapping
> is expected to be allowed in such case?  Does it mean the guest driver
> should also be aware to allocate huge continuous physical mem for DMA only?

Guest drivers would generally end up either going through SWIOTLB to get
access to shared memory, or convert their pages to shared directly prior
to use. Optimizations for hugepages would work roughly the same as any
optimizations that have already been done for the non-confidential case.
Shared guest allocations/conversions at granularities smaller than the
backing page/hugepage size are where confidential VMs start to pay an
extra tax in guest_memfd and potentially with the security architecture
(e.g. extra RMP table checks for SNP, not just the TLB/TLB misses).

It seems to be in everyone's best interest to have a common shared
memory pool that gets initialized/converted/replenished at hugepage-sized
granularities. Here's one patchset[1] that takes the natural choice of
doing this through SWIOTLB. Not sure that's ultimately what it will look
like but I think it's safe to expect some level of optimization.

So with all these pieces in play, what I would expect is that DPDK
applications could access these shared buffers using hugepages both in
the guest and host-side as they do today, since the whole hugepage/folio
will be homogenous with the above guest optimizations... but that still
requires the above-mentioned mm/ rework for refcounting to allow hugepages
for shared ranges so this is another thing that we probably won't need to
deal with any time soon, but I think that's roughly what we could expect
it to look like eventually.

[1] https://lkml.org/lkml/2024/1/12/65

> 
> >   - DAX-like stuff: there are some proposals for making device memory
> >     available to use as private guest memory, and since 'guest-memfd'
> >     is generally responsible for managing private memory, it will
> >     likely end up being extended to handle this at some point. One
> >     proposal/PoC[1] would involve at least needing additional options
> >     for the /dev/dax path, but there have also been discussions about
> >     having a general notion of custom allocators that can be plugged
> >     into guest_memfd, and some of these might have overlapping options
> >     WRT things like hugepages/etc. But at a high-level, DAX would map
> >     more to memory-backend-file than memory-backend-memfd, so we'd
> >     already be crossing up some wires there.
> 
> I have no deep understanding on this, but IIUC we used to stick with
> memory-backend-file for dax.  Why switch to memory-backend-guest-memfd?
> Are we still exposing a dax via a file path ultimately, even with CoCo?

I touched on this a bit below, but I don't necessarily think
memory-backend-guest-memfd should handle DAX, it's just one example
where we clearly need to think beyond 'memfd', but are still potentially
in the realm of 'guest_memfd' depending on what the API ends up looking
like.

But I agree with your below point that we don't need the backend to
match up with implementation details of how guest_memfd works
internally, and that the core point that memory-backend-file might still
end up seeming like the most appropriate way for a QEMU user to specify
a DAX path, even if internally it's still using guest_memfd.

Though going that route, we'd still have a
memory-backend-file,...,guest_memfd=on that brings 'memfd's' back into
the discussion. We could take it a step further and rename the
'guest_memfd' backend option to 'securable', but maybe this is ends up
being that right level of balance between 'i need to open a file that
can do guest_memfd-related stuff', and 'i need to create a guest_memfd
instance can handle a DAX path'.

My thinking was that since the hugepage PoC already implements the notion
of custom allocators in the uAPI, and that there's been talk of 'pluggable'
backends for guest_memfd, that the kernel would also need to do a reasonable
job in creating a consistent uAPI/documentation, such that the hugepage/DAX
cases would end up looking something like:

  memory-backend-guest-memfd,allocator=hugetlb,pagesize=2M,...
  memory-backend-guest-memfd,allocator=dax,path=/dev/daxX,pagesize=2M,...

which is firmly on the 'i need to create a guest_memfd instance that's
back by a DAX path' end of the spectrum, compared to the more abstracted
approach you're suggesting, and so for the most part we'd be passing
through the kernel options/documentation to users vs. abstracting it and
then touching on it case-by-case in 'memfd'/'file'/etc. documentation.

Personally, I'm not sure at this point which approach will end up being
the more workable one. But it is harder/more-confusing to start with
memory-backend-guest-memfd, and then go back to e.g.
memory-backend-file,guest_memfd=on later for future extensions. So I'm
start to lean toward doing the minimal
'<existing_backend>,guest_memfd=on' thing for now, and then just
deprecating it if we really feel like we need a more direct interface
that memory-backend-guest-memfd down the road.

Does that seem reasonable for a starting point? I feel like we'll be
better positioned to make a better long-term decision once some of these
patchsets are further along.

> 
> Note, here I want to differenciate two concepts: QEMU interfacing and
> kernel/KVM interfacing.  I mean, I have a gut feeling that for coco dax we
> could still stick with memory-backend-file, even if internally we can still
> use new KVM ioctls to set them up: there's no rule to say only
> memory-backend-guest-memfd can use the KVM ioctl.  IMHO they're different
> stories, and here I'm focused more on the QEMU interfacing that we're
> discussing here.
> 
> IMHO for QEMU's interfacing, any memory-backend should play one solo role
> which is to point to QEMU (as a hypervisor) a backing store for some piece
> of resource that can be used as guest memory backend.  It doesn't need to
> have any implication on how we implement that backend internally.
> 
> >   - live update: there's work[2] on enabling preservation of confidential
> >     guest memory across kexec by preserving it through guest_memfd. This
> >     one is still a bit mind-blowing to me but I could see us needing
> >     some additional options here that would really make no sense for
> >     memfd.
> 
> Could you elaborate what kind of parameter you would expect?

I was thinking stuff like the metadata that would be needed to rebuild a
KVM instance with the same GPA->HPA mappings to the pages previously
allocated by guest_memfd. It makes semse that each backend has it's own
associated metadata so that each can be restored in-turn, but yes there
would also need to be some common state like KVM itself that needs to be
serialized, and this would probably have separate options. So in theory it
wouldn't need to be tied to the backend, but IMO it feels very natural
to imagine the options like something like that.

> 
> I'm not sure if you have investigated QEMU's CPR approach, now memfd
> backend is really the core of supporting such infrastructure, where fds can
> be persisted.  For live update, it'll be persisted across kexec and kernel
> switchover.  For CPR, it actually also works when with cpr-reboot with its
> own tricky way to persist memory.
> 
> In general, what I want to say is, I really think they should play the same
> in term of live update case too: if we need to register some fd for
> persistency, we need to register gmem, kvm, but also memfd if some of them
> are attached to the current VM, right?

I definitely need to look into this more (and intra-host live update for
guest_memfd/in-place converison in general), but for guest memory
persistence it seems like we'd generally be relying on
memory-backend-file=<path> as a target/src for serializing guest memory
to persistent storage for normal/'memfd' case.

But for confidential VMs we don't just need the data for a particular GPA,
but the original HPA and maybe details like the associated shared/private
memory attributes, which is why I'm thinking we might need something like a
separate path argument for that, or maybe QEMU abstracting this out into its
own user-configurable format.

> 
> >   - directmap removal: these[3] patches allow a new guest_memfd flag to
> >     be set to unmap guest_memfd pages from kernel directmap to help
> >     mitigate speculative attacks, probably would involve a new option
> >     as well that wouldn't be applicable to normal memfds
> 
> Now the question is, do we want to remove directmap for "some" memory
> backend, or do we want to remove it per-VM?
> 
> This is another thing I want to make sure we're on the same page: I want to
> make sure we don't introduce per-VM setup for memory backends.
> 
> Say, "init-shared" or "in-place CoCo", what should we use for one gmem fd?
> IMHO it shouldn't be a parameter in the memory-backend.  It should be a
> parameter for the -machine or some similar per-vm setup, which will apply
> to all gmemfd across the current VM.
> 
> My understanding is directmap removal is similar in this case, which seems
> to be a per-VM (rather than per-memory-backend) attribute?  We can still
> operate on that per-memory-backend, but then it'll be internally, the
> backends need to understand the VM setup and do things properly, IMHO.

I think all-or-nothing would be most common, but it's completely
controlled at the guest_memfd inode level so it would support that sort
of flexibility if needed. One side effect is that setting it currently
sets AS_NO_DIRECT_MAP which can have some performance downsides...
maaaaaybe that's enough for someone to want to fine-tune 'isolated' vs.
'non-isolated' GPA ranges?

So I think it's pretty safe to say we don't *need* to expose this
functionality per-backend/inode initially, and if we end up preferring a
global option then that's probably fine too. So we can probably set this
example aside for now.

> 
> > 
> > It could also end up that even memory-backend-guest-memfd is too
> > generic, and that some of these would involve a more specialized memory
> > backend where may they can share a common base class for some of the
> > core guest_memfd stuff but otherwise be separate backends with their
> > own specific options. So to me, starting off building up
> > memory-backend-memfd seems like a potential misstep, whereas we don't
> > really lose much to start with a clean slate.
> > 
> > [1] DAX: https://lwn.net/ml/all/20260423170219.281618-1-dave.jiang@intel.com/
> > [2] LUO: https://lore.kernel.org/all/cover.1779080766.git.tarunsahu@google.com/#r
> > [3] directmap removal: https://lore.kernel.org/kvm/20260317141031.514-1-kalyazin@amazon.com/
> > 
> > > 
> > > > 
> > > > I also saw you were open to having someone pick up these patches if you
> > > > don't think you'll have a chance to get to them near-term, so I'd be
> > > > happy to pick them up if that's preferable.
> > > 
> > > Sure!  Indeed I don't have bandwidth to keep working on this one in the
> > > near future. Please feel free to pick whatever needed into your series.
> > 
> > Ok, sounds good, I'll pick these up for my next posting and incorporate
> > any changes/comments that might still be pending at that time.
> > 
> > Thanks for getting things to this stage!
> 
> Thanks for picking it up!  Juraj in our team may have some future
> exploration on gmem over 1G for postcopy on init-shared, so it's great the
> code is moving closer to that direction.

Nice, lots of interesting work ahead it seems :)

Thanks,

Mike

> 
> Thanks,
> 
> -- 
> Peter Xu
> 
> 


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2026-06-08 18:16 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-15 20:51 [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Peter Xu
2025-12-15 20:51 ` [PATCH v3 01/12] kvm: Decouple memory attribute check from kvm_guest_memfd_supported Peter Xu
2025-12-16 12:41   ` Xiaoyao Li
2025-12-23 16:56     ` Peter Xu
2025-12-16 13:53   ` Fabiano Rosas
2025-12-23 17:02     ` Peter Xu
2026-06-02  1:10   ` Michael Roth
2025-12-15 20:51 ` [PATCH v3 02/12] kvm: Detect guest-memfd flags supported Peter Xu
2025-12-16 13:54   ` Fabiano Rosas
2026-06-02  1:29   ` Michael Roth
2025-12-15 20:51 ` [PATCH v3 03/12] kvm: Provide explicit error for kvm_create_guest_memfd() Peter Xu
2025-12-16  4:03   ` Xiaoyao Li
2025-12-16 13:55   ` Fabiano Rosas
2026-06-02  1:31   ` Michael Roth
2025-12-15 20:51 ` [PATCH v3 04/12] ramblock: Rename guest_memfd to guest_memfd_private Peter Xu
2026-06-02  1:37   ` Michael Roth
2025-12-15 20:51 ` [PATCH v3 05/12] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
2025-12-16  5:49   ` Xiaoyao Li
2025-12-23 17:04     ` Peter Xu
2026-06-02  1:39   ` Michael Roth
2025-12-15 20:51 ` [PATCH v3 06/12] memory: Rename memory_region_has_guest_memfd() to *_private() Peter Xu
2026-06-02  1:40   ` Michael Roth
2025-12-15 20:51 ` [PATCH v3 07/12] hostmem: Rename guest_memfd to guest_memfd_private Peter Xu
2025-12-16  5:54   ` Xiaoyao Li
2026-06-02 18:56   ` Michael Roth
2025-12-15 20:51 ` [PATCH v3 08/12] hostmem: Support fully shared guest memfd to back a VM Peter Xu
2025-12-16  6:54   ` Xiaoyao Li
2025-12-16 14:02   ` Fabiano Rosas
2026-06-02 21:40   ` Michael Roth
2026-06-05  7:23     ` David Hildenbrand (Arm)
2026-06-05 11:23       ` David Hildenbrand (Arm)
2025-12-15 20:52 ` [PATCH v3 09/12] machine: Rename machine_require_guest_memfd() to *_private() Peter Xu
2025-12-16  6:55   ` Xiaoyao Li
2026-06-02 21:46   ` Michael Roth
2025-12-15 20:52 ` [PATCH v3 10/12] memory: Rename memory_region_init_ram_guest_memfd() " Peter Xu
2025-12-16  6:56   ` Xiaoyao Li
2026-06-02 21:49   ` Michael Roth
2025-12-15 20:52 ` [PATCH v3 11/12] tests/migration-test: Support guest-memfd init shared mem type Peter Xu
2025-12-16 14:18   ` Fabiano Rosas
2025-12-23 17:09     ` Peter Xu
2025-12-15 20:52 ` [PATCH v3 12/12] tests/migration-test: Add a precopy test for guest-memfd Peter Xu
2025-12-16 14:20   ` Fabiano Rosas
2026-06-02 22:02 ` [PATCH v3 00/12] KVM/hostmem: Support init-shared guest-memfd as VM backends Michael Roth
2026-06-03 19:27   ` Peter Xu
2026-06-04 22:36     ` Michael Roth
2026-06-05 14:57       ` Peter Xu
2026-06-08 17:59         ` Michael Roth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.