[PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends
@ 2025-10-23 18:59 Peter Xu
  2025-10-23 18:59 ` [PATCH 1/8] linux-headers: Update to v6.18-rc2 Peter Xu
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

This series allows QEMU to consume guest-memfd in-place, to be a common
memory backend. Before this series, guest-memfd was only used in CoCo and
the fds will be created implicitly whenever CoCo environment is detected.

In the current patchset, I reused the memory-backend-memfd object, rather
than creating a new type of object.  After all, guest-memfd (at least from
userspace POV) really works very similarly like a memfd, except that it was
tailored for VM's use case.

So instead of using a normal memfd backend using:

  -object memory-backend-memfd,id=ID,size=SIZE,share=on

One can also boot a VM with guest-memfd:

  -object memory-backend-memfd,id=ID,size=SIZE,share=on,guest-memfd=on

The in-place guest-memfd here relies on almost the latest linux, as the
mmap() support just landed v6.18-rc2.  When run it on an older qemu, we'll
see errors like:

  qemu-system-x86_64: KVM does not support guest_memfd

One thing to mention is live migration is by default supported, however
postcopy is still currently not supported.  The postcopy support will have
some kernel dependency work that was still being reviewed on mm list, so it
will be a separate work TBD.

Thanks,

Peter Xu (8):
  linux-headers: Update to v6.18-rc2
  kvm: Allow kvm_guest_memfd_supported for non-private use case
  kvm: Detect guest-memfd flags supported
  memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
  memory: Rename memory_region_has_guest_memfd() to *_private()
  ramblock: Rename guest_memfd to guest_memfd_private
  hostmem: Rename guest_memfd to guest_memfd_private
  hostmem: Support in-place guest memfd to back a VM

 qapi/qom.json                                 |  6 +-
 include/standard-headers/linux/ethtool.h      |  1 +
 include/standard-headers/linux/fuse.h         | 22 +++++-
 .../linux/input-event-codes.h                 |  1 +
 include/standard-headers/linux/input.h        | 22 +++++-
 include/standard-headers/linux/pci_regs.h     | 10 +++
 include/standard-headers/linux/virtio_ids.h   |  1 +
 include/system/hostmem.h                      |  2 +-
 include/system/memory.h                       | 16 ++---
 include/system/ram_addr.h                     |  2 +-
 include/system/ramblock.h                     |  7 +-
 linux-headers/asm-loongarch/kvm.h             |  1 +
 linux-headers/asm-riscv/kvm.h                 | 23 ++++++-
 linux-headers/asm-riscv/ptrace.h              |  4 +-
 linux-headers/asm-x86/kvm.h                   | 34 ++++++++++
 linux-headers/asm-x86/unistd_64.h             |  1 +
 linux-headers/asm-x86/unistd_x32.h            |  1 +
 linux-headers/linux/kvm.h                     |  3 +
 linux-headers/linux/psp-sev.h                 | 10 ++-
 linux-headers/linux/stddef.h                  |  1 -
 linux-headers/linux/vduse.h                   |  2 +-
 linux-headers/linux/vhost.h                   |  4 +-
 accel/kvm/kvm-all.c                           | 26 ++++---
 backends/hostmem-file.c                       |  2 +-
 backends/hostmem-memfd.c                      | 68 +++++++++++++++++--
 backends/hostmem-ram.c                        |  2 +-
 backends/hostmem.c                            |  2 +-
 system/memory.c                               |  6 +-
 system/physmem.c                              | 29 ++++----
 29 files changed, 253 insertions(+), 56 deletions(-)

-- 
2.50.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/8] linux-headers: Update to v6.18-rc2
  2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
@ 2025-10-23 18:59 ` Peter Xu
  2025-10-23 18:59 ` [PATCH 2/8] kvm: Allow kvm_guest_memfd_supported for non-private use case Peter Xu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/standard-headers/linux/ethtool.h      |  1 +
 include/standard-headers/linux/fuse.h         | 22 ++++++++++--
 .../linux/input-event-codes.h                 |  1 +
 include/standard-headers/linux/input.h        | 22 +++++++++++-
 include/standard-headers/linux/pci_regs.h     | 10 ++++++
 include/standard-headers/linux/virtio_ids.h   |  1 +
 linux-headers/asm-loongarch/kvm.h             |  1 +
 linux-headers/asm-riscv/kvm.h                 | 23 ++++++++++++-
 linux-headers/asm-riscv/ptrace.h              |  4 +--
 linux-headers/asm-x86/kvm.h                   | 34 +++++++++++++++++++
 linux-headers/asm-x86/unistd_64.h             |  1 +
 linux-headers/asm-x86/unistd_x32.h            |  1 +
 linux-headers/linux/kvm.h                     |  3 ++
 linux-headers/linux/psp-sev.h                 | 10 +++++-
 linux-headers/linux/stddef.h                  |  1 -
 linux-headers/linux/vduse.h                   |  2 +-
 linux-headers/linux/vhost.h                   |  4 +--
 17 files changed, 130 insertions(+), 11 deletions(-)

diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h
index eb80314028..dc24512d28 100644
--- a/include/standard-headers/linux/ethtool.h
+++ b/include/standard-headers/linux/ethtool.h
@@ -2380,6 +2380,7 @@ enum {
 #define	RXH_L4_B_0_1	(1 << 6) /* src port in case of TCP/UDP/SCTP */
 #define	RXH_L4_B_2_3	(1 << 7) /* dst port in case of TCP/UDP/SCTP */
 #define	RXH_GTP_TEID	(1 << 8) /* teid in case of GTP */
+#define	RXH_IP6_FL	(1 << 9) /* IPv6 flow label */
 #define	RXH_DISCARD	(1 << 31)
 
 #define	RX_CLS_FLOW_DISC	0xffffffffffffffffULL
diff --git a/include/standard-headers/linux/fuse.h b/include/standard-headers/linux/fuse.h
index d8b2fd67e1..abf3a78858 100644
--- a/include/standard-headers/linux/fuse.h
+++ b/include/standard-headers/linux/fuse.h
@@ -235,6 +235,11 @@
  *
  *  7.44
  *  - add FUSE_NOTIFY_INC_EPOCH
+ *
+ *  7.45
+ *  - add FUSE_COPY_FILE_RANGE_64
+ *  - add struct fuse_copy_file_range_out
+ *  - add FUSE_NOTIFY_PRUNE
  */
 
 #ifndef _LINUX_FUSE_H
@@ -266,7 +271,7 @@
 #define FUSE_KERNEL_VERSION 7
 
 /** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 44
+#define FUSE_KERNEL_MINOR_VERSION 45
 
 /** The node ID of the root inode */
 #define FUSE_ROOT_ID 1
@@ -653,6 +658,7 @@ enum fuse_opcode {
 	FUSE_SYNCFS		= 50,
 	FUSE_TMPFILE		= 51,
 	FUSE_STATX		= 52,
+	FUSE_COPY_FILE_RANGE_64	= 53,
 
 	/* CUSE specific operations */
 	CUSE_INIT		= 4096,
@@ -671,7 +677,7 @@ enum fuse_notify_code {
 	FUSE_NOTIFY_DELETE = 6,
 	FUSE_NOTIFY_RESEND = 7,
 	FUSE_NOTIFY_INC_EPOCH = 8,
-	FUSE_NOTIFY_CODE_MAX,
+	FUSE_NOTIFY_PRUNE = 9,
 };
 
 /* The read buffer is required to be at least 8k, but may be much larger */
@@ -1110,6 +1116,12 @@ struct fuse_notify_retrieve_in {
 	uint64_t	dummy4;
 };
 
+struct fuse_notify_prune_out {
+	uint32_t	count;
+	uint32_t	padding;
+	uint64_t	spare;
+};
+
 struct fuse_backing_map {
 	int32_t		fd;
 	uint32_t	flags;
@@ -1122,6 +1134,7 @@ struct fuse_backing_map {
 #define FUSE_DEV_IOC_BACKING_OPEN	_IOW(FUSE_DEV_IOC_MAGIC, 1, \
 					     struct fuse_backing_map)
 #define FUSE_DEV_IOC_BACKING_CLOSE	_IOW(FUSE_DEV_IOC_MAGIC, 2, uint32_t)
+#define FUSE_DEV_IOC_SYNC_INIT		_IO(FUSE_DEV_IOC_MAGIC, 3)
 
 struct fuse_lseek_in {
 	uint64_t	fh;
@@ -1144,6 +1157,11 @@ struct fuse_copy_file_range_in {
 	uint64_t	flags;
 };
 
+/* For FUSE_COPY_FILE_RANGE_64 */
+struct fuse_copy_file_range_out {
+	uint64_t	bytes_copied;
+};
+
 #define FUSE_SETUPMAPPING_FLAG_WRITE (1ull << 0)
 #define FUSE_SETUPMAPPING_FLAG_READ (1ull << 1)
 struct fuse_setupmapping_in {
diff --git a/include/standard-headers/linux/input-event-codes.h b/include/standard-headers/linux/input-event-codes.h
index 00dc9caac9..c914ccd723 100644
--- a/include/standard-headers/linux/input-event-codes.h
+++ b/include/standard-headers/linux/input-event-codes.h
@@ -27,6 +27,7 @@
 #define INPUT_PROP_TOPBUTTONPAD		0x04	/* softbuttons at top of pad */
 #define INPUT_PROP_POINTING_STICK	0x05	/* is a pointing stick */
 #define INPUT_PROP_ACCELEROMETER	0x06	/* has accelerometer */
+#define INPUT_PROP_HAPTIC_TOUCHPAD	0x07	/* is a haptic touchpad */
 
 #define INPUT_PROP_MAX			0x1f
 #define INPUT_PROP_CNT			(INPUT_PROP_MAX + 1)
diff --git a/include/standard-headers/linux/input.h b/include/standard-headers/linux/input.h
index d4512c20b5..9aff211dd5 100644
--- a/include/standard-headers/linux/input.h
+++ b/include/standard-headers/linux/input.h
@@ -426,6 +426,24 @@ struct ff_rumble_effect {
 	uint16_t weak_magnitude;
 };
 
+/**
+ * struct ff_haptic_effect
+ * @hid_usage: hid_usage according to Haptics page (WAVEFORM_CLICK, etc.)
+ * @vendor_id: the waveform vendor ID if hid_usage is in the vendor-defined range
+ * @vendor_waveform_page: the vendor waveform page if hid_usage is in the vendor-defined range
+ * @intensity: strength of the effect as percentage
+ * @repeat_count: number of times to retrigger effect
+ * @retrigger_period: time before effect is retriggered (in ms)
+ */
+struct ff_haptic_effect {
+	uint16_t hid_usage;
+	uint16_t vendor_id;
+	uint8_t  vendor_waveform_page;
+	uint16_t intensity;
+	uint16_t repeat_count;
+	uint16_t retrigger_period;
+};
+
 /**
  * struct ff_effect - defines force feedback effect
  * @type: type of the effect (FF_CONSTANT, FF_PERIODIC, FF_RAMP, FF_SPRING,
@@ -462,6 +480,7 @@ struct ff_effect {
 		struct ff_periodic_effect periodic;
 		struct ff_condition_effect condition[2]; /* One for each axis */
 		struct ff_rumble_effect rumble;
+		struct ff_haptic_effect haptic;
 	} u;
 };
 
@@ -469,6 +488,7 @@ struct ff_effect {
  * Force feedback effect types
  */
 
+#define FF_HAPTIC		0x4f
 #define FF_RUMBLE	0x50
 #define FF_PERIODIC	0x51
 #define FF_CONSTANT	0x52
@@ -478,7 +498,7 @@ struct ff_effect {
 #define FF_INERTIA	0x56
 #define FF_RAMP		0x57
 
-#define FF_EFFECT_MIN	FF_RUMBLE
+#define FF_EFFECT_MIN	FF_HAPTIC
 #define FF_EFFECT_MAX	FF_RAMP
 
 /*
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index f5b17745de..07e06aafec 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -207,6 +207,9 @@
 
 /* Capability lists */
 
+#define PCI_CAP_ID_MASK		0x00ff	/* Capability ID mask */
+#define PCI_CAP_LIST_NEXT_MASK	0xff00	/* Next Capability Pointer mask */
+
 #define PCI_CAP_LIST_ID		0	/* Capability ID */
 #define  PCI_CAP_ID_PM		0x01	/* Power Management */
 #define  PCI_CAP_ID_AGP		0x02	/* Accelerated Graphics Port */
@@ -776,6 +779,12 @@
 #define  PCI_ERR_UNC_MCBTLP	0x00800000	/* MC blocked TLP */
 #define  PCI_ERR_UNC_ATOMEG	0x01000000	/* Atomic egress blocked */
 #define  PCI_ERR_UNC_TLPPRE	0x02000000	/* TLP prefix blocked */
+#define  PCI_ERR_UNC_POISON_BLK	0x04000000	/* Poisoned TLP Egress Blocked */
+#define  PCI_ERR_UNC_DMWR_BLK	0x08000000	/* DMWr Request Egress Blocked */
+#define  PCI_ERR_UNC_IDE_CHECK	0x10000000	/* IDE Check Failed */
+#define  PCI_ERR_UNC_MISR_IDE	0x20000000	/* Misrouted IDE TLP */
+#define  PCI_ERR_UNC_PCRC_CHECK	0x40000000	/* PCRC Check Failed */
+#define  PCI_ERR_UNC_XLAT_BLK	0x80000000	/* TLP Translation Egress Blocked */
 #define PCI_ERR_UNCOR_MASK	0x08	/* Uncorrectable Error Mask */
 	/* Same bits as above */
 #define PCI_ERR_UNCOR_SEVER	0x0c	/* Uncorrectable Error Severity */
@@ -798,6 +807,7 @@
 #define  PCI_ERR_CAP_ECRC_CHKC		0x00000080 /* ECRC Check Capable */
 #define  PCI_ERR_CAP_ECRC_CHKE		0x00000100 /* ECRC Check Enable */
 #define  PCI_ERR_CAP_PREFIX_LOG_PRESENT	0x00000800 /* TLP Prefix Log Present */
+#define  PCI_ERR_CAP_COMP_TIME_LOG	0x00001000 /* Completion Timeout Prefix/Header Log Capable */
 #define  PCI_ERR_CAP_TLP_LOG_FLIT	0x00040000 /* TLP was logged in Flit Mode */
 #define  PCI_ERR_CAP_TLP_LOG_SIZE	0x00f80000 /* Logged TLP Size (only in Flit mode) */
 #define PCI_ERR_HEADER_LOG	0x1c	/* Header Log Register (16 bytes) */
diff --git a/include/standard-headers/linux/virtio_ids.h b/include/standard-headers/linux/virtio_ids.h
index 7aa2eb7662..6c12db16fa 100644
--- a/include/standard-headers/linux/virtio_ids.h
+++ b/include/standard-headers/linux/virtio_ids.h
@@ -68,6 +68,7 @@
 #define VIRTIO_ID_AUDIO_POLICY		39 /* virtio audio policy */
 #define VIRTIO_ID_BT			40 /* virtio bluetooth */
 #define VIRTIO_ID_GPIO			41 /* virtio gpio */
+#define VIRTIO_ID_SPI			45 /* virtio spi */
 
 /*
  * Virtio Transitional IDs
diff --git a/linux-headers/asm-loongarch/kvm.h b/linux-headers/asm-loongarch/kvm.h
index 5f354f5c68..57ba1a563b 100644
--- a/linux-headers/asm-loongarch/kvm.h
+++ b/linux-headers/asm-loongarch/kvm.h
@@ -103,6 +103,7 @@ struct kvm_fpu {
 #define  KVM_LOONGARCH_VM_FEAT_PMU		5
 #define  KVM_LOONGARCH_VM_FEAT_PV_IPI		6
 #define  KVM_LOONGARCH_VM_FEAT_PV_STEALTIME	7
+#define  KVM_LOONGARCH_VM_FEAT_PTW		8
 
 /* Device Control API on vcpu fd */
 #define KVM_LOONGARCH_VCPU_CPUCFG	0
diff --git a/linux-headers/asm-riscv/kvm.h b/linux-headers/asm-riscv/kvm.h
index ef27d4289d..759a4852c0 100644
--- a/linux-headers/asm-riscv/kvm.h
+++ b/linux-headers/asm-riscv/kvm.h
@@ -9,7 +9,7 @@
 #ifndef __LINUX_KVM_RISCV_H
 #define __LINUX_KVM_RISCV_H
 
-#ifndef __ASSEMBLY__
+#ifndef __ASSEMBLER__
 
 #include <linux/types.h>
 #include <asm/bitsperlong.h>
@@ -56,6 +56,7 @@ struct kvm_riscv_config {
 	unsigned long mimpid;
 	unsigned long zicboz_block_size;
 	unsigned long satp_mode;
+	unsigned long zicbop_block_size;
 };
 
 /* CORE registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */
@@ -185,6 +186,10 @@ enum KVM_RISCV_ISA_EXT_ID {
 	KVM_RISCV_ISA_EXT_ZICCRSE,
 	KVM_RISCV_ISA_EXT_ZAAMO,
 	KVM_RISCV_ISA_EXT_ZALRSC,
+	KVM_RISCV_ISA_EXT_ZICBOP,
+	KVM_RISCV_ISA_EXT_ZFBFMIN,
+	KVM_RISCV_ISA_EXT_ZVFBFMIN,
+	KVM_RISCV_ISA_EXT_ZVFBFWMA,
 	KVM_RISCV_ISA_EXT_MAX,
 };
 
@@ -205,6 +210,7 @@ enum KVM_RISCV_SBI_EXT_ID {
 	KVM_RISCV_SBI_EXT_DBCN,
 	KVM_RISCV_SBI_EXT_STA,
 	KVM_RISCV_SBI_EXT_SUSP,
+	KVM_RISCV_SBI_EXT_FWFT,
 	KVM_RISCV_SBI_EXT_MAX,
 };
 
@@ -214,6 +220,18 @@ struct kvm_riscv_sbi_sta {
 	unsigned long shmem_hi;
 };
 
+struct kvm_riscv_sbi_fwft_feature {
+	unsigned long enable;
+	unsigned long flags;
+	unsigned long value;
+};
+
+/* SBI FWFT extension registers for KVM_GET_ONE_REG and KVM_SET_ONE_REG */
+struct kvm_riscv_sbi_fwft {
+	struct kvm_riscv_sbi_fwft_feature misaligned_deleg;
+	struct kvm_riscv_sbi_fwft_feature pointer_masking;
+};
+
 /* Possible states for kvm_riscv_timer */
 #define KVM_RISCV_TIMER_STATE_OFF	0
 #define KVM_RISCV_TIMER_STATE_ON	1
@@ -297,6 +315,9 @@ struct kvm_riscv_sbi_sta {
 #define KVM_REG_RISCV_SBI_STA		(0x0 << KVM_REG_RISCV_SUBTYPE_SHIFT)
 #define KVM_REG_RISCV_SBI_STA_REG(name)		\
 		(offsetof(struct kvm_riscv_sbi_sta, name) / sizeof(unsigned long))
+#define KVM_REG_RISCV_SBI_FWFT		(0x1 << KVM_REG_RISCV_SUBTYPE_SHIFT)
+#define KVM_REG_RISCV_SBI_FWFT_REG(name)	\
+		(offsetof(struct kvm_riscv_sbi_fwft, name) / sizeof(unsigned long))
 
 /* Device Control API: RISC-V AIA */
 #define KVM_DEV_RISCV_APLIC_ALIGN		0x1000
diff --git a/linux-headers/asm-riscv/ptrace.h b/linux-headers/asm-riscv/ptrace.h
index 1e3166caca..a3f8211ede 100644
--- a/linux-headers/asm-riscv/ptrace.h
+++ b/linux-headers/asm-riscv/ptrace.h
@@ -6,7 +6,7 @@
 #ifndef _ASM_RISCV_PTRACE_H
 #define _ASM_RISCV_PTRACE_H
 
-#ifndef __ASSEMBLY__
+#ifndef __ASSEMBLER__
 
 #include <linux/types.h>
 
@@ -127,6 +127,6 @@ struct __riscv_v_regset_state {
  */
 #define RISCV_MAX_VLENB (8192)
 
-#endif /* __ASSEMBLY__ */
+#endif /* __ASSEMBLER__ */
 
 #endif /* _ASM_RISCV_PTRACE_H */
diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h
index f0c1a730d9..3bb38f6c3a 100644
--- a/linux-headers/asm-x86/kvm.h
+++ b/linux-headers/asm-x86/kvm.h
@@ -35,6 +35,11 @@
 #define MC_VECTOR 18
 #define XM_VECTOR 19
 #define VE_VECTOR 20
+#define CP_VECTOR 21
+
+#define HV_VECTOR 28
+#define VC_VECTOR 29
+#define SX_VECTOR 30
 
 /* Select x86 specific features in <linux/kvm.h> */
 #define __KVM_HAVE_PIT
@@ -409,6 +414,35 @@ struct kvm_xcrs {
 	__u64 padding[16];
 };
 
+#define KVM_X86_REG_TYPE_MSR		2
+#define KVM_X86_REG_TYPE_KVM		3
+
+#define KVM_X86_KVM_REG_SIZE(reg)						\
+({										\
+	reg == KVM_REG_GUEST_SSP ? KVM_REG_SIZE_U64 : 0;			\
+})
+
+#define KVM_X86_REG_TYPE_SIZE(type, reg)					\
+({										\
+	__u64 type_size = (__u64)type << 32;					\
+										\
+	type_size |= type == KVM_X86_REG_TYPE_MSR ? KVM_REG_SIZE_U64 :		\
+		     type == KVM_X86_REG_TYPE_KVM ? KVM_X86_KVM_REG_SIZE(reg) :	\
+		     0;								\
+	type_size;								\
+})
+
+#define KVM_X86_REG_ID(type, index)				\
+	(KVM_REG_X86 | KVM_X86_REG_TYPE_SIZE(type, index) | index)
+
+#define KVM_X86_REG_MSR(index)					\
+	KVM_X86_REG_ID(KVM_X86_REG_TYPE_MSR, index)
+#define KVM_X86_REG_KVM(index)					\
+	KVM_X86_REG_ID(KVM_X86_REG_TYPE_KVM, index)
+
+/* KVM-defined registers starting from 0 */
+#define KVM_REG_GUEST_SSP	0
+
 #define KVM_SYNC_X86_REGS      (1UL << 0)
 #define KVM_SYNC_X86_SREGS     (1UL << 1)
 #define KVM_SYNC_X86_EVENTS    (1UL << 2)
diff --git a/linux-headers/asm-x86/unistd_64.h b/linux-headers/asm-x86/unistd_64.h
index 2f55bebb81..26c258d1a6 100644
--- a/linux-headers/asm-x86/unistd_64.h
+++ b/linux-headers/asm-x86/unistd_64.h
@@ -337,6 +337,7 @@
 #define __NR_io_pgetevents 333
 #define __NR_rseq 334
 #define __NR_uretprobe 335
+#define __NR_uprobe 336
 #define __NR_pidfd_send_signal 424
 #define __NR_io_uring_setup 425
 #define __NR_io_uring_enter 426
diff --git a/linux-headers/asm-x86/unistd_x32.h b/linux-headers/asm-x86/unistd_x32.h
index 8cc8673f15..65c2aed946 100644
--- a/linux-headers/asm-x86/unistd_x32.h
+++ b/linux-headers/asm-x86/unistd_x32.h
@@ -290,6 +290,7 @@
 #define __NR_io_pgetevents (__X32_SYSCALL_BIT + 333)
 #define __NR_rseq (__X32_SYSCALL_BIT + 334)
 #define __NR_uretprobe (__X32_SYSCALL_BIT + 335)
+#define __NR_uprobe (__X32_SYSCALL_BIT + 336)
 #define __NR_pidfd_send_signal (__X32_SYSCALL_BIT + 424)
 #define __NR_io_uring_setup (__X32_SYSCALL_BIT + 425)
 #define __NR_io_uring_enter (__X32_SYSCALL_BIT + 426)
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index be704965d8..4ea28ef7ca 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -954,6 +954,7 @@ struct kvm_enable_cap {
 #define KVM_CAP_ARM_EL2_E2H0 241
 #define KVM_CAP_RISCV_MP_STATE_RESET 242
 #define KVM_CAP_ARM_CACHEABLE_PFNMAP_SUPPORTED 243
+#define KVM_CAP_GUEST_MEMFD_FLAGS 244
 
 struct kvm_irq_routing_irqchip {
 	__u32 irqchip;
@@ -1590,6 +1591,8 @@ struct kvm_memory_attributes {
 #define KVM_MEMORY_ATTRIBUTE_PRIVATE           (1ULL << 3)
 
 #define KVM_CREATE_GUEST_MEMFD	_IOWR(KVMIO,  0xd4, struct kvm_create_guest_memfd)
+#define GUEST_MEMFD_FLAG_MMAP		(1ULL << 0)
+#define GUEST_MEMFD_FLAG_INIT_SHARED	(1ULL << 1)
 
 struct kvm_create_guest_memfd {
 	__u64 size;
diff --git a/linux-headers/linux/psp-sev.h b/linux-headers/linux/psp-sev.h
index 113c4ceb78..c525125ea8 100644
--- a/linux-headers/linux/psp-sev.h
+++ b/linux-headers/linux/psp-sev.h
@@ -185,6 +185,10 @@ struct sev_user_data_get_id2 {
  * @mask_chip_id: whether chip id is present in attestation reports or not
  * @mask_chip_key: whether attestation reports are signed or not
  * @vlek_en: VLEK (Version Loaded Endorsement Key) hashstick is loaded
+ * @feature_info: whether SNP_FEATURE_INFO command is available
+ * @rapl_dis: whether RAPL is disabled
+ * @ciphertext_hiding_cap: whether platform has ciphertext hiding capability
+ * @ciphertext_hiding_en: whether ciphertext hiding is enabled
  * @rsvd1: reserved
  * @guest_count: the number of guest currently managed by the firmware
  * @current_tcb_version: current TCB version
@@ -200,7 +204,11 @@ struct sev_user_data_snp_status {
 	__u32 mask_chip_id:1;		/* Out */
 	__u32 mask_chip_key:1;		/* Out */
 	__u32 vlek_en:1;		/* Out */
-	__u32 rsvd1:29;
+	__u32 feature_info:1;		/* Out */
+	__u32 rapl_dis:1;		/* Out */
+	__u32 ciphertext_hiding_cap:1;	/* Out */
+	__u32 ciphertext_hiding_en:1;	/* Out */
+	__u32 rsvd1:25;
 	__u32 guest_count;		/* Out */
 	__u64 current_tcb_version;	/* Out */
 	__u64 reported_tcb_version;	/* Out */
diff --git a/linux-headers/linux/stddef.h b/linux-headers/linux/stddef.h
index e1fcfcf3b3..48ee4438e0 100644
--- a/linux-headers/linux/stddef.h
+++ b/linux-headers/linux/stddef.h
@@ -3,7 +3,6 @@
 #define _LINUX_STDDEF_H
 
 
-
 #ifndef __always_inline
 #define __always_inline __inline__
 #endif
diff --git a/linux-headers/linux/vduse.h b/linux-headers/linux/vduse.h
index f46269af34..da6ac89af1 100644
--- a/linux-headers/linux/vduse.h
+++ b/linux-headers/linux/vduse.h
@@ -237,7 +237,7 @@ struct vduse_iova_umem {
  * struct vduse_iova_info - information of one IOVA region
  * @start: start of the IOVA region
  * @last: last of the IOVA region
- * @capability: capability of the IOVA regsion
+ * @capability: capability of the IOVA region
  * @reserved: for future use, needs to be initialized to zero
  *
  * Structure used by VDUSE_IOTLB_GET_INFO ioctl to get information of
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 283348b64a..c57674a6aa 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -260,7 +260,7 @@
  * When fork_owner is set to VHOST_FORK_OWNER_KTHREAD:
  *   - Vhost will create vhost workers as kernel threads.
  */
-#define VHOST_SET_FORK_FROM_OWNER _IOW(VHOST_VIRTIO, 0x83, __u8)
+#define VHOST_SET_FORK_FROM_OWNER _IOW(VHOST_VIRTIO, 0x84, __u8)
 
 /**
  * VHOST_GET_FORK_OWNER - Get the current fork_owner flag for the vhost device.
@@ -268,6 +268,6 @@
  *
  * @return: An 8-bit value indicating the current thread mode.
  */
-#define VHOST_GET_FORK_FROM_OWNER _IOR(VHOST_VIRTIO, 0x84, __u8)
+#define VHOST_GET_FORK_FROM_OWNER _IOR(VHOST_VIRTIO, 0x85, __u8)
 
 #endif
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/8] kvm: Allow kvm_guest_memfd_supported for non-private use case
  2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
  2025-10-23 18:59 ` [PATCH 1/8] linux-headers: Update to v6.18-rc2 Peter Xu
@ 2025-10-23 18:59 ` Peter Xu
  2025-10-24  2:30   ` Xiaoyao Li
  2025-10-23 18:59 ` [PATCH 3/8] kvm: Detect guest-memfd flags supported Peter Xu
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

Guest-memfd is not 100% attached to private, it's a VM-specific memory
provider.  Allow it to be created even without private attributes, for
example, when the VM can use the guest-memfd memory completely shared.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 accel/kvm/kvm-all.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f9254ae654..1425dfd8b3 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2779,10 +2779,8 @@ static int kvm_init(AccelState *as, MachineState *ms)
     }
 
     kvm_supported_memory_attributes = kvm_vm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
-    kvm_guest_memfd_supported =
-        kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
-        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
-        (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
+    kvm_guest_memfd_supported = kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
+        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
     kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
 
     if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/8] kvm: Detect guest-memfd flags supported
  2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
  2025-10-23 18:59 ` [PATCH 1/8] linux-headers: Update to v6.18-rc2 Peter Xu
  2025-10-23 18:59 ` [PATCH 2/8] kvm: Allow kvm_guest_memfd_supported for non-private use case Peter Xu
@ 2025-10-23 18:59 ` Peter Xu
  2025-10-24  3:52   ` Xiaoyao Li
  2025-10-23 18:59 ` [PATCH 4/8] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

Detect supported guest-memfd flags by the current kernel, and reject
creations of guest-memfd using invalid flags.  When the cap isn't
available, then no flag is supported.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 accel/kvm/kvm-all.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 1425dfd8b3..48a8f6424f 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -108,6 +108,7 @@ static int kvm_sstep_flags;
 static bool kvm_immediate_exit;
 static uint64_t kvm_supported_memory_attributes;
 static bool kvm_guest_memfd_supported;
+static uint64_t kvm_guest_memfd_flags_supported;
 static hwaddr kvm_max_slot_size = ~0;
 
 static const KVMCapabilityInfo kvm_required_capabilites[] = {
@@ -2781,6 +2782,11 @@ static int kvm_init(AccelState *as, MachineState *ms)
     kvm_supported_memory_attributes = kvm_vm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
     kvm_guest_memfd_supported = kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
         kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
+    ret = kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD_FLAGS);
+    if (ret > 0)
+        kvm_guest_memfd_flags_supported = (uint64_t)ret;
+    else
+        kvm_guest_memfd_flags_supported = 0;
     kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
 
     if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
@@ -4486,6 +4492,12 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
         return -1;
     }
 
+    if (flags & ~kvm_guest_memfd_flags_supported) {
+        error_setg(errp, "KVM does not support guest-memfd flag: 0x%"PRIx64,
+                   flags & ~kvm_guest_memfd_flags_supported);
+        return -1;
+    }
+
     fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &guest_memfd);
     if (fd < 0) {
         error_setg_errno(errp, errno, "Error creating KVM guest_memfd");
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/8] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
  2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
                   ` (2 preceding siblings ...)
  2025-10-23 18:59 ` [PATCH 3/8] kvm: Detect guest-memfd flags supported Peter Xu
@ 2025-10-23 18:59 ` Peter Xu
  2025-10-24  9:17   ` Xiaoyao Li
  2025-10-23 18:59 ` [PATCH 5/8] memory: Rename memory_region_has_guest_memfd() to *_private() Peter Xu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

This name is too generic, and can conflict with in-place guest-memfd
support.  Add a _PRIVATE suffix to show what it really means: it is always
silently using an internal guest-memfd to back a shared host backend,
rather than used in-place.

This paves way for in-place guest-memfd, which means we can have a ramblock
that allocates pages completely from guest-memfd (private or shared).

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/memory.h   | 8 ++++----
 include/system/ram_addr.h | 2 +-
 backends/hostmem-file.c   | 2 +-
 backends/hostmem-memfd.c  | 2 +-
 backends/hostmem-ram.c    | 2 +-
 system/memory.c           | 2 +-
 system/physmem.c          | 8 ++++----
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/system/memory.h b/include/system/memory.h
index 3bd5ffa5e0..2c1a5e06b4 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -263,7 +263,7 @@ typedef struct IOMMUTLBEvent {
 #define RAM_READONLY_FD (1 << 11)
 
 /* RAM can be private that has kvm guest memfd backend */
-#define RAM_GUEST_MEMFD   (1 << 12)
+#define RAM_GUEST_MEMFD_PRIVATE   (1 << 12)
 
 /*
  * In RAMBlock creation functions, if MAP_SHARED is 0 in the flags parameter,
@@ -1401,7 +1401,7 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
  *        must be unique within any device
  * @size: size of the region.
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE,
- *             RAM_GUEST_MEMFD.
+ *             RAM_GUEST_MEMFD_PRIVATE.
  * @errp: pointer to Error*, to store an error if it happens.
  *
  * Note that this function does not do anything to cause the data in the
@@ -1463,7 +1463,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
  *         (getpagesize()) will be used.
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *             RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- *             RAM_READONLY_FD, RAM_GUEST_MEMFD
+ *             RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
  * @path: the path in which to allocate the RAM.
  * @offset: offset within the file referenced by path
  * @errp: pointer to Error*, to store an error if it happens.
@@ -1493,7 +1493,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
  * @size: size of the region.
  * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *             RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- *             RAM_READONLY_FD, RAM_GUEST_MEMFD
+ *             RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
  * @fd: the fd to mmap.
  * @offset: offset within the file referenced by fd
  * @errp: pointer to Error*, to store an error if it happens.
diff --git a/include/system/ram_addr.h b/include/system/ram_addr.h
index 683485980c..930d3824d7 100644
--- a/include/system/ram_addr.h
+++ b/include/system/ram_addr.h
@@ -92,7 +92,7 @@ static inline unsigned long int ramblock_recv_bitmap_offset(void *host_addr,
  *  @resized: callback after calls to qemu_ram_resize
  *  @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
  *              RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- *              RAM_READONLY_FD, RAM_GUEST_MEMFD
+ *              RAM_READONLY_FD, RAM_GUEST_MEMFD_PRIVATE
  *  @mem_path or @fd: specify the backing file or device
  *  @offset: Offset into target file
  *  @grow: extend file if necessary (but an empty file is always extended).
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 8e3219c061..1f20cd8fd6 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -86,7 +86,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
     ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
+    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
     ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
     ram_flags |= RAM_NAMED_FILE;
     return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index 923239f9cf..3f3e485709 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -60,7 +60,7 @@ have_fd:
     backend->aligned = true;
     ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
+    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
     return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
                                           backend->size, ram_flags, fd, 0, errp);
 }
diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
index 062b1abb11..96ad29112d 100644
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -30,7 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     name = host_memory_backend_get_name(backend);
     ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
+    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
     return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
                                                   name, backend->size,
                                                   ram_flags, errp);
diff --git a/system/memory.c b/system/memory.c
index 8b84661ae3..006b03ce1c 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -3755,7 +3755,7 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
     DeviceState *owner_dev;
 
     if (!memory_region_init_ram_flags_nomigrate(mr, owner, name, size,
-                                                RAM_GUEST_MEMFD, errp)) {
+                                                RAM_GUEST_MEMFD_PRIVATE, errp)) {
         return false;
     }
     /* This will assert if owner is neither NULL nor a DeviceState.
diff --git a/system/physmem.c b/system/physmem.c
index a340ca3e61..1a186739a8 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2203,7 +2203,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
         }
     }
 
-    if (new_block->flags & RAM_GUEST_MEMFD) {
+    if (new_block->flags & RAM_GUEST_MEMFD_PRIVATE) {
         int ret;
 
         if (!kvm_enabled()) {
@@ -2333,7 +2333,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, ram_addr_t max_size,
     /* Just support these ram flags by now. */
     assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE |
                           RAM_PROTECTED | RAM_NAMED_FILE | RAM_READONLY |
-                          RAM_READONLY_FD | RAM_GUEST_MEMFD |
+                          RAM_READONLY_FD | RAM_GUEST_MEMFD_PRIVATE |
                           RAM_RESIZEABLE)) == 0);
     assert(max_size >= size);
 
@@ -2490,7 +2490,7 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
     ram_flags &= ~RAM_PRIVATE;
 
     assert((ram_flags & ~(RAM_SHARED | RAM_RESIZEABLE | RAM_PREALLOC |
-                          RAM_NORESERVE | RAM_GUEST_MEMFD)) == 0);
+                          RAM_NORESERVE | RAM_GUEST_MEMFD_PRIVATE)) == 0);
     assert(!host ^ (ram_flags & RAM_PREALLOC));
     assert(max_size >= size);
 
@@ -2573,7 +2573,7 @@ RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
 RAMBlock *qemu_ram_alloc(ram_addr_t size, uint32_t ram_flags,
                          MemoryRegion *mr, Error **errp)
 {
-    assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE | RAM_GUEST_MEMFD |
+    assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE | RAM_GUEST_MEMFD_PRIVATE |
                           RAM_PRIVATE)) == 0);
     return qemu_ram_alloc_internal(size, size, NULL, NULL, ram_flags, mr, errp);
 }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/8] memory: Rename memory_region_has_guest_memfd() to *_private()
  2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
                   ` (3 preceding siblings ...)
  2025-10-23 18:59 ` [PATCH 4/8] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
@ 2025-10-23 18:59 ` Peter Xu
  2025-10-23 18:59 ` [PATCH 6/8] ramblock: Rename guest_memfd to guest_memfd_private Peter Xu
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

Rename the function with "_private" suffix, to show that it returns true
only if it has an internal guest-memfd to back private pages (rather than
in-place guest-memfd).

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/memory.h | 6 +++---
 accel/kvm/kvm-all.c     | 6 +++---
 system/memory.c         | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/system/memory.h b/include/system/memory.h
index 2c1a5e06b4..4428701a9f 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -1823,14 +1823,14 @@ static inline bool memory_region_is_romd(MemoryRegion *mr)
 bool memory_region_is_protected(MemoryRegion *mr);
 
 /**
- * memory_region_has_guest_memfd: check whether a memory region has guest_memfd
- *     associated
+ * memory_region_has_guest_memfd_private: check whether a memory region has
+ *     guest_memfd associated
  *
  * Returns %true if a memory region's ram_block has valid guest_memfd assigned.
  *
  * @mr: the memory region being queried
  */
-bool memory_region_has_guest_memfd(MemoryRegion *mr);
+bool memory_region_has_guest_memfd_private(MemoryRegion *mr);
 
 /**
  * memory_region_get_iommu: check whether a memory region is an iommu
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 48a8f6424f..6521648ce9 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -666,7 +666,7 @@ static int kvm_mem_flags(MemoryRegion *mr)
     if (readonly && kvm_readonly_mem_allowed) {
         flags |= KVM_MEM_READONLY;
     }
-    if (memory_region_has_guest_memfd(mr)) {
+    if (memory_region_has_guest_memfd_private(mr)) {
         assert(kvm_guest_memfd_supported);
         flags |= KVM_MEM_GUEST_MEMFD;
     }
@@ -1610,7 +1610,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
             abort();
         }
 
-        if (memory_region_has_guest_memfd(mr)) {
+        if (memory_region_has_guest_memfd_private(mr)) {
             err = kvm_set_memory_attributes_private(start_addr, slot_size);
             if (err) {
                 error_report("%s: failed to set memory attribute private: %s",
@@ -3096,7 +3096,7 @@ int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private)
         return ret;
     }
 
-    if (!memory_region_has_guest_memfd(mr)) {
+    if (!memory_region_has_guest_memfd_private(mr)) {
         /*
          * Because vMMIO region must be shared, guest TD may convert vMMIO
          * region to shared explicitly.  Don't complain such case.  See
diff --git a/system/memory.c b/system/memory.c
index 006b03ce1c..5f05e5d73e 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1897,7 +1897,7 @@ bool memory_region_is_protected(MemoryRegion *mr)
     return mr->ram && (mr->ram_block->flags & RAM_PROTECTED);
 }
 
-bool memory_region_has_guest_memfd(MemoryRegion *mr)
+bool memory_region_has_guest_memfd_private(MemoryRegion *mr)
 {
     return mr->ram_block && mr->ram_block->guest_memfd >= 0;
 }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/8] ramblock: Rename guest_memfd to guest_memfd_private
  2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
                   ` (4 preceding siblings ...)
  2025-10-23 18:59 ` [PATCH 5/8] memory: Rename memory_region_has_guest_memfd() to *_private() Peter Xu
@ 2025-10-23 18:59 ` Peter Xu
  2025-10-23 18:59 ` [PATCH 7/8] hostmem: " Peter Xu
  2025-10-23 18:59 ` [PATCH 8/8] hostmem: Support in-place guest memfd to back a VM Peter Xu
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

Rename the field to reflect the fact that the guest_memfd in this case only
backs private portion of the ramblock rather than all of it.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/memory.h   |  2 +-
 include/system/ramblock.h |  7 ++++++-
 accel/kvm/kvm-all.c       |  2 +-
 system/memory.c           |  2 +-
 system/physmem.c          | 21 +++++++++++----------
 5 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/include/system/memory.h b/include/system/memory.h
index 4428701a9f..5c38018f4a 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -1826,7 +1826,7 @@ bool memory_region_is_protected(MemoryRegion *mr);
  * memory_region_has_guest_memfd_private: check whether a memory region has
  *     guest_memfd associated
  *
- * Returns %true if a memory region's ram_block has valid guest_memfd assigned.
+ * Returns %true if a memory region's ram_block has guest_memfd_private assigned.
  *
  * @mr: the memory region being queried
  */
diff --git a/include/system/ramblock.h b/include/system/ramblock.h
index 76694fe1b5..9ecf7f970c 100644
--- a/include/system/ramblock.h
+++ b/include/system/ramblock.h
@@ -40,7 +40,12 @@ struct RAMBlock {
     Error *cpr_blocker;
     int fd;
     uint64_t fd_offset;
-    int guest_memfd;
+    /*
+     * When RAM_GUEST_MEMFD_PRIVATE flag is set, this ramblock can have
+     * private pages backed by guest_memfd_private specified, while shared
+     * pages are backed by the ramblock on its own.
+     */
+    int guest_memfd_private;
     RamBlockAttributes *attributes;
     size_t page_size;
     /* dirty bitmap used during migration */
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 6521648ce9..687f33a2bb 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1598,7 +1598,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml,
         mem->ram_start_offset = ram_start_offset;
         mem->ram = ram;
         mem->flags = kvm_mem_flags(mr);
-        mem->guest_memfd = mr->ram_block->guest_memfd;
+        mem->guest_memfd = mr->ram_block->guest_memfd_private;
         mem->guest_memfd_offset = mem->guest_memfd >= 0 ?
                                   (uint8_t*)ram - mr->ram_block->host : 0;
 
diff --git a/system/memory.c b/system/memory.c
index 5f05e5d73e..dadcc21d0e 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1899,7 +1899,7 @@ bool memory_region_is_protected(MemoryRegion *mr)
 
 bool memory_region_has_guest_memfd_private(MemoryRegion *mr)
 {
-    return mr->ram_block && mr->ram_block->guest_memfd >= 0;
+    return mr->ram_block && mr->ram_block->guest_memfd_private >= 0;
 }
 
 uint8_t memory_region_get_dirty_log_mask(MemoryRegion *mr)
diff --git a/system/physmem.c b/system/physmem.c
index 1a186739a8..66fa4c7b6a 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2211,7 +2211,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
                        object_get_typename(OBJECT(current_machine->cgs)));
             goto out_free;
         }
-        assert(new_block->guest_memfd < 0);
+        assert(new_block->guest_memfd_private < 0);
 
         ret = ram_block_coordinated_discard_require(true);
         if (ret < 0) {
@@ -2221,9 +2221,9 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
             goto out_free;
         }
 
-        new_block->guest_memfd = kvm_create_guest_memfd(new_block->max_length,
-                                                        0, errp);
-        if (new_block->guest_memfd < 0) {
+        new_block->guest_memfd_private =
+            kvm_create_guest_memfd(new_block->max_length, 0, errp);
+        if (new_block->guest_memfd_private < 0) {
             qemu_mutex_unlock_ramlist();
             goto out_free;
         }
@@ -2240,7 +2240,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
         new_block->attributes = ram_block_attributes_create(new_block);
         if (!new_block->attributes) {
             error_setg(errp, "Failed to create ram block attribute");
-            close(new_block->guest_memfd);
+            close(new_block->guest_memfd_private);
             ram_block_coordinated_discard_require(false);
             qemu_mutex_unlock_ramlist();
             goto out_free;
@@ -2377,7 +2377,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, ram_addr_t max_size,
     new_block->max_length = max_size;
     new_block->resized = resized;
     new_block->flags = ram_flags;
-    new_block->guest_memfd = -1;
+    new_block->guest_memfd_private = -1;
     new_block->host = file_ram_alloc(new_block, max_size, fd,
                                      file_size < offset + max_size,
                                      offset, errp);
@@ -2550,7 +2550,7 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
     new_block->used_length = size;
     new_block->max_length = max_size;
     new_block->fd = -1;
-    new_block->guest_memfd = -1;
+    new_block->guest_memfd_private = -1;
     new_block->page_size = qemu_real_host_page_size();
     new_block->host = host;
     new_block->flags = ram_flags;
@@ -2601,9 +2601,9 @@ static void reclaim_ramblock(RAMBlock *block)
         qemu_anon_ram_free(block->host, block->max_length);
     }
 
-    if (block->guest_memfd >= 0) {
+    if (block->guest_memfd_private >= 0) {
         ram_block_attributes_destroy(block->attributes);
-        close(block->guest_memfd);
+        close(block->guest_memfd_private);
         ram_block_coordinated_discard_require(false);
     }
 
@@ -4211,7 +4211,8 @@ int ram_block_discard_guest_memfd_range(RAMBlock *rb, uint64_t offset,
 
 #ifdef CONFIG_FALLOCATE_PUNCH_HOLE
     /* ignore fd_offset with guest_memfd */
-    ret = fallocate(rb->guest_memfd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
+    ret = fallocate(rb->guest_memfd_private,
+                    FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
                     offset, length);
 
     if (ret) {
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 7/8] hostmem: Rename guest_memfd to guest_memfd_private
  2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
                   ` (5 preceding siblings ...)
  2025-10-23 18:59 ` [PATCH 6/8] ramblock: Rename guest_memfd to guest_memfd_private Peter Xu
@ 2025-10-23 18:59 ` Peter Xu
  2025-10-23 18:59 ` [PATCH 8/8] hostmem: Support in-place guest memfd to back a VM Peter Xu
  7 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

Rename the HostMemoryBackend.guest_memfd field to reflect what it really
means, on whether it needs guest_memfd to back its private portion of
mapping.  This will help on clearance when we introduce in-place
guest_memfd for hostmem.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/system/hostmem.h | 2 +-
 backends/hostmem-file.c  | 2 +-
 backends/hostmem-memfd.c | 2 +-
 backends/hostmem-ram.c   | 2 +-
 backends/hostmem.c       | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/system/hostmem.h b/include/system/hostmem.h
index 88fa791ac7..dcbf81aeae 100644
--- a/include/system/hostmem.h
+++ b/include/system/hostmem.h
@@ -76,7 +76,7 @@ struct HostMemoryBackend {
     uint64_t size;
     bool merge, dump, use_canonical_path;
     bool prealloc, is_mapped, share, reserve;
-    bool guest_memfd, aligned;
+    bool guest_memfd_private, aligned;
     uint32_t prealloc_threads;
     ThreadContext *prealloc_context;
     DECLARE_BITMAP(host_nodes, MAX_NODES + 1);
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index 1f20cd8fd6..0e4cfd6dc6 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -86,7 +86,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
     ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
+    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
     ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
     ram_flags |= RAM_NAMED_FILE;
     return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index 3f3e485709..ea93f034e4 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -60,7 +60,7 @@ have_fd:
     backend->aligned = true;
     ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
+    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
     return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
                                           backend->size, ram_flags, fd, 0, errp);
 }
diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
index 96ad29112d..6a507fad77 100644
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -30,7 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
     name = host_memory_backend_get_name(backend);
     ram_flags = backend->share ? RAM_SHARED : RAM_PRIVATE;
     ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
-    ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD_PRIVATE : 0;
+    ram_flags |= backend->guest_memfd_private ? RAM_GUEST_MEMFD_PRIVATE : 0;
     return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
                                                   name, backend->size,
                                                   ram_flags, errp);
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 35734d6f4d..70450733db 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -288,7 +288,7 @@ static void host_memory_backend_init(Object *obj)
     /* TODO: convert access to globals to compat properties */
     backend->merge = machine_mem_merge(machine);
     backend->dump = machine_dump_guest_core(machine);
-    backend->guest_memfd = machine_require_guest_memfd(machine);
+    backend->guest_memfd_private = machine_require_guest_memfd(machine);
     backend->reserve = true;
     backend->prealloc_threads = machine->smp.cpus;
 }
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 8/8] hostmem: Support in-place guest memfd to back a VM
  2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
                   ` (6 preceding siblings ...)
  2025-10-23 18:59 ` [PATCH 7/8] hostmem: " Peter Xu
@ 2025-10-23 18:59 ` Peter Xu
  2025-10-24  9:01   ` Xiaoyao Li
  7 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2025-10-23 18:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: peterx, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Li Xiaoyao, Juraj Marcin

Host backends supports guest-memfd now by detecting whether it's a
confidential VM.  There's no way to choose it yet from the memory level to
use it in-place.  If we use guest-memfd, it so far always implies we need
two layers of memory backends, while the guest-memfd only provides the
private set of pages.

This patch introduces a way so that QEMU can consume guest memfd as the
only source of memory to back the object (aka, in place), rather than
having another backend supporting the pages converted to shared.

To use the in-place guest-memfd, one can add a memfd object with:

  -object memory-backend-memfd,guest-memfd=on,share=on

Note that share=on is required with in-place guest_memfd.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qapi/qom.json            |  6 +++-
 backends/hostmem-memfd.c | 66 +++++++++++++++++++++++++++++++++++++---
 2 files changed, 67 insertions(+), 5 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 830cb2ffe7..6b090fe9a0 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -764,13 +764,17 @@
 # @seal: if true, create a sealed-file, which will block further
 #     resizing of the memory (default: true)
 #
+# @guest-memfd: if true, use guest-memfd to back the memory region.
+#     (default: false, since: 10.2)
+#
 # Since: 2.12
 ##
 { 'struct': 'MemoryBackendMemfdProperties',
   'base': 'MemoryBackendProperties',
   'data': { '*hugetlb': 'bool',
             '*hugetlbsize': 'size',
-            '*seal': 'bool' },
+            '*seal': 'bool',
+            '*guest-memfd': 'bool' },
   'if': 'CONFIG_LINUX' }
 
 ##
diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index ea93f034e4..1fa16c1e1d 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -18,6 +18,8 @@
 #include "qapi/error.h"
 #include "qom/object.h"
 #include "migration/cpr.h"
+#include "system/kvm.h"
+#include <linux/kvm.h>
 
 OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendMemfd, MEMORY_BACKEND_MEMFD)
 
@@ -28,6 +30,13 @@ struct HostMemoryBackendMemfd {
     bool hugetlb;
     uint64_t hugetlbsize;
     bool seal;
+    /*
+     * NOTE: this differs from HostMemoryBackend's guest_memfd_private,
+     * which represents a internally private guest-memfd that only backs
+     * private pages.  Instead, this flag marks the memory backend will
+     * 100% use the guest-memfd pages in-place.
+     */
+    bool guest_memfd;
 };
 
 static bool
@@ -47,10 +56,40 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
         goto have_fd;
     }
 
-    fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
-                           m->hugetlb, m->hugetlbsize, m->seal ?
-                           F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
-                           errp);
+    if (m->guest_memfd) {
+        /* User choose to use in-place guest-memfd to back the VM.. */
+        if (!backend->share) {
+            error_setg(errp, "In-place guest-memfd must be used with share=on");
+            return false;
+        }
+
+        /*
+         * This is the request to have a guest-memfd to back private pages.
+         * In-place guest-memfd doesn't work like that.  Disable it for now
+         * to make it simple, so that each memory backend can only have
+         * guest-memfd either as private, or fully shared.
+         */
+        if (backend->guest_memfd_private) {
+            error_setg(errp, "In-place guest-memfd cannot be used with another "
+                       "private guest-memfd");
+            return false;
+        }
+
+        /* TODO: add huge page support */
+        fd = kvm_create_guest_memfd(backend->size,
+                                    GUEST_MEMFD_FLAG_MMAP |
+                                    GUEST_MEMFD_FLAG_INIT_SHARED,
+                                    errp);
+        if (fd < 0) {
+            return false;
+        }
+    } else {
+        fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
+                               m->hugetlb, m->hugetlbsize, m->seal ?
+                               F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
+                               errp);
+    }
+
     if (fd == -1) {
         return false;
     }
@@ -65,6 +104,18 @@ have_fd:
                                           backend->size, ram_flags, fd, 0, errp);
 }
 
+static bool
+memfd_backend_get_guest_memfd(Object *o, Error **errp)
+{
+    return MEMORY_BACKEND_MEMFD(o)->guest_memfd;
+}
+
+static void
+memfd_backend_set_guest_memfd(Object *o, bool value, Error **errp)
+{
+    MEMORY_BACKEND_MEMFD(o)->guest_memfd = value;
+}
+
 static bool
 memfd_backend_get_hugetlb(Object *o, Error **errp)
 {
@@ -152,6 +203,13 @@ memfd_backend_class_init(ObjectClass *oc, const void *data)
         object_class_property_set_description(oc, "hugetlbsize",
                                               "Huge pages size (ex: 2M, 1G)");
     }
+
+    object_class_property_add_bool(oc, "guest-memfd",
+                                   memfd_backend_get_guest_memfd,
+                                   memfd_backend_set_guest_memfd);
+    object_class_property_set_description(oc, "guest-memfd",
+                                          "Use guest memfd");
+
     object_class_property_add_bool(oc, "seal",
                                    memfd_backend_get_seal,
                                    memfd_backend_set_seal);
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/8] kvm: Allow kvm_guest_memfd_supported for non-private use case
  2025-10-23 18:59 ` [PATCH 2/8] kvm: Allow kvm_guest_memfd_supported for non-private use case Peter Xu
@ 2025-10-24  2:30   ` Xiaoyao Li
  0 siblings, 0 replies; 15+ messages in thread
From: Xiaoyao Li @ 2025-10-24  2:30 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Paolo Bonzini, Fabiano Rosas, Chenyi Qiang, David Hildenbrand,
	Alexey Kardashevskiy, Juraj Marcin

On 10/24/2025 2:59 AM, Peter Xu wrote:
> Guest-memfd is not 100% attached to private, it's a VM-specific memory
> provider.  Allow it to be created even without private attributes, for
> example, when the VM can use the guest-memfd memory completely shared.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   accel/kvm/kvm-all.c | 6 ++----
>   1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index f9254ae654..1425dfd8b3 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -2779,10 +2779,8 @@ static int kvm_init(AccelState *as, MachineState *ms)
>       }
>   
>       kvm_supported_memory_attributes = kvm_vm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
> -    kvm_guest_memfd_supported =
> -        kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
> -        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
> -        (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
> +    kvm_guest_memfd_supported = kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
> +        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
>       kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
>   
>       if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {


The check on KVM_MEMORY_ATTRIBUTE_PRIVATE is dropped silently. But using 
guest memfd to serve as private memory does requires the support of 
KVM_MEMORY_ATTRIBUTE_PRIVATE.

My version of the patch was


Author: Xiaoyao Li <xiaoyao.li@intel.com>
Date:   Sat Jul 19 00:56:57 2025 +0800

     kvm: Decouple memory attribute check from kvm_guest_memfd_supported

     With the mmap support of guest memfd, KVM allows usersapce to create
     guest memfd serving as normal non-private memory for X86 DEFEAULT VM.
     However, KVM doesn't support private memory attriute for X86 DEFAULT
     VM.

     Make kvm_guest_memfd_supported not rely on KVM_MEMORY_ATTRIBUTE_PRIVATE
     and check KVM_MEMORY_ATTRIBUTE_PRIVATE separately when the machine
     requires guest_memfd to serve as private memory.

     This allows QMEU to create guest memfd with mmap to serve as the memory
     backend for X86 DEFAULT VM.

     Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index f9254ae65466..96c194ce54cd 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1501,6 +1501,11 @@ int kvm_set_memory_attributes_shared(hwaddr 
start, uint64_t size)
      return kvm_set_memory_attributes(start, size, 0);
  }

+bool kvm_private_memory_attribute_supported(void)
+{
+    return !!(kvm_supported_memory_attributes & 
KVM_MEMORY_ATTRIBUTE_PRIVATE);
+}
+
  /* Called with KVMMemoryListener.slots_lock held */
  static void kvm_set_phys_mem(KVMMemoryListener *kml,
                               MemoryRegionSection *section, bool add)
@@ -2781,8 +2786,7 @@ static int kvm_init(AccelState *as, MachineState *ms)
      kvm_supported_memory_attributes = kvm_vm_check_extension(s, 
KVM_CAP_MEMORY_ATTRIBUTES);
      kvm_guest_memfd_supported =
          kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
-        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
-        (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
+        kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
      kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, 
KVM_CAP_PRE_FAULT_MEMORY);

      if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index 68cd33ba9735..73f04eb589ef 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -125,3 +125,8 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t 
flags, Error **errp)
  {
      return -ENOSYS;
  }
+
+bool kvm_private_memory_attribute_supported(void)
+{
+    return false;
+}
diff --git a/include/system/kvm.h b/include/system/kvm.h
index 8f9eecf044c2..b5811c90f1cc 100644
--- a/include/system/kvm.h
+++ b/include/system/kvm.h
@@ -561,6 +561,7 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t 
flags, Error **errp);

  int kvm_set_memory_attributes_private(hwaddr start, uint64_t size);
  int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size);
+bool kvm_private_memory_attribute_supported(void);

  int kvm_convert_memory(hwaddr start, hwaddr size, bool to_private);

diff --git a/system/physmem.c b/system/physmem.c
index a340ca3e6166..7704572a5745 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2211,6 +2211,14 @@ static void ram_block_add(RAMBlock *new_block, 
Error **errp)
                         object_get_typename(OBJECT(current_machine->cgs)));
              goto out_free;
          }
+
+        if (!kvm_private_memory_attribute_supported()) {
+            error_setg(errp, "cannot set up private guest memory for %s: "
+                       " KVM does not support private memory attribute",
+                       object_get_typename(OBJECT(current_machine->cgs)));
+            goto out_free;
+        }
+
          assert(new_block->guest_memfd < 0);

          ret = ram_block_coordinated_discard_require(true);


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/8] kvm: Detect guest-memfd flags supported
  2025-10-23 18:59 ` [PATCH 3/8] kvm: Detect guest-memfd flags supported Peter Xu
@ 2025-10-24  3:52   ` Xiaoyao Li
  0 siblings, 0 replies; 15+ messages in thread
From: Xiaoyao Li @ 2025-10-24  3:52 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Paolo Bonzini, Fabiano Rosas, Chenyi Qiang, David Hildenbrand,
	Alexey Kardashevskiy, Juraj Marcin

On 10/24/2025 2:59 AM, Peter Xu wrote:
> Detect supported guest-memfd flags by the current kernel, and reject
> creations of guest-memfd using invalid flags.  When the cap isn't
> available, then no flag is supported.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   accel/kvm/kvm-all.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 1425dfd8b3..48a8f6424f 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -108,6 +108,7 @@ static int kvm_sstep_flags;
>   static bool kvm_immediate_exit;
>   static uint64_t kvm_supported_memory_attributes;
>   static bool kvm_guest_memfd_supported;
> +static uint64_t kvm_guest_memfd_flags_supported;
>   static hwaddr kvm_max_slot_size = ~0;
>   
>   static const KVMCapabilityInfo kvm_required_capabilites[] = {
> @@ -2781,6 +2782,11 @@ static int kvm_init(AccelState *as, MachineState *ms)
>       kvm_supported_memory_attributes = kvm_vm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
>       kvm_guest_memfd_supported = kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
>           kvm_vm_check_extension(s, KVM_CAP_USER_MEMORY2);
> +    ret = kvm_vm_check_extension(s, KVM_CAP_GUEST_MEMFD_FLAGS);
> +    if (ret > 0)
> +        kvm_guest_memfd_flags_supported = (uint64_t)ret;
> +    else
> +        kvm_guest_memfd_flags_supported = 0;

Nit:
1. QEMU's coding style always requires curly braces.
2. is the (uint64_t) necessary?
3. can we name it "kvm_supported_guest_memfd_flags" to make it 
consistent with "kvm_supported_memory_attributes"?

so how about

kvm_supported_guest_memfd_flags = kvm_vm_check_extension(s, 
KVM_CAP_GUEST_MEMFD_FLAGS);
     if (kvm_supported_guest_memfd_flags < 0) {
         kvm_supported_guest_memfd_flags = 0;
     }

>       kvm_pre_fault_memory_supported = kvm_vm_check_extension(s, KVM_CAP_PRE_FAULT_MEMORY);
>   
>       if (s->kernel_irqchip_split == ON_OFF_AUTO_AUTO) {
> @@ -4486,6 +4492,12 @@ int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
>           return -1;
>       }
>   
> +    if (flags & ~kvm_guest_memfd_flags_supported) {
> +        error_setg(errp, "KVM does not support guest-memfd flag: 0x%"PRIx64,
> +                   flags & ~kvm_guest_memfd_flags_supported);
> +        return -1;
> +    }
> +
>       fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &guest_memfd);
>       if (fd < 0) {
>           error_setg_errno(errp, errno, "Error creating KVM guest_memfd");



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 8/8] hostmem: Support in-place guest memfd to back a VM
  2025-10-23 18:59 ` [PATCH 8/8] hostmem: Support in-place guest memfd to back a VM Peter Xu
@ 2025-10-24  9:01   ` Xiaoyao Li
  2025-10-24 15:22     ` Peter Xu
  0 siblings, 1 reply; 15+ messages in thread
From: Xiaoyao Li @ 2025-10-24  9:01 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Paolo Bonzini, Fabiano Rosas, Chenyi Qiang, David Hildenbrand,
	Alexey Kardashevskiy, Juraj Marcin

On 10/24/2025 2:59 AM, Peter Xu wrote:
> Host backends supports guest-memfd now by detecting whether it's a
> confidential VM.  There's no way to choose it yet from the memory level to
> use it in-place.  If we use guest-memfd, it so far always implies we need
> two layers of memory backends, while the guest-memfd only provides the
> private set of pages.
> 
> This patch introduces a way so that QEMU can consume guest memfd as the
> only source of memory to back the object (aka, in place), rather than
> having another backend supporting the pages converted to shared.
> 
> To use the in-place guest-memfd, one can add a memfd object with:
> 
>    -object memory-backend-memfd,guest-memfd=on,share=on
> 
> Note that share=on is required with in-place guest_memfd.

First, I'm not sure "in-place" is the proper wording here. At first 
glance on the series, I thought it's something related to "in-place" 
page conversion. After reading a bit, I really that it is enabling guest 
memfd with mmap support to serve as normal memory backend.

Second, my POC implementation chose to implement a separate and specific 
memory-backend type "memory-backend-guest-memfd". Your approach to add 
an option of "guest-memfd" to memory-backend-memfd looks OK to me and it 
requires less code. But I think we need to explicitly error out to users 
when they set "guest_memfd" to on with unsupported properties 
configured, e.g., "hugetlb", "hugetlbsize", and "seal".

Third, the intended usage of gmem with mmap from KVM/kernel's 
perspective is userspace configures the meomry slot by passing the gmem 
fd to @guest_memfd and @guest_memfd of struct 
kvm_userspace_memory_region2 instead of passing the user address 
returned by mmap of the fd to @userspace_addr return mmap() as this 
patch does. Surely the usage of this path works. But when QEMU is going 
to support in-place conversion of gmem, we has to pass the @guest_memfd.
Well, this is no issue now and we can handle it in the future when needed.

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   qapi/qom.json            |  6 +++-
>   backends/hostmem-memfd.c | 66 +++++++++++++++++++++++++++++++++++++---
>   2 files changed, 67 insertions(+), 5 deletions(-)
> 
> diff --git a/qapi/qom.json b/qapi/qom.json
> index 830cb2ffe7..6b090fe9a0 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -764,13 +764,17 @@
>   # @seal: if true, create a sealed-file, which will block further
>   #     resizing of the memory (default: true)
>   #
> +# @guest-memfd: if true, use guest-memfd to back the memory region.
> +#     (default: false, since: 10.2)
> +#
>   # Since: 2.12
>   ##
>   { 'struct': 'MemoryBackendMemfdProperties',
>     'base': 'MemoryBackendProperties',
>     'data': { '*hugetlb': 'bool',
>               '*hugetlbsize': 'size',
> -            '*seal': 'bool' },
> +            '*seal': 'bool',
> +            '*guest-memfd': 'bool' },
>     'if': 'CONFIG_LINUX' }
>   
>   ##
> diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
> index ea93f034e4..1fa16c1e1d 100644
> --- a/backends/hostmem-memfd.c
> +++ b/backends/hostmem-memfd.c
> @@ -18,6 +18,8 @@
>   #include "qapi/error.h"
>   #include "qom/object.h"
>   #include "migration/cpr.h"
> +#include "system/kvm.h"
> +#include <linux/kvm.h>
>   
>   OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendMemfd, MEMORY_BACKEND_MEMFD)
>   
> @@ -28,6 +30,13 @@ struct HostMemoryBackendMemfd {
>       bool hugetlb;
>       uint64_t hugetlbsize;
>       bool seal;
> +    /*
> +     * NOTE: this differs from HostMemoryBackend's guest_memfd_private,
> +     * which represents a internally private guest-memfd that only backs
> +     * private pages.  Instead, this flag marks the memory backend will
> +     * 100% use the guest-memfd pages in-place.
> +     */
> +    bool guest_memfd;
>   };
>   
>   static bool
> @@ -47,10 +56,40 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>           goto have_fd;
>       }
>   
> -    fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
> -                           m->hugetlb, m->hugetlbsize, m->seal ?
> -                           F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
> -                           errp);
> +    if (m->guest_memfd) {
> +        /* User choose to use in-place guest-memfd to back the VM.. */
> +        if (!backend->share) {
> +            error_setg(errp, "In-place guest-memfd must be used with share=on");
> +            return false;
> +        }
> +
> +        /*
> +         * This is the request to have a guest-memfd to back private pages.
> +         * In-place guest-memfd doesn't work like that.  Disable it for now
> +         * to make it simple, so that each memory backend can only have
> +         * guest-memfd either as private, or fully shared.
> +         */
> +        if (backend->guest_memfd_private) {
> +            error_setg(errp, "In-place guest-memfd cannot be used with another "
> +                       "private guest-memfd");
> +            return false;
> +        }

Add kvm_enabled() here, otherwise the following calling of 
kvm_create_guest_memfd() emits confusing information when accelerator is 
not configured as KVM, e.g., -machine q35,accel=tcg

qemu-system-x86: KVM does not support guest_memfd


> +        /* TODO: add huge page support */
> +        fd = kvm_create_guest_memfd(backend->size,
> +                                    GUEST_MEMFD_FLAG_MMAP |
> +                                    GUEST_MEMFD_FLAG_INIT_SHARED,
> +                                    errp);
> +        if (fd < 0) {
> +            return false;
> +        }
> +    } else {
> +        fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
> +                               m->hugetlb, m->hugetlbsize, m->seal ?
> +                               F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
> +                               errp);
> +    }
> +
>       if (fd == -1) {
>           return false;
>       }
> @@ -65,6 +104,18 @@ have_fd:
>                                             backend->size, ram_flags, fd, 0, errp);
>   }
>   
> +static bool
> +memfd_backend_get_guest_memfd(Object *o, Error **errp)
> +{
> +    return MEMORY_BACKEND_MEMFD(o)->guest_memfd;
> +}
> +
> +static void
> +memfd_backend_set_guest_memfd(Object *o, bool value, Error **errp)
> +{
> +    MEMORY_BACKEND_MEMFD(o)->guest_memfd = value;
> +}
> +
>   static bool
>   memfd_backend_get_hugetlb(Object *o, Error **errp)
>   {
> @@ -152,6 +203,13 @@ memfd_backend_class_init(ObjectClass *oc, const void *data)
>           object_class_property_set_description(oc, "hugetlbsize",
>                                                 "Huge pages size (ex: 2M, 1G)");
>       }
> +
> +    object_class_property_add_bool(oc, "guest-memfd",
> +                                   memfd_backend_get_guest_memfd,
> +                                   memfd_backend_set_guest_memfd);
> +    object_class_property_set_description(oc, "guest-memfd",
> +                                          "Use guest memfd");
> +
>       object_class_property_add_bool(oc, "seal",
>                                      memfd_backend_get_seal,
>                                      memfd_backend_set_seal);



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 4/8] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE
  2025-10-23 18:59 ` [PATCH 4/8] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
@ 2025-10-24  9:17   ` Xiaoyao Li
  0 siblings, 0 replies; 15+ messages in thread
From: Xiaoyao Li @ 2025-10-24  9:17 UTC (permalink / raw)
  To: Peter Xu, qemu-devel
  Cc: Paolo Bonzini, Fabiano Rosas, Chenyi Qiang, David Hildenbrand,
	Alexey Kardashevskiy, Juraj Marcin

On 10/24/2025 2:59 AM, Peter Xu wrote:
> This name is too generic, and can conflict with in-place guest-memfd
> support.  Add a _PRIVATE suffix to show what it really means: it is always
> silently using an internal guest-memfd to back a shared host backend,
> rather than used in-place.
> 
> This paves way for in-place guest-memfd, which means we can have a ramblock
> that allocates pages completely from guest-memfd (private or shared).

It's for patch 4-7. Regarding the rename. How about:

- RAM_GUEST_MEMFD => RAM_PRIVATE_MEMORY
- backend->guest_memfd => backend->private_memory
- machine_require_guest_memfd() => machine_require_private_memory()
- cgs->require_guest_memfd => cgs->require_private_memory

For CoCo VMs, what they require is the support of private memory, while 
the guest_memfd is how linux provides private memory support. But with 
mmap support added to guest memfd, it can serve as shared/non-private 
memory as well. Futher, in the future when in-place conversion support 
is implemented, a single guest memfd can serve as both shared and 
private in different parts. So guest_memfd_private will be confusing at 
that time.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 8/8] hostmem: Support in-place guest memfd to back a VM
  2025-10-24  9:01   ` Xiaoyao Li
@ 2025-10-24 15:22     ` Peter Xu
  2025-10-27  5:24       ` Xiaoyao Li
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2025-10-24 15:22 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: qemu-devel, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Juraj Marcin

On Fri, Oct 24, 2025 at 05:01:44PM +0800, Xiaoyao Li wrote:
> On 10/24/2025 2:59 AM, Peter Xu wrote:
> > Host backends supports guest-memfd now by detecting whether it's a
> > confidential VM.  There's no way to choose it yet from the memory level to
> > use it in-place.  If we use guest-memfd, it so far always implies we need
> > two layers of memory backends, while the guest-memfd only provides the
> > private set of pages.
> > 
> > This patch introduces a way so that QEMU can consume guest memfd as the
> > only source of memory to back the object (aka, in place), rather than
> > having another backend supporting the pages converted to shared.
> > 
> > To use the in-place guest-memfd, one can add a memfd object with:
> > 
> >    -object memory-backend-memfd,guest-memfd=on,share=on
> > 
> > Note that share=on is required with in-place guest_memfd.
> 
> First, I'm not sure "in-place" is the proper wording here. At first glance
> on the series, I thought it's something related to "in-place" page
> conversion. After reading a bit, I really that it is enabling guest memfd
> with mmap support to serve as normal memory backend.

It'll be only proper in current context of qemu, but yes I'm aware CoCo
also has such idea, so at least I should have come up with something
better. My bad.  When I wrote the patches a while ago it wasn't as clear,
and I didn't pay attention when I prepare them upstream.

> 
> Second, my POC implementation chose to implement a separate and specific
> memory-backend type "memory-backend-guest-memfd". Your approach to add an
> option of "guest-memfd" to memory-backend-memfd looks OK to me and it
> requires less code. But I think we need to explicitly error out to users
> when they set "guest_memfd" to on with unsupported properties configured,
> e.g., "hugetlb", "hugetlbsize", and "seal".

In my local tree I actually reused hugetlb* parameters, that needs
Ackerley's 1G kernel patches, and some mine on top.

Before I go and reply your other series..  I was definitely not aware that
anyone has been working on it!  Could you share a pointer?  Or is it still
in a private branch?

I'm more than happy to drop this series if you have an older / better
version.  Then I can rebase whatever I work on top.

> 
> Third, the intended usage of gmem with mmap from KVM/kernel's perspective is
> userspace configures the meomry slot by passing the gmem fd to @guest_memfd
> and @guest_memfd of struct kvm_userspace_memory_region2 instead of passing
> the user address returned by mmap of the fd to @userspace_addr return mmap()
> as this patch does. Surely the usage of this path works. But when QEMU is
> going to support in-place conversion of gmem, we has to pass the
> @guest_memfd.
> Well, this is no issue now and we can handle it in the future when needed.

Yes, that's something the private guest-memfd would need.  For completely
shared guest-memfd, IIUC we will use a lot of different code paths, the
goal is to make old APIs work not only for KVM_SET_USER_MEMORY_REGION, but
for all the rest modules like vhost-kernel, vhost-user, and so on.

> 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   qapi/qom.json            |  6 +++-
> >   backends/hostmem-memfd.c | 66 +++++++++++++++++++++++++++++++++++++---
> >   2 files changed, 67 insertions(+), 5 deletions(-)
> > 
> > diff --git a/qapi/qom.json b/qapi/qom.json
> > index 830cb2ffe7..6b090fe9a0 100644
> > --- a/qapi/qom.json
> > +++ b/qapi/qom.json
> > @@ -764,13 +764,17 @@
> >   # @seal: if true, create a sealed-file, which will block further
> >   #     resizing of the memory (default: true)
> >   #
> > +# @guest-memfd: if true, use guest-memfd to back the memory region.
> > +#     (default: false, since: 10.2)
> > +#
> >   # Since: 2.12
> >   ##
> >   { 'struct': 'MemoryBackendMemfdProperties',
> >     'base': 'MemoryBackendProperties',
> >     'data': { '*hugetlb': 'bool',
> >               '*hugetlbsize': 'size',
> > -            '*seal': 'bool' },
> > +            '*seal': 'bool',
> > +            '*guest-memfd': 'bool' },
> >     'if': 'CONFIG_LINUX' }
> >   ##
> > diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
> > index ea93f034e4..1fa16c1e1d 100644
> > --- a/backends/hostmem-memfd.c
> > +++ b/backends/hostmem-memfd.c
> > @@ -18,6 +18,8 @@
> >   #include "qapi/error.h"
> >   #include "qom/object.h"
> >   #include "migration/cpr.h"
> > +#include "system/kvm.h"
> > +#include <linux/kvm.h>
> >   OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendMemfd, MEMORY_BACKEND_MEMFD)
> > @@ -28,6 +30,13 @@ struct HostMemoryBackendMemfd {
> >       bool hugetlb;
> >       uint64_t hugetlbsize;
> >       bool seal;
> > +    /*
> > +     * NOTE: this differs from HostMemoryBackend's guest_memfd_private,
> > +     * which represents a internally private guest-memfd that only backs
> > +     * private pages.  Instead, this flag marks the memory backend will
> > +     * 100% use the guest-memfd pages in-place.
> > +     */
> > +    bool guest_memfd;
> >   };
> >   static bool
> > @@ -47,10 +56,40 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
> >           goto have_fd;
> >       }
> > -    fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
> > -                           m->hugetlb, m->hugetlbsize, m->seal ?
> > -                           F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
> > -                           errp);
> > +    if (m->guest_memfd) {
> > +        /* User choose to use in-place guest-memfd to back the VM.. */
> > +        if (!backend->share) {
> > +            error_setg(errp, "In-place guest-memfd must be used with share=on");
> > +            return false;
> > +        }
> > +
> > +        /*
> > +         * This is the request to have a guest-memfd to back private pages.
> > +         * In-place guest-memfd doesn't work like that.  Disable it for now
> > +         * to make it simple, so that each memory backend can only have
> > +         * guest-memfd either as private, or fully shared.
> > +         */
> > +        if (backend->guest_memfd_private) {
> > +            error_setg(errp, "In-place guest-memfd cannot be used with another "
> > +                       "private guest-memfd");
> > +            return false;
> > +        }
> 
> Add kvm_enabled() here, otherwise the following calling of
> kvm_create_guest_memfd() emits confusing information when accelerator is not
> configured as KVM, e.g., -machine q35,accel=tcg
> 
> qemu-system-x86: KVM does not support guest_memfd
> 
> 
> > +        /* TODO: add huge page support */
> > +        fd = kvm_create_guest_memfd(backend->size,
> > +                                    GUEST_MEMFD_FLAG_MMAP |
> > +                                    GUEST_MEMFD_FLAG_INIT_SHARED,
> > +                                    errp);
> > +        if (fd < 0) {
> > +            return false;
> > +        }
> > +    } else {
> > +        fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
> > +                               m->hugetlb, m->hugetlbsize, m->seal ?
> > +                               F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
> > +                               errp);
> > +    }
> > +
> >       if (fd == -1) {
> >           return false;
> >       }
> > @@ -65,6 +104,18 @@ have_fd:
> >                                             backend->size, ram_flags, fd, 0, errp);
> >   }
> > +static bool
> > +memfd_backend_get_guest_memfd(Object *o, Error **errp)
> > +{
> > +    return MEMORY_BACKEND_MEMFD(o)->guest_memfd;
> > +}
> > +
> > +static void
> > +memfd_backend_set_guest_memfd(Object *o, bool value, Error **errp)
> > +{
> > +    MEMORY_BACKEND_MEMFD(o)->guest_memfd = value;
> > +}
> > +
> >   static bool
> >   memfd_backend_get_hugetlb(Object *o, Error **errp)
> >   {
> > @@ -152,6 +203,13 @@ memfd_backend_class_init(ObjectClass *oc, const void *data)
> >           object_class_property_set_description(oc, "hugetlbsize",
> >                                                 "Huge pages size (ex: 2M, 1G)");
> >       }
> > +
> > +    object_class_property_add_bool(oc, "guest-memfd",
> > +                                   memfd_backend_get_guest_memfd,
> > +                                   memfd_backend_set_guest_memfd);
> > +    object_class_property_set_description(oc, "guest-memfd",
> > +                                          "Use guest memfd");
> > +
> >       object_class_property_add_bool(oc, "seal",
> >                                      memfd_backend_get_seal,
> >                                      memfd_backend_set_seal);
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 8/8] hostmem: Support in-place guest memfd to back a VM
  2025-10-24 15:22     ` Peter Xu
@ 2025-10-27  5:24       ` Xiaoyao Li
  0 siblings, 0 replies; 15+ messages in thread
From: Xiaoyao Li @ 2025-10-27  5:24 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Paolo Bonzini, Fabiano Rosas, Chenyi Qiang,
	David Hildenbrand, Alexey Kardashevskiy, Juraj Marcin

On 10/24/2025 11:22 PM, Peter Xu wrote:
> On Fri, Oct 24, 2025 at 05:01:44PM +0800, Xiaoyao Li wrote:
>> On 10/24/2025 2:59 AM, Peter Xu wrote:
>>> Host backends supports guest-memfd now by detecting whether it's a
>>> confidential VM.  There's no way to choose it yet from the memory level to
>>> use it in-place.  If we use guest-memfd, it so far always implies we need
>>> two layers of memory backends, while the guest-memfd only provides the
>>> private set of pages.
>>>
>>> This patch introduces a way so that QEMU can consume guest memfd as the
>>> only source of memory to back the object (aka, in place), rather than
>>> having another backend supporting the pages converted to shared.
>>>
>>> To use the in-place guest-memfd, one can add a memfd object with:
>>>
>>>     -object memory-backend-memfd,guest-memfd=on,share=on
>>>
>>> Note that share=on is required with in-place guest_memfd.
>>
>> First, I'm not sure "in-place" is the proper wording here. At first glance
>> on the series, I thought it's something related to "in-place" page
>> conversion. After reading a bit, I really that it is enabling guest memfd
>> with mmap support to serve as normal memory backend.
> 
> It'll be only proper in current context of qemu, but yes I'm aware CoCo
> also has such idea, so at least I should have come up with something
> better. My bad.  When I wrote the patches a while ago it wasn't as clear,
> and I didn't pay attention when I prepare them upstream.
> 
>>
>> Second, my POC implementation chose to implement a separate and specific
>> memory-backend type "memory-backend-guest-memfd". Your approach to add an
>> option of "guest-memfd" to memory-backend-memfd looks OK to me and it
>> requires less code. But I think we need to explicitly error out to users
>> when they set "guest_memfd" to on with unsupported properties configured,
>> e.g., "hugetlb", "hugetlbsize", and "seal".
> 
> In my local tree I actually reused hugetlb* parameters, that needs
> Ackerley's 1G kernel patches, and some mine on top.
> 
> Before I go and reply your other series..  I was definitely not aware that
> anyone has been working on it!  Could you share a pointer?  Or is it still
> in a private branch?

I shared it publicly when reviwed and tested KVM series: 
https://lore.kernel.org/all/13654746-3edc-4e4a-ac4f-fa281b83b2ae@intel.com/

The poc branch:

   https://github.com/intel-staging/qemu-tdx.git lxy/gmem-mmap-poc

It was based on the old QEMU and based on old kernel API of v6.18-rc1 
(the API changes on -rc2).

> I'm more than happy to drop this series if you have an older / better
> version.  Then I can rebase whatever I work on top.

I was not authorized to do the QEMU upstream of gmem mmap support inside 
the company. So please keep your series and I'm happy to help review it 
and make it upstreamed.

>>
>> Third, the intended usage of gmem with mmap from KVM/kernel's perspective is
>> userspace configures the meomry slot by passing the gmem fd to @guest_memfd
>> and @guest_memfd of struct kvm_userspace_memory_region2 instead of passing
>> the user address returned by mmap of the fd to @userspace_addr return mmap()
>> as this patch does. Surely the usage of this path works. But when QEMU is
>> going to support in-place conversion of gmem, we has to pass the
>> @guest_memfd.
>> Well, this is no issue now and we can handle it in the future when needed.
> 
> Yes, that's something the private guest-memfd would need.  For completely
> shared guest-memfd, IIUC we will use a lot of different code paths, the
> goal is to make old APIs work not only for KVM_SET_USER_MEMORY_REGION, but
> for all the rest modules like vhost-kernel, vhost-user, and so on.

And if pass the @guest_memfd, we will need to handle the issue of 
aliased: https://lore.kernel.org/all/aH-0MdNJbH19Mhm3@google.com/

>>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>    qapi/qom.json            |  6 +++-
>>>    backends/hostmem-memfd.c | 66 +++++++++++++++++++++++++++++++++++++---
>>>    2 files changed, 67 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/qapi/qom.json b/qapi/qom.json
>>> index 830cb2ffe7..6b090fe9a0 100644
>>> --- a/qapi/qom.json
>>> +++ b/qapi/qom.json
>>> @@ -764,13 +764,17 @@
>>>    # @seal: if true, create a sealed-file, which will block further
>>>    #     resizing of the memory (default: true)
>>>    #
>>> +# @guest-memfd: if true, use guest-memfd to back the memory region.
>>> +#     (default: false, since: 10.2)
>>> +#
>>>    # Since: 2.12
>>>    ##
>>>    { 'struct': 'MemoryBackendMemfdProperties',
>>>      'base': 'MemoryBackendProperties',
>>>      'data': { '*hugetlb': 'bool',
>>>                '*hugetlbsize': 'size',
>>> -            '*seal': 'bool' },
>>> +            '*seal': 'bool',
>>> +            '*guest-memfd': 'bool' },
>>>      'if': 'CONFIG_LINUX' }
>>>    ##
>>> diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
>>> index ea93f034e4..1fa16c1e1d 100644
>>> --- a/backends/hostmem-memfd.c
>>> +++ b/backends/hostmem-memfd.c
>>> @@ -18,6 +18,8 @@
>>>    #include "qapi/error.h"
>>>    #include "qom/object.h"
>>>    #include "migration/cpr.h"
>>> +#include "system/kvm.h"
>>> +#include <linux/kvm.h>
>>>    OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendMemfd, MEMORY_BACKEND_MEMFD)
>>> @@ -28,6 +30,13 @@ struct HostMemoryBackendMemfd {
>>>        bool hugetlb;
>>>        uint64_t hugetlbsize;
>>>        bool seal;
>>> +    /*
>>> +     * NOTE: this differs from HostMemoryBackend's guest_memfd_private,
>>> +     * which represents a internally private guest-memfd that only backs
>>> +     * private pages.  Instead, this flag marks the memory backend will
>>> +     * 100% use the guest-memfd pages in-place.
>>> +     */
>>> +    bool guest_memfd;
>>>    };
>>>    static bool
>>> @@ -47,10 +56,40 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>>>            goto have_fd;
>>>        }
>>> -    fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
>>> -                           m->hugetlb, m->hugetlbsize, m->seal ?
>>> -                           F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
>>> -                           errp);
>>> +    if (m->guest_memfd) {
>>> +        /* User choose to use in-place guest-memfd to back the VM.. */
>>> +        if (!backend->share) {
>>> +            error_setg(errp, "In-place guest-memfd must be used with share=on");
>>> +            return false;
>>> +        }
>>> +
>>> +        /*
>>> +         * This is the request to have a guest-memfd to back private pages.
>>> +         * In-place guest-memfd doesn't work like that.  Disable it for now
>>> +         * to make it simple, so that each memory backend can only have
>>> +         * guest-memfd either as private, or fully shared.
>>> +         */
>>> +        if (backend->guest_memfd_private) {
>>> +            error_setg(errp, "In-place guest-memfd cannot be used with another "
>>> +                       "private guest-memfd");
>>> +            return false;
>>> +        }
>>
>> Add kvm_enabled() here, otherwise the following calling of
>> kvm_create_guest_memfd() emits confusing information when accelerator is not
>> configured as KVM, e.g., -machine q35,accel=tcg
>>
>> qemu-system-x86: KVM does not support guest_memfd
>>
>>
>>> +        /* TODO: add huge page support */
>>> +        fd = kvm_create_guest_memfd(backend->size,
>>> +                                    GUEST_MEMFD_FLAG_MMAP |
>>> +                                    GUEST_MEMFD_FLAG_INIT_SHARED,
>>> +                                    errp);
>>> +        if (fd < 0) {
>>> +            return false;
>>> +        }
>>> +    } else {
>>> +        fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
>>> +                               m->hugetlb, m->hugetlbsize, m->seal ?
>>> +                               F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
>>> +                               errp);
>>> +    }
>>> +
>>>        if (fd == -1) {
>>>            return false;
>>>        }
>>> @@ -65,6 +104,18 @@ have_fd:
>>>                                              backend->size, ram_flags, fd, 0, errp);
>>>    }
>>> +static bool
>>> +memfd_backend_get_guest_memfd(Object *o, Error **errp)
>>> +{
>>> +    return MEMORY_BACKEND_MEMFD(o)->guest_memfd;
>>> +}
>>> +
>>> +static void
>>> +memfd_backend_set_guest_memfd(Object *o, bool value, Error **errp)
>>> +{
>>> +    MEMORY_BACKEND_MEMFD(o)->guest_memfd = value;
>>> +}
>>> +
>>>    static bool
>>>    memfd_backend_get_hugetlb(Object *o, Error **errp)
>>>    {
>>> @@ -152,6 +203,13 @@ memfd_backend_class_init(ObjectClass *oc, const void *data)
>>>            object_class_property_set_description(oc, "hugetlbsize",
>>>                                                  "Huge pages size (ex: 2M, 1G)");
>>>        }
>>> +
>>> +    object_class_property_add_bool(oc, "guest-memfd",
>>> +                                   memfd_backend_get_guest_memfd,
>>> +                                   memfd_backend_set_guest_memfd);
>>> +    object_class_property_set_description(oc, "guest-memfd",
>>> +                                          "Use guest memfd");
>>> +
>>>        object_class_property_add_bool(oc, "seal",
>>>                                       memfd_backend_get_seal,
>>>                                       memfd_backend_set_seal);
>>
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-10-27  5:25 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-23 18:59 [PATCH 0/8] KVM/hostmem: Support in-place guest-memfd as VM backends Peter Xu
2025-10-23 18:59 ` [PATCH 1/8] linux-headers: Update to v6.18-rc2 Peter Xu
2025-10-23 18:59 ` [PATCH 2/8] kvm: Allow kvm_guest_memfd_supported for non-private use case Peter Xu
2025-10-24  2:30   ` Xiaoyao Li
2025-10-23 18:59 ` [PATCH 3/8] kvm: Detect guest-memfd flags supported Peter Xu
2025-10-24  3:52   ` Xiaoyao Li
2025-10-23 18:59 ` [PATCH 4/8] memory: Rename RAM_GUEST_MEMFD to RAM_GUEST_MEMFD_PRIVATE Peter Xu
2025-10-24  9:17   ` Xiaoyao Li
2025-10-23 18:59 ` [PATCH 5/8] memory: Rename memory_region_has_guest_memfd() to *_private() Peter Xu
2025-10-23 18:59 ` [PATCH 6/8] ramblock: Rename guest_memfd to guest_memfd_private Peter Xu
2025-10-23 18:59 ` [PATCH 7/8] hostmem: " Peter Xu
2025-10-23 18:59 ` [PATCH 8/8] hostmem: Support in-place guest memfd to back a VM Peter Xu
2025-10-24  9:01   ` Xiaoyao Li
2025-10-24 15:22     ` Peter Xu
2025-10-27  5:24       ` Xiaoyao Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).