* [PATCH v2 0/9] memory-backend-file related improvements and VM templating support
@ 2023-08-22 11:44 David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 1/9] nvdimm: Reject writing label data to ROM instead of crashing QEMU David Hildenbrand
` (8 more replies)
0 siblings, 9 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
This is the result of the previous discussion of:
* "[PATCH v2] softmmu/physmem: try opening file readonly before failure
in file_ram_open" [1]
* "[PATCH v1 0/3] softmmu/physmem: file_ram_open() readonly
improvements" [2]
After looking into various ways to avoid a new parameter for
memory-backend-file to cleanly support VM templating with R/O files, I
concluded that it might be easier and cleaner to hust have a new parameter.
The alternatives all had their own problems.
Looking back, we could have designed the "readonly=on/off" parameter
slightly differently.
So this series adds a new "rom=on/off/auto" option and wires it up
internally. It uses new internal RAM flags to improve qemu_ram_remap() and
ram_block_discard_range().
Further, improve file_ram_open() with readonly=on and update+add some
documentation.
While working on this and testing some configurations, I realized that
an NVDIMM with label data on ROM does not work as expected (QEMU crashes).
Fix included as patch #1.
No changelog, because too much changed.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Thiner Logoer <logoerthiner1@163.com>
Cc: "Philippe Mathieu-Daudé" <philmd@linaro.org>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Cc: Jagannathan Raman <jag.raman@oracle.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ani Sinha <anisinha@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong.eric@gmail.com>
Cc: Daniel Henrique Barboza <danielhb413@gmail.com>
Cc: Greg Kurz <groug@kaod.org>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: Eduardo Habkost <eduardo@habkost.net>
[1] https://lkml.kernel.org/r/20230726145912.88545-1-logoerthiner1@163.com
[2] https://lkml.kernel.org/r/20230807190736.572665-1-david@redhat.com
David Hildenbrand (9):
nvdimm: Reject writing label data to ROM instead of crashing QEMU
softmmu/physmem: Distinguish between file access mode and mmap
protection
backends/hostmem-file: Add "rom" property to support VM templating
with R/O files
softmmu/physmem: Remap with proper protection in qemu_ram_remap()
softmmu/physmem: Bail out early in ram_block_discard_range() with
readonly files
softmmu/physmem: Fail creation of new files in file_ram_open() with
readonly=true
softmmu/physmem: Never return directories from file_ram_open()
docs: Don't mention "-mem-path" in multi-process.rst
docs: Start documenting VM templating
backends/hostmem-file.c | 61 +++++++++++++++++++-
docs/devel/multi-process.rst | 5 +-
docs/vm-templating.txt | 109 +++++++++++++++++++++++++++++++++++
hw/acpi/nvdimm.c | 11 +++-
hw/mem/nvdimm.c | 10 +++-
hw/ppc/spapr_nvdimm.c | 3 +-
include/exec/memory.h | 14 +++--
include/exec/ram_addr.h | 8 +--
include/hw/mem/nvdimm.h | 6 ++
qapi/qom.json | 6 +-
qemu-options.hx | 10 +++-
softmmu/memory.c | 8 +--
softmmu/physmem.c | 74 +++++++++++++++++-------
13 files changed, 279 insertions(+), 46 deletions(-)
create mode 100644 docs/vm-templating.txt
--
2.41.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH v2 1/9] nvdimm: Reject writing label data to ROM instead of crashing QEMU
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 19:25 ` Stefan Hajnoczi
2023-08-22 11:44 ` [PATCH v2 2/9] softmmu/physmem: Distinguish between file access mode and mmap protection David Hildenbrand
` (7 subsequent siblings)
8 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
Currently, when using a true R/O NVDIMM (ROM memory backend) with a label
area, the VM can easily crash QEMU by trying to write to the label area,
because the ROM memory is mmap'ed without PROT_WRITE.
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
-> QEMU segfaults
Let's remember whether we have a ROM memory backend and properly
reject the write request:
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
zeroed 0 nmem
In comparison, on a system with a R/W NVDIMM:
[root@vm-0 ~]# ndctl disable-region region0
disabled 1 region
[root@vm-0 ~]# ndctl zero-labels nmem0
zeroed 1 nmem
For ACPI, just return "unsupported", like if no label exists. For spapr,
return "H_P2", similar to when no label area exists.
Could we rely on the "unarmed" property? Maybe, but it looks cleaner to
only disallow what certainly cannot work.
After all "unarmed=on" primarily means: cannot accept persistent writes. In
theory, there might be setups where devices with "unarmed=on" set could
be used to host non-persistent data (temporary files, system RAM, ...); for
example, in Linux, admins can overwrite the "readonly" setting and still
write to the device -- which will work as long as we're not using ROM.
Allowing writing label data in such configurations can make sense.
Fixes: dbd730e85987 ("nvdimm: check -object memory-backend-file, readonly=on option")
Signed-off-by: David Hildenbrand <david@redhat.com>
---
hw/acpi/nvdimm.c | 11 ++++++++---
hw/mem/nvdimm.c | 10 +++++++---
hw/ppc/spapr_nvdimm.c | 3 ++-
include/hw/mem/nvdimm.h | 6 ++++++
4 files changed, 23 insertions(+), 7 deletions(-)
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index a3b25a92f3..3cbd41629d 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -670,7 +670,8 @@ static void nvdimm_dsm_label_size(NVDIMMDevice *nvdimm, hwaddr dsm_mem_addr)
}
static uint32_t nvdimm_rw_label_data_check(NVDIMMDevice *nvdimm,
- uint32_t offset, uint32_t length)
+ uint32_t offset, uint32_t length,
+ bool is_write)
{
uint32_t ret = NVDIMM_DSM_RET_STATUS_INVALID;
@@ -690,6 +691,10 @@ static uint32_t nvdimm_rw_label_data_check(NVDIMMDevice *nvdimm,
return ret;
}
+ if (is_write && nvdimm->readonly) {
+ return NVDIMM_DSM_RET_STATUS_UNSUPPORT;
+ }
+
return NVDIMM_DSM_RET_STATUS_SUCCESS;
}
@@ -713,7 +718,7 @@ static void nvdimm_dsm_get_label_data(NVDIMMDevice *nvdimm, NvdimmDsmIn *in,
get_label_data->length);
status = nvdimm_rw_label_data_check(nvdimm, get_label_data->offset,
- get_label_data->length);
+ get_label_data->length, false);
if (status != NVDIMM_DSM_RET_STATUS_SUCCESS) {
nvdimm_dsm_no_payload(status, dsm_mem_addr);
return;
@@ -752,7 +757,7 @@ static void nvdimm_dsm_set_label_data(NVDIMMDevice *nvdimm, NvdimmDsmIn *in,
set_label_data->length);
status = nvdimm_rw_label_data_check(nvdimm, set_label_data->offset,
- set_label_data->length);
+ set_label_data->length, true);
if (status != NVDIMM_DSM_RET_STATUS_SUCCESS) {
nvdimm_dsm_no_payload(status, dsm_mem_addr);
return;
diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 31080c22c9..1631a7d13f 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -154,6 +154,9 @@ static void nvdimm_prepare_memory_region(NVDIMMDevice *nvdimm, Error **errp)
object_get_canonical_path_component(OBJECT(hostmem)));
return;
}
+ if (memory_region_is_rom(mr)) {
+ nvdimm->readonly = true;
+ }
nvdimm->nvdimm_mr = g_new(MemoryRegion, 1);
memory_region_init_alias(nvdimm->nvdimm_mr, OBJECT(dimm),
@@ -207,15 +210,16 @@ static void nvdimm_unrealize(PCDIMMDevice *dimm)
* label read/write functions.
*/
static void nvdimm_validate_rw_label_data(NVDIMMDevice *nvdimm, uint64_t size,
- uint64_t offset)
+ uint64_t offset, bool is_write)
{
assert((nvdimm->label_size >= size + offset) && (offset + size > offset));
+ assert(!is_write || !nvdimm->readonly);
}
static void nvdimm_read_label_data(NVDIMMDevice *nvdimm, void *buf,
uint64_t size, uint64_t offset)
{
- nvdimm_validate_rw_label_data(nvdimm, size, offset);
+ nvdimm_validate_rw_label_data(nvdimm, size, offset, false);
memcpy(buf, nvdimm->label_data + offset, size);
}
@@ -229,7 +233,7 @@ static void nvdimm_write_label_data(NVDIMMDevice *nvdimm, const void *buf,
"pmem", NULL);
uint64_t backend_offset;
- nvdimm_validate_rw_label_data(nvdimm, size, offset);
+ nvdimm_validate_rw_label_data(nvdimm, size, offset, true);
if (!is_pmem) {
memcpy(nvdimm->label_data + offset, buf, size);
diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
index a8688243a6..60d6d0acc0 100644
--- a/hw/ppc/spapr_nvdimm.c
+++ b/hw/ppc/spapr_nvdimm.c
@@ -320,7 +320,8 @@ static target_ulong h_scm_write_metadata(PowerPCCPU *cpu,
nvdimm = NVDIMM(drc->dev);
if ((offset + len < offset) ||
- (nvdimm->label_size < len + offset)) {
+ (nvdimm->label_size < len + offset) ||
+ nvdimm->readonly) {
return H_P2;
}
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index acf887c83d..d3b763453a 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -77,6 +77,12 @@ struct NVDIMMDevice {
*/
bool unarmed;
+ /*
+ * Whether our DIMM is backed by ROM, and even label data cannot be
+ * written. If set, implies that "unarmed" is also set.
+ */
+ bool readonly;
+
/*
* The PPC64 - spapr requires each nvdimm device have a uuid.
*/
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 2/9] softmmu/physmem: Distinguish between file access mode and mmap protection
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 1/9] nvdimm: Reject writing label data to ROM instead of crashing QEMU David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 13:13 ` ThinerLogoer
2023-08-22 11:44 ` [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files David Hildenbrand
` (6 subsequent siblings)
8 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
There is a difference between how we open a file and how we mmap it,
and we want to support writable private mappings of readonly files. Let's
define RAM_READONLY and RAM_READONLY_FD flags, to replace the single
"readonly" parameter for file-related functions.
In memory_region_init_ram_from_fd() and memory_region_init_ram_from_file(),
initialize mr->readonly based on the new RAM_READONLY flag.
While at it, add some RAM_* flags we missed to add to the list of accepted
flags in the documentation of some functions.
No change in functionality intended. We'll make use of both flags next
and start setting them independently for memory-backend-file.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
backends/hostmem-file.c | 4 ++--
include/exec/memory.h | 14 ++++++++++----
include/exec/ram_addr.h | 8 ++++----
softmmu/memory.c | 8 ++++----
softmmu/physmem.c | 21 ++++++++++-----------
5 files changed, 30 insertions(+), 25 deletions(-)
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index b4335a80e6..ef2d5533af 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -55,13 +55,13 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
name = host_memory_backend_get_name(backend);
ram_flags = backend->share ? RAM_SHARED : 0;
+ ram_flags |= fb->readonly ? RAM_READONLY | RAM_READONLY_FD : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
ram_flags |= RAM_NAMED_FILE;
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
backend->size, fb->align, ram_flags,
- fb->mem_path, fb->offset, fb->readonly,
- errp);
+ fb->mem_path, fb->offset, errp);
g_free(name);
#endif
}
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 68284428f8..cc68249eda 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -235,6 +235,12 @@ typedef struct IOMMUTLBEvent {
/* RAM is an mmap-ed named file */
#define RAM_NAMED_FILE (1 << 9)
+/* RAM is mmap-ed read-only */
+#define RAM_READONLY (1 << 10)
+
+/* RAM FD is opened read-only */
+#define RAM_READONLY_FD (1 << 11)
+
static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
IOMMUNotifierFlag flags,
hwaddr start, hwaddr end,
@@ -1331,10 +1337,10 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
* @align: alignment of the region base address; if 0, the default alignment
* (getpagesize()) will be used.
* @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
- * RAM_NORESERVE,
+ * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
+ * RAM_READONLY_FD
* @path: the path in which to allocate the RAM.
* @offset: offset within the file referenced by path
- * @readonly: true to open @path for reading, false for read/write.
* @errp: pointer to Error*, to store an error if it happens.
*
* Note that this function does not do anything to cause the data in the
@@ -1348,7 +1354,6 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
uint32_t ram_flags,
const char *path,
ram_addr_t offset,
- bool readonly,
Error **errp);
/**
@@ -1360,7 +1365,8 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
* @name: the name of the region.
* @size: size of the region.
* @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
- * RAM_NORESERVE, RAM_PROTECTED.
+ * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
+ * RAM_READONLY_FD
* @fd: the fd to mmap.
* @offset: offset within the file referenced by fd
* @errp: pointer to Error*, to store an error if it happens.
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 9f2e3893f5..90676093f5 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -108,10 +108,10 @@ long qemu_maxrampagesize(void);
* @size: the size in bytes of the ram block
* @mr: the memory region where the ram block is
* @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
- * RAM_NORESERVE.
+ * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
+ * RAM_READONLY_FD
* @mem_path or @fd: specify the backing file or device
* @offset: Offset into target file
- * @readonly: true to open @path for reading, false for read/write.
* @errp: pointer to Error*, to store an error if it happens
*
* Return:
@@ -120,10 +120,10 @@ long qemu_maxrampagesize(void);
*/
RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
uint32_t ram_flags, const char *mem_path,
- off_t offset, bool readonly, Error **errp);
+ off_t offset, Error **errp);
RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
uint32_t ram_flags, int fd, off_t offset,
- bool readonly, Error **errp);
+ Error **errp);
RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
MemoryRegion *mr, Error **errp);
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 7d9494ce70..d8974a1e65 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1620,18 +1620,17 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
uint32_t ram_flags,
const char *path,
ram_addr_t offset,
- bool readonly,
Error **errp)
{
Error *err = NULL;
memory_region_init(mr, owner, name, size);
mr->ram = true;
- mr->readonly = readonly;
+ mr->readonly = ram_flags & RAM_READONLY;
mr->terminates = true;
mr->destructor = memory_region_destructor_ram;
mr->align = align;
mr->ram_block = qemu_ram_alloc_from_file(size, mr, ram_flags, path,
- offset, readonly, &err);
+ offset, &err);
if (err) {
mr->size = int128_zero();
object_unparent(OBJECT(mr));
@@ -1651,10 +1650,11 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr,
Error *err = NULL;
memory_region_init(mr, owner, name, size);
mr->ram = true;
+ mr->readonly = ram_flags & RAM_READONLY;
mr->terminates = true;
mr->destructor = memory_region_destructor_ram;
mr->ram_block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, offset,
- false, &err);
+ &err);
if (err) {
mr->size = int128_zero();
object_unparent(OBJECT(mr));
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 3df73542e1..16d7a16aa8 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1351,7 +1351,6 @@ static int file_ram_open(const char *path,
static void *file_ram_alloc(RAMBlock *block,
ram_addr_t memory,
int fd,
- bool readonly,
bool truncate,
off_t offset,
Error **errp)
@@ -1409,7 +1408,7 @@ static void *file_ram_alloc(RAMBlock *block,
perror("ftruncate");
}
- qemu_map_flags = readonly ? QEMU_MAP_READONLY : 0;
+ qemu_map_flags = (block->flags & RAM_READONLY) ? QEMU_MAP_READONLY : 0;
qemu_map_flags |= (block->flags & RAM_SHARED) ? QEMU_MAP_SHARED : 0;
qemu_map_flags |= (block->flags & RAM_PMEM) ? QEMU_MAP_SYNC : 0;
qemu_map_flags |= (block->flags & RAM_NORESERVE) ? QEMU_MAP_NORESERVE : 0;
@@ -1877,7 +1876,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
#ifdef CONFIG_POSIX
RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
uint32_t ram_flags, int fd, off_t offset,
- bool readonly, Error **errp)
+ Error **errp)
{
RAMBlock *new_block;
Error *local_err = NULL;
@@ -1885,7 +1884,8 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
/* Just support these ram flags by now. */
assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE |
- RAM_PROTECTED | RAM_NAMED_FILE)) == 0);
+ RAM_PROTECTED | RAM_NAMED_FILE | RAM_READONLY |
+ RAM_READONLY_FD)) == 0);
if (xen_enabled()) {
error_setg(errp, "-mem-path not supported with Xen");
@@ -1920,8 +1920,8 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
new_block->used_length = size;
new_block->max_length = size;
new_block->flags = ram_flags;
- new_block->host = file_ram_alloc(new_block, size, fd, readonly,
- !file_size, offset, errp);
+ new_block->host = file_ram_alloc(new_block, size, fd, !file_size, offset,
+ errp);
if (!new_block->host) {
g_free(new_block);
return NULL;
@@ -1940,20 +1940,19 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
uint32_t ram_flags, const char *mem_path,
- off_t offset, bool readonly, Error **errp)
+ off_t offset, Error **errp)
{
int fd;
bool created;
RAMBlock *block;
- fd = file_ram_open(mem_path, memory_region_name(mr), readonly, &created,
- errp);
+ fd = file_ram_open(mem_path, memory_region_name(mr),
+ ram_flags & RAM_READONLY_FD, &created, errp);
if (fd < 0) {
return NULL;
}
- block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, offset, readonly,
- errp);
+ block = qemu_ram_alloc_from_fd(size, mr, ram_flags, fd, offset, errp);
if (!block) {
if (created) {
unlink(mem_path);
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 1/9] nvdimm: Reject writing label data to ROM instead of crashing QEMU David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 2/9] softmmu/physmem: Distinguish between file access mode and mmap protection David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 13:27 ` Markus Armbruster
2023-08-22 14:26 ` ThinerLogoer
2023-08-22 11:44 ` [PATCH v2 4/9] softmmu/physmem: Remap with proper protection in qemu_ram_remap() David Hildenbrand
` (5 subsequent siblings)
8 siblings, 2 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
For now, "share=off,readonly=on" would always result in us opening the
file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
turning it into ROM.
Especially for VM templating, "share=off" is a common use case. However,
that use case is impossible with files that lack write permissions,
because "share=off,readonly=on" will not give us writable RAM.
The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
have users (Kata Containers) that rely on the existing behavior --
malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
we cannot change the semantics of "share=off,readonly=on"
So let's add a new "rom" property with on/off/auto values. "auto" is
the default and what most people will use: for historical reasons, to not
change the old semantics, it defaults to the value of the "readonly"
property.
For VM templating, one can now use:
-object memory-backend-file,share=off,readonly=on,rom=off,...
But we'll disallow:
-object memory-backend-file,share=on,readonly=on,rom=off,...
because we would otherwise get an error when trying to mmap the R/O file
shared and writable. An explicit error message is cleaner.
We will also disallow for now:
-object memory-backend-file,share=off,readonly=off,rom=on,...
-object memory-backend-file,share=on,readonly=off,rom=on,...
It's not harmful, but also not really required for now.
Alternatives that were abandoned:
* Make "unarmed=on" for the NVDIMM set the memory region container
readonly. We would still see a change of ROM->RAM and possibly run
into memslot limits with vhost-user. Further, there might be use cases
for "unarmed=on" that should still allow writing to that memory
(temporary files, system RAM, ...).
* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
as with "unarmed=on".
* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
This would slightly changes the behavior of the "readonly" parameter:
values like true/false (as accepted by a 'bool'type) would no longer be
accepted.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
backends/hostmem-file.c | 59 ++++++++++++++++++++++++++++++++++++++++-
qapi/qom.json | 6 ++++-
qemu-options.hx | 10 ++++++-
3 files changed, 72 insertions(+), 3 deletions(-)
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index ef2d5533af..361d4a8103 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -18,6 +18,8 @@
#include "sysemu/hostmem.h"
#include "qom/object_interfaces.h"
#include "qom/object.h"
+#include "qapi/visitor.h"
+#include "qapi/qapi-visit-common.h"
OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendFile, MEMORY_BACKEND_FILE)
@@ -31,6 +33,7 @@ struct HostMemoryBackendFile {
bool discard_data;
bool is_pmem;
bool readonly;
+ OnOffAuto rom;
};
static void
@@ -53,9 +56,33 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
return;
}
+ switch (fb->rom) {
+ case ON_OFF_AUTO_AUTO:
+ /* Traditionally, opening the file readonly always resulted in ROM. */
+ fb->rom = fb->readonly ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
+ break;
+ case ON_OFF_AUTO_ON:
+ if (!fb->readonly) {
+ error_setg(errp, "property 'rom' = 'on' is not supported with"
+ " 'readonly' = 'off'");
+ return;
+ }
+ break;
+ case ON_OFF_AUTO_OFF:
+ if (fb->readonly && backend->share) {
+ error_setg(errp, "property 'rom' = 'off' is incompatible with"
+ " 'readonly' = 'on' and 'share' = 'on'");
+ return;
+ }
+ break;
+ default:
+ assert(false);
+ }
+
name = host_memory_backend_get_name(backend);
ram_flags = backend->share ? RAM_SHARED : 0;
- ram_flags |= fb->readonly ? RAM_READONLY | RAM_READONLY_FD : 0;
+ ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
+ ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
ram_flags |= RAM_NAMED_FILE;
@@ -201,6 +228,32 @@ static void file_memory_backend_set_readonly(Object *obj, bool value,
fb->readonly = value;
}
+static void file_memory_backend_get_rom(Object *obj, Visitor *v,
+ const char *name, void *opaque,
+ Error **errp)
+{
+ HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
+ OnOffAuto rom = fb->rom;
+
+ visit_type_OnOffAuto(v, name, &rom, errp);
+}
+
+static void file_memory_backend_set_rom(Object *obj, Visitor *v,
+ const char *name, void *opaque,
+ Error **errp)
+{
+ HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+ HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
+
+ if (host_memory_backend_mr_inited(backend)) {
+ error_setg(errp, "cannot change property '%s' of %s.", name,
+ object_get_typename(obj));
+ return;
+ }
+
+ visit_type_OnOffAuto(v, name, &fb->rom, errp);
+}
+
static void file_backend_unparent(Object *obj)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
@@ -243,6 +296,10 @@ file_backend_class_init(ObjectClass *oc, void *data)
object_class_property_add_bool(oc, "readonly",
file_memory_backend_get_readonly,
file_memory_backend_set_readonly);
+ object_class_property_add(oc, "rom", "OnOffAuto",
+ file_memory_backend_get_rom, file_memory_backend_set_rom, NULL, NULL);
+ object_class_property_set_description(oc, "rom",
+ "Whether to create Read Only Memory (ROM)");
}
static void file_backend_instance_finalize(Object *o)
diff --git a/qapi/qom.json b/qapi/qom.json
index fa3e88c8e6..0cf83c6f39 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -668,6 +668,9 @@
# @readonly: if true, the backing file is opened read-only; if false,
# it is opened read-write. (default: false)
#
+# @rom: whether to create Read Only Memory (ROM). If set to 'auto', it
+# defaults to the value of @readonly. (default: auto, since 8.2)
+#
# Since: 2.1
##
{ 'struct': 'MemoryBackendFileProperties',
@@ -677,7 +680,8 @@
'*discard-data': 'bool',
'mem-path': 'str',
'*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
- '*readonly': 'bool' } }
+ '*readonly': 'bool',
+ '*rom': 'OnOffAuto' } }
##
# @MemoryBackendMemfdProperties:
diff --git a/qemu-options.hx b/qemu-options.hx
index 29b98c3d4c..03ce0b0a30 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -4976,7 +4976,7 @@ SRST
they are specified. Note that the 'id' property must be set. These
objects are placed in the '/objects' path.
- ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off``
+ ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off,rom=on|off|auto``
Creates a memory file backend object, which can be used to back
the guest RAM with huge pages.
@@ -5066,6 +5066,14 @@ SRST
The ``readonly`` option specifies whether the backing file is opened
read-only or read-write (default).
+ The ``rom`` option specifies whether to create Read Only Memory (ROM)
+ that cannot be modified by the VM. If set to ``on``, the VM cannot
+ modify the memory. If set to ``off``, the VM can modify the memory.
+ If set to ``auto`` (default), the value of the ``readonly`` property
+ is used. This option is primarily helpful for VM templating, where we
+ want to open a file readonly (``readonly=on``) and allow private
+ modifications of the memory by the VM (``share=off``, ``rom=off``).
+
``-object memory-backend-ram,id=id,merge=on|off,dump=on|off,share=on|off,prealloc=on|off,size=size,host-nodes=host-nodes,policy=default|preferred|bind|interleave``
Creates a memory backend object, which can be used to back the
guest RAM. Memory backend objects offer more control than the
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 4/9] softmmu/physmem: Remap with proper protection in qemu_ram_remap()
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
` (2 preceding siblings ...)
2023-08-22 11:44 ` [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 5/9] softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files David Hildenbrand
` (4 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
Let's remap with the proper protection that we can derive from
RAM_READONLY.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
softmmu/physmem.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 16d7a16aa8..2ed83fcefe 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -2070,6 +2070,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
ram_addr_t offset;
int flags;
void *area, *vaddr;
+ int prot;
RAMBLOCK_FOREACH(block) {
offset = addr - block->offset;
@@ -2084,13 +2085,14 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
flags |= block->flags & RAM_SHARED ?
MAP_SHARED : MAP_PRIVATE;
flags |= block->flags & RAM_NORESERVE ? MAP_NORESERVE : 0;
+ prot = PROT_READ;
+ prot |= block->flags & RAM_READONLY ? 0 : PROT_WRITE;
if (block->fd >= 0) {
- area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
- flags, block->fd, offset + block->fd_offset);
+ area = mmap(vaddr, length, prot, flags, block->fd,
+ offset + block->fd_offset);
} else {
flags |= MAP_ANONYMOUS;
- area = mmap(vaddr, length, PROT_READ | PROT_WRITE,
- flags, -1, 0);
+ area = mmap(vaddr, length, prot, flags, -1, 0);
}
if (area != vaddr) {
error_report("Could not remap addr: "
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 5/9] softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
` (3 preceding siblings ...)
2023-08-22 11:44 ` [PATCH v2 4/9] softmmu/physmem: Remap with proper protection in qemu_ram_remap() David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 6/9] softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true David Hildenbrand
` (3 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
fallocate() will fail, let's print a nicer error message.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
softmmu/physmem.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 2ed83fcefe..817a7811ee 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -3457,6 +3457,16 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t start, size_t length)
* so a userfault will trigger.
*/
#ifdef CONFIG_FALLOCATE_PUNCH_HOLE
+ /*
+ * fallocate() will fail with readonly files. Let's print a
+ * proper error message.
+ */
+ if (rb->flags & RAM_READONLY_FD) {
+ error_report("ram_block_discard_range: Discarding RAM"
+ " with readonly files is not supported");
+ goto err;
+
+ }
/*
* We'll discard data from the actual file, even though we only
* have a MAP_PRIVATE mapping, possibly messing with other
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 6/9] softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
` (4 preceding siblings ...)
2023-08-22 11:44 ` [PATCH v2 5/9] softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 7/9] softmmu/physmem: Never return directories from file_ram_open() David Hildenbrand
` (2 subsequent siblings)
8 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
Currently, if a file does not exist yet, file_ram_open() will create new
empty file and open it writable. However, it even does that when
readonly=true was specified.
Specifying O_RDONLY instead to create a new readonly file would
theoretically work, however, ftruncate() will refuse to resize the new
empty file and we'll get a warning:
ftruncate: Invalid argument
And later eventually more problems when actually mmap'ing that file and
accessing it.
If someone intends to let QEMU open+mmap a file read-only, better
create+resize+fill that file ahead of time outside of QEMU context.
We'll now fail with:
./qemu-system-x86_64 \
-object memory-backend-file,id=ram0,mem-path=tmp,readonly=true,size=1g
qemu-system-x86_64: can't open backing store tmp for guest RAM: No such file or directory
All use cases of readonly files (R/O NVDIMMs, VM templating) work on
existing files, so silently creating new files might just hide user
errors when accidentally specifying a non-existent file.
Note that the only memory-backend-file will end up calling
memory_region_init_ram_from_file() -> qemu_ram_alloc_from_file() ->
file_ram_open().
Move error reporting to the single caller.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
softmmu/physmem.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 817a7811ee..b683c92b46 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1289,8 +1289,7 @@ static int64_t get_file_align(int fd)
static int file_ram_open(const char *path,
const char *region_name,
bool readonly,
- bool *created,
- Error **errp)
+ bool *created)
{
char *filename;
char *sanitized_name;
@@ -1305,6 +1304,10 @@ static int file_ram_open(const char *path,
break;
}
if (errno == ENOENT) {
+ if (readonly) {
+ /* Refuse to create new, readonly files. */
+ return -ENOENT;
+ }
/* @path names a file that doesn't exist, create it */
fd = open(path, O_RDWR | O_CREAT | O_EXCL, 0644);
if (fd >= 0) {
@@ -1334,10 +1337,7 @@ static int file_ram_open(const char *path,
g_free(filename);
}
if (errno != EEXIST && errno != EINTR) {
- error_setg_errno(errp, errno,
- "can't open backing store %s for guest RAM",
- path);
- return -1;
+ return -errno;
}
/*
* Try again on EINTR and EEXIST. The latter happens when
@@ -1947,8 +1947,10 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
RAMBlock *block;
fd = file_ram_open(mem_path, memory_region_name(mr),
- ram_flags & RAM_READONLY_FD, &created, errp);
+ ram_flags & RAM_READONLY_FD, &created);
if (fd < 0) {
+ error_setg_errno(errp, -fd, "can't open backing store %s for guest RAM",
+ mem_path);
return NULL;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 7/9] softmmu/physmem: Never return directories from file_ram_open()
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
` (5 preceding siblings ...)
2023-08-22 11:44 ` [PATCH v2 6/9] softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 8/9] docs: Don't mention "-mem-path" in multi-process.rst David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 9/9] docs: Start documenting VM templating David Hildenbrand
8 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
open() does not fail on directories when opening them readonly (O_RDONLY).
Currently, we succeed opening such directories and fail later during
mmap(), resulting in a misleading error message.
$ ./qemu-system-x86_64 \
-object memory-backend-file,id=ram0,mem-path=tmp,readonly=true,size=1g
qemu-system-x86_64: unable to map backing store for guest RAM: No such device
To identify directories and handle them accordingly in file_ram_open()
also when readonly=true was specified, detect if we just opened a directory
using fstat() instead. Then, fail file_ram_open() right away, similarly
to how we now fail if the file does not exist and we want to open the
file readonly.
With this change, we get a nicer error message:
qemu-system-x86_64: can't open backing store tmp for guest RAM: Is a directory
Note that the only memory-backend-file will end up calling
memory_region_init_ram_from_file() -> qemu_ram_alloc_from_file() ->
file_ram_open().
Reported-by: Thiner Logoer <logoerthiner1@163.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
softmmu/physmem.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index b683c92b46..d0c1650781 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1300,6 +1300,25 @@ static int file_ram_open(const char *path,
for (;;) {
fd = open(path, readonly ? O_RDONLY : O_RDWR);
if (fd >= 0) {
+ /*
+ * open(O_RDONLY) won't fail with EISDIR. Check manually if we
+ * opened a directory and fail similarly to how we fail ENOENT
+ * in readonly mode. Note that mkstemp() would imply O_RDWR.
+ */
+ if (readonly) {
+ struct stat file_stat;
+
+ if (fstat(fd, &file_stat)) {
+ close(fd);
+ if (errno == EINTR) {
+ continue;
+ }
+ return -errno;
+ } else if (S_ISDIR(file_stat.st_mode)) {
+ close(fd);
+ return -EISDIR;
+ }
+ }
/* @path names an existing file, use it */
break;
}
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 8/9] docs: Don't mention "-mem-path" in multi-process.rst
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
` (6 preceding siblings ...)
2023-08-22 11:44 ` [PATCH v2 7/9] softmmu/physmem: Never return directories from file_ram_open() David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 13:21 ` ThinerLogoer
2023-08-22 11:44 ` [PATCH v2 9/9] docs: Start documenting VM templating David Hildenbrand
8 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
"-mem-path" corresponds to "memory-backend-file,share=off" and,
therefore, creates a private COW mapping of the file. For multi-proces
QEMU, we need proper shared file-backed memory.
Let's make that clearer.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
docs/devel/multi-process.rst | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/docs/devel/multi-process.rst b/docs/devel/multi-process.rst
index e4801751f2..4ef539c0b0 100644
--- a/docs/devel/multi-process.rst
+++ b/docs/devel/multi-process.rst
@@ -409,8 +409,9 @@ the initial messages sent to the emulation process is a guest memory
table. Each entry in this table consists of a file descriptor and size
that the emulation process can ``mmap()`` to directly access guest
memory, similar to ``vhost_user_set_mem_table()``. Note guest memory
-must be backed by file descriptors, such as when QEMU is given the
-*-mem-path* command line option.
+must be backed by shared file-backed memory, for example, using
+*-object memory-backend-file,share=on* and setting that memory backend
+as RAM for the machine.
IOMMU operations
^^^^^^^^^^^^^^^^
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [PATCH v2 9/9] docs: Start documenting VM templating
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
` (7 preceding siblings ...)
2023-08-22 11:44 ` [PATCH v2 8/9] docs: Don't mention "-mem-path" in multi-process.rst David Hildenbrand
@ 2023-08-22 11:44 ` David Hildenbrand
2023-08-22 13:47 ` Daniel P. Berrangé
2023-08-22 14:23 ` Peter Maydell
8 siblings, 2 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 11:44 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-ppc, David Hildenbrand, Paolo Bonzini, Peter Xu,
Igor Mammedov, Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
Let's add some details about VM templating, focusing on the VM memory
configuration only.
There is much more to VM templating (VM state? block devices?), but I leave
that as future work.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
docs/vm-templating.txt | 109 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 109 insertions(+)
create mode 100644 docs/vm-templating.txt
diff --git a/docs/vm-templating.txt b/docs/vm-templating.txt
new file mode 100644
index 0000000000..419362c1ea
--- /dev/null
+++ b/docs/vm-templating.txt
@@ -0,0 +1,109 @@
+QEMU VM templating
+==================
+
+This document explains how to use VM templating in QEMU.
+
+For now, the focus is on VM memory aspects, and not about how to save and
+restore other VM state (i.e., migrate-to-file with 'x-ignore-shared').
+
+Overview
+--------
+
+With VM templating, a single template VM serves as the starting point for
+new VMs. This allows for fast and efficient replication of VMs, resulting
+in fast startup times and reduced memory consumption.
+
+Conceptually, the VM state is frozen, to then be used as a basis for new
+VMs. The Copy-On-Write mechanism in the operating systems makes
+sure that new VMs are able to read template VM memory; however, any
+modifications stay private and don't modify the original template VM or any
+other created VM.
+
+Memory configuration
+--------------------
+
+In order to create the template VM, we have to make sure that VM memory
+ends up in a file, from where it can be reused for the new VMs:
+
+Supply VM RAM via memory-backend-file, with 'share=on' (modifications go
+to the file) and 'readonly=off' (open the file writable). Note that
+'readonly=off' is implicit.
+
+In the following command-line example, a 2GB VM is created, whereby VM RAM
+is to be stored in the 'template' file.
+
+ qemu [...] -m 2g \
+ -object memory-backend-file,id=pc.ram,mem-path=template,size=2g,share=on,... \
+ -machine q35,memory-backend=pc.ram',
+
+If multiple memory backends are used (vNUMA, DIMMs), configure all
+memory backends accordingly.
+
+Once the VM is in the desired state, stop the VM and save other VM state,
+leaving the current state of VM RAM reside in the file.
+
+In order to have a new VM be based on a template VM, we have to
+configure VM RAM to be based on a template VM RAM file; however, the VM
+should not be able to modify file content.
+
+Supply VM RAM via memory-backend-file, with 'share=off' (modifications stay
+private), 'readonly=on' (open the file readonly) and 'rom=off' (don't make
+the memory readonly for the VM). Note that 'share=off' is implicit and
+that other VM state has to be restored separately.
+
+In the following command-line example, a 2GB VM is created based on the
+existing 2GB file 'template'.
+
+ qemu [...] -m 2g \
+ -object memory-backend-file,id=pc.ram,mem-path=template,size=2g,readonly=on,rom=off,... \
+ -machine q35,memory-backend=pc.ram',
+
+If multiple memory backends are used (vNUMA, DIMMs), configure all
+memory backends accordingly.
+
+Note that '-mem-path' cannot be used for VM templating when creating the
+template VM or when starting new VMs based on a template VM.
+
+Incompatible features
+---------------------
+
+Some features are incompatible with VM templating, as the underlying file
+cannot be modified to discard VM RAM, or to actually share memory with
+another process.
+
+vhost-user and multi-process QEMU
+'''''''''''''''''''''''''''''''''
+
+vhost-user and multi-process QEMU are incompatible with VM templating.
+These technologies rely on shared memory, however, the template VMs
+don't actually share memory ('share=off'), even though they are file-based.
+
+virtio-balloon
+''''''''''''''
+
+virtio-balloon inflation and "free page reporting" cannot discard VM RAM
+and will repeatedly report errors. While virtio-balloon can be used
+for template VMs (e.g., report VM RAM stats), "free page reporting"
+should be disabled and the balloon should not be inflated.
+
+virtio-mem
+''''''''''
+
+virtio-mem cannot discard VM RAM that is managed by the virtio-mem
+device. virtio-mem will fail early when realizing the device. To use
+VM templating with virtio-mem, either hotplug virtio-mem devices to the new
+VM, or don't supply any memory to the template VM using virtio-mem
+(requested-size=0), not using a template VM file as memory backend for the
+virtio-mem device.
+
+VM migration
+''''''''''''
+
+For VM migration, "x-release-ram" similarly relies on discarding of VM
+RAM on the migration source to free up migrated RAM, and will
+repeatedly report errors.
+
+Postcopy live migration fails discarding VM RAM on the migration
+destination early and refuses to activate postcopy live migration. Note
+that postcopy live migration usually only works on selected filesystems
+(shmem/tmpfs, hugetlbfs) either way.
--
2.41.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re:[PATCH v2 2/9] softmmu/physmem: Distinguish between file access mode and mmap protection
2023-08-22 11:44 ` [PATCH v2 2/9] softmmu/physmem: Distinguish between file access mode and mmap protection David Hildenbrand
@ 2023-08-22 13:13 ` ThinerLogoer
2023-08-22 13:25 ` [PATCH " David Hildenbrand
0 siblings, 1 reply; 25+ messages in thread
From: ThinerLogoer @ 2023-08-22 13:13 UTC (permalink / raw)
To: David Hildenbrand
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Philippe Mathieu-Daudé, Daniel P . Berrangé,
Stefan Hajnoczi, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
Hello,
At 2023-08-22 19:44:50, "David Hildenbrand" <david@redhat.com> wrote:
>There is a difference between how we open a file and how we mmap it,
>and we want to support writable private mappings of readonly files. Let's
>define RAM_READONLY and RAM_READONLY_FD flags, to replace the single
>"readonly" parameter for file-related functions.
>
>In memory_region_init_ram_from_fd() and memory_region_init_ram_from_file(),
>initialize mr->readonly based on the new RAM_READONLY flag.
>
>While at it, add some RAM_* flags we missed to add to the list of accepted
>flags in the documentation of some functions.
>
>No change in functionality intended. We'll make use of both flags next
>and start setting them independently for memory-backend-file.
>
>Signed-off-by: David Hildenbrand <david@redhat.com>
>---
> backends/hostmem-file.c | 4 ++--
> include/exec/memory.h | 14 ++++++++++----
> include/exec/ram_addr.h | 8 ++++----
> softmmu/memory.c | 8 ++++----
> softmmu/physmem.c | 21 ++++++++++-----------
> 5 files changed, 30 insertions(+), 25 deletions(-)
>
>diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
>index b4335a80e6..ef2d5533af 100644
>--- a/backends/hostmem-file.c
>+++ b/backends/hostmem-file.c
>@@ -55,13 +55,13 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>
> name = host_memory_backend_get_name(backend);
> ram_flags = backend->share ? RAM_SHARED : 0;
>+ ram_flags |= fb->readonly ? RAM_READONLY | RAM_READONLY_FD : 0;
> ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
> ram_flags |= RAM_NAMED_FILE;
> memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
> backend->size, fb->align, ram_flags,
>- fb->mem_path, fb->offset, fb->readonly,
>- errp);
>+ fb->mem_path, fb->offset, errp);
> g_free(name);
> #endif
> }
>diff --git a/include/exec/memory.h b/include/exec/memory.h
>index 68284428f8..cc68249eda 100644
>--- a/include/exec/memory.h
>+++ b/include/exec/memory.h
>@@ -235,6 +235,12 @@ typedef struct IOMMUTLBEvent {
> /* RAM is an mmap-ed named file */
> #define RAM_NAMED_FILE (1 << 9)
>
>+/* RAM is mmap-ed read-only */
>+#define RAM_READONLY (1 << 10)
>+
>+/* RAM FD is opened read-only */
>+#define RAM_READONLY_FD (1 << 11)
>+
> static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
> IOMMUNotifierFlag flags,
> hwaddr start, hwaddr end,
>@@ -1331,10 +1337,10 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
> * @align: alignment of the region base address; if 0, the default alignment
> * (getpagesize()) will be used.
> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>- * RAM_NORESERVE,
>+ * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
>+ * RAM_READONLY_FD
> * @path: the path in which to allocate the RAM.
> * @offset: offset within the file referenced by path
>- * @readonly: true to open @path for reading, false for read/write.
> * @errp: pointer to Error*, to store an error if it happens.
> *
> * Note that this function does not do anything to cause the data in the
>@@ -1348,7 +1354,6 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
> uint32_t ram_flags,
> const char *path,
> ram_addr_t offset,
>- bool readonly,
> Error **errp);
>
> /**
>@@ -1360,7 +1365,8 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
> * @name: the name of the region.
> * @size: size of the region.
> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>- * RAM_NORESERVE, RAM_PROTECTED.
>+ * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
>+ * RAM_READONLY_FD
> * @fd: the fd to mmap.
> * @offset: offset within the file referenced by fd
> * @errp: pointer to Error*, to store an error if it happens.
>diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>index 9f2e3893f5..90676093f5 100644
>--- a/include/exec/ram_addr.h
>+++ b/include/exec/ram_addr.h
>@@ -108,10 +108,10 @@ long qemu_maxrampagesize(void);
> * @size: the size in bytes of the ram block
> * @mr: the memory region where the ram block is
> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>- * RAM_NORESERVE.
>+ * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
>+ * RAM_READONLY_FD
> * @mem_path or @fd: specify the backing file or device
> * @offset: Offset into target file
>- * @readonly: true to open @path for reading, false for read/write.
> * @errp: pointer to Error*, to store an error if it happens
> *
> * Return:
>@@ -120,10 +120,10 @@ long qemu_maxrampagesize(void);
> */
> RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
> uint32_t ram_flags, const char *mem_path,
>- off_t offset, bool readonly, Error **errp);
>+ off_t offset, Error **errp);
> RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
> uint32_t ram_flags, int fd, off_t offset,
>- bool readonly, Error **errp);
>+ Error **errp);
>
> RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
> MemoryRegion *mr, Error **errp);
>diff --git a/softmmu/memory.c b/softmmu/memory.c
>index 7d9494ce70..d8974a1e65 100644
>--- a/softmmu/memory.c
>+++ b/softmmu/memory.c
>@@ -1620,18 +1620,17 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
> uint32_t ram_flags,
> const char *path,
> ram_addr_t offset,
>- bool readonly,
> Error **errp)
> {
> Error *err = NULL;
> memory_region_init(mr, owner, name, size);
> mr->ram = true;
>- mr->readonly = readonly;
>+ mr->readonly = ram_flags & RAM_READONLY;
I only did a quick code auditing, but I suspect that
```
mr->readonly = !!(ram_flags & RAM_READONLY);
```
is safer. So is the other parts of the code probably.
> mr->terminates = true;
> mr->destructor = memory_region_destructor_ram;
> mr->align = align;
> mr->ram_block = qemu_ram_alloc_from_file(size, mr, ram_flags, path,
>- offset, readonly, &err);
>+ offset, &err);
> if (err) {
> mr->size = int128_zero();
> object_unparent(OBJECT(mr));
>@@ -1651,10 +1650,11 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr,
> Error *err = NULL;
> memory_region_init(mr, owner, name, size);
> mr->ram = true;
>+ mr->readonly = ram_flags & RAM_READONLY;
for example here.
--
Regards,
logoerthiner
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re:[PATCH v2 8/9] docs: Don't mention "-mem-path" in multi-process.rst
2023-08-22 11:44 ` [PATCH v2 8/9] docs: Don't mention "-mem-path" in multi-process.rst David Hildenbrand
@ 2023-08-22 13:21 ` ThinerLogoer
2023-08-22 13:24 ` [PATCH " David Hildenbrand
0 siblings, 1 reply; 25+ messages in thread
From: ThinerLogoer @ 2023-08-22 13:21 UTC (permalink / raw)
To: David Hildenbrand
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Philippe Mathieu-Daudé, Daniel P . Berrangé,
Stefan Hajnoczi, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
Hello,
At 2023-08-22 19:44:56, "David Hildenbrand" <david@redhat.com> wrote:
>"-mem-path" corresponds to "memory-backend-file,share=off" and,
>therefore, creates a private COW mapping of the file. For multi-proces
>QEMU, we need proper shared file-backed memory.
>
>Let's make that clearer.
>
>Signed-off-by: David Hildenbrand <david@redhat.com>
>---
> docs/devel/multi-process.rst | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
>diff --git a/docs/devel/multi-process.rst b/docs/devel/multi-process.rst
>index e4801751f2..4ef539c0b0 100644
>--- a/docs/devel/multi-process.rst
>+++ b/docs/devel/multi-process.rst
>@@ -409,8 +409,9 @@ the initial messages sent to the emulation process is a guest memory
> table. Each entry in this table consists of a file descriptor and size
> that the emulation process can ``mmap()`` to directly access guest
> memory, similar to ``vhost_user_set_mem_table()``. Note guest memory
>-must be backed by file descriptors, such as when QEMU is given the
>-*-mem-path* command line option.
>+must be backed by shared file-backed memory, for example, using
>+*-object memory-backend-file,share=on* and setting that memory backend
>+as RAM for the machine.
>
> IOMMU operations
> ^^^^^^^^^^^^^^^^
About "-mem-path" and "-object memory-backend-file".
I have mentioned a error message suggestion, maybe you can consider it? The
error message related to "-object memory-backend-file,id=pc.ram" is confusing,
as is shown below: (https://lore.kernel.org/all/2337d9f.16d6.189e8682901.Coremail.logoerthiner1@163.com/)
>
> Wait ... I thought it should not work but it did work today. How bad am I at reading
> the correct part of documentation ...
>
> '-machine q35 -m 512M' is equivalent to '-object
> memory-backend-file,id=pc.ram,size=512M
> -machine q35,memory-backend=pc.ram',
> the latter works, and the two mentioned setup can be
> migrated from one to another.
>
> What I was consistently trying was '-object
> memory-backend-file,id=pc.ram,size=512M -machine q35', and qemu raises an error
> for this in a recent update:
>
> ```
> qemu-system-x86_64: object name 'pc.ram' is reserved for the default RAM backend, it can't be used for any other purposes. Change the object's 'id' to something else
> ```
>
> This error is misleading. Actually in this case, the error report message should be more
> close to:
> ```
> object name 'pc.ram' is reserved for the default RAM backend, it can't
> be used for any other purposes. Change the object's 'id' to something
> else, or append "memory-backend=pc.ram" to -machine arguments
> ```
>
> (I suggest rewriting the error message like this string because of the confusion just now)
>
>
> Even though the default memory backend name is pc.ram, the
> '-machine q35,memory-backend=pc.ram' part explicitly marks that qemu
> uses a memory backend named pc.ram, rather than rely on default.
>
> It seems that if it "rely on default" and memory-backend-file has an id
> of "pc.ram" (in x86_64 of course), it will fail.
>
> Great. Now I will consider using a "-object
> memory-backend-file,id=pc.ram,size=512M
> -machine q35,memory-backend=pc.ram"
--
Regards,
logoerthiner
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 8/9] docs: Don't mention "-mem-path" in multi-process.rst
2023-08-22 13:21 ` ThinerLogoer
@ 2023-08-22 13:24 ` David Hildenbrand
0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 13:24 UTC (permalink / raw)
To: ThinerLogoer
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Philippe Mathieu-Daudé, Daniel P . Berrangé,
Stefan Hajnoczi, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
On 22.08.23 15:21, ThinerLogoer wrote:
> Hello,
>
> At 2023-08-22 19:44:56, "David Hildenbrand" <david@redhat.com> wrote:
>> "-mem-path" corresponds to "memory-backend-file,share=off" and,
>> therefore, creates a private COW mapping of the file. For multi-proces
>> QEMU, we need proper shared file-backed memory.
>>
>> Let's make that clearer.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>> docs/devel/multi-process.rst | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/devel/multi-process.rst b/docs/devel/multi-process.rst
>> index e4801751f2..4ef539c0b0 100644
>> --- a/docs/devel/multi-process.rst
>> +++ b/docs/devel/multi-process.rst
>> @@ -409,8 +409,9 @@ the initial messages sent to the emulation process is a guest memory
>> table. Each entry in this table consists of a file descriptor and size
>> that the emulation process can ``mmap()`` to directly access guest
>> memory, similar to ``vhost_user_set_mem_table()``. Note guest memory
>> -must be backed by file descriptors, such as when QEMU is given the
>> -*-mem-path* command line option.
>> +must be backed by shared file-backed memory, for example, using
>> +*-object memory-backend-file,share=on* and setting that memory backend
>> +as RAM for the machine.
>>
>> IOMMU operations
>> ^^^^^^^^^^^^^^^^
>
> About "-mem-path" and "-object memory-backend-file".
>
> I have mentioned a error message suggestion, maybe you can consider it? The
> error message related to "-object memory-backend-file,id=pc.ram" is confusing,
> as is shown below: (https://lore.kernel.org/all/2337d9f.16d6.189e8682901.Coremail.logoerthiner1@163.com/)
Oh, I missed that. Let me see if I can clarify that! Thanks!
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 2/9] softmmu/physmem: Distinguish between file access mode and mmap protection
2023-08-22 13:13 ` ThinerLogoer
@ 2023-08-22 13:25 ` David Hildenbrand
0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 13:25 UTC (permalink / raw)
To: ThinerLogoer
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Philippe Mathieu-Daudé, Daniel P . Berrangé,
Stefan Hajnoczi, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
On 22.08.23 15:13, ThinerLogoer wrote:
> Hello,
>
> At 2023-08-22 19:44:50, "David Hildenbrand" <david@redhat.com> wrote:
>> There is a difference between how we open a file and how we mmap it,
>> and we want to support writable private mappings of readonly files. Let's
>> define RAM_READONLY and RAM_READONLY_FD flags, to replace the single
>> "readonly" parameter for file-related functions.
>>
>> In memory_region_init_ram_from_fd() and memory_region_init_ram_from_file(),
>> initialize mr->readonly based on the new RAM_READONLY flag.
>>
>> While at it, add some RAM_* flags we missed to add to the list of accepted
>> flags in the documentation of some functions.
>>
>> No change in functionality intended. We'll make use of both flags next
>> and start setting them independently for memory-backend-file.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>> backends/hostmem-file.c | 4 ++--
>> include/exec/memory.h | 14 ++++++++++----
>> include/exec/ram_addr.h | 8 ++++----
>> softmmu/memory.c | 8 ++++----
>> softmmu/physmem.c | 21 ++++++++++-----------
>> 5 files changed, 30 insertions(+), 25 deletions(-)
>>
>> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
>> index b4335a80e6..ef2d5533af 100644
>> --- a/backends/hostmem-file.c
>> +++ b/backends/hostmem-file.c
>> @@ -55,13 +55,13 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>>
>> name = host_memory_backend_get_name(backend);
>> ram_flags = backend->share ? RAM_SHARED : 0;
>> + ram_flags |= fb->readonly ? RAM_READONLY | RAM_READONLY_FD : 0;
>> ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
>> ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
>> ram_flags |= RAM_NAMED_FILE;
>> memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
>> backend->size, fb->align, ram_flags,
>> - fb->mem_path, fb->offset, fb->readonly,
>> - errp);
>> + fb->mem_path, fb->offset, errp);
>> g_free(name);
>> #endif
>> }
>> diff --git a/include/exec/memory.h b/include/exec/memory.h
>> index 68284428f8..cc68249eda 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -235,6 +235,12 @@ typedef struct IOMMUTLBEvent {
>> /* RAM is an mmap-ed named file */
>> #define RAM_NAMED_FILE (1 << 9)
>>
>> +/* RAM is mmap-ed read-only */
>> +#define RAM_READONLY (1 << 10)
>> +
>> +/* RAM FD is opened read-only */
>> +#define RAM_READONLY_FD (1 << 11)
>> +
>> static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
>> IOMMUNotifierFlag flags,
>> hwaddr start, hwaddr end,
>> @@ -1331,10 +1337,10 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
>> * @align: alignment of the region base address; if 0, the default alignment
>> * (getpagesize()) will be used.
>> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>> - * RAM_NORESERVE,
>> + * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
>> + * RAM_READONLY_FD
>> * @path: the path in which to allocate the RAM.
>> * @offset: offset within the file referenced by path
>> - * @readonly: true to open @path for reading, false for read/write.
>> * @errp: pointer to Error*, to store an error if it happens.
>> *
>> * Note that this function does not do anything to cause the data in the
>> @@ -1348,7 +1354,6 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
>> uint32_t ram_flags,
>> const char *path,
>> ram_addr_t offset,
>> - bool readonly,
>> Error **errp);
>>
>> /**
>> @@ -1360,7 +1365,8 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
>> * @name: the name of the region.
>> * @size: size of the region.
>> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>> - * RAM_NORESERVE, RAM_PROTECTED.
>> + * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
>> + * RAM_READONLY_FD
>> * @fd: the fd to mmap.
>> * @offset: offset within the file referenced by fd
>> * @errp: pointer to Error*, to store an error if it happens.
>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>> index 9f2e3893f5..90676093f5 100644
>> --- a/include/exec/ram_addr.h
>> +++ b/include/exec/ram_addr.h
>> @@ -108,10 +108,10 @@ long qemu_maxrampagesize(void);
>> * @size: the size in bytes of the ram block
>> * @mr: the memory region where the ram block is
>> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>> - * RAM_NORESERVE.
>> + * RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
>> + * RAM_READONLY_FD
>> * @mem_path or @fd: specify the backing file or device
>> * @offset: Offset into target file
>> - * @readonly: true to open @path for reading, false for read/write.
>> * @errp: pointer to Error*, to store an error if it happens
>> *
>> * Return:
>> @@ -120,10 +120,10 @@ long qemu_maxrampagesize(void);
>> */
>> RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
>> uint32_t ram_flags, const char *mem_path,
>> - off_t offset, bool readonly, Error **errp);
>> + off_t offset, Error **errp);
>> RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
>> uint32_t ram_flags, int fd, off_t offset,
>> - bool readonly, Error **errp);
>> + Error **errp);
>>
>> RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>> MemoryRegion *mr, Error **errp);
>> diff --git a/softmmu/memory.c b/softmmu/memory.c
>> index 7d9494ce70..d8974a1e65 100644
>> --- a/softmmu/memory.c
>> +++ b/softmmu/memory.c
>> @@ -1620,18 +1620,17 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
>> uint32_t ram_flags,
>> const char *path,
>> ram_addr_t offset,
>> - bool readonly,
>> Error **errp)
>> {
>> Error *err = NULL;
>> memory_region_init(mr, owner, name, size);
>> mr->ram = true;
>> - mr->readonly = readonly;
>> + mr->readonly = ram_flags & RAM_READONLY;
>
> I only did a quick code auditing, but I suspect that
> ```
> mr->readonly = !!(ram_flags & RAM_READONLY);
> ```
> is safer. So is the other parts of the code probably.
Yes, looks cleaner, thanks!
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files
2023-08-22 11:44 ` [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files David Hildenbrand
@ 2023-08-22 13:27 ` Markus Armbruster
2023-08-22 13:29 ` David Hildenbrand
2023-08-22 14:26 ` ThinerLogoer
1 sibling, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2023-08-22 13:27 UTC (permalink / raw)
To: David Hildenbrand
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Eduardo Habkost
David Hildenbrand <david@redhat.com> writes:
> For now, "share=off,readonly=on" would always result in us opening the
> file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
> turning it into ROM.
>
> Especially for VM templating, "share=off" is a common use case. However,
> that use case is impossible with files that lack write permissions,
> because "share=off,readonly=on" will not give us writable RAM.
>
> The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
> have users (Kata Containers) that rely on the existing behavior --
> malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
> we cannot change the semantics of "share=off,readonly=on"
>
> So let's add a new "rom" property with on/off/auto values. "auto" is
> the default and what most people will use: for historical reasons, to not
> change the old semantics, it defaults to the value of the "readonly"
> property.
>
> For VM templating, one can now use:
> -object memory-backend-file,share=off,readonly=on,rom=off,...
>
> But we'll disallow:
> -object memory-backend-file,share=on,readonly=on,rom=off,...
> because we would otherwise get an error when trying to mmap the R/O file
> shared and writable. An explicit error message is cleaner.
>
> We will also disallow for now:
> -object memory-backend-file,share=off,readonly=off,rom=on,...
> -object memory-backend-file,share=on,readonly=off,rom=on,...
> It's not harmful, but also not really required for now.
>
> Alternatives that were abandoned:
> * Make "unarmed=on" for the NVDIMM set the memory region container
> readonly. We would still see a change of ROM->RAM and possibly run
> into memslot limits with vhost-user. Further, there might be use cases
> for "unarmed=on" that should still allow writing to that memory
> (temporary files, system RAM, ...).
> * Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
> as with "unarmed=on".
> * Make "readonly" consume "on/off/file" instead of being a 'bool' type.
> This would slightly changes the behavior of the "readonly" parameter:
> values like true/false (as accepted by a 'bool'type) would no longer be
> accepted.
>
> Signed-off-by: David Hildenbrand <david@redhat.com>
[...]
> static void file_backend_instance_finalize(Object *o)
> diff --git a/qapi/qom.json b/qapi/qom.json
> index fa3e88c8e6..0cf83c6f39 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -668,6 +668,9 @@
> # @readonly: if true, the backing file is opened read-only; if false,
> # it is opened read-write. (default: false)
> #
> +# @rom: whether to create Read Only Memory (ROM). If set to 'auto', it
> +# defaults to the value of @readonly. (default: auto, since 8.2)
> +#
> # Since: 2.1
> ##
The commit message discusses how @readonly, @rom and @share interact.
The doc comments don't, and users have to guess.
I can see two ways to help users:
1. Describe their interaction in full, so users can understand how to
get from them what they need.
2. Provide suitable guidance on how to use them.
> { 'struct': 'MemoryBackendFileProperties',
> @@ -677,7 +680,8 @@
> '*discard-data': 'bool',
> 'mem-path': 'str',
> '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
> - '*readonly': 'bool' } }
> + '*readonly': 'bool',
> + '*rom': 'OnOffAuto' } }
> ##
> # @MemoryBackendMemfdProperties:
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 29b98c3d4c..03ce0b0a30 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -4976,7 +4976,7 @@ SRST
> they are specified. Note that the 'id' property must be set. These
> objects are placed in the '/objects' path.
>
> - ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off``
> + ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off,rom=on|off|auto``
> Creates a memory file backend object, which can be used to back
> the guest RAM with huge pages.
>
> @@ -5066,6 +5066,14 @@ SRST
> The ``readonly`` option specifies whether the backing file is opened
> read-only or read-write (default).
>
> + The ``rom`` option specifies whether to create Read Only Memory (ROM)
> + that cannot be modified by the VM. If set to ``on``, the VM cannot
> + modify the memory. If set to ``off``, the VM can modify the memory.
> + If set to ``auto`` (default), the value of the ``readonly`` property
> + is used. This option is primarily helpful for VM templating, where we
> + want to open a file readonly (``readonly=on``) and allow private
> + modifications of the memory by the VM (``share=off``, ``rom=off``).
> +
Here, you provide some guidance.
> ``-object memory-backend-ram,id=id,merge=on|off,dump=on|off,share=on|off,prealloc=on|off,size=size,host-nodes=host-nodes,policy=default|preferred|bind|interleave``
> Creates a memory backend object, which can be used to back the
> guest RAM. Memory backend objects offer more control than the
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files
2023-08-22 13:27 ` Markus Armbruster
@ 2023-08-22 13:29 ` David Hildenbrand
0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 13:29 UTC (permalink / raw)
To: Markus Armbruster
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P.Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Eduardo Habkost
On 22.08.23 15:27, Markus Armbruster wrote:
> David Hildenbrand <david@redhat.com> writes:
>
>> For now, "share=off,readonly=on" would always result in us opening the
>> file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
>> turning it into ROM.
>>
>> Especially for VM templating, "share=off" is a common use case. However,
>> that use case is impossible with files that lack write permissions,
>> because "share=off,readonly=on" will not give us writable RAM.
>>
>> The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
>> have users (Kata Containers) that rely on the existing behavior --
>> malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
>> we cannot change the semantics of "share=off,readonly=on"
>>
>> So let's add a new "rom" property with on/off/auto values. "auto" is
>> the default and what most people will use: for historical reasons, to not
>> change the old semantics, it defaults to the value of the "readonly"
>> property.
>>
>> For VM templating, one can now use:
>> -object memory-backend-file,share=off,readonly=on,rom=off,...
>>
>> But we'll disallow:
>> -object memory-backend-file,share=on,readonly=on,rom=off,...
>> because we would otherwise get an error when trying to mmap the R/O file
>> shared and writable. An explicit error message is cleaner.
>>
>> We will also disallow for now:
>> -object memory-backend-file,share=off,readonly=off,rom=on,...
>> -object memory-backend-file,share=on,readonly=off,rom=on,...
>> It's not harmful, but also not really required for now.
>>
>> Alternatives that were abandoned:
>> * Make "unarmed=on" for the NVDIMM set the memory region container
>> readonly. We would still see a change of ROM->RAM and possibly run
>> into memslot limits with vhost-user. Further, there might be use cases
>> for "unarmed=on" that should still allow writing to that memory
>> (temporary files, system RAM, ...).
>> * Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
>> as with "unarmed=on".
>> * Make "readonly" consume "on/off/file" instead of being a 'bool' type.
>> This would slightly changes the behavior of the "readonly" parameter:
>> values like true/false (as accepted by a 'bool'type) would no longer be
>> accepted.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>
> [...]
>
>> static void file_backend_instance_finalize(Object *o)
>> diff --git a/qapi/qom.json b/qapi/qom.json
>> index fa3e88c8e6..0cf83c6f39 100644
>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -668,6 +668,9 @@
>> # @readonly: if true, the backing file is opened read-only; if false,
>> # it is opened read-write. (default: false)
>> #
>> +# @rom: whether to create Read Only Memory (ROM). If set to 'auto', it
>> +# defaults to the value of @readonly. (default: auto, since 8.2)
>> +#
>> # Since: 2.1
>> ##
>
> The commit message discusses how @readonly, @rom and @share interact.
> The doc comments don't, and users have to guess.
>
> I can see two ways to help users:
>
> 1. Describe their interaction in full, so users can understand how to
> get from them what they need.
>
> 2. Provide suitable guidance on how to use them.
>
>> { 'struct': 'MemoryBackendFileProperties',
>> @@ -677,7 +680,8 @@
>> '*discard-data': 'bool',
>> 'mem-path': 'str',
>> '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
>> - '*readonly': 'bool' } }
>> + '*readonly': 'bool',
>> + '*rom': 'OnOffAuto' } }
>> ##
>> # @MemoryBackendMemfdProperties:
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 29b98c3d4c..03ce0b0a30 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -4976,7 +4976,7 @@ SRST
>> they are specified. Note that the 'id' property must be set. These
>> objects are placed in the '/objects' path.
>>
>> - ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off``
>> + ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off,rom=on|off|auto``
>> Creates a memory file backend object, which can be used to back
>> the guest RAM with huge pages.
>>
>> @@ -5066,6 +5066,14 @@ SRST
>> The ``readonly`` option specifies whether the backing file is opened
>> read-only or read-write (default).
>>
>> + The ``rom`` option specifies whether to create Read Only Memory (ROM)
>> + that cannot be modified by the VM. If set to ``on``, the VM cannot
>> + modify the memory. If set to ``off``, the VM can modify the memory.
>> + If set to ``auto`` (default), the value of the ``readonly`` property
>> + is used. This option is primarily helpful for VM templating, where we
>> + want to open a file readonly (``readonly=on``) and allow private
>> + modifications of the memory by the VM (``share=off``, ``rom=off``).
>> +
>
> Here, you provide some guidance.
Is that sufficient in your opinion? Then I could similarly replicate
(unfortunately) it in the qapi/qom.json doc?
Thanks Markus!
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 9/9] docs: Start documenting VM templating
2023-08-22 11:44 ` [PATCH v2 9/9] docs: Start documenting VM templating David Hildenbrand
@ 2023-08-22 13:47 ` Daniel P. Berrangé
2023-08-22 14:04 ` David Hildenbrand
2023-08-22 14:23 ` Peter Maydell
1 sibling, 1 reply; 25+ messages in thread
From: Daniel P. Berrangé @ 2023-08-22 13:47 UTC (permalink / raw)
To: David Hildenbrand
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Thiner Logoer, Philippe Mathieu-Daudé, Stefan Hajnoczi,
Elena Ufimtseva, Jagannathan Raman, Michael S. Tsirkin, Ani Sinha,
Xiao Guangrong, Daniel Henrique Barboza, Greg Kurz, Eric Blake,
Markus Armbruster, Eduardo Habkost
On Tue, Aug 22, 2023 at 01:44:57PM +0200, David Hildenbrand wrote:
> Let's add some details about VM templating, focusing on the VM memory
> configuration only.
>
> There is much more to VM templating (VM state? block devices?), but I leave
> that as future work.
Then there's the supposedly "unique" hardware identifiers, most notably
VM UUID & NIC MAC addr that don't change if you create many VMs from
a "template". Or from the guest OS there are "unique" things like
/etc/machine-id, SSH host keys, web server certificates, etc.
The vmgenid device at least provides a way for guest OS to get notified
to update its unique resources/identifiers, but doesn't solve the overall
VM UUID. NIC MAC addr could be solved by hotunplug+plug either side of
creating the template & instantiating the template.
>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> docs/vm-templating.txt | 109 +++++++++++++++++++++++++++++++++++++++++
Can you make this doument RST from the start and link to it from
somewhere appropriate in our documentation. Perhaps it should live
under the docs/system/ directory ?
> 1 file changed, 109 insertions(+)
> create mode 100644 docs/vm-templating.txt
>
> diff --git a/docs/vm-templating.txt b/docs/vm-templating.txt
> new file mode 100644
> index 0000000000..419362c1ea
> --- /dev/null
> +++ b/docs/vm-templating.txt
> @@ -0,0 +1,109 @@
> +QEMU VM templating
> +==================
> +
> +This document explains how to use VM templating in QEMU.
> +
> +For now, the focus is on VM memory aspects, and not about how to save and
> +restore other VM state (i.e., migrate-to-file with 'x-ignore-shared').
> +
> +Overview
> +--------
> +
> +With VM templating, a single template VM serves as the starting point for
> +new VMs. This allows for fast and efficient replication of VMs, resulting
> +in fast startup times and reduced memory consumption.
> +
> +Conceptually, the VM state is frozen, to then be used as a basis for new
> +VMs. The Copy-On-Write mechanism in the operating systems makes
> +sure that new VMs are able to read template VM memory; however, any
> +modifications stay private and don't modify the original template VM or any
> +other created VM.
I feel like we should have a paragraph at the top here explicitly calling
out the dangers of templating, wrt to unique data in the hardware and guest
OS. Don't have to provide solutions, just more of a scarcy "here be dragons"
warning to users who might be tempted to try this.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 9/9] docs: Start documenting VM templating
2023-08-22 13:47 ` Daniel P. Berrangé
@ 2023-08-22 14:04 ` David Hildenbrand
0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 14:04 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Thiner Logoer, Philippe Mathieu-Daudé, Stefan Hajnoczi,
Elena Ufimtseva, Jagannathan Raman, Michael S. Tsirkin, Ani Sinha,
Xiao Guangrong, Daniel Henrique Barboza, Greg Kurz, Eric Blake,
Markus Armbruster, Eduardo Habkost
On 22.08.23 15:47, Daniel P. Berrangé wrote:
> On Tue, Aug 22, 2023 at 01:44:57PM +0200, David Hildenbrand wrote:
>> Let's add some details about VM templating, focusing on the VM memory
>> configuration only.
>>
>> There is much more to VM templating (VM state? block devices?), but I leave
>> that as future work.
>
> Then there's the supposedly "unique" hardware identifiers, most notably
> VM UUID & NIC MAC addr that don't change if you create many VMs from
> a "template". Or from the guest OS there are "unique" things like
> /etc/machine-id, SSH host keys, web server certificates, etc.
>
> The vmgenid device at least provides a way for guest OS to get notified
> to update its unique resources/identifiers, but doesn't solve the overall
> VM UUID. NIC MAC addr could be solved by hotunplug+plug either side of
> creating the template & instantiating the template.
>
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>> docs/vm-templating.txt | 109 +++++++++++++++++++++++++++++++++++++++++
>
> Can you make this doument RST from the start and link to it from
> somewhere appropriate in our documentation. Perhaps it should live
> under the docs/system/ directory ?
I blindly did what memory-hotplug.txt and nvdimm.txt do. I can make it a
RST and move under docs/system [+ link it in the index]
>
>> 1 file changed, 109 insertions(+)
>> create mode 100644 docs/vm-templating.txt
>>
>> diff --git a/docs/vm-templating.txt b/docs/vm-templating.txt
>> new file mode 100644
>> index 0000000000..419362c1ea
>> --- /dev/null
>> +++ b/docs/vm-templating.txt
>> @@ -0,0 +1,109 @@
>> +QEMU VM templating
>> +==================
>> +
>> +This document explains how to use VM templating in QEMU.
>> +
>> +For now, the focus is on VM memory aspects, and not about how to save and
>> +restore other VM state (i.e., migrate-to-file with 'x-ignore-shared').
>> +
>> +Overview
>> +--------
>> +
>> +With VM templating, a single template VM serves as the starting point for
>> +new VMs. This allows for fast and efficient replication of VMs, resulting
>> +in fast startup times and reduced memory consumption.
>> +
>> +Conceptually, the VM state is frozen, to then be used as a basis for new
>> +VMs. The Copy-On-Write mechanism in the operating systems makes
>> +sure that new VMs are able to read template VM memory; however, any
>> +modifications stay private and don't modify the original template VM or any
>> +other created VM.
>
> I feel like we should have a paragraph at the top here explicitly calling
> out the dangers of templating, wrt to unique data in the hardware and guest
> OS. Don't have to provide solutions, just more of a scarcy "here be dragons"
> warning to users who might be tempted to try this.
Agreed, I'll use some of your information above, thanks!
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 9/9] docs: Start documenting VM templating
2023-08-22 11:44 ` [PATCH v2 9/9] docs: Start documenting VM templating David Hildenbrand
2023-08-22 13:47 ` Daniel P. Berrangé
@ 2023-08-22 14:23 ` Peter Maydell
2023-08-22 14:31 ` David Hildenbrand
1 sibling, 1 reply; 25+ messages in thread
From: Peter Maydell @ 2023-08-22 14:23 UTC (permalink / raw)
To: David Hildenbrand
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
On Tue, 22 Aug 2023 at 12:49, David Hildenbrand <david@redhat.com> wrote:
>
> Let's add some details about VM templating, focusing on the VM memory
> configuration only.
>
> There is much more to VM templating (VM state? block devices?), but I leave
> that as future work.
>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> docs/vm-templating.txt | 109 +++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 109 insertions(+)
> create mode 100644 docs/vm-templating.txt
No new .txt files in docs/, please. Use rst, and incorporate
the information into the correct parts of the manual structure.
thanks
-- PMM
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re:[PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files
2023-08-22 11:44 ` [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files David Hildenbrand
2023-08-22 13:27 ` Markus Armbruster
@ 2023-08-22 14:26 ` ThinerLogoer
2023-08-23 12:43 ` [PATCH " David Hildenbrand
1 sibling, 1 reply; 25+ messages in thread
From: ThinerLogoer @ 2023-08-22 14:26 UTC (permalink / raw)
To: David Hildenbrand
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Philippe Mathieu-Daudé, Daniel P . Berrangé,
Stefan Hajnoczi, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
Hello,
At 2023-08-22 19:44:51, "David Hildenbrand" <david@redhat.com> wrote:
>For now, "share=off,readonly=on" would always result in us opening the
>file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
>turning it into ROM.
>
>Especially for VM templating, "share=off" is a common use case. However,
>that use case is impossible with files that lack write permissions,
>because "share=off,readonly=on" will not give us writable RAM.
>
>The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
>have users (Kata Containers) that rely on the existing behavior --
>malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
>we cannot change the semantics of "share=off,readonly=on"
>
>So let's add a new "rom" property with on/off/auto values. "auto" is
>the default and what most people will use: for historical reasons, to not
>change the old semantics, it defaults to the value of the "readonly"
>property.
>
>For VM templating, one can now use:
> -object memory-backend-file,share=off,readonly=on,rom=off,...
>
>But we'll disallow:
> -object memory-backend-file,share=on,readonly=on,rom=off,...
>because we would otherwise get an error when trying to mmap the R/O file
>shared and writable. An explicit error message is cleaner.
>
>We will also disallow for now:
> -object memory-backend-file,share=off,readonly=off,rom=on,...
> -object memory-backend-file,share=on,readonly=off,rom=on,...
>It's not harmful, but also not really required for now.
>
>Alternatives that were abandoned:
>* Make "unarmed=on" for the NVDIMM set the memory region container
> readonly. We would still see a change of ROM->RAM and possibly run
> into memslot limits with vhost-user. Further, there might be use cases
> for "unarmed=on" that should still allow writing to that memory
> (temporary files, system RAM, ...).
>* Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
> as with "unarmed=on".
>* Make "readonly" consume "on/off/file" instead of being a 'bool' type.
> This would slightly changes the behavior of the "readonly" parameter:
> values like true/false (as accepted by a 'bool'type) would no longer be
> accepted.
>
>Signed-off-by: David Hildenbrand <david@redhat.com>
>---
> backends/hostmem-file.c | 59 ++++++++++++++++++++++++++++++++++++++++-
> qapi/qom.json | 6 ++++-
> qemu-options.hx | 10 ++++++-
> 3 files changed, 72 insertions(+), 3 deletions(-)
>
>diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
>index ef2d5533af..361d4a8103 100644
>--- a/backends/hostmem-file.c
>+++ b/backends/hostmem-file.c
>@@ -18,6 +18,8 @@
> #include "sysemu/hostmem.h"
> #include "qom/object_interfaces.h"
> #include "qom/object.h"
>+#include "qapi/visitor.h"
>+#include "qapi/qapi-visit-common.h"
>
> OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendFile, MEMORY_BACKEND_FILE)
>
>@@ -31,6 +33,7 @@ struct HostMemoryBackendFile {
> bool discard_data;
> bool is_pmem;
> bool readonly;
>+ OnOffAuto rom;
> };
>
> static void
>@@ -53,9 +56,33 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
> return;
> }
>
>+ switch (fb->rom) {
>+ case ON_OFF_AUTO_AUTO:
>+ /* Traditionally, opening the file readonly always resulted in ROM. */
>+ fb->rom = fb->readonly ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
>+ break;
>+ case ON_OFF_AUTO_ON:
>+ if (!fb->readonly) {
>+ error_setg(errp, "property 'rom' = 'on' is not supported with"
>+ " 'readonly' = 'off'");
>+ return;
>+ }
>+ break;
>+ case ON_OFF_AUTO_OFF:
>+ if (fb->readonly && backend->share) {
>+ error_setg(errp, "property 'rom' = 'off' is incompatible with"
>+ " 'readonly' = 'on' and 'share' = 'on'");
>+ return;
>+ }
>+ break;
>+ default:
>+ assert(false);
>+ }
>+
> name = host_memory_backend_get_name(backend);
> ram_flags = backend->share ? RAM_SHARED : 0;
>- ram_flags |= fb->readonly ? RAM_READONLY | RAM_READONLY_FD : 0;
>+ ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
>+ ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
> ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
> ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
> ram_flags |= RAM_NAMED_FILE;
>@@ -201,6 +228,32 @@ static void file_memory_backend_set_readonly(Object *obj, bool value,
> fb->readonly = value;
> }
>
>+static void file_memory_backend_get_rom(Object *obj, Visitor *v,
>+ const char *name, void *opaque,
>+ Error **errp)
>+{
>+ HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
>+ OnOffAuto rom = fb->rom;
>+
>+ visit_type_OnOffAuto(v, name, &rom, errp);
>+}
>+
>+static void file_memory_backend_set_rom(Object *obj, Visitor *v,
>+ const char *name, void *opaque,
>+ Error **errp)
>+{
>+ HostMemoryBackend *backend = MEMORY_BACKEND(obj);
>+ HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
>+
>+ if (host_memory_backend_mr_inited(backend)) {
>+ error_setg(errp, "cannot change property '%s' of %s.", name,
>+ object_get_typename(obj));
>+ return;
>+ }
>+
>+ visit_type_OnOffAuto(v, name, &fb->rom, errp);
>+}
>+
> static void file_backend_unparent(Object *obj)
> {
> HostMemoryBackend *backend = MEMORY_BACKEND(obj);
>@@ -243,6 +296,10 @@ file_backend_class_init(ObjectClass *oc, void *data)
> object_class_property_add_bool(oc, "readonly",
> file_memory_backend_get_readonly,
> file_memory_backend_set_readonly);
>+ object_class_property_add(oc, "rom", "OnOffAuto",
>+ file_memory_backend_get_rom, file_memory_backend_set_rom, NULL, NULL);
>+ object_class_property_set_description(oc, "rom",
>+ "Whether to create Read Only Memory (ROM)");
> }
>
> static void file_backend_instance_finalize(Object *o)
>diff --git a/qapi/qom.json b/qapi/qom.json
>index fa3e88c8e6..0cf83c6f39 100644
>--- a/qapi/qom.json
>+++ b/qapi/qom.json
>@@ -668,6 +668,9 @@
> # @readonly: if true, the backing file is opened read-only; if false,
> # it is opened read-write. (default: false)
> #
>+# @rom: whether to create Read Only Memory (ROM). If set to 'auto', it
>+# defaults to the value of @readonly. (default: auto, since 8.2)
>+#
> # Since: 2.1
> ##
> { 'struct': 'MemoryBackendFileProperties',
>@@ -677,7 +680,8 @@
> '*discard-data': 'bool',
> 'mem-path': 'str',
> '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
>- '*readonly': 'bool' } }
>+ '*readonly': 'bool',
>+ '*rom': 'OnOffAuto' } }
>
> ##
> # @MemoryBackendMemfdProperties:
>diff --git a/qemu-options.hx b/qemu-options.hx
>index 29b98c3d4c..03ce0b0a30 100644
>--- a/qemu-options.hx
>+++ b/qemu-options.hx
>@@ -4976,7 +4976,7 @@ SRST
> they are specified. Note that the 'id' property must be set. These
> objects are placed in the '/objects' path.
>
>- ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off``
>+ ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off,rom=on|off|auto``
> Creates a memory file backend object, which can be used to back
> the guest RAM with huge pages.
>
>@@ -5066,6 +5066,14 @@ SRST
> The ``readonly`` option specifies whether the backing file is opened
> read-only or read-write (default).
>
>+ The ``rom`` option specifies whether to create Read Only Memory (ROM)
>+ that cannot be modified by the VM. If set to ``on``, the VM cannot
>+ modify the memory. If set to ``off``, the VM can modify the memory.
>+ If set to ``auto`` (default), the value of the ``readonly`` property
>+ is used. This option is primarily helpful for VM templating, where we
>+ want to open a file readonly (``readonly=on``) and allow private
>+ modifications of the memory by the VM (``share=off``, ``rom=off``).
>+
> ``-object memory-backend-ram,id=id,merge=on|off,dump=on|off,share=on|off,prealloc=on|off,size=size,host-nodes=host-nodes,policy=default|preferred|bind|interleave``
> Creates a memory backend object, which can be used to back the
> guest RAM. Memory backend objects offer more control than the
In one word, I'd suggest advertising the existence of "rom" option more invasively, whenever
private file mapping is used.
IMHO you should probably be more invasive here to warn unconditionally when
the memory backend file is going to be opened readwrite but is mapped non shared.
I as a user find the patch series indeed work functionally when I am aware of the "rom"
option - but what if I am not aware, the outcome is still that qemu tried
to open the file readwrite even when it is going to be mapped private.
When the file is readonly, the error message is:
```
qemu-system-x86_64: can't open backing store pc.ram for guest RAM: Permission denied
```
This should be probably helpful if qemu found that the file exists as a regular file and
is mapped private, to say something like
```
qemu-system-x86_64: can't open backing store pc.ram for guest RAM: Permission denied
tip: mapping is private and ram file is probably readonly, maybe you should append "readonly=on,rom=off"
to "-object memory-backend-file,..." option list. see documentation xxx for details
```
to advertise the "rom" option.
--
Regards,
logoerthiner
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 9/9] docs: Start documenting VM templating
2023-08-22 14:23 ` Peter Maydell
@ 2023-08-22 14:31 ` David Hildenbrand
0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-22 14:31 UTC (permalink / raw)
To: Peter Maydell
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Stefan Hajnoczi, Elena Ufimtseva,
Jagannathan Raman, Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
On 22.08.23 16:23, Peter Maydell wrote:
> On Tue, 22 Aug 2023 at 12:49, David Hildenbrand <david@redhat.com> wrote:
>>
>> Let's add some details about VM templating, focusing on the VM memory
>> configuration only.
>>
>> There is much more to VM templating (VM state? block devices?), but I leave
>> that as future work.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>> docs/vm-templating.txt | 109 +++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 109 insertions(+)
>> create mode 100644 docs/vm-templating.txt
>
> No new .txt files in docs/, please. Use rst, and incorporate
> the information into the correct parts of the manual structure.
Thanks, already raised by Daniel. Will be an RST and moved under
docs/system.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 1/9] nvdimm: Reject writing label data to ROM instead of crashing QEMU
2023-08-22 11:44 ` [PATCH v2 1/9] nvdimm: Reject writing label data to ROM instead of crashing QEMU David Hildenbrand
@ 2023-08-22 19:25 ` Stefan Hajnoczi
0 siblings, 0 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2023-08-22 19:25 UTC (permalink / raw)
To: David Hildenbrand
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Thiner Logoer, Philippe Mathieu-Daudé,
Daniel P . Berrangé, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
[-- Attachment #1: Type: text/plain, Size: 2023 bytes --]
On Tue, Aug 22, 2023 at 01:44:49PM +0200, David Hildenbrand wrote:
> Currently, when using a true R/O NVDIMM (ROM memory backend) with a label
> area, the VM can easily crash QEMU by trying to write to the label area,
> because the ROM memory is mmap'ed without PROT_WRITE.
>
> [root@vm-0 ~]# ndctl disable-region region0
> disabled 1 region
> [root@vm-0 ~]# ndctl zero-labels nmem0
> -> QEMU segfaults
>
> Let's remember whether we have a ROM memory backend and properly
> reject the write request:
>
> [root@vm-0 ~]# ndctl disable-region region0
> disabled 1 region
> [root@vm-0 ~]# ndctl zero-labels nmem0
> zeroed 0 nmem
>
> In comparison, on a system with a R/W NVDIMM:
>
> [root@vm-0 ~]# ndctl disable-region region0
> disabled 1 region
> [root@vm-0 ~]# ndctl zero-labels nmem0
> zeroed 1 nmem
>
> For ACPI, just return "unsupported", like if no label exists. For spapr,
> return "H_P2", similar to when no label area exists.
>
> Could we rely on the "unarmed" property? Maybe, but it looks cleaner to
> only disallow what certainly cannot work.
>
> After all "unarmed=on" primarily means: cannot accept persistent writes. In
> theory, there might be setups where devices with "unarmed=on" set could
> be used to host non-persistent data (temporary files, system RAM, ...); for
> example, in Linux, admins can overwrite the "readonly" setting and still
> write to the device -- which will work as long as we're not using ROM.
> Allowing writing label data in such configurations can make sense.
>
> Fixes: dbd730e85987 ("nvdimm: check -object memory-backend-file, readonly=on option")
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> hw/acpi/nvdimm.c | 11 ++++++++---
> hw/mem/nvdimm.c | 10 +++++++---
> hw/ppc/spapr_nvdimm.c | 3 ++-
> include/hw/mem/nvdimm.h | 6 ++++++
> 4 files changed, 23 insertions(+), 7 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files
2023-08-22 14:26 ` ThinerLogoer
@ 2023-08-23 12:43 ` David Hildenbrand
2023-08-23 14:47 ` ThinerLogoer
0 siblings, 1 reply; 25+ messages in thread
From: David Hildenbrand @ 2023-08-23 12:43 UTC (permalink / raw)
To: ThinerLogoer
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Philippe Mathieu-Daudé, Daniel P . Berrangé,
Stefan Hajnoczi, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
On 22.08.23 16:26, ThinerLogoer wrote:
> Hello,
>
> At 2023-08-22 19:44:51, "David Hildenbrand" <david@redhat.com> wrote:
>> For now, "share=off,readonly=on" would always result in us opening the
>> file R/O and mmap'ing the opened file MAP_PRIVATE R/O -- effectively
>> turning it into ROM.
>>
>> Especially for VM templating, "share=off" is a common use case. However,
>> that use case is impossible with files that lack write permissions,
>> because "share=off,readonly=on" will not give us writable RAM.
>>
>> The sole user of ROM via memory-backend-file are R/O NVDIMMs, but as we
>> have users (Kata Containers) that rely on the existing behavior --
>> malicious VMs should not be able to consume COW memory for R/O NVDIMMs --
>> we cannot change the semantics of "share=off,readonly=on"
>>
>> So let's add a new "rom" property with on/off/auto values. "auto" is
>> the default and what most people will use: for historical reasons, to not
>> change the old semantics, it defaults to the value of the "readonly"
>> property.
>>
>> For VM templating, one can now use:
>> -object memory-backend-file,share=off,readonly=on,rom=off,...
>>
>> But we'll disallow:
>> -object memory-backend-file,share=on,readonly=on,rom=off,...
>> because we would otherwise get an error when trying to mmap the R/O file
>> shared and writable. An explicit error message is cleaner.
>>
>> We will also disallow for now:
>> -object memory-backend-file,share=off,readonly=off,rom=on,...
>> -object memory-backend-file,share=on,readonly=off,rom=on,...
>> It's not harmful, but also not really required for now.
>>
>> Alternatives that were abandoned:
>> * Make "unarmed=on" for the NVDIMM set the memory region container
>> readonly. We would still see a change of ROM->RAM and possibly run
>> into memslot limits with vhost-user. Further, there might be use cases
>> for "unarmed=on" that should still allow writing to that memory
>> (temporary files, system RAM, ...).
>> * Add a new "readonly=on/off/auto" parameter for NVDIMMs. Similar issues
>> as with "unarmed=on".
>> * Make "readonly" consume "on/off/file" instead of being a 'bool' type.
>> This would slightly changes the behavior of the "readonly" parameter:
>> values like true/false (as accepted by a 'bool'type) would no longer be
>> accepted.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>> backends/hostmem-file.c | 59 ++++++++++++++++++++++++++++++++++++++++-
>> qapi/qom.json | 6 ++++-
>> qemu-options.hx | 10 ++++++-
>> 3 files changed, 72 insertions(+), 3 deletions(-)
>>
>> diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
>> index ef2d5533af..361d4a8103 100644
>> --- a/backends/hostmem-file.c
>> +++ b/backends/hostmem-file.c
>> @@ -18,6 +18,8 @@
>> #include "sysemu/hostmem.h"
>> #include "qom/object_interfaces.h"
>> #include "qom/object.h"
>> +#include "qapi/visitor.h"
>> +#include "qapi/qapi-visit-common.h"
>>
>> OBJECT_DECLARE_SIMPLE_TYPE(HostMemoryBackendFile, MEMORY_BACKEND_FILE)
>>
>> @@ -31,6 +33,7 @@ struct HostMemoryBackendFile {
>> bool discard_data;
>> bool is_pmem;
>> bool readonly;
>> + OnOffAuto rom;
>> };
>>
>> static void
>> @@ -53,9 +56,33 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
>> return;
>> }
>>
>> + switch (fb->rom) {
>> + case ON_OFF_AUTO_AUTO:
>> + /* Traditionally, opening the file readonly always resulted in ROM. */
>> + fb->rom = fb->readonly ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
>> + break;
>> + case ON_OFF_AUTO_ON:
>> + if (!fb->readonly) {
>> + error_setg(errp, "property 'rom' = 'on' is not supported with"
>> + " 'readonly' = 'off'");
>> + return;
>> + }
>> + break;
>> + case ON_OFF_AUTO_OFF:
>> + if (fb->readonly && backend->share) {
>> + error_setg(errp, "property 'rom' = 'off' is incompatible with"
>> + " 'readonly' = 'on' and 'share' = 'on'");
>> + return;
>> + }
>> + break;
>> + default:
>> + assert(false);
>> + }
>> +
>> name = host_memory_backend_get_name(backend);
>> ram_flags = backend->share ? RAM_SHARED : 0;
>> - ram_flags |= fb->readonly ? RAM_READONLY | RAM_READONLY_FD : 0;
>> + ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
>> + ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
>> ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
>> ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
>> ram_flags |= RAM_NAMED_FILE;
>> @@ -201,6 +228,32 @@ static void file_memory_backend_set_readonly(Object *obj, bool value,
>> fb->readonly = value;
>> }
>>
>> +static void file_memory_backend_get_rom(Object *obj, Visitor *v,
>> + const char *name, void *opaque,
>> + Error **errp)
>> +{
>> + HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
>> + OnOffAuto rom = fb->rom;
>> +
>> + visit_type_OnOffAuto(v, name, &rom, errp);
>> +}
>> +
>> +static void file_memory_backend_set_rom(Object *obj, Visitor *v,
>> + const char *name, void *opaque,
>> + Error **errp)
>> +{
>> + HostMemoryBackend *backend = MEMORY_BACKEND(obj);
>> + HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
>> +
>> + if (host_memory_backend_mr_inited(backend)) {
>> + error_setg(errp, "cannot change property '%s' of %s.", name,
>> + object_get_typename(obj));
>> + return;
>> + }
>> +
>> + visit_type_OnOffAuto(v, name, &fb->rom, errp);
>> +}
>> +
>> static void file_backend_unparent(Object *obj)
>> {
>> HostMemoryBackend *backend = MEMORY_BACKEND(obj);
>> @@ -243,6 +296,10 @@ file_backend_class_init(ObjectClass *oc, void *data)
>> object_class_property_add_bool(oc, "readonly",
>> file_memory_backend_get_readonly,
>> file_memory_backend_set_readonly);
>> + object_class_property_add(oc, "rom", "OnOffAuto",
>> + file_memory_backend_get_rom, file_memory_backend_set_rom, NULL, NULL);
>> + object_class_property_set_description(oc, "rom",
>> + "Whether to create Read Only Memory (ROM)");
>> }
>>
>> static void file_backend_instance_finalize(Object *o)
>> diff --git a/qapi/qom.json b/qapi/qom.json
>> index fa3e88c8e6..0cf83c6f39 100644
>> --- a/qapi/qom.json
>> +++ b/qapi/qom.json
>> @@ -668,6 +668,9 @@
>> # @readonly: if true, the backing file is opened read-only; if false,
>> # it is opened read-write. (default: false)
>> #
>> +# @rom: whether to create Read Only Memory (ROM). If set to 'auto', it
>> +# defaults to the value of @readonly. (default: auto, since 8.2)
>> +#
>> # Since: 2.1
>> ##
>> { 'struct': 'MemoryBackendFileProperties',
>> @@ -677,7 +680,8 @@
>> '*discard-data': 'bool',
>> 'mem-path': 'str',
>> '*pmem': { 'type': 'bool', 'if': 'CONFIG_LIBPMEM' },
>> - '*readonly': 'bool' } }
>> + '*readonly': 'bool',
>> + '*rom': 'OnOffAuto' } }
>>
>> ##
>> # @MemoryBackendMemfdProperties:
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 29b98c3d4c..03ce0b0a30 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -4976,7 +4976,7 @@ SRST
>> they are specified. Note that the 'id' property must be set. These
>> objects are placed in the '/objects' path.
>>
>> - ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off``
>> + ``-object memory-backend-file,id=id,size=size,mem-path=dir,share=on|off,discard-data=on|off,merge=on|off,dump=on|off,prealloc=on|off,host-nodes=host-nodes,policy=default|preferred|bind|interleave,align=align,offset=offset,readonly=on|off,rom=on|off|auto``
>> Creates a memory file backend object, which can be used to back
>> the guest RAM with huge pages.
>>
>> @@ -5066,6 +5066,14 @@ SRST
>> The ``readonly`` option specifies whether the backing file is opened
>> read-only or read-write (default).
>>
>> + The ``rom`` option specifies whether to create Read Only Memory (ROM)
>> + that cannot be modified by the VM. If set to ``on``, the VM cannot
>> + modify the memory. If set to ``off``, the VM can modify the memory.
>> + If set to ``auto`` (default), the value of the ``readonly`` property
>> + is used. This option is primarily helpful for VM templating, where we
>> + want to open a file readonly (``readonly=on``) and allow private
>> + modifications of the memory by the VM (``share=off``, ``rom=off``).
>> +
>> ``-object memory-backend-ram,id=id,merge=on|off,dump=on|off,share=on|off,prealloc=on|off,size=size,host-nodes=host-nodes,policy=default|preferred|bind|interleave``
>> Creates a memory backend object, which can be used to back the
>> guest RAM. Memory backend objects offer more control than the
>
> In one word, I'd suggest advertising the existence of "rom" option more invasively, whenever
> private file mapping is used.
>
> IMHO you should probably be more invasive here to warn unconditionally when
> the memory backend file is going to be opened readwrite but is mapped non shared.
As Daniel said, we should not add new warnings for sane use cases. But we can indeed give a hint when opening the file failed, see below.
>
> I as a user find the patch series indeed work functionally when I am aware of the "rom"
> option - but what if I am not aware, the outcome is still that qemu tried
> to open the file readwrite even when it is going to be mapped private.
Yes, the implicit "readonly=off" is active in any case, and we cannot change that due to existing users unfortunately.
>
> When the file is readonly, the error message is:
> ```
> qemu-system-x86_64: can't open backing store pc.ram for guest RAM: Permission denied
> ```
>
> This should be probably helpful if qemu found that the file exists as a regular file and
> is mapped private, to say something like
>
> ```
> qemu-system-x86_64: can't open backing store pc.ram for guest RAM: Permission denied
> tip: mapping is private and ram file is probably readonly, maybe you should append "readonly=on,rom=off"
> to "-object memory-backend-file,..." option list. see documentation xxx for details
> ```
What about the following, if we can indeed open the file R/O and we're dealing witha private mapping:
qemu-system-x86_64: can't open backing store tmp-file for guest RAM: Permission denied
Consider opening the backing store read-only using '-object memory-backend-file,readonly=on,rom=off,...' (see "VM templating" documentation)
?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re:Re: [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files
2023-08-23 12:43 ` [PATCH " David Hildenbrand
@ 2023-08-23 14:47 ` ThinerLogoer
2023-08-23 14:59 ` David Hildenbrand
0 siblings, 1 reply; 25+ messages in thread
From: ThinerLogoer @ 2023-08-23 14:47 UTC (permalink / raw)
To: David Hildenbrand
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Philippe Mathieu-Daudé, Daniel P . Berrangé,
Stefan Hajnoczi, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
At 2023-08-23 20:43:48, "David Hildenbrand" <david@redhat.com> wrote:
>>> + The ``rom`` option specifies whether to create Read Only Memory (ROM)
>>> + that cannot be modified by the VM. If set to ``on``, the VM cannot
>>> + modify the memory. If set to ``off``, the VM can modify the memory.
>>> + If set to ``auto`` (default), the value of the ``readonly`` property
>>> + is used. This option is primarily helpful for VM templating, where we
>>> + want to open a file readonly (``readonly=on``) and allow private
>>> + modifications of the memory by the VM (``share=off``, ``rom=off``).
>>> +
>>> ``-object memory-backend-ram,id=id,merge=on|off,dump=on|off,share=on|off,prealloc=on|off,size=size,host-nodes=host-nodes,policy=default|preferred|bind|interleave``
>>> Creates a memory backend object, which can be used to back the
>>> guest RAM. Memory backend objects offer more control than the
>>
>> In one word, I'd suggest advertising the existence of "rom" option more invasively, whenever
>> private file mapping is used.
>>
>> IMHO you should probably be more invasive here to warn unconditionally when
>> the memory backend file is going to be opened readwrite but is mapped non shared.
>
>As Daniel said, we should not add new warnings for sane use cases. But we can indeed give a hint when opening the file failed, see below.
>
Well I don't think it's completely sane use cases though we can keep it for backward
compatibility.
A nonshared memory map backed by an fd opened readwrite is probably
problematic and not what one usually expect. Either it should be shared
or it should opened readonly. Current behavior sticks to
the file being opened readwrite by default but warning can be raised when
readonly holds (file path does not prefix with /dev, and is a nonempty regular file,
and mapping is private), indicate that the user may probably want a
"readonly=on,rom=off" setup.
>>
>> I as a user find the patch series indeed work functionally when I am aware of the "rom"
>> option - but what if I am not aware, the outcome is still that qemu tried
>> to open the file readwrite even when it is going to be mapped private.
>
>Yes, the implicit "readonly=off" is active in any case, and we cannot change that due to existing users unfortunately.
>
Yeah I am ok with that now, and I found a way for my setup to work with your patches,
so I am happy with it.
>>
>> When the file is readonly, the error message is:
>> ```
>> qemu-system-x86_64: can't open backing store pc.ram for guest RAM: Permission denied
>> ```
>>
>> This should be probably helpful if qemu found that the file exists as a regular file and
>> is mapped private, to say something like
>>
>> ```
>> qemu-system-x86_64: can't open backing store pc.ram for guest RAM: Permission denied
>> tip: mapping is private and ram file is probably readonly, maybe you should append "readonly=on,rom=off"
>> to "-object memory-backend-file,..." option list. see documentation xxx for details
>> ```
>
>What about the following, if we can indeed open the file R/O and we're dealing witha private mapping:
>
>qemu-system-x86_64: can't open backing store tmp-file for guest RAM: Permission denied
>Consider opening the backing store read-only using '-object memory-backend-file,readonly=on,rom=off,...' (see "VM templating" documentation)
>
>?
This is good. Wording might need improvement. (Are qemu error messages always this cold?)
--
Regards,
logoerthiner
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files
2023-08-23 14:47 ` ThinerLogoer
@ 2023-08-23 14:59 ` David Hildenbrand
0 siblings, 0 replies; 25+ messages in thread
From: David Hildenbrand @ 2023-08-23 14:59 UTC (permalink / raw)
To: ThinerLogoer
Cc: qemu-devel, qemu-ppc, Paolo Bonzini, Peter Xu, Igor Mammedov,
Philippe Mathieu-Daudé, Daniel P . Berrangé,
Stefan Hajnoczi, Elena Ufimtseva, Jagannathan Raman,
Michael S. Tsirkin, Ani Sinha, Xiao Guangrong,
Daniel Henrique Barboza, Greg Kurz, Eric Blake, Markus Armbruster,
Eduardo Habkost
On 23.08.23 16:47, ThinerLogoer wrote:
> At 2023-08-23 20:43:48, "David Hildenbrand" <david@redhat.com> wrote:
>>>> + The ``rom`` option specifies whether to create Read Only Memory (ROM)
>>>> + that cannot be modified by the VM. If set to ``on``, the VM cannot
>>>> + modify the memory. If set to ``off``, the VM can modify the memory.
>>>> + If set to ``auto`` (default), the value of the ``readonly`` property
>>>> + is used. This option is primarily helpful for VM templating, where we
>>>> + want to open a file readonly (``readonly=on``) and allow private
>>>> + modifications of the memory by the VM (``share=off``, ``rom=off``).
>>>> +
>>>> ``-object memory-backend-ram,id=id,merge=on|off,dump=on|off,share=on|off,prealloc=on|off,size=size,host-nodes=host-nodes,policy=default|preferred|bind|interleave``
>>>> Creates a memory backend object, which can be used to back the
>>>> guest RAM. Memory backend objects offer more control than the
>>>
>>> In one word, I'd suggest advertising the existence of "rom" option more invasively, whenever
>>> private file mapping is used.
>>>
>>> IMHO you should probably be more invasive here to warn unconditionally when
>>> the memory backend file is going to be opened readwrite but is mapped non shared.
>>
>> As Daniel said, we should not add new warnings for sane use cases. But we can indeed give a hint when opening the file failed, see below.
>>
>
> Well I don't think it's completely sane use cases though we can keep it for backward
> compatibility.
Well, yes, but these are sane use cases that have been using that way of
dealing with files forever. We cannot simply change their usage,
unfortunately.
Having that said, your important use case is currently VM templating.
Note that that's not what the many other QEMU users actually do.
So I can understand why you want a different behavior and more hints;
but we have to balance a bit here (after all, I'm writing documentation
how to do VM templating for a reason ;) ).
[...]
>>>
>>> When the file is readonly, the error message is:
>>> ```
>>> qemu-system-x86_64: can't open backing store pc.ram for guest RAM: Permission denied
>>> ```
>>>
>>> This should be probably helpful if qemu found that the file exists as a regular file and
>>> is mapped private, to say something like
>>>
>>> ```
>>> qemu-system-x86_64: can't open backing store pc.ram for guest RAM: Permission denied
>>> tip: mapping is private and ram file is probably readonly, maybe you should append "readonly=on,rom=off"
>>> to "-object memory-backend-file,..." option list. see documentation xxx for details
>>> ```
>>
>> What about the following, if we can indeed open the file R/O and we're dealing witha private mapping:
>>
>> qemu-system-x86_64: can't open backing store tmp-file for guest RAM: Permission denied
>> Consider opening the backing store read-only using '-object memory-backend-file,readonly=on,rom=off,...' (see "VM templating" documentation)
>>
>> ?
>
> This is good. Wording might need improvement. (Are qemu error messages always this cold?)
Maybe. Maybe just my writing style :P
Looking at most error_append_hint(), they are fairly neutral/cold.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2023-08-23 15:00 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-22 11:44 [PATCH v2 0/9] memory-backend-file related improvements and VM templating support David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 1/9] nvdimm: Reject writing label data to ROM instead of crashing QEMU David Hildenbrand
2023-08-22 19:25 ` Stefan Hajnoczi
2023-08-22 11:44 ` [PATCH v2 2/9] softmmu/physmem: Distinguish between file access mode and mmap protection David Hildenbrand
2023-08-22 13:13 ` ThinerLogoer
2023-08-22 13:25 ` [PATCH " David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 3/9] backends/hostmem-file: Add "rom" property to support VM templating with R/O files David Hildenbrand
2023-08-22 13:27 ` Markus Armbruster
2023-08-22 13:29 ` David Hildenbrand
2023-08-22 14:26 ` ThinerLogoer
2023-08-23 12:43 ` [PATCH " David Hildenbrand
2023-08-23 14:47 ` ThinerLogoer
2023-08-23 14:59 ` David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 4/9] softmmu/physmem: Remap with proper protection in qemu_ram_remap() David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 5/9] softmmu/physmem: Bail out early in ram_block_discard_range() with readonly files David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 6/9] softmmu/physmem: Fail creation of new files in file_ram_open() with readonly=true David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 7/9] softmmu/physmem: Never return directories from file_ram_open() David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 8/9] docs: Don't mention "-mem-path" in multi-process.rst David Hildenbrand
2023-08-22 13:21 ` ThinerLogoer
2023-08-22 13:24 ` [PATCH " David Hildenbrand
2023-08-22 11:44 ` [PATCH v2 9/9] docs: Start documenting VM templating David Hildenbrand
2023-08-22 13:47 ` Daniel P. Berrangé
2023-08-22 14:04 ` David Hildenbrand
2023-08-22 14:23 ` Peter Maydell
2023-08-22 14:31 ` David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).