From: David Hildenbrand <david@redhat.com>
To: Peter Xu <peterx@redhat.com>, Steve Sistare <steven.sistare@oracle.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
Eduardo Habkost <eduardo@habkost.net>,
Philippe Mathieu-Daude <philmd@linaro.org>,
Paolo Bonzini <pbonzini@redhat.com>,
"Daniel P. Berrange" <berrange@redhat.com>,
Markus Armbruster <armbru@redhat.com>,
Igor Mammedov <imammedo@redhat.com>
Subject: Re: [PATCH V2 01/13] machine: alloc-anon option
Date: Fri, 4 Oct 2024 12:14:35 +0200 [thread overview]
Message-ID: <2143f803-439e-4b8b-ae92-07caa913d646@redhat.com> (raw)
In-Reply-To: <Zv7C7MeVP2X8bEJU@x1n>
On 03.10.24 18:14, Peter Xu wrote:
> On Mon, Sep 30, 2024 at 12:40:32PM -0700, Steve Sistare wrote:
>> Allocate anonymous memory using mmap MAP_ANON or memfd_create depending
>> on the value of the anon-alloc machine property. This option applies to
>> memory allocated as a side effect of creating various devices. It does
>> not apply to memory-backend-objects, whether explicitly specified on
>> the command line, or implicitly created by the -m command line option.
>>
>> The memfd option is intended to support new migration modes, in which the
>> memory region can be transferred in place to a new QEMU process, by sending
>> the memfd file descriptor to the process. Memory contents are preserved,
>> and if the mode also transfers device descriptors, then pages that are
>> locked in memory for DMA remain locked. This behavior is a pre-requisite
>> for supporting vfio, vdpa, and iommufd devices with the new modes.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>
> [Igor seems missing in the loop; added]
>
>> ---
>> hw/core/machine.c | 19 +++++++++++++++++++
>> include/hw/boards.h | 1 +
>> qapi/machine.json | 14 ++++++++++++++
>> qemu-options.hx | 11 +++++++++++
>> system/physmem.c | 35 +++++++++++++++++++++++++++++++++++
>> system/trace-events | 3 +++
>> 6 files changed, 83 insertions(+)
>>
>> diff --git a/hw/core/machine.c b/hw/core/machine.c
>> index adaba17..a89a32b 100644
>> --- a/hw/core/machine.c
>> +++ b/hw/core/machine.c
>> @@ -460,6 +460,20 @@ static void machine_set_mem_merge(Object *obj, bool value, Error **errp)
>> ms->mem_merge = value;
>> }
>>
>> +static int machine_get_anon_alloc(Object *obj, Error **errp)
>> +{
>> + MachineState *ms = MACHINE(obj);
>> +
>> + return ms->anon_alloc;
>> +}
>> +
>> +static void machine_set_anon_alloc(Object *obj, int value, Error **errp)
>> +{
>> + MachineState *ms = MACHINE(obj);
>> +
>> + ms->anon_alloc = value;
>> +}
>> +
>> static bool machine_get_usb(Object *obj, Error **errp)
>> {
>> MachineState *ms = MACHINE(obj);
>> @@ -1078,6 +1092,11 @@ static void machine_class_init(ObjectClass *oc, void *data)
>> object_class_property_set_description(oc, "mem-merge",
>> "Enable/disable memory merge support");
>>
>> + object_class_property_add_enum(oc, "anon-alloc", "AnonAllocOption",
>> + &AnonAllocOption_lookup,
>> + machine_get_anon_alloc,
>> + machine_set_anon_alloc);
>> +
>> object_class_property_add_bool(oc, "usb",
>> machine_get_usb, machine_set_usb);
>> object_class_property_set_description(oc, "usb",
>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>> index 5966069..5a87647 100644
>> --- a/include/hw/boards.h
>> +++ b/include/hw/boards.h
>> @@ -393,6 +393,7 @@ struct MachineState {
>> bool enable_graphics;
>> ConfidentialGuestSupport *cgs;
>> HostMemoryBackend *memdev;
>> + AnonAllocOption anon_alloc;
>> /*
>> * convenience alias to ram_memdev_id backend memory region
>> * or to numa container memory region
>> diff --git a/qapi/machine.json b/qapi/machine.json
>> index a6b8795..d4a63f5 100644
>> --- a/qapi/machine.json
>> +++ b/qapi/machine.json
>> @@ -1898,3 +1898,17 @@
>> { 'command': 'x-query-interrupt-controllers',
>> 'returns': 'HumanReadableText',
>> 'features': [ 'unstable' ]}
>> +
>> +##
>> +# @AnonAllocOption:
>> +#
>> +# An enumeration of the options for allocating anonymous guest memory.
>> +#
>> +# @mmap: allocate using mmap MAP_ANON
>> +#
>> +# @memfd: allocate using memfd_create
>> +#
>> +# Since: 9.2
>> +##
>> +{ 'enum': 'AnonAllocOption',
>> + 'data': [ 'mmap', 'memfd' ] }
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index d94e2cb..90ab943 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -38,6 +38,7 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \
>> " nvdimm=on|off controls NVDIMM support (default=off)\n"
>> " memory-encryption=@var{} memory encryption object to use (default=none)\n"
>> " hmat=on|off controls ACPI HMAT support (default=off)\n"
>> + " anon-alloc=mmap|memfd allocate anonymous guest RAM using mmap MAP_ANON or memfd_create (default: mmap)\n"
>> " memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n"
>> " cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n",
>> QEMU_ARCH_ALL)
>> @@ -101,6 +102,16 @@ SRST
>> Enables or disables ACPI Heterogeneous Memory Attribute Table
>> (HMAT) support. The default is off.
>>
>> + ``anon-alloc=mmap|memfd``
>> + Allocate anonymous guest RAM using mmap MAP_ANON (the default)
>> + or memfd_create. This option applies to memory allocated as a
>> + side effect of creating various devices. It does not apply to
>> + memory-backend-objects, whether explicitly specified on the
>> + command line, or implicitly created by the -m command line
>> + option.
>> +
>> + Some migration modes require anon-alloc=memfd.
>> +
>> ``memory-backend='id'``
>> An alternative to legacy ``-mem-path`` and ``mem-prealloc`` options.
>> Allows to use a memory backend as main RAM.
>> diff --git a/system/physmem.c b/system/physmem.c
>> index dc1db3a..174f7e0 100644
>> --- a/system/physmem.c
>> +++ b/system/physmem.c
>> @@ -47,6 +47,7 @@
>> #include "qemu/qemu-print.h"
>> #include "qemu/log.h"
>> #include "qemu/memalign.h"
>> +#include "qemu/memfd.h"
>> #include "exec/memory.h"
>> #include "exec/ioport.h"
>> #include "sysemu/dma.h"
>> @@ -69,6 +70,8 @@
>>
>> #include "qemu/pmem.h"
>>
>> +#include "qapi/qapi-types-migration.h"
>> +#include "migration/options.h"
>> #include "migration/vmstate.h"
>>
>> #include "qemu/range.h"
>> @@ -1849,6 +1852,35 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>> qemu_mutex_unlock_ramlist();
>> return;
>> }
>> +
>> + } else if (current_machine->anon_alloc == ANON_ALLOC_OPTION_MEMFD &&
>> + !object_dynamic_cast(new_block->mr->parent_obj.parent,
>> + TYPE_MEMORY_BACKEND)) {
>
> This is pretty fragile.. if someone adds yet another layer on top of memory
> backend objects, the ownership links can change and this might silently run
> into something else even without any warning..
>
> I wished we dig into what is missing, but maybe that's too trivial. If
> not, we still need to make this as solid. Perhaps that can be a ram flag
> and let relevant callers pass in that flag explicitly.
How would they decide whether or not we want to set the flag in the
current configuration?
>
> I think RAM_SHARED can actually be that flag already - I mean, in all paths
> that we may create anon mem (but not memory-backend-* objects), is it
> always safe we always switch to RAM_SHARED from anon?
Do you mean only setting the flag (-> anonymous shmem) or switching also
to memfd, which is a bigger change?
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2024-10-04 10:15 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-30 19:40 [PATCH V2 00/13] Live update: cpr-transfer Steve Sistare
2024-09-30 19:40 ` [PATCH V2 01/13] machine: alloc-anon option Steve Sistare
2024-10-03 16:14 ` Peter Xu
2024-10-04 10:14 ` David Hildenbrand [this message]
2024-10-04 12:33 ` Peter Xu
2024-10-04 12:54 ` David Hildenbrand
2024-10-04 13:24 ` Peter Xu
2024-10-07 16:23 ` David Hildenbrand
2024-10-07 19:05 ` Peter Xu
2024-10-07 15:36 ` Peter Xu
2024-10-07 19:30 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 02/13] migration: cpr-state Steve Sistare
2024-10-07 14:14 ` Peter Xu
2024-10-07 19:30 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 03/13] migration: save cpr mode Steve Sistare
2024-10-07 15:18 ` Peter Xu
2024-10-07 19:31 ` Steven Sistare
2024-10-07 20:10 ` Peter Xu
2024-10-08 15:57 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 04/13] migration: stop vm earlier for cpr Steve Sistare
2024-10-07 15:27 ` Peter Xu
2024-10-07 20:52 ` Steven Sistare
2024-10-08 15:35 ` Peter Xu
2024-10-08 19:13 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 05/13] physmem: preserve ram blocks " Steve Sistare
2024-10-07 15:49 ` Peter Xu
2024-10-07 16:28 ` Peter Xu
2024-10-08 15:17 ` Steven Sistare
2024-10-08 16:26 ` Peter Xu
2024-10-08 21:05 ` Steven Sistare
2024-10-08 21:32 ` Peter Xu
2024-10-31 20:32 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 06/13] hostmem-memfd: preserve " Steve Sistare
2024-10-07 15:52 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 07/13] migration: SCM_RIGHTS for QEMUFile Steve Sistare
2024-10-07 16:06 ` Peter Xu
2024-10-07 16:35 ` Daniel P. Berrangé
2024-10-07 18:12 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 08/13] migration: VMSTATE_FD Steve Sistare
2024-10-07 16:36 ` Peter Xu
2024-10-07 19:31 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 09/13] migration: cpr-transfer save and load Steve Sistare
2024-10-07 16:47 ` Peter Xu
2024-10-07 19:31 ` Steven Sistare
2024-10-08 15:36 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 10/13] migration: cpr-uri parameter Steve Sistare
2024-10-07 16:49 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 11/13] migration: cpr-uri option Steve Sistare
2024-10-07 16:50 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 12/13] migration: split qmp_migrate Steve Sistare
2024-10-07 19:18 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 13/13] migration: cpr-transfer mode Steve Sistare
2024-10-07 19:44 ` Peter Xu
2024-10-07 20:39 ` Steven Sistare
2024-10-08 15:45 ` Peter Xu
2024-10-08 19:12 ` Steven Sistare
2024-10-08 19:38 ` Peter Xu
2024-10-08 18:28 ` Fabiano Rosas
2024-10-08 18:47 ` Peter Xu
2024-10-08 19:11 ` Fabiano Rosas
2024-10-08 19:33 ` Steven Sistare
2024-10-08 19:48 ` Peter Xu
2024-10-09 18:43 ` Steven Sistare
2024-10-09 19:06 ` Peter Xu
2024-10-09 19:59 ` Peter Xu
2024-10-09 20:18 ` Steven Sistare
2024-10-09 20:57 ` Peter Xu
2024-10-09 22:08 ` Fabiano Rosas
2024-10-10 20:05 ` Steven Sistare
2024-10-09 20:09 ` Steven Sistare
2024-10-09 20:36 ` Peter Xu
2024-10-10 20:06 ` Steven Sistare
2024-10-10 21:23 ` Peter Xu
2024-10-24 21:12 ` Steven Sistare
2024-10-25 13:55 ` Peter Xu
2024-10-25 15:04 ` Steven Sistare
2024-10-08 19:29 ` Steven Sistare
2024-10-08 14:33 ` [PATCH V2 00/13] Live update: cpr-transfer Vladimir Sementsov-Ogievskiy
2024-10-08 21:13 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2143f803-439e-4b8b-ae92-07caa913d646@redhat.com \
--to=david@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=eduardo@habkost.net \
--cc=farosas@suse.de \
--cc=imammedo@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=steven.sistare@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).