qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Steven Sistare <steven.sistare@oracle.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
	David Hildenbrand <david@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Eduardo Habkost <eduardo@habkost.net>,
	Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	Philippe Mathieu-Daude <philmd@linaro.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Daniel P. Berrange" <berrange@redhat.com>,
	Markus Armbruster <armbru@redhat.com>
Subject: Re: [PATCH V1 17/26] machine: memfd-alloc option
Date: Wed, 29 May 2024 15:14:17 -0400	[thread overview]
Message-ID: <Zld-iWfa3_yEWgn6@x1n> (raw)
In-Reply-To: <79a8023d-2e19-4d80-821d-a03818a5372e@oracle.com>

On Wed, May 29, 2024 at 01:31:38PM -0400, Steven Sistare wrote:
> > > diff --git a/system/memory.c b/system/memory.c
> > > index 49f1cb2..ca04a0e 100644
> > > --- a/system/memory.c
> > > +++ b/system/memory.c
> > > @@ -1552,8 +1552,9 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
> > >                                         uint64_t size,
> > >                                         Error **errp)
> > >   {
> > > +    uint32_t flags = current_machine->memfd_alloc ? RAM_SHARED : 0;
> > 
> > If there's a machine option to "use memfd for allocations", then it's
> > shared mem... Hmm..
> > 
> > It is a bit confusing to me in quite a few levels:
> > 
> >    - Why memory allocation method will be defined by a machine property,
> >      even if we have memory-backend-* which should cover everything?
> 
> Some memory regions are implicitly created, and have no explicit representation
> on the qemu command line.  memfd-alloc affects those.
> 
> More generally, memfd-alloc affects all ramblock allocations that are
> not explicitly represented by memory-backend object.  Thus the simple
> command line "qemu -m 1G" does not explicitly describe an object, so it
> goes through the anonymous allocation path, and is affected by memfd-alloc.

Can we simply now allow "qemu -m 1G" to work for cpr-exec?  AFAIU that's
what we do with cpr-reboot: we ask the user to specify the right things to
make other thing work.  Otherwise it won't.

> 
> Internally, create_default_memdev does create a memory-backend object.
> That is what my doc comment above refers to:
>   Any associated memory-backend objects are created with share=on
> 
> An explicit "qemu -object memory-backend-*" is not affected by memfd-alloc.
> 
> The qapi comments in patch "migration: cpr-exec mode" attempt to say all that:
> 
> +#     Memory backend objects must have the share=on attribute, and
> +#     must be mmap'able in the new QEMU process.  For example,
> +#     memory-backend-file is acceptable, but memory-backend-ram is
> +#     not.
> +#
> +#     The VM must be started with the '-machine memfd-alloc=on'
> +#     option.  This causes implicit ram blocks -- those not explicitly
> +#     described by a memory-backend object -- to be allocated by
> +#     mmap'ing a memfd.  Examples include VGA, ROM, and even guest
> +#     RAM when it is specified without a memory-backend object.

VGA is IIRC 16MB chunk, ROM is even smaller.  If the user specifies -object
memory-backend-file,share=on propertly, these should be the only outliers?

Are these important enough for the downtime?  Can we put them into the
migrated image alongside with the rest device states?

> 
> >    - Even if we have such a machine property, why setting "memfd" will
> >      always imply shared?  why not private?  After all it's not called
> >      "memfd-shared-alloc", and we can create private mappings using
> >      e.g. memory-backend-memfd,share=off.
> 
> There is no use case for memfd-alloc with share=off, so no point IMO in
> making the option more verbose.

Unfortunately this fact doesn't make the property easier to understand. :-(

> For cpr, the mapping with all its modifications must be visible to new
> qemu when qemu mmaps it.

So this might be the important part - do you mean migrating
VGA/ROM/... small ramblocks won't work (besides any performance concerns)?
Could you elaborate?

Cpr-reboot already introduced lots of tricky knobs to QEMU.  We may need to
restrict that specialty to minimal, making the interfacing as clear as
possible, or (at least migration) maintainers will start to be soon scared
and running away, if such proposal was not shot down.

In short, I hope when we introduce new knobs for cpr, we shouldn't always
keep cpr-* modes in mind, but consider whenever the user can use it without
cpr-*.  I'm not sure whether it'll be always possible, but we should try.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2024-05-29 19:14 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-29 15:55 [PATCH V1 00/26] Live update: cpr-exec Steve Sistare
2024-04-29 15:55 ` [PATCH V1 01/26] oslib: qemu_clear_cloexec Steve Sistare
2024-05-06 23:27   ` Fabiano Rosas
2024-05-07  8:56     ` Daniel P. Berrangé
2024-05-07 13:54       ` Fabiano Rosas
2024-04-29 15:55 ` [PATCH V1 02/26] vl: helper to request re-exec Steve Sistare
2024-04-29 15:55 ` [PATCH V1 03/26] migration: SAVEVM_FOREACH Steve Sistare
2024-05-06 23:17   ` Fabiano Rosas
2024-05-13 19:27     ` Steven Sistare
2024-05-27 18:14       ` Peter Xu
2024-04-29 15:55 ` [PATCH V1 04/26] migration: delete unused parameter mis Steve Sistare
2024-05-06 21:50   ` Fabiano Rosas
2024-05-27 18:02   ` Peter Xu
2024-04-29 15:55 ` [PATCH V1 05/26] migration: precreate vmstate Steve Sistare
2024-05-07 21:02   ` Fabiano Rosas
2024-05-13 19:28     ` Steven Sistare
2024-05-24 13:56   ` Fabiano Rosas
2024-05-27 18:16   ` Peter Xu
2024-05-28 15:09     ` Steven Sistare via
2024-05-29 18:39       ` Peter Xu
2024-05-30 17:04         ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 06/26] migration: precreate vmstate for exec Steve Sistare
2024-05-06 23:34   ` Fabiano Rosas
2024-05-13 19:28     ` Steven Sistare
2024-05-13 21:21       ` Fabiano Rosas
2024-04-29 15:55 ` [PATCH V1 07/26] migration: VMStateId Steve Sistare
2024-05-07 21:03   ` Fabiano Rosas
2024-05-27 18:20   ` Peter Xu
2024-05-28 15:10     ` Steven Sistare via
2024-05-28 17:44       ` Peter Xu
2024-05-29 17:30         ` Steven Sistare via
2024-05-29 18:53           ` Peter Xu
2024-05-30 17:11             ` Steven Sistare via
2024-05-30 18:03               ` Peter Xu
2024-04-29 15:55 ` [PATCH V1 08/26] migration: vmstate_info_void_ptr Steve Sistare
2024-05-07 21:33   ` Fabiano Rosas
2024-05-27 18:31   ` Peter Xu
2024-05-28 15:10     ` Steven Sistare via
2024-05-28 18:21       ` Peter Xu
2024-05-29 17:30         ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 09/26] migration: vmstate_register_named Steve Sistare
2024-05-09 14:19   ` Fabiano Rosas
2024-05-09 14:32     ` Fabiano Rosas
2024-05-13 19:29       ` Steven Sistare
2024-04-29 15:55 ` [PATCH V1 10/26] migration: vmstate_unregister_named Steve Sistare
2024-04-29 15:55 ` [PATCH V1 11/26] migration: vmstate_register at init time Steve Sistare
2024-04-29 15:55 ` [PATCH V1 12/26] migration: vmstate factory object Steve Sistare
2024-04-29 15:55 ` [PATCH V1 13/26] physmem: ram_block_create Steve Sistare
2024-05-13 18:37   ` Fabiano Rosas
2024-05-13 19:30     ` Steven Sistare
2024-04-29 15:55 ` [PATCH V1 14/26] physmem: hoist guest_memfd creation Steve Sistare
2024-04-29 15:55 ` [PATCH V1 15/26] physmem: hoist host memory allocation Steve Sistare
2024-04-29 15:55 ` [PATCH V1 16/26] physmem: set ram block idstr earlier Steve Sistare
2024-04-29 15:55 ` [PATCH V1 17/26] machine: memfd-alloc option Steve Sistare
2024-05-28 21:12   ` Peter Xu
2024-05-29 17:31     ` Steven Sistare via
2024-05-29 19:14       ` Peter Xu [this message]
2024-05-30 17:11         ` Steven Sistare via
2024-05-30 18:14           ` Peter Xu
2024-05-31 19:32             ` Steven Sistare via
2024-06-03 21:48               ` Peter Xu
2024-06-04  7:13                 ` Daniel P. Berrangé
2024-06-04 15:58                   ` Peter Xu
2024-06-04 16:14                     ` David Hildenbrand
2024-06-04 16:41                       ` Peter Xu
2024-06-04 17:16                         ` David Hildenbrand
2024-06-03 10:17       ` Daniel P. Berrangé
2024-06-03 11:59         ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 18/26] migration: cpr-exec-args parameter Steve Sistare
2024-05-02 12:23   ` Markus Armbruster
2024-05-02 16:00     ` Steven Sistare
2024-05-21  8:13   ` Daniel P. Berrangé
2024-04-29 15:55 ` [PATCH V1 19/26] physmem: preserve ram blocks for cpr Steve Sistare
2024-05-28 21:44   ` Peter Xu
2024-05-29 17:31     ` Steven Sistare via
2024-05-29 19:25       ` Peter Xu
2024-05-30 17:12         ` Steven Sistare via
2024-05-30 18:39           ` Peter Xu
2024-05-31 19:32             ` Steven Sistare via
2024-06-03 22:29               ` Peter Xu
2024-04-29 15:55 ` [PATCH V1 20/26] migration: cpr-exec mode Steve Sistare
2024-05-02 12:23   ` Markus Armbruster
2024-05-02 16:00     ` Steven Sistare
2024-05-03  6:26       ` Markus Armbruster
2024-05-21  8:20   ` Daniel P. Berrangé
2024-05-24 14:58   ` Fabiano Rosas
2024-05-27 18:54     ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 21/26] migration: migrate_add_blocker_mode Steve Sistare
2024-05-09 17:47   ` Fabiano Rosas
2024-04-29 15:55 ` [PATCH V1 22/26] migration: ram block cpr-exec blockers Steve Sistare
2024-05-09 18:01   ` Fabiano Rosas
2024-05-13 19:29     ` Steven Sistare
2024-04-29 15:55 ` [PATCH V1 23/26] migration: misc " Steve Sistare
2024-05-09 18:05   ` Fabiano Rosas
2024-05-24 12:40   ` Fabiano Rosas
2024-05-27 19:02     ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 24/26] seccomp: cpr-exec blocker Steve Sistare
2024-05-09 18:16   ` Fabiano Rosas
2024-05-10  7:54   ` Daniel P. Berrangé
2024-05-13 19:29     ` Steven Sistare
2024-05-21  7:14       ` Daniel P. Berrangé
2024-04-29 15:55 ` [PATCH V1 25/26] migration: fix mismatched GPAs during cpr-exec Steve Sistare
2024-05-09 18:39   ` Fabiano Rosas
2024-04-29 15:55 ` [PATCH V1 26/26] migration: only-migratable-modes Steve Sistare
2024-05-09 19:14   ` Fabiano Rosas
2024-05-13 19:48     ` Steven Sistare
2024-05-13 21:57       ` Fabiano Rosas
2024-05-21  8:05   ` Daniel P. Berrangé
2024-05-02 16:13 ` cpr-exec doc (was Re: [PATCH V1 00/26] Live update: cpr-exec) Steven Sistare
2024-05-02 18:15   ` Peter Xu
2024-05-20 18:30 ` [PATCH V1 00/26] Live update: cpr-exec Steven Sistare
2024-05-20 22:28   ` Fabiano Rosas
2024-05-21  2:31     ` Peter Xu
2024-05-21 11:46       ` Steven Sistare
2024-05-27 17:45         ` Peter Xu
2024-05-28 15:10           ` Steven Sistare via
2024-05-28 16:42             ` Peter Xu
2024-05-30 17:17               ` Steven Sistare via
2024-05-30 19:23                 ` Peter Xu
2024-05-24 13:02 ` Fabiano Rosas
2024-05-24 14:07   ` Steven Sistare
2024-05-27 18:07 ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zld-iWfa3_yEWgn6@x1n \
    --to=peterx@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=david@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=farosas@suse.de \
    --cc=imammedo@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=steven.sistare@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).