qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Steven Sistare <steven.sistare@oracle.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
	David Hildenbrand <david@redhat.com>,
	Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	Eduardo Habkost <eduardo@habkost.net>,
	Philippe Mathieu-Daude <philmd@linaro.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Daniel P. Berrange" <berrange@redhat.com>,
	Markus Armbruster <armbru@redhat.com>
Subject: Re: [PATCH V5 15/23] migration: cpr-transfer mode
Date: Thu, 2 Jan 2025 14:57:18 -0500	[thread overview]
Message-ID: <Z3bvnlQ955dWzc-n@x1n> (raw)
In-Reply-To: <72eaea07-ccfc-4134-84c5-1bc044f7ddae@oracle.com>

On Thu, Jan 02, 2025 at 02:21:13PM -0500, Steven Sistare wrote:
> On 12/24/2024 2:24 PM, Peter Xu wrote:
> > On Tue, Dec 24, 2024 at 08:17:00AM -0800, Steve Sistare wrote:
> > > Add the cpr-transfer migration mode, which allows the user to transfer
> > > a guest to a new QEMU instance on the same host with minimal guest pause
> > > time, by preserving guest RAM in place, albeit with new virtual addresses
> > > in new QEMU, and by preserving device file descriptors.  Pages that were
> > > locked in memory for DMA in old QEMU remain locked in new QEMU, because the
> > > descriptor of the device that locked them remains open.
> > > 
> > > cpr-transfer preserves memory and devices descriptors by sending them to
> > > new QEMU over a unix domain socket using SCM_RIGHTS.  Such CPR state cannot
> > > be sent over the normal migration channel, because devices and backends
> > > are created prior to reading the channel, so this mode sends CPR state
> > > over a second "cpr" migration channel.  New QEMU reads the cpr channel
> > > prior to creating devices or backends.  The user specifies the cpr channel
> > > in the channel arguments on the outgoing side, and in a second -incoming
> > > command-line parameter on the incoming side.
> > > 
> > > The user must start old QEMU with the the '-machine aux-ram-share=on' option,
> > > which allows anonymous memory to be transferred in place to the new process
> > > by transferring a memory descriptor for each ram block.  Memory-backend
> > > objects must have the share=on attribute, but memory-backend-epc is not
> > > supported.
> > > 
> > > The user starts new QEMU on the same host as old QEMU, with command-line
> > > arguments to create the same machine, plus the -incoming option for the
> > > main migration channel, like normal live migration.  In addition, the user
> > > adds a second -incoming option with channel type "cpr".  The CPR channel
> > > address must be a type, such as unix socket, that supports SCM_RIGHTS.
> > > 
> > > To initiate CPR, the user issues a migrate command to old QEMU, adding
> > > a second migration channel of type "cpr" in the channels argument.
> > > Old QEMU stops the VM, saves state to the migration channels, and enters
> > > the postmigrate state.  New QEMU mmap's memory descriptors, and execution
> > > resumes.
> > > 
> > > The implementation splits qmp_migrate into start and finish functions.
> > > Start sends CPR state to new QEMU, which responds by closing the CPR
> > > channel.  Old QEMU detects the HUP then calls finish, which connects the
> > > main migration channel.
> > > 
> > > In summary, the usage is:
> > > 
> > >    qemu-system-$arch -machine aux-ram-share=on ...
> > > 
> > >    start new QEMU with "-incoming <main-uri> -incoming <cpr-channel>"
> > > 
> > >    Issue commands to old QEMU:
> > >      migrate_set_parameter mode cpr-transfer
> > > 
> > >      {"execute": "migrate", ...
> > >          {"channel-type": "main"...}, {"channel-type": "cpr"...} ... }
> > > 
> > > Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> > 
> > Feel free to take:
> > 
> > Reviewed-by: Peter Xu <peterx@redhat.com>
> > 
> > I still have a few trivial comments.
> > 
> > [...]
> > 
> > > diff --git a/migration/cpr.c b/migration/cpr.c
> > > index 87bcfdb..584b0b9 100644
> > > --- a/migration/cpr.c
> > > +++ b/migration/cpr.c
> > > @@ -45,7 +45,7 @@ static const VMStateDescription vmstate_cpr_fd = {
> > >           VMSTATE_UINT32(namelen, CprFd),
> > >           VMSTATE_VBUFFER_ALLOC_UINT32(name, CprFd, 0, NULL, namelen),
> > >           VMSTATE_INT32(id, CprFd),
> > 
> > Could you remind me again on when id!=0 will start to be used?
> 
> Each of vfio, iommufd, chardev, and tap will use id != 0.

I don't remember the details of the planned future series, but just to
mention that using integer ID can be error prone on device hot plug/unplug.

QEMU has a known bug even now on some device (e.g. slirp network backends)
that if the src QEMU originally has two devices (e.g. id=1,2), unplug
device id=1 (leaving id=2), then migrate, it could fail seeing dest only
has id=1 (dest QEMU starts with only one device), seeing a mismatched ID.

I recall PCIe frontend devices are not prone to such issue, that should
depend on whoever has ->get_id() (qdev_get_dev_path?) properly implemented
to generate a global unique ID that is not affected by order of device
realized / created.

It could boil down to how the IDs are allocated, anything that can be
allocated on the fly may not work well if there's no solid topology
information to fetch.

I wonder if CPR can be prone to this too when using IDs, just FYI.  It
might be a good idea if ID integers can be avoided somehow.  But you'll
definitely have the best picture of the whole thing, so it may or may not
apply.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2025-01-02 19:58 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-24 16:16 [PATCH V5 00/23] Live update: cpr-transfer Steve Sistare
2024-12-24 16:16 ` [PATCH V5 01/23] backends/hostmem-shm: factor out allocation of "anonymous shared memory with an fd" Steve Sistare
2024-12-24 16:56   ` Peter Xu
2024-12-24 16:16 ` [PATCH V5 02/23] physmem: qemu_ram_alloc_from_fd extensions Steve Sistare
2024-12-24 17:18   ` Peter Xu
2025-01-02 18:36     ` Steven Sistare
2025-01-02 19:48       ` Peter Xu
2025-01-02 20:03         ` Steven Sistare
2024-12-24 16:16 ` [PATCH V5 03/23] physmem: fd-based shared memory Steve Sistare
2024-12-24 17:27   ` Peter Xu
2025-01-02 18:34     ` Steven Sistare
2024-12-24 16:16 ` [PATCH V5 04/23] memory: add RAM_PRIVATE Steve Sistare
2024-12-24 16:16 ` [PATCH V5 05/23] machine: aux-ram-share option Steve Sistare
2024-12-24 16:16 ` [PATCH V5 06/23] migration: cpr-state Steve Sistare
2024-12-24 16:16 ` [PATCH V5 07/23] physmem: preserve ram blocks for cpr Steve Sistare
2024-12-24 17:32   ` Peter Xu
2024-12-24 16:16 ` [PATCH V5 08/23] hostmem-memfd: preserve " Steve Sistare
2024-12-24 16:16 ` [PATCH V5 09/23] hostmem-shm: " Steve Sistare
2024-12-24 16:16 ` [PATCH V5 10/23] migration: enhance migrate_uri_parse Steve Sistare
2024-12-24 17:48   ` Peter Xu
2024-12-24 16:16 ` [PATCH V5 11/23] migration: incoming channel Steve Sistare
2024-12-24 17:51   ` Peter Xu
2024-12-24 16:16 ` [PATCH V5 12/23] migration: SCM_RIGHTS for QEMUFile Steve Sistare
2024-12-24 16:16 ` [PATCH V5 13/23] migration: VMSTATE_FD Steve Sistare
2024-12-24 16:16 ` [PATCH V5 14/23] migration: cpr-transfer save and load Steve Sistare
2024-12-24 16:17 ` [PATCH V5 15/23] migration: cpr-transfer mode Steve Sistare
2024-12-24 19:24   ` Peter Xu
2025-01-02 19:21     ` Steven Sistare
2025-01-02 19:57       ` Peter Xu [this message]
2025-01-02 20:05         ` Steven Sistare
2025-01-07 12:05   ` Markus Armbruster
2025-01-07 15:38     ` Steven Sistare
2025-01-17 13:44       ` Markus Armbruster
2025-01-27 16:35         ` Steven Sistare
2025-01-28 11:56           ` Markus Armbruster
2025-01-28 21:19             ` Steven Sistare
2025-01-28 21:30             ` Steven Sistare
2025-01-29  6:19   ` Markus Armbruster
2024-12-24 16:17 ` [PATCH V5 16/23] migration-test: memory_backend Steve Sistare
2024-12-24 16:17 ` [PATCH V5 17/23] tests/qtest: optimize migrate_set_ports Steve Sistare
2024-12-24 19:26   ` Peter Xu
2024-12-24 16:17 ` [PATCH V5 18/23] tests/qtest: defer connection Steve Sistare
2024-12-24 19:27   ` Peter Xu
2024-12-24 16:17 ` [PATCH V5 19/23] migration-test: " Steve Sistare
2024-12-24 16:17 ` [PATCH V5 20/23] tests/qtest: enhance migration channels Steve Sistare
2024-12-24 19:48   ` Peter Xu
2024-12-24 16:17 ` [PATCH V5 21/23] tests/qtest: assert qmp_ready Steve Sistare
2024-12-24 19:54   ` Peter Xu
2025-01-02 18:36     ` Steven Sistare
2024-12-24 16:17 ` [PATCH V5 22/23] migration-test: cpr-transfer Steve Sistare
2024-12-24 20:01   ` Peter Xu
2024-12-24 20:06     ` Peter Xu
2025-01-02 18:35       ` Steven Sistare
2025-01-02 20:11         ` Peter Xu
2025-01-02 18:35     ` Steven Sistare
2025-01-02 20:09       ` Peter Xu
2025-01-02 20:12   ` Peter Xu
2024-12-24 16:17 ` [PATCH V5 23/23] migration: cpr-transfer documentation Steve Sistare
2024-12-24 20:02   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z3bvnlQ955dWzc-n@x1n \
    --to=peterx@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=david@redhat.com \
    --cc=eduardo@habkost.net \
    --cc=farosas@suse.de \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=steven.sistare@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).