From: Peter Xu <peterx@redhat.com>
To: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: "Fabiano Rosas" <farosas@suse.de>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"Avihai Horon" <avihaih@nvidia.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH v3 16/24] migration/multifd: Send final SYNC only after device state is complete
Date: Thu, 5 Dec 2024 14:02:37 -0500 [thread overview]
Message-ID: <Z1H4zS_TXZtVJOhw@x1n> (raw)
In-Reply-To: <945bab06-b6e6-449e-b810-7800b996ba83@maciej.szmigiero.name>
On Tue, Nov 26, 2024 at 10:22:42PM +0100, Maciej S. Szmigiero wrote:
> On 26.11.2024 21:52, Fabiano Rosas wrote:
> > "Maciej S. Szmigiero" <mail@maciej.szmigiero.name> writes:
> >
> > > From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
> > >
> > > Currently, ram_save_complete() sends a final SYNC multifd packet near this
> > > function end, after sending all of the remaining RAM data.
> > >
> > > On the receive side, this SYNC packet will cause multifd channel threads
> > > to block, waiting for the final sem_sync posting in
> > > multifd_recv_terminate_threads().
> > >
> > > However, multifd_recv_terminate_threads() won't be called until the
> > > migration is complete, which causes a problem if multifd channels are
> > > still required for transferring device state data after RAM transfer is
> > > complete but before finishing the migration process.
> > >
> > > Defer sending the final SYNC packet to the end of sending of
> > > post-switchover iterable data instead if device state transfer is possible.
> > >
> > > Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
> >
> > Reviewed-by: Fabiano Rosas <farosas@suse.de>
> >
> > I wonder whether we could just defer the sync for the !device_state case
> > as well.
> >
>
> AFAIK this should work, just wanted to be extra cautious with bit
> stream timing changes in case there's for example some race in an
> older QEMU version.
I see the issue, but maybe we don't even need this patch..
When I was working on commit 637280aeb2 previously, I forgot that the SYNC
messages are together with the FLUSH which got removed. It means now in
complete() we will sent SYNCs always, but always without FLUSH.
On new binaries, it means SYNCs are not collected properly on dest threads
so it'll hang all threads there.
So yeah, at least from that part it's me to blame..
I think maybe VFIO doesn't need to change the generic path to sync, because
logically speaking VFIO can also use multifd_send_sync_main() in its own
complete() hook to flush everything. Here the trick is such sync doesn't
need to be attached to any message (either SYNC or FLUSH, that only RAM
uses). The sync is about "sync against all sender threads", just like what
we do exactly with mapped-ram. Mapped-ram tricked that path with a
use_packet check in sender thread, however for VFIO we could already expose
a new parameter to multifd_send_sync_main() saying "let's only sync
threads".
I sent two small patches here:
https://lore.kernel.org/r/20241205185303.897010-1-peterx@redhat.com
The 1st patch should fix the SYNC message hang for 637280aeb2 that I did.
The 2nd patch introduced the flag that I said. I think after that applied
VFIO should be able to sync directly with:
multifd_send_sync_main(MULTIFD_SYNC_THREADS);
Then maybe we don't need this patch anymore. Please have a look.
PS: the two patches could be ready to merge already even before VFIO, if
they're properly reviewed and acked.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2024-12-05 19:03 UTC|newest]
Thread overview: 140+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-17 19:19 [PATCH v3 00/24] Multifd 🔀 device state transfer support with VFIO consumer Maciej S. Szmigiero
2024-11-17 19:19 ` [PATCH v3 01/24] migration: Clarify that {load, save}_cleanup handlers can run without setup Maciej S. Szmigiero
2024-11-25 19:08 ` Fabiano Rosas
2024-11-26 16:25 ` [PATCH v3 01/24] migration: Clarify that {load,save}_cleanup " Cédric Le Goater
2024-11-17 19:19 ` [PATCH v3 02/24] thread-pool: Remove thread_pool_submit() function Maciej S. Szmigiero
2024-11-25 19:13 ` Fabiano Rosas
2024-11-26 16:25 ` Cédric Le Goater
2024-12-04 19:24 ` Peter Xu
2024-12-06 21:11 ` Maciej S. Szmigiero
2024-11-17 19:19 ` [PATCH v3 03/24] thread-pool: Rename AIO pool functions to *_aio() and data types to *Aio Maciej S. Szmigiero
2024-11-25 19:15 ` Fabiano Rosas
2024-11-26 16:26 ` Cédric Le Goater
2024-12-04 19:26 ` Peter Xu
2024-11-17 19:19 ` [PATCH v3 04/24] thread-pool: Implement generic (non-AIO) pool support Maciej S. Szmigiero
2024-11-25 19:41 ` Fabiano Rosas
2024-11-25 19:55 ` Maciej S. Szmigiero
2024-11-25 20:51 ` Fabiano Rosas
2024-11-26 19:25 ` Cédric Le Goater
2024-11-26 21:21 ` Maciej S. Szmigiero
2024-11-26 19:29 ` Cédric Le Goater
2024-11-26 21:22 ` Maciej S. Szmigiero
2024-12-05 13:10 ` Cédric Le Goater
2024-11-28 10:08 ` Avihai Horon
2024-11-28 12:11 ` Maciej S. Szmigiero
2024-12-04 20:04 ` Peter Xu
2024-11-17 19:20 ` [PATCH v3 05/24] migration: Add MIG_CMD_SWITCHOVER_START and its load handler Maciej S. Szmigiero
2024-11-25 19:46 ` Fabiano Rosas
2024-11-26 19:37 ` Cédric Le Goater
2024-11-26 21:22 ` Maciej S. Szmigiero
2024-12-04 21:29 ` Peter Xu
2024-12-05 19:46 ` Zhang Chen
2024-12-06 18:24 ` Maciej S. Szmigiero
2024-12-06 22:12 ` Peter Xu
2024-12-09 1:43 ` Zhang Chen
2024-11-17 19:20 ` [PATCH v3 06/24] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2024-12-04 21:32 ` Peter Xu
2024-12-06 21:12 ` Maciej S. Szmigiero
2024-11-17 19:20 ` [PATCH v3 07/24] migration: Document the BQL behavior of load SaveVMHandlers Maciej S. Szmigiero
2024-12-04 21:38 ` Peter Xu
2024-12-06 18:40 ` Maciej S. Szmigiero
2024-12-06 22:15 ` Peter Xu
2024-11-17 19:20 ` [PATCH v3 08/24] migration: Add thread pool of optional load threads Maciej S. Szmigiero
2024-11-25 19:58 ` Fabiano Rosas
2024-11-27 9:13 ` Cédric Le Goater
2024-11-27 20:16 ` Maciej S. Szmigiero
2024-12-04 22:48 ` Peter Xu
2024-12-05 16:15 ` Peter Xu
2024-12-10 23:05 ` Maciej S. Szmigiero
2024-12-10 23:05 ` Maciej S. Szmigiero
2024-12-12 16:38 ` Peter Xu
2024-12-12 22:53 ` Maciej S. Szmigiero
2024-12-16 16:29 ` Peter Xu
2024-12-16 23:15 ` Maciej S. Szmigiero
2024-12-17 14:50 ` Peter Xu
2024-11-28 10:26 ` Avihai Horon
2024-11-28 12:11 ` Maciej S. Szmigiero
2024-12-04 22:43 ` Peter Xu
2024-12-10 23:05 ` Maciej S. Szmigiero
2024-12-12 16:55 ` Peter Xu
2024-12-12 22:53 ` Maciej S. Szmigiero
2024-12-16 16:33 ` Peter Xu
2024-12-16 23:15 ` Maciej S. Szmigiero
2024-11-17 19:20 ` [PATCH v3 09/24] migration/multifd: Split packet into header and RAM data Maciej S. Szmigiero
2024-11-26 14:34 ` Fabiano Rosas
2024-12-05 15:29 ` Peter Xu
2024-11-17 19:20 ` [PATCH v3 10/24] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2024-12-05 16:06 ` Peter Xu
2024-12-06 21:12 ` Maciej S. Szmigiero
2024-12-06 21:57 ` Peter Xu
2024-11-17 19:20 ` [PATCH v3 11/24] migration/multifd: Make multifd_send() thread safe Maciej S. Szmigiero
2024-12-05 16:17 ` Peter Xu
2024-12-06 21:12 ` Maciej S. Szmigiero
2024-11-17 19:20 ` [PATCH v3 12/24] migration/multifd: Add an explicit MultiFDSendData destructor Maciej S. Szmigiero
2024-12-05 16:23 ` Peter Xu
2024-11-17 19:20 ` [PATCH v3 13/24] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2024-11-26 19:58 ` Fabiano Rosas
2024-11-26 21:22 ` Maciej S. Szmigiero
2024-11-17 19:20 ` [PATCH v3 14/24] migration/multifd: Make MultiFDSendData a struct Maciej S. Szmigiero
2024-11-17 19:20 ` [PATCH v3 15/24] migration/multifd: Add migration_has_device_state_support() Maciej S. Szmigiero
2024-11-26 20:05 ` Fabiano Rosas
2024-11-28 10:33 ` Avihai Horon
2024-11-28 12:12 ` Maciej S. Szmigiero
2024-12-05 16:44 ` Peter Xu
2024-11-17 19:20 ` [PATCH v3 16/24] migration/multifd: Send final SYNC only after device state is complete Maciej S. Szmigiero
2024-11-26 20:52 ` Fabiano Rosas
2024-11-26 21:22 ` Maciej S. Szmigiero
2024-12-05 19:02 ` Peter Xu [this message]
2024-12-10 23:05 ` Maciej S. Szmigiero
2024-12-11 13:20 ` Peter Xu
2024-11-17 19:20 ` [PATCH v3 17/24] migration: Add save_live_complete_precopy_thread handler Maciej S. Szmigiero
2024-11-29 14:03 ` Cédric Le Goater
2024-11-29 17:14 ` Maciej S. Szmigiero
2024-11-17 19:20 ` [PATCH v3 18/24] vfio/migration: Don't run load cleanup if load setup didn't run Maciej S. Szmigiero
2024-11-29 14:08 ` Cédric Le Goater
2024-11-29 17:15 ` Maciej S. Szmigiero
2024-12-03 15:09 ` Avihai Horon
2024-12-10 23:04 ` Maciej S. Szmigiero
2024-12-12 14:30 ` Avihai Horon
2024-12-12 22:52 ` Maciej S. Szmigiero
2024-12-19 9:19 ` Cédric Le Goater
2024-11-17 19:20 ` [PATCH v3 19/24] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2024-11-29 14:11 ` Cédric Le Goater
2024-11-29 17:15 ` Maciej S. Szmigiero
2024-12-19 9:37 ` Cédric Le Goater
2024-11-17 19:20 ` [PATCH v3 20/24] vfio/migration: Add load_device_config_state_start trace event Maciej S. Szmigiero
2024-11-29 14:26 ` Cédric Le Goater
2024-11-17 19:20 ` [PATCH v3 21/24] vfio/migration: Convert bytes_transferred counter to atomic Maciej S. Szmigiero
2024-11-17 19:20 ` [PATCH v3 22/24] vfio/migration: Multifd device state transfer support - receive side Maciej S. Szmigiero
2024-12-02 17:56 ` Cédric Le Goater
2024-12-10 23:04 ` Maciej S. Szmigiero
2024-12-19 14:13 ` Cédric Le Goater
2024-12-09 9:13 ` Avihai Horon
2024-12-10 23:06 ` Maciej S. Szmigiero
2024-12-12 14:33 ` Avihai Horon
2024-11-17 19:20 ` [PATCH v3 23/24] migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile Maciej S. Szmigiero
2024-11-26 21:01 ` Fabiano Rosas
2024-12-05 19:49 ` Peter Xu
2024-11-17 19:20 ` [PATCH v3 24/24] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2024-12-09 9:28 ` Avihai Horon
2024-12-10 23:06 ` Maciej S. Szmigiero
2024-12-12 11:10 ` Cédric Le Goater
2024-12-12 22:52 ` Maciej S. Szmigiero
2024-12-13 11:08 ` Cédric Le Goater
2024-12-13 18:25 ` Maciej S. Szmigiero
2024-12-12 14:54 ` Avihai Horon
2024-12-12 22:53 ` Maciej S. Szmigiero
2024-12-16 17:33 ` Peter Xu
2024-12-19 9:50 ` Cédric Le Goater
2024-12-04 19:10 ` [PATCH v3 00/24] Multifd 🔀 device state transfer support with VFIO consumer Peter Xu
2024-12-06 18:03 ` Maciej S. Szmigiero
2024-12-06 22:20 ` Peter Xu
2024-12-10 23:06 ` Maciej S. Szmigiero
2024-12-12 17:35 ` Peter Xu
2024-12-19 7:55 ` Yanghang Liu
2024-12-19 8:53 ` Cédric Le Goater
2024-12-19 13:00 ` Yanghang Liu
2024-12-05 21:27 ` Cédric Le Goater
2024-12-05 21:42 ` Peter Xu
2024-12-06 10:24 ` Cédric Le Goater
2024-12-06 18:44 ` Maciej S. Szmigiero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1H4zS_TXZtVJOhw@x1n \
--to=peterx@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=avihaih@nvidia.com \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=eblake@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=mail@maciej.szmigiero.name \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).