From: Peter Xu <peterx@redhat.com>
To: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: "Fabiano Rosas" <farosas@suse.de>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Avihai Horon" <avihaih@nvidia.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH v4 19/33] migration: Add save_live_complete_precopy_thread handler
Date: Wed, 5 Feb 2025 10:55:59 -0500 [thread overview]
Message-ID: <Z6OKDxw1hnNKCjJn@x1.local> (raw)
In-Reply-To: <c107d827-6913-4af5-8a63-c71000060fec@maciej.szmigiero.name>
On Wed, Feb 05, 2025 at 12:53:21PM +0100, Maciej S. Szmigiero wrote:
> On 4.02.2025 21:34, Peter Xu wrote:
> > On Tue, Feb 04, 2025 at 08:32:15PM +0100, Maciej S. Szmigiero wrote:
> > > On 4.02.2025 18:54, Peter Xu wrote:
> > > > On Thu, Jan 30, 2025 at 11:08:40AM +0100, Maciej S. Szmigiero wrote:
> > > > > +static int multifd_device_state_save_thread(void *opaque)
> > > > > +{
> > > > > + struct MultiFDDSSaveThreadData *data = opaque;
> > > > > + int ret;
> > > > > +
> > > > > + ret = data->hdlr(data->idstr, data->instance_id, &send_threads_abort,
> > > > > + data->handler_opaque);
> > > >
> > > > I thought we discussed somewhere and the plan was we could use Error** here
> > > > to report errors. Would that still make sense, or maybe I lost some
> > > > context?
> > >
> > > That was about *load* threads, here these are *save* threads.
> >
> > Ah OK.
> >
> > >
> > > Save handlers do not return an Error value, neither save_live_iterate, nor
> > > save_live_complete_precopy or save_state does so.
> >
> > Let's try to make new APIs work with Error* if possible.
>
> Let's assume that these threads return an Error object.
>
> What's qemu_savevm_state_complete_precopy_iterable() supposed to do with it?
IIUC it's not about qemu_savevm_state_complete_precopy_iterable() in this
context, as the Error* will be used in one of the thread of the pool, not
migration thread.
The goal is to be able to set Error* with migrate_set_error(), so that when
migration failed, query-migrate can return the error to libvirt, so
migration always tries to remember the 1st error hit if ever possible.
It's multifd_device_state_save_thread() to do migrate_set_error(), not in
migration thread. qemu_savevm_state_complete_*() are indeed not ready to
pass Errors, but it's not in the discussed stack.
> Both it and its caller qemu_savevm_state_complete_precopy() only handle int
> errors.
>
> qemu_savevm_state_complete_precopy() in turn has 4 callers, half of which (2)
> also would need to be enlightened with Error handling somehow.
Right, we don't need to touch those, as explained above.
Generally speaking, IMHO it's always good to add new code with Error*
reports, rather than retvals in migration, even if the new code is in the
migration thread stack. It made future changes easier to switch to Error*.
>
> >
> > >
> > > > Meanwhile, I still feel uneasy on having these globals (send_threads_abort,
> > > > send_threads_ret). Can we make MultiFDDSSaveThreadData the only interface
> > > > between migration and the threads impl? So I wonder if it can be:
> > > >
> > > > ret = data->hdlr(data);
> > > >
> > > > With extended struct like this (I added thread_error and thread_quit):
> > > >
> > > > struct MultiFDDSSaveThreadData {
> > > > SaveLiveCompletePrecopyThreadHandler hdlr;
> > > > char *idstr;
> > > > uint32_t instance_id;
> > > > void *handler_opaque;
> > > > /*
> > > > * Should be NULL when struct passed over to thread, the thread should
> > > > * set this if the handler would return false. It must be kept NULL if
> > > > * the handler returned true / success.
> > > > */
> > > > Error *thread_error;
> > >
> > > As I mentioned above, these handlers do not generally return Error type,
> > > so this would need to be an *int;
> > >
> > > > /*
> > > > * Migration core would set this when it wants to notify thread to
> > > > * quit, for example, when error occured in other threads, or migration is
> > > > * cancelled by the user.
> > > > */
> > > > bool thread_quit;
> > >
> > > ^ I guess that was supposed to be a pointer too (*thread_quit).
> >
> > It's my intention to make this bool, to make everything managed per-thread.
>
> But that's unnecessary since this flag is common to all these threads.
One bool would be enough, but you'll need to export another API for VFIO to
use otherwise. I suppose that's ok too.
Some context of multifd threads and how that's done there..
We started with one "quit" per thread struct, but then we switched to one
bool exactly as you said, see commit 15f3f21d598148.
If you want to stick with one bool, it's okay too, you can export something
similar in misc.h, e.g. multifd_device_state_save_thread_quitting(), then
we can avoid passing in the "quit" either as handler parameter, or
per-thread flag.
>
> > It's actually what we do with multifd, these are a bunch of extra threads
> > to differeciate from the "IO threads" / "multifd threads".
> >
> > >
> > > > };
> > > >
> > > > Then if any multifd_device_state_save_thread() failed, for example, it
> > > > should notify all threads to quit by setting thread_quit, instead of
> > > > relying on yet another global variable to show migration needs to quit.
> > >
> > > multifd_abort_device_state_save_threads() needs to access
> > > send_threads_abort too.
> >
> > This may need to become something like:
> >
> > QLIST_FOREACH() {
> > MultiFDDSSaveThreadData *data = ...;
> > data->thread_quit = true;
> > }
>
> At the most basic level that's turning O(1) operation into O(n).
>
> Besides, it creates a question now who now owns these MultiFDDSSaveThreadData
> structures - they could be owned by either thread pool or the
> multifd_device_state code.
I think it should be owned by migration, and with this idea it will need to
be there until waiting thread pool completing their works, so migration
core needs to free them.
>
> Currently the ownership is simple - the multifd_device_state code
> allocates such per-thread structure in multifd_spawn_device_state_save_thread()
> and immediately passes its ownership to the thread pool which
> takes care to free it once it no longer needs it.
Right, this is another reason why I think having migration owing these
structs is better. We used to have task dangling issues when we shift
ownership of something to mainloop then we lose track of them (e.g. on TLS
handshake gsources). Those are pretty hard to debug when hanged, because
migration core has nothing to link to the hanged tasks again anymore.
I think we should start from having migration core being able to reach
these thread-based tasks when needed. Migration also have control of the
thread pool, then it would be easier. Thread pool is so far simple so we
may still need to be able to reference to per-task info separately.
>
> Now, with the list implementation if the thread pool were to free
> that MultiFDDSSaveThreadData it would also need to release it from
> the list.
>
> Which in turn would need appropriate locking around this removal
> operation and probably also each time the list is iterated over.
>
> On the other hand if the multifd_device_state code were to own
> that MultiFDDSSaveThreadData then it would linger around until
> multifd_device_state_send_cleanup() cleans it up even though its
> associated thread might be long gone.
Do you see a problem with it? It sounds good to me actually.. and pretty
easy to understand.
So migration creates these MultiFDDSSaveThreadData, then create threads to
enqueue then, then wait for all threads to complete, then free these
structs.
>
> > We may want to double check qmp 'migrate_cancel' will work when save
> > threads are running, but this can also be done for later.
>
> > >
> > > And multifd_join_device_state_save_threads() needs to access
> > > send_threads_ret.
> >
> > Then this one becomes:
> >
> > thread_pool_wait(send_threads);
> > QLIST_FOREACH() {
> > MultiFDDSSaveThreadData *data = ...;
> > if (data->thread_error) {
> > return false;
> > }
> > }
> > return true;
>
> Same here, having a common error return would save us from having
> to iterate over a list (or having a list in the first place).
IMHO perf isn't an issue here. It's slow path, threads num is small, loop
is cheap. I prefer prioritize cleaness in this case.
Otherwise any suggestion we could report an Error* in the threads?
>
> > >
> > > These variables ultimately will have to be stored somewhere since
> > > there can be multiple save threads and so multiple instances of
> > > MultiFDDSSaveThreadData.
> > >
> > > So these need to be stored somewhere where
> > > multifd_spawn_device_state_save_thread() can reach them to assign
> > > their addresses to MultiFDDSSaveThreadData members.
> >
> > Then multifd_spawn_device_state_save_thread() will need to manage the
> > qlist, making sure migration core remembers what jobs it submitted. It
> > sounds good to have that bookkeeping when I think about it, instead of
> > throw the job to the thread pool and forget it..
>
> It's not "forgetting" about the job but rather letting thread pool
> manage it - I think thread pool was introduced so these details
> (thread management) are abstracted from the migration code.
> Now they would be effectively duplicated in the migration code.
Migration is still managing those as long as you have send_threads_abort,
isn't it? The thread pool doesn't yet have an API to say "let's quit all
the tasks", otherwise I'm OK too to use the pool API instead of having
thread_quit.
>
> > >
> > > However, at that point multifd_device_state_save_thread() can
> > > access them too so it does not need to have them passed via
> > > MultiFDDSSaveThreadData.
> > >
> > > However, nothing prevents putting send_threads* variables
> > > into a global struct (with internal linkage - "static", just as
> > > these separate ones are) if you like such construct more.
> >
> > This should be better than the current global vars indeed, but less
> > favoured if the per-thread way could work above.
>
> You still need that list to be a global variable,
> so it's the same amount of global variables as just putting
> the existing variables in a struct (which could be even allocated
> in multifd_device_state_send_setup() and deallocated in
> multifd_device_state_send_cleanup() for extra memory savings).
Yes this works for me.
I think you got me wrong on "not allowing to introduce global variables".
I'm OK with it, but please still consider..
- Put it under some existing global object rather than having separate
global variables all over the places..
- Having Error reports
And I still think we can change:
typedef int (*SaveLiveCompletePrecopyThreadHandler)(char *idstr,
uint32_t instance_id,
bool *abort_flag,
void *opaque);
To:
typedef int (*SaveLiveCompletePrecopyThreadHandler)(MultiFDDSSaveThreadData*);
No matter what.
>
> These variables are having internal linkage limited to (relatively
> small) multifd-device-state.c, so it's not like they are polluting
> namespace in some major migration translation unit.
If someone proposes to introduce 100 global vars in multifd-device-state.c,
I'll strongly stop that.
If it's one global var, I'm OK.
What if it's 5?
===8<===
static QemuMutex queue_job_mutex;
static ThreadPool *send_threads;
static int send_threads_ret;
static bool send_threads_abort;
static MultiFDSendData *device_state_send;
===8<===
I think I should start calling a stop. That's what happened..
Please consider introducing something like multifd_send_device_state so we
can avoid anyone in the future randomly add static global vars.
>
> Taking into consideration having to manage an extra data structure
> (list), needing more code to do so, having worse algorithms I don't
> really see a point of using that list.
>
> (This is orthogonal to whether the thread return type is changed to
> Error which could be easily done on the existing save threads pool
> implementation).
My bet is changing to list is as easy (10-20 LOC?). If not, I can try to
provide the diff on top of your patch.
I'm also not strictly asking for a list, but anything that makes the API
cleaner (less globals, better error reports, etc.).
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2025-02-05 15:56 UTC|newest]
Thread overview: 137+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-30 10:08 [PATCH v4 00/33] Multifd 🔀 device state transfer support with VFIO consumer Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 01/33] migration: Clarify that {load, save}_cleanup handlers can run without setup Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 02/33] thread-pool: Remove thread_pool_submit() function Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 03/33] thread-pool: Rename AIO pool functions to *_aio() and data types to *Aio Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 04/33] thread-pool: Implement generic (non-AIO) pool support Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 05/33] migration: Add MIG_CMD_SWITCHOVER_START and its load handler Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 06/33] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 07/33] io: tls: Allow terminating the TLS session gracefully with EOF Maciej S. Szmigiero
2025-02-04 15:15 ` Daniel P. Berrangé
2025-02-04 16:02 ` Maciej S. Szmigiero
2025-02-04 16:14 ` Daniel P. Berrangé
2025-02-04 18:25 ` Maciej S. Szmigiero
2025-02-06 21:53 ` Peter Xu
2025-01-30 10:08 ` [PATCH v4 08/33] migration/multifd: Allow premature EOF on TLS incoming channels Maciej S. Szmigiero
2025-02-03 18:20 ` Peter Xu
2025-02-03 18:53 ` Maciej S. Szmigiero
2025-02-03 20:20 ` Peter Xu
2025-02-03 21:41 ` Maciej S. Szmigiero
2025-02-03 22:56 ` Peter Xu
2025-02-04 13:51 ` Fabiano Rosas
2025-02-04 14:39 ` Maciej S. Szmigiero
2025-02-04 15:00 ` Fabiano Rosas
2025-02-04 15:10 ` Maciej S. Szmigiero
2025-02-04 15:31 ` Peter Xu
2025-02-04 15:39 ` Daniel P. Berrangé
2025-02-05 19:09 ` Fabiano Rosas
2025-02-05 20:42 ` Fabiano Rosas
2025-02-05 20:55 ` Maciej S. Szmigiero
2025-02-06 14:13 ` Fabiano Rosas
2025-02-06 14:53 ` Maciej S. Szmigiero
2025-02-06 15:20 ` Fabiano Rosas
2025-02-06 16:01 ` Maciej S. Szmigiero
2025-02-06 17:32 ` Fabiano Rosas
2025-02-06 17:55 ` Maciej S. Szmigiero
2025-02-06 21:51 ` Peter Xu
2025-02-07 13:17 ` Fabiano Rosas
2025-02-07 14:04 ` Peter Xu
2025-02-07 14:16 ` Fabiano Rosas
2025-02-05 21:13 ` Peter Xu
2025-02-06 14:19 ` Fabiano Rosas
2025-02-04 15:10 ` Daniel P. Berrangé
2025-02-04 15:08 ` Daniel P. Berrangé
2025-02-04 16:02 ` Peter Xu
2025-02-04 16:12 ` Daniel P. Berrangé
2025-02-04 16:29 ` Peter Xu
2025-02-04 18:25 ` Fabiano Rosas
2025-02-04 19:34 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 09/33] migration: postcopy_ram_listen_thread() needs to take BQL for some calls Maciej S. Szmigiero
2025-02-02 2:06 ` Dr. David Alan Gilbert
2025-02-02 11:55 ` Maciej S. Szmigiero
2025-02-02 12:45 ` Dr. David Alan Gilbert
2025-02-03 13:57 ` Maciej S. Szmigiero
2025-02-03 19:58 ` Peter Xu
2025-02-03 20:15 ` Maciej S. Szmigiero
2025-02-03 20:36 ` Peter Xu
2025-02-03 21:41 ` Maciej S. Szmigiero
2025-02-03 23:02 ` Peter Xu
2025-02-04 14:57 ` Maciej S. Szmigiero
2025-02-04 15:39 ` Peter Xu
2025-02-04 19:32 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 10/33] error: define g_autoptr() cleanup function for the Error type Maciej S. Szmigiero
2025-02-03 20:53 ` Peter Xu
2025-02-03 21:13 ` Daniel P. Berrangé
2025-02-03 21:51 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 11/33] migration: Add thread pool of optional load threads Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 12/33] migration/multifd: Split packet into header and RAM data Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 13/33] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2025-02-03 21:27 ` Peter Xu
2025-02-03 22:18 ` Maciej S. Szmigiero
2025-02-03 22:59 ` Peter Xu
2025-02-04 14:40 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 14/33] migration/multifd: Make multifd_send() thread safe Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 15/33] migration/multifd: Add an explicit MultiFDSendData destructor Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 16/33] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2025-02-03 21:47 ` Peter Xu
2025-01-30 10:08 ` [PATCH v4 17/33] migration/multifd: Make MultiFDSendData a struct Maciej S. Szmigiero
2025-02-07 14:36 ` Fabiano Rosas
2025-02-07 19:43 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 18/33] migration/multifd: Add multifd_device_state_supported() Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 19/33] migration: Add save_live_complete_precopy_thread handler Maciej S. Szmigiero
2025-02-04 17:54 ` Peter Xu
2025-02-04 19:32 ` Maciej S. Szmigiero
2025-02-04 20:34 ` Peter Xu
2025-02-05 11:53 ` Maciej S. Szmigiero
2025-02-05 15:55 ` Peter Xu [this message]
2025-02-06 11:41 ` Maciej S. Szmigiero
2025-02-06 22:16 ` Peter Xu
2025-01-30 10:08 ` [PATCH v4 20/33] vfio/migration: Add x-migration-load-config-after-iter VFIO property Maciej S. Szmigiero
2025-02-10 17:24 ` Cédric Le Goater
2025-02-11 14:37 ` Maciej S. Szmigiero
2025-02-11 15:00 ` Cédric Le Goater
2025-02-11 15:57 ` Maciej S. Szmigiero
2025-02-11 16:28 ` Cédric Le Goater
2025-01-30 10:08 ` [PATCH v4 21/33] vfio/migration: Add load_device_config_state_start trace event Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 22/33] vfio/migration: Convert bytes_transferred counter to atomic Maciej S. Szmigiero
2025-01-30 21:35 ` Cédric Le Goater
2025-01-31 9:47 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 23/33] vfio/migration: Multifd device state transfer support - basic types Maciej S. Szmigiero
2025-02-10 17:17 ` Cédric Le Goater
2025-01-30 10:08 ` [PATCH v4 24/33] vfio/migration: Multifd device state transfer support - VFIOStateBuffer(s) Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 25/33] vfio/migration: Multifd device state transfer - add support checking function Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 26/33] vfio/migration: Multifd device state transfer support - receive init/cleanup Maciej S. Szmigiero
2025-02-12 10:55 ` Cédric Le Goater
2025-02-14 20:55 ` Maciej S. Szmigiero
2025-02-17 9:38 ` Cédric Le Goater
2025-02-17 22:13 ` Maciej S. Szmigiero
2025-02-18 7:54 ` Cédric Le Goater
2025-01-30 10:08 ` [PATCH v4 27/33] vfio/migration: Multifd device state transfer support - received buffers queuing Maciej S. Szmigiero
2025-02-12 13:47 ` Cédric Le Goater
2025-02-14 20:58 ` Maciej S. Szmigiero
2025-02-17 13:48 ` Cédric Le Goater
2025-02-17 22:15 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 28/33] vfio/migration: Multifd device state transfer support - load thread Maciej S. Szmigiero
2025-02-12 15:48 ` Cédric Le Goater
2025-02-12 16:19 ` Cédric Le Goater
2025-02-17 22:09 ` Maciej S. Szmigiero
2025-02-17 22:09 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 29/33] vfio/migration: Multifd device state transfer support - config loading support Maciej S. Szmigiero
2025-02-12 16:21 ` Cédric Le Goater
2025-02-17 22:09 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 30/33] migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 31/33] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2025-02-12 17:03 ` Cédric Le Goater
2025-02-17 22:12 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 32/33] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2025-02-12 17:10 ` Cédric Le Goater
2025-02-14 20:56 ` Maciej S. Szmigiero
2025-02-17 13:57 ` Cédric Le Goater
2025-02-17 14:16 ` Maciej S. Szmigiero
2025-01-30 10:08 ` [PATCH v4 33/33] hw/core/machine: Add compat for " Maciej S. Szmigiero
2025-01-30 20:19 ` [PATCH v4 00/33] Multifd 🔀 device state transfer support with VFIO consumer Fabiano Rosas
2025-01-30 20:27 ` Maciej S. Szmigiero
2025-01-30 20:46 ` Fabiano Rosas
2025-01-31 18:16 ` Maciej S. Szmigiero
2025-02-03 14:19 ` Cédric Le Goater
2025-02-21 6:57 ` Yanghang Liu
2025-02-22 9:51 ` Maciej S. Szmigiero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6OKDxw1hnNKCjJn@x1.local \
--to=peterx@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=avihaih@nvidia.com \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=eblake@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=mail@maciej.szmigiero.name \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).