From: Peter Xu <peterx@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>,
"Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>,
"Fabiano Rosas" <farosas@suse.de>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"Avihai Horon" <avihaih@nvidia.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
qemu-devel@nongnu.org, "Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [PATCH v2 05/17] thread-pool: Implement non-AIO (generic) pool support
Date: Mon, 9 Sep 2024 12:45:50 -0400 [thread overview]
Message-ID: <Zt8mPlBLO9FUHgxG@x1n> (raw)
In-Reply-To: <CAJSP0QXURSS4cHj0i6xy27HMbtF2D4ckL4fwDk5rHA3vFFtHUg@mail.gmail.com>
Hi, Stefan, Maciej,
Sorry to be slow on responding.
On Tue, Sep 03, 2024 at 03:04:54PM -0400, Stefan Hajnoczi wrote:
> On Tue, 3 Sept 2024 at 12:54, Maciej S. Szmigiero
> <mail@maciej.szmigiero.name> wrote:
> >
> > On 3.09.2024 15:55, Stefan Hajnoczi wrote:
> > > On Tue, 27 Aug 2024 at 13:58, Maciej S. Szmigiero
> > > <mail@maciej.szmigiero.name> wrote:
> > >>
> > >> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com>
> > >>
> > >> Migration code wants to manage device data sending threads in one place.
> > >>
> > >> QEMU has an existing thread pool implementation, however it was limited
> > >> to queuing AIO operations only and essentially had a 1:1 mapping between
> > >> the current AioContext and the ThreadPool in use.
> > >>
> > >> Implement what is necessary to queue generic (non-AIO) work on a ThreadPool
> > >> too.
> > >>
> > >> This brings a few new operations on a pool:
> > >> * thread_pool_set_minmax_threads() explicitly sets the minimum and maximum
> > >> thread count in the pool.
> > >>
> > >> * thread_pool_join() operation waits until all the submitted work requests
> > >> have finished.
> > >>
> > >> * thread_pool_poll() lets the new thread and / or thread completion bottom
> > >> halves run (if they are indeed scheduled to be run).
> > >> It is useful for thread pool users that need to launch or terminate new
> > >> threads without returning to the QEMU main loop.
> > >
> > > Did you consider glib's GThreadPool?
> > > https://docs.gtk.org/glib/struct.ThreadPool.html
> > >
> > > QEMU's thread pool is integrated into the QEMU event loop. If your
> > > goal is to bypass the QEMU event loop, then you may as well use the
> > > glib API instead.
> > >
> > > thread_pool_join() and thread_pool_poll() will lead to code that
> > > blocks the event loop. QEMU's aio_poll() and nested event loops in
> > > general are a source of hangs and re-entrancy bugs. I would prefer not
> > > introducing these issues in the QEMU ThreadPool API.
> > >
> >
> > Unfortunately, the problem with the migration code is that it is
> > synchronous - it does not return to the main event loop until the
> > migration is done.
> >
> > So the only way to handle things that need working event loop is to
> > pump it manually from inside the migration code.
> >
> > The reason why I used the QEMU thread pool in the first place in this
> > patch set version is because Peter asked me to do so during the review
> > of its previous iteration [1].
> >
> > Peter also asked me previously to move to QEMU synchronization
> > primitives from using the Glib ones in the early version of this
> > patch set [2].
> >
> > I personally would rather use something common to many applications,
> > well tested and with more pairs of eyes looking at it rather to
> > re-invent things in QEMU.
> >
> > Looking at GThreadPool it seems that it lacks ability to wait until
> > all queued work have finished, so this would need to be open-coded
> > in the migration code.
> >
> > @Peter, what's your opinion on using Glib's thread pool instead of
> > QEMU's one, considering the above things?
>
> I'll add a bit more about my thinking:
>
> Using QEMU's event-driven model is usually preferred because it makes
> integrating with the rest of QEMU easy and avoids having lots of
> single-purpose threads that are hard to observe/manage (e.g. through
> the QMP monitor).
>
> When there is a genuine need to spawn a thread and write synchronous
> code (e.g. a blocking ioctl(2) call or something CPU-intensive), then
Right, AFAIU this is the current use case for VFIO, and anything beyond in
migration context, where we want to use genuine threads with no need to
integrate with the main even loop.
Currently the VFIO workfn should read() the VFIO fd in a blocked way, then
dump them to multifd threads (further dump to migration channels), during
which it can wait on a semaphore.
> it's okay to do that. Use QEMUBH, EventNotifier, or other QEMU APIs to
> synchronize between event loop threads and special-purpose synchronous
> threads.
>
> I haven't looked at the patch series enough to have an opinion about
> whether this use case needs a special-purpose thread or not. I am
> assuming it really needs to be a special-purpose thread. Peter and you
> could discuss that further if you want.
>
> I agree with Peter's request to use QEMU's synchronization primitives.
> They do not depend on the event loop so they can be used outside the
> event loop.
>
> The issue I'm raising with this patch is that adding new join()/poll()
> APIs that shouldn't be called from the event loop is bug-prone. It
> will make the QEMU ThreadPool code harder to understand and maintain
> because now there are two different contexts where different subsets
> of this API can be used and mixing them leads to problems. To me the
> non-event loop case is beyond the scope of QEMU's ThreadPool. I have
> CCed Paolo, who wrote the thread pool in its current form in case he
> wants to participate in the discussion.
>
> Using glib's ThreadPool solves the issue while still reusing an
> existing thread pool implementation. Waiting for all work to complete
> can be done using QemuSemaphore.
Right. It's a pity that g_thread_pool_unprocessed() only monitors
unqueuing of tasks, and looks like there's no g_thread_pool_flush().
Indeed the current thread poll is very aio-centric, and if we worry about
misuse of the APIs we can switch to glib's threadpool. Sorry Maciej, looks
like I routed you to a direction that I didn't see the side effects..
I think the fundamental request from my side (on behalf of migration) is we
should avoid a specific vmstate handler managing threads on its own. E.g.,
any future devices (vdpa, vcpu, etc.) that may also be able to offload
save() processes concurrently to threads (just like what VFIO can already
do right now) should share the same pool of threads. As long as that can
be achieved I am ok.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2024-09-09 16:46 UTC|newest]
Thread overview: 128+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-27 17:54 [PATCH v2 00/17] Multifd 🔀 device state transfer support with VFIO consumer Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 01/17] vfio/migration: Add save_{iterate, complete_precopy}_started trace events Maciej S. Szmigiero
2024-09-05 13:08 ` [PATCH v2 01/17] vfio/migration: Add save_{iterate,complete_precopy}_started " Avihai Horon
2024-09-09 18:04 ` Maciej S. Szmigiero
2024-09-11 14:50 ` Avihai Horon
2024-08-27 17:54 ` [PATCH v2 02/17] migration/ram: Add load start trace event Maciej S. Szmigiero
2024-08-28 18:44 ` Fabiano Rosas
2024-08-28 20:21 ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 03/17] migration/multifd: Zero p->flags before starting filling a packet Maciej S. Szmigiero
2024-08-28 18:50 ` Fabiano Rosas
2024-09-09 15:41 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 04/17] thread-pool: Add a DestroyNotify parameter to thread_pool_submit{, _aio)() Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 05/17] thread-pool: Implement non-AIO (generic) pool support Maciej S. Szmigiero
2024-09-02 22:07 ` Fabiano Rosas
2024-09-03 12:02 ` Maciej S. Szmigiero
2024-09-03 14:26 ` Fabiano Rosas
2024-09-03 18:14 ` Maciej S. Szmigiero
2024-09-03 13:55 ` Stefan Hajnoczi
2024-09-03 16:54 ` Maciej S. Szmigiero
2024-09-03 19:04 ` Stefan Hajnoczi
2024-09-09 16:45 ` Peter Xu [this message]
2024-09-09 18:38 ` Maciej S. Szmigiero
2024-09-09 19:12 ` Peter Xu
2024-09-09 19:16 ` Maciej S. Szmigiero
2024-09-09 19:24 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 06/17] migration: Add save_live_complete_precopy_{begin, end} handlers Maciej S. Szmigiero
2024-08-28 19:03 ` [PATCH v2 06/17] migration: Add save_live_complete_precopy_{begin,end} handlers Fabiano Rosas
2024-09-05 13:45 ` Avihai Horon
2024-09-09 17:59 ` Peter Xu
2024-09-09 18:32 ` Maciej S. Szmigiero
2024-09-09 19:08 ` Peter Xu
2024-09-09 19:32 ` Peter Xu
2024-09-19 19:48 ` Maciej S. Szmigiero
2024-09-19 19:47 ` Maciej S. Szmigiero
2024-09-19 20:54 ` Peter Xu
2024-09-20 15:22 ` Maciej S. Szmigiero
2024-09-20 16:08 ` Peter Xu
2024-09-09 18:05 ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 07/17] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2024-08-30 19:05 ` Fabiano Rosas
2024-09-05 14:15 ` Avihai Horon
2024-09-09 18:05 ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 08/17] migration: Add load_finish handler and associated functions Maciej S. Szmigiero
2024-08-30 19:28 ` Fabiano Rosas
2024-09-05 15:13 ` Avihai Horon
2024-09-09 18:05 ` Maciej S. Szmigiero
2024-09-09 20:03 ` Peter Xu
2024-09-19 19:49 ` Maciej S. Szmigiero
2024-09-19 21:11 ` Peter Xu
2024-09-20 15:23 ` Maciej S. Szmigiero
2024-09-20 16:45 ` Peter Xu
2024-09-26 22:34 ` Maciej S. Szmigiero
2024-09-27 0:53 ` Peter Xu
2024-09-30 19:25 ` Maciej S. Szmigiero
2024-09-30 21:57 ` Peter Xu
2024-10-01 20:41 ` Maciej S. Szmigiero
2024-10-01 21:30 ` Peter Xu
2024-10-02 20:11 ` Maciej S. Szmigiero
2024-10-02 21:25 ` Peter Xu
2024-10-03 20:34 ` Maciej S. Szmigiero
2024-10-03 21:17 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 09/17] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2024-08-30 20:22 ` Fabiano Rosas
2024-09-02 20:12 ` Maciej S. Szmigiero
2024-09-03 14:42 ` Fabiano Rosas
2024-09-03 18:41 ` Maciej S. Szmigiero
2024-09-09 19:52 ` Peter Xu
2024-09-19 19:49 ` Maciej S. Szmigiero
2024-09-05 16:47 ` Avihai Horon
2024-09-09 18:05 ` Maciej S. Szmigiero
2024-09-12 8:13 ` Avihai Horon
2024-09-12 13:52 ` Fabiano Rosas
2024-09-19 19:59 ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 10/17] migration/multifd: Convert multifd_send()::next_channel to atomic Maciej S. Szmigiero
2024-08-30 18:13 ` Fabiano Rosas
2024-09-02 20:11 ` Maciej S. Szmigiero
2024-09-03 15:01 ` Fabiano Rosas
2024-09-03 20:04 ` Maciej S. Szmigiero
2024-09-10 14:13 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 11/17] migration/multifd: Add an explicit MultiFDSendData destructor Maciej S. Szmigiero
2024-08-30 13:12 ` Fabiano Rosas
2024-08-27 17:54 ` [PATCH v2 12/17] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2024-08-29 0:41 ` Fabiano Rosas
2024-08-29 20:03 ` Maciej S. Szmigiero
2024-08-30 13:02 ` Fabiano Rosas
2024-09-09 19:40 ` Peter Xu
2024-09-19 19:50 ` Maciej S. Szmigiero
2024-09-10 19:48 ` Peter Xu
2024-09-12 18:43 ` Fabiano Rosas
2024-09-13 0:23 ` Peter Xu
2024-09-13 13:21 ` Fabiano Rosas
2024-09-13 14:19 ` Peter Xu
2024-09-13 15:04 ` Fabiano Rosas
2024-09-13 15:22 ` Peter Xu
2024-09-13 18:26 ` Fabiano Rosas
2024-09-17 15:39 ` Peter Xu
2024-09-17 17:07 ` Cédric Le Goater
2024-09-17 17:50 ` Peter Xu
2024-09-19 19:51 ` Maciej S. Szmigiero
2024-09-19 19:49 ` Maciej S. Szmigiero
2024-09-19 21:17 ` Peter Xu
2024-09-20 15:23 ` Maciej S. Szmigiero
2024-09-20 17:09 ` Peter Xu
2024-09-10 16:06 ` Peter Xu
2024-09-19 19:49 ` Maciej S. Szmigiero
2024-09-19 21:18 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support() Maciej S. Szmigiero
2024-08-30 18:55 ` Fabiano Rosas
2024-09-02 20:11 ` Maciej S. Szmigiero
2024-09-03 15:09 ` Fabiano Rosas
2024-08-27 17:54 ` [PATCH v2 14/17] migration: Add save_live_complete_precopy_thread handler Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 15/17] vfio/migration: Multifd device state transfer support - receive side Maciej S. Szmigiero
2024-09-09 8:55 ` Avihai Horon
2024-09-09 18:06 ` Maciej S. Szmigiero
2024-09-12 8:20 ` Avihai Horon
2024-09-12 8:45 ` Cédric Le Goater
2024-08-27 17:54 ` [PATCH v2 16/17] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 17/17] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2024-09-09 11:41 ` Avihai Horon
2024-09-09 18:07 ` Maciej S. Szmigiero
2024-09-12 8:26 ` Avihai Horon
2024-09-12 8:57 ` Cédric Le Goater
2024-08-28 20:46 ` [PATCH v2 00/17] Multifd 🔀 device state transfer support with VFIO consumer Fabiano Rosas
2024-08-28 21:58 ` Maciej S. Szmigiero
2024-08-29 0:51 ` Fabiano Rosas
2024-08-29 20:02 ` Maciej S. Szmigiero
2024-10-11 13:58 ` Cédric Le Goater
2024-10-15 21:12 ` Maciej S. Szmigiero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zt8mPlBLO9FUHgxG@x1n \
--to=peterx@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=avihaih@nvidia.com \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=eblake@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=mail@maciej.szmigiero.name \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).