qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: "Fabiano Rosas" <farosas@suse.de>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Cédric Le Goater" <clg@redhat.com>,
	"Eric Blake" <eblake@redhat.com>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Avihai Horon" <avihaih@nvidia.com>,
	"Joao Martins" <joao.m.martins@oracle.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v2 08/17] migration: Add load_finish handler and associated functions
Date: Thu, 3 Oct 2024 17:17:10 -0400	[thread overview]
Message-ID: <Zv8J1uAWIxzCgErh@x1n> (raw)
In-Reply-To: <6240036b-f8b8-4592-8f80-21ee6d3eaa1e@maciej.szmigiero.name>

On Thu, Oct 03, 2024 at 10:34:28PM +0200, Maciej S. Szmigiero wrote:
> To be clear, these loading threads are mostly blocking I/O threads, NOT
> compute threads.
> This means that the usual "rule of thumb" that the count of threads should
> not exceed the total number of logical CPUs does NOT apply to them.
> 
> They are similar to what glibc uses under the hood to simulate POSIX AIO
> (aio_read(), aio_write()), to implement an async DNS resolver (getaddrinfo_a())
> and what Glib's GIO uses to simulate its own async file operations.
> Using helper threads for turning blocking I/O into "AIO" is a pretty common
> thing.

Fair enough.  Yes I could be over-cautious due to the previous experience
on managing all kinds of migration threads.

> 
> To show that these loading threads mostly spend their time sleeping (waiting
> for I/O) I made a quick patch at [1] tracing how much time they spend waiting
> for incoming buffers and how much time they spend waiting for these buffers
> to be loaded into the device.
> 
> The results (without patch [2] described later) are like this:
> > 5919@1727974993.403280:vfio_load_state_device_buffer_start  (0000:af:00.2)
> > 5921@1727974993.407932:vfio_load_state_device_buffer_start  (0000:af:00.4)
> > 5922@1727974993.407964:vfio_load_state_device_buffer_start  (0000:af:00.5)
> > 5920@1727974993.408480:vfio_load_state_device_buffer_start  (0000:af:00.3)
> > 5920@1727974993.666843:vfio_load_state_device_buffer_end  (0000:af:00.3) wait 43 ms load 217 ms
> > 5921@1727974993.686005:vfio_load_state_device_buffer_end  (0000:af:00.4) wait 75 ms load 206 ms
> > 5919@1727974993.686054:vfio_load_state_device_buffer_end  (0000:af:00.2) wait 69 ms load 210 ms
> > 5922@1727974993.689919:vfio_load_state_device_buffer_end  (0000:af:00.5) wait 79 ms load 204 ms
> 
> Summing up:
> 0000:af:00.2 total loading time 283 ms, wait 69 ms load 210 ms
> 0000:af:00.3 total loading time 258 ms, wait 43 ms load 217 ms
> 0000:af:00.4 total loading time 278 ms, wait 75 ms load 206 ms
> 0000:af:00.5 total loading time 282 ms, wait 79 ms load 204 ms
> 
> In other words, these threads spend ~100% of their total runtime waiting
> for I/O, 70%-75% of that time waiting for buffers to get loaded into their
> target device.
> 
> So having more threads here won't negatively affect the host CPU
> consumption since these threads barely use the host CPU at all.
> Also, their count is capped at the number of VFIO devices in the VM.
> 
> I also did a quick test with the same config as usual: 4 VFs, 6 multifd
> channels, but with patch at [2] simulating forced coupling of loading
> threads to multifd receive channel threads.
> 
> With this patch load_state_buffer() handler will return to the multifd
> channel thread only when the loading thread finishes loading available
> buffers and is about to wait for the next buffers to arrive - just as
> loading buffers directly from these channel threads would do.
> 
> The resulting lowest downtime from 115 live migration runs was 1295ms -
> that's 21% worse than 1068ms of downtime with these loading threads running
> on their own.
> 
> I expect that this performance penalty to get even worse with more VFs
> than 4.
> 
> So no, we can't load buffers directly from multifd channel receive threads.

6 channels can be a bit less in this test case with 4 VFs, but indeed
adding such dependency on number of multifd threads isn't as good either, I
agree.  I'm ok as long as VFIO reviewers are fine.

> 
> > PS: I'd suggest if you really need those threads it should still be managed
> > by migration framework like the src thread pool.  Sorry I'm pretty stubborn
> > on this, especially after I notice we have query-migrationthreads API just
> > recently.. even if now I'm not sure whether we should remove that API.  I
> > assume that shouldn't need much change, even if necessary.
> 
> I can certainly make these loading threads managed in a thread pool if that's
> easier for you.

Yes, if you want to use separate thread it'll be great to match on the src
thread model with similar pool.  I hope the pool interface you have is
easily applicable on both sides.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2024-10-03 21:17 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-27 17:54 [PATCH v2 00/17] Multifd 🔀 device state transfer support with VFIO consumer Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 01/17] vfio/migration: Add save_{iterate, complete_precopy}_started trace events Maciej S. Szmigiero
2024-09-05 13:08   ` [PATCH v2 01/17] vfio/migration: Add save_{iterate,complete_precopy}_started " Avihai Horon
2024-09-09 18:04     ` Maciej S. Szmigiero
2024-09-11 14:50       ` Avihai Horon
2024-08-27 17:54 ` [PATCH v2 02/17] migration/ram: Add load start trace event Maciej S. Szmigiero
2024-08-28 18:44   ` Fabiano Rosas
2024-08-28 20:21     ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 03/17] migration/multifd: Zero p->flags before starting filling a packet Maciej S. Szmigiero
2024-08-28 18:50   ` Fabiano Rosas
2024-09-09 15:41   ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 04/17] thread-pool: Add a DestroyNotify parameter to thread_pool_submit{, _aio)() Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 05/17] thread-pool: Implement non-AIO (generic) pool support Maciej S. Szmigiero
2024-09-02 22:07   ` Fabiano Rosas
2024-09-03 12:02     ` Maciej S. Szmigiero
2024-09-03 14:26       ` Fabiano Rosas
2024-09-03 18:14         ` Maciej S. Szmigiero
2024-09-03 13:55   ` Stefan Hajnoczi
2024-09-03 16:54     ` Maciej S. Szmigiero
2024-09-03 19:04       ` Stefan Hajnoczi
2024-09-09 16:45         ` Peter Xu
2024-09-09 18:38           ` Maciej S. Szmigiero
2024-09-09 19:12             ` Peter Xu
2024-09-09 19:16               ` Maciej S. Szmigiero
2024-09-09 19:24                 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 06/17] migration: Add save_live_complete_precopy_{begin, end} handlers Maciej S. Szmigiero
2024-08-28 19:03   ` [PATCH v2 06/17] migration: Add save_live_complete_precopy_{begin,end} handlers Fabiano Rosas
2024-09-05 13:45   ` Avihai Horon
2024-09-09 17:59     ` Peter Xu
2024-09-09 18:32       ` Maciej S. Szmigiero
2024-09-09 19:08         ` Peter Xu
2024-09-09 19:32           ` Peter Xu
2024-09-19 19:48             ` Maciej S. Szmigiero
2024-09-19 19:47           ` Maciej S. Szmigiero
2024-09-19 20:54             ` Peter Xu
2024-09-20 15:22               ` Maciej S. Szmigiero
2024-09-20 16:08                 ` Peter Xu
2024-09-09 18:05     ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 07/17] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2024-08-30 19:05   ` Fabiano Rosas
2024-09-05 14:15   ` Avihai Horon
2024-09-09 18:05     ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 08/17] migration: Add load_finish handler and associated functions Maciej S. Szmigiero
2024-08-30 19:28   ` Fabiano Rosas
2024-09-05 15:13   ` Avihai Horon
2024-09-09 18:05     ` Maciej S. Szmigiero
2024-09-09 20:03   ` Peter Xu
2024-09-19 19:49     ` Maciej S. Szmigiero
2024-09-19 21:11       ` Peter Xu
2024-09-20 15:23         ` Maciej S. Szmigiero
2024-09-20 16:45           ` Peter Xu
2024-09-26 22:34             ` Maciej S. Szmigiero
2024-09-27  0:53               ` Peter Xu
2024-09-30 19:25                 ` Maciej S. Szmigiero
2024-09-30 21:57                   ` Peter Xu
2024-10-01 20:41                     ` Maciej S. Szmigiero
2024-10-01 21:30                       ` Peter Xu
2024-10-02 20:11                         ` Maciej S. Szmigiero
2024-10-02 21:25                           ` Peter Xu
2024-10-03 20:34                             ` Maciej S. Szmigiero
2024-10-03 21:17                               ` Peter Xu [this message]
2024-08-27 17:54 ` [PATCH v2 09/17] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2024-08-30 20:22   ` Fabiano Rosas
2024-09-02 20:12     ` Maciej S. Szmigiero
2024-09-03 14:42       ` Fabiano Rosas
2024-09-03 18:41         ` Maciej S. Szmigiero
2024-09-09 19:52       ` Peter Xu
2024-09-19 19:49         ` Maciej S. Szmigiero
2024-09-05 16:47   ` Avihai Horon
2024-09-09 18:05     ` Maciej S. Szmigiero
2024-09-12  8:13       ` Avihai Horon
2024-09-12 13:52         ` Fabiano Rosas
2024-09-19 19:59           ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 10/17] migration/multifd: Convert multifd_send()::next_channel to atomic Maciej S. Szmigiero
2024-08-30 18:13   ` Fabiano Rosas
2024-09-02 20:11     ` Maciej S. Szmigiero
2024-09-03 15:01       ` Fabiano Rosas
2024-09-03 20:04         ` Maciej S. Szmigiero
2024-09-10 14:13   ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 11/17] migration/multifd: Add an explicit MultiFDSendData destructor Maciej S. Szmigiero
2024-08-30 13:12   ` Fabiano Rosas
2024-08-27 17:54 ` [PATCH v2 12/17] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2024-08-29  0:41   ` Fabiano Rosas
2024-08-29 20:03     ` Maciej S. Szmigiero
2024-08-30 13:02       ` Fabiano Rosas
2024-09-09 19:40         ` Peter Xu
2024-09-19 19:50           ` Maciej S. Szmigiero
2024-09-10 19:48     ` Peter Xu
2024-09-12 18:43       ` Fabiano Rosas
2024-09-13  0:23         ` Peter Xu
2024-09-13 13:21           ` Fabiano Rosas
2024-09-13 14:19             ` Peter Xu
2024-09-13 15:04               ` Fabiano Rosas
2024-09-13 15:22                 ` Peter Xu
2024-09-13 18:26                   ` Fabiano Rosas
2024-09-17 15:39                     ` Peter Xu
2024-09-17 17:07                   ` Cédric Le Goater
2024-09-17 17:50                     ` Peter Xu
2024-09-19 19:51                       ` Maciej S. Szmigiero
2024-09-19 19:49       ` Maciej S. Szmigiero
2024-09-19 21:17         ` Peter Xu
2024-09-20 15:23           ` Maciej S. Szmigiero
2024-09-20 17:09             ` Peter Xu
2024-09-10 16:06   ` Peter Xu
2024-09-19 19:49     ` Maciej S. Szmigiero
2024-09-19 21:18       ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support() Maciej S. Szmigiero
2024-08-30 18:55   ` Fabiano Rosas
2024-09-02 20:11     ` Maciej S. Szmigiero
2024-09-03 15:09       ` Fabiano Rosas
2024-08-27 17:54 ` [PATCH v2 14/17] migration: Add save_live_complete_precopy_thread handler Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 15/17] vfio/migration: Multifd device state transfer support - receive side Maciej S. Szmigiero
2024-09-09  8:55   ` Avihai Horon
2024-09-09 18:06     ` Maciej S. Szmigiero
2024-09-12  8:20       ` Avihai Horon
2024-09-12  8:45         ` Cédric Le Goater
2024-08-27 17:54 ` [PATCH v2 16/17] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 17/17] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2024-09-09 11:41   ` Avihai Horon
2024-09-09 18:07     ` Maciej S. Szmigiero
2024-09-12  8:26       ` Avihai Horon
2024-09-12  8:57         ` Cédric Le Goater
2024-08-28 20:46 ` [PATCH v2 00/17] Multifd 🔀 device state transfer support with VFIO consumer Fabiano Rosas
2024-08-28 21:58   ` Maciej S. Szmigiero
2024-08-29  0:51     ` Fabiano Rosas
2024-08-29 20:02       ` Maciej S. Szmigiero
2024-10-11 13:58 ` Cédric Le Goater
2024-10-15 21:12   ` Maciej S. Szmigiero

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zv8J1uAWIxzCgErh@x1n \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=berrange@redhat.com \
    --cc=clg@redhat.com \
    --cc=eblake@redhat.com \
    --cc=farosas@suse.de \
    --cc=joao.m.martins@oracle.com \
    --cc=mail@maciej.szmigiero.name \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).