From: Peter Xu <peterx@redhat.com>
To: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
Cc: "Fabiano Rosas" <farosas@suse.de>,
"Alex Williamson" <alex.williamson@redhat.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Eric Blake" <eblake@redhat.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Avihai Horon" <avihaih@nvidia.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH v2 12/17] migration/multifd: Device state transfer support - send side
Date: Fri, 20 Sep 2024 13:09:49 -0400 [thread overview]
Message-ID: <Zu2sXYUmsrxUndtd@x1n> (raw)
In-Reply-To: <dbb9ce96-100b-4a95-8ce1-f0b8b5041046@maciej.szmigiero.name>
On Fri, Sep 20, 2024 at 05:23:20PM +0200, Maciej S. Szmigiero wrote:
> On 19.09.2024 23:17, Peter Xu wrote:
> > On Thu, Sep 19, 2024 at 09:49:43PM +0200, Maciej S. Szmigiero wrote:
> > > On 10.09.2024 21:48, Peter Xu wrote:
> > > > On Wed, Aug 28, 2024 at 09:41:17PM -0300, Fabiano Rosas wrote:
> > > > > > +size_t multifd_device_state_payload_size(void)
> > > > > > +{
> > > > > > + return sizeof(MultiFDDeviceState_t);
> > > > > > +}
> > > > >
> > > > > This will not be necessary because the payload size is the same as the
> > > > > data type. We only need it for the special case where the MultiFDPages_t
> > > > > is smaller than the total ram payload size.
> > > >
> > > > Today I was thinking maybe we should really clean this up, as the current
> > > > multifd_send_data_alloc() is indeed too tricky (blame me.. who requested
> > > > that more or less). Knowing that VFIO can use dynamic buffers with ->idstr
> > > > and ->buf (I was thinking it could be buf[1M].. but I was wrong...) made
> > > > that feeling stronger.
> > > >
> > > > I think we should change it now perhaps, otherwise we'll need to introduce
> > > > other helpers to e.g. reset the device buffers, and that's not only slow
> > > > but also not good looking, IMO.
> > > >
> > > > So I went ahead with the idea in previous discussion, that I managed to
> > > > change the SendData union into struct; the memory consumption is not super
> > > > important yet, IMHO, but we should still stick with the object model where
> > > > multifd enqueue thread switch buffer with multifd, as it still sounds a
> > > > sane way to do.
> > > >
> > > > Then when that patch is ready, I further tried to make VFIO reuse multifd
> > > > buffers just like what we do with MultiFDPages_t->offset[]: in RAM code we
> > > > don't allocate it every time we enqueue.
> > > >
> > > > I hope it'll also work for VFIO. VFIO has a specialty on being able to
> > > > dump the config space so it's more complex (and I noticed Maciej's current
> > > > design requires the final chunk of VFIO config data be migrated in one
> > > > packet.. that is also part of the complexity there). So I allowed that
> > > > part to allocate a buffer but only that. IOW, I made some API (see below)
> > > > that can either reuse preallocated buffer, or use a separate one only for
> > > > the final bulk.
> > > >
> > > > In short, could both of you have a look at what I came up with below? I
> > > > did that in patches because I think it's too much to comment, so patches
> > > > may work better. No concern if any of below could be good changes to you,
> > > > then either Maciej can squash whatever into existing patches (and I feel
> > > > like some existing patches in this series can go away with below design),
> > > > or I can post pre-requisite patch but only if any of you prefer that.
> > > >
> > > > Anyway, let me know, the patches apply on top of this whole series applied
> > > > first.
> > > >
> > > > I also wonder whether there can be any perf difference already (I tested
> > > > all multifd qtest with below, but no VFIO I can run), perhaps not that
> > > > much, but just to mention below should avoid both buffer allocations and
> > > > one round of copy (so VFIO read() directly writes to the multifd buffers
> > > > now).
> > >
> > > I am not against making MultiFDSendData a struct and maybe introducing
> > > some pre-allocated buffer.
> > >
> > > But to be honest, that manual memory management with having to remember
> > > to call multifd_device_state_finish() on error paths as in your
> > > proposed patch 3 really invites memory leaks.
> > >
> > > Will think about some other way to have a reusable buffer.
> >
> > Sure. That's patch 3, and I suppose then it looks like patch 1 is still
> > OK in one way or another.
> >
> > >
> > > In terms of not making idstr copy (your proposed patch 2) I am not
> > > 100% sure that avoiding such tiny allocation really justifies the risk
> > > of possible use-after-free of a dangling pointer.
> >
> > Why there's risk? Someone strdup() on the stack? That only goes via VFIO
> > itself, so I thought it wasn't that complicated. But yeah as I said this
> > part (patch 2) is optional.
>
> I mean the risk here is somebody providing idstr that somehow gets free'd
> or overwritten before the device state buffer gets sent.
>
> With a static idstr that's obviously not an issue, but I see that, for example,
> vmstate_register_with_alias_id() generates idstr dynamically and this API
> is used by all qdevs that have a VMSD (in device_set_realized()).
>
> > > Not 100% against it either if you are confident that it will never happen.
> > >
> > > By the way, I guess it makes sense to carry these changes in the main patch
> > > set rather than as a separate changes?
> >
> > Whatever you prefer.
> >
> > I wrote those patches only because I thought maybe you'd like to run some
> > perf test to see whether they would help at all, and when the patches are
> > there it'll be much easier for you, then you can decide whether it's worth
> > intergrating already, or leave that for later.
> >
> > If not I'd say they're even lower priority, so feel free to stick with
> > whatever easier for you. I'm ok there.
> >
> > However it'll be always good we can still have patch 1 as I mentioned
> > before (as part of your series, if you won't disagree), to make the
> > SendData interface slightly cleaner and easier to follow.
> >
>
> Will try to include these patches in my patch set if they don't cause any
> downtime regressions.
Thanks, note that it's not my request to have patch 2-3. :) Please take
your own judgement.
Again, I'd run a round of perf (only if my patches can directly run through
without heavy debugging on top of yours), if nothing shows a benefit I'd go
with what you have right now, and we drop patch 2-3 - they're not justified
if they provide zero perf benefit.
--
Peter Xu
next prev parent reply other threads:[~2024-09-20 17:11 UTC|newest]
Thread overview: 128+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-27 17:54 [PATCH v2 00/17] Multifd 🔀 device state transfer support with VFIO consumer Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 01/17] vfio/migration: Add save_{iterate, complete_precopy}_started trace events Maciej S. Szmigiero
2024-09-05 13:08 ` [PATCH v2 01/17] vfio/migration: Add save_{iterate,complete_precopy}_started " Avihai Horon
2024-09-09 18:04 ` Maciej S. Szmigiero
2024-09-11 14:50 ` Avihai Horon
2024-08-27 17:54 ` [PATCH v2 02/17] migration/ram: Add load start trace event Maciej S. Szmigiero
2024-08-28 18:44 ` Fabiano Rosas
2024-08-28 20:21 ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 03/17] migration/multifd: Zero p->flags before starting filling a packet Maciej S. Szmigiero
2024-08-28 18:50 ` Fabiano Rosas
2024-09-09 15:41 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 04/17] thread-pool: Add a DestroyNotify parameter to thread_pool_submit{, _aio)() Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 05/17] thread-pool: Implement non-AIO (generic) pool support Maciej S. Szmigiero
2024-09-02 22:07 ` Fabiano Rosas
2024-09-03 12:02 ` Maciej S. Szmigiero
2024-09-03 14:26 ` Fabiano Rosas
2024-09-03 18:14 ` Maciej S. Szmigiero
2024-09-03 13:55 ` Stefan Hajnoczi
2024-09-03 16:54 ` Maciej S. Szmigiero
2024-09-03 19:04 ` Stefan Hajnoczi
2024-09-09 16:45 ` Peter Xu
2024-09-09 18:38 ` Maciej S. Szmigiero
2024-09-09 19:12 ` Peter Xu
2024-09-09 19:16 ` Maciej S. Szmigiero
2024-09-09 19:24 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 06/17] migration: Add save_live_complete_precopy_{begin, end} handlers Maciej S. Szmigiero
2024-08-28 19:03 ` [PATCH v2 06/17] migration: Add save_live_complete_precopy_{begin,end} handlers Fabiano Rosas
2024-09-05 13:45 ` Avihai Horon
2024-09-09 17:59 ` Peter Xu
2024-09-09 18:32 ` Maciej S. Szmigiero
2024-09-09 19:08 ` Peter Xu
2024-09-09 19:32 ` Peter Xu
2024-09-19 19:48 ` Maciej S. Szmigiero
2024-09-19 19:47 ` Maciej S. Szmigiero
2024-09-19 20:54 ` Peter Xu
2024-09-20 15:22 ` Maciej S. Szmigiero
2024-09-20 16:08 ` Peter Xu
2024-09-09 18:05 ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 07/17] migration: Add qemu_loadvm_load_state_buffer() and its handler Maciej S. Szmigiero
2024-08-30 19:05 ` Fabiano Rosas
2024-09-05 14:15 ` Avihai Horon
2024-09-09 18:05 ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 08/17] migration: Add load_finish handler and associated functions Maciej S. Szmigiero
2024-08-30 19:28 ` Fabiano Rosas
2024-09-05 15:13 ` Avihai Horon
2024-09-09 18:05 ` Maciej S. Szmigiero
2024-09-09 20:03 ` Peter Xu
2024-09-19 19:49 ` Maciej S. Szmigiero
2024-09-19 21:11 ` Peter Xu
2024-09-20 15:23 ` Maciej S. Szmigiero
2024-09-20 16:45 ` Peter Xu
2024-09-26 22:34 ` Maciej S. Szmigiero
2024-09-27 0:53 ` Peter Xu
2024-09-30 19:25 ` Maciej S. Szmigiero
2024-09-30 21:57 ` Peter Xu
2024-10-01 20:41 ` Maciej S. Szmigiero
2024-10-01 21:30 ` Peter Xu
2024-10-02 20:11 ` Maciej S. Szmigiero
2024-10-02 21:25 ` Peter Xu
2024-10-03 20:34 ` Maciej S. Szmigiero
2024-10-03 21:17 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 09/17] migration/multifd: Device state transfer support - receive side Maciej S. Szmigiero
2024-08-30 20:22 ` Fabiano Rosas
2024-09-02 20:12 ` Maciej S. Szmigiero
2024-09-03 14:42 ` Fabiano Rosas
2024-09-03 18:41 ` Maciej S. Szmigiero
2024-09-09 19:52 ` Peter Xu
2024-09-19 19:49 ` Maciej S. Szmigiero
2024-09-05 16:47 ` Avihai Horon
2024-09-09 18:05 ` Maciej S. Szmigiero
2024-09-12 8:13 ` Avihai Horon
2024-09-12 13:52 ` Fabiano Rosas
2024-09-19 19:59 ` Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 10/17] migration/multifd: Convert multifd_send()::next_channel to atomic Maciej S. Szmigiero
2024-08-30 18:13 ` Fabiano Rosas
2024-09-02 20:11 ` Maciej S. Szmigiero
2024-09-03 15:01 ` Fabiano Rosas
2024-09-03 20:04 ` Maciej S. Szmigiero
2024-09-10 14:13 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 11/17] migration/multifd: Add an explicit MultiFDSendData destructor Maciej S. Szmigiero
2024-08-30 13:12 ` Fabiano Rosas
2024-08-27 17:54 ` [PATCH v2 12/17] migration/multifd: Device state transfer support - send side Maciej S. Szmigiero
2024-08-29 0:41 ` Fabiano Rosas
2024-08-29 20:03 ` Maciej S. Szmigiero
2024-08-30 13:02 ` Fabiano Rosas
2024-09-09 19:40 ` Peter Xu
2024-09-19 19:50 ` Maciej S. Szmigiero
2024-09-10 19:48 ` Peter Xu
2024-09-12 18:43 ` Fabiano Rosas
2024-09-13 0:23 ` Peter Xu
2024-09-13 13:21 ` Fabiano Rosas
2024-09-13 14:19 ` Peter Xu
2024-09-13 15:04 ` Fabiano Rosas
2024-09-13 15:22 ` Peter Xu
2024-09-13 18:26 ` Fabiano Rosas
2024-09-17 15:39 ` Peter Xu
2024-09-17 17:07 ` Cédric Le Goater
2024-09-17 17:50 ` Peter Xu
2024-09-19 19:51 ` Maciej S. Szmigiero
2024-09-19 19:49 ` Maciej S. Szmigiero
2024-09-19 21:17 ` Peter Xu
2024-09-20 15:23 ` Maciej S. Szmigiero
2024-09-20 17:09 ` Peter Xu [this message]
2024-09-10 16:06 ` Peter Xu
2024-09-19 19:49 ` Maciej S. Szmigiero
2024-09-19 21:18 ` Peter Xu
2024-08-27 17:54 ` [PATCH v2 13/17] migration/multifd: Add migration_has_device_state_support() Maciej S. Szmigiero
2024-08-30 18:55 ` Fabiano Rosas
2024-09-02 20:11 ` Maciej S. Szmigiero
2024-09-03 15:09 ` Fabiano Rosas
2024-08-27 17:54 ` [PATCH v2 14/17] migration: Add save_live_complete_precopy_thread handler Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 15/17] vfio/migration: Multifd device state transfer support - receive side Maciej S. Szmigiero
2024-09-09 8:55 ` Avihai Horon
2024-09-09 18:06 ` Maciej S. Szmigiero
2024-09-12 8:20 ` Avihai Horon
2024-09-12 8:45 ` Cédric Le Goater
2024-08-27 17:54 ` [PATCH v2 16/17] vfio/migration: Add x-migration-multifd-transfer VFIO property Maciej S. Szmigiero
2024-08-27 17:54 ` [PATCH v2 17/17] vfio/migration: Multifd device state transfer support - send side Maciej S. Szmigiero
2024-09-09 11:41 ` Avihai Horon
2024-09-09 18:07 ` Maciej S. Szmigiero
2024-09-12 8:26 ` Avihai Horon
2024-09-12 8:57 ` Cédric Le Goater
2024-08-28 20:46 ` [PATCH v2 00/17] Multifd 🔀 device state transfer support with VFIO consumer Fabiano Rosas
2024-08-28 21:58 ` Maciej S. Szmigiero
2024-08-29 0:51 ` Fabiano Rosas
2024-08-29 20:02 ` Maciej S. Szmigiero
2024-10-11 13:58 ` Cédric Le Goater
2024-10-15 21:12 ` Maciej S. Szmigiero
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zu2sXYUmsrxUndtd@x1n \
--to=peterx@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=armbru@redhat.com \
--cc=avihaih@nvidia.com \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=eblake@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=mail@maciej.szmigiero.name \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).