qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Cédric Le Goater" <clg@redhat.com>
Cc: Avihai Horon <avihaih@nvidia.com>,
	qemu-devel@nongnu.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	Leonardo Bras <leobras@redhat.com>,
	Yanghang Liu <yanghliu@redhat.com>
Subject: Re: [PATCH 5/6] vfio/migration: Block VFIO migration with postcopy migration
Date: Wed, 30 Aug 2023 10:22:07 -0400	[thread overview]
Message-ID: <ZO9Qj/tbHqZ/h34z@x1n> (raw)
In-Reply-To: <95a37158-3ded-3930-ebf9-e33df4416cec@redhat.com>

On Wed, Aug 30, 2023 at 01:17:55PM +0200, Cédric Le Goater wrote:
> On 8/30/23 12:12, Avihai Horon wrote:
> > 
> > On 30/08/2023 12:53, Cédric Le Goater wrote:
> > > External email: Use caution opening links or attachments
> > > 
> > > 
> > > On 8/30/23 11:21, Avihai Horon wrote:
> > > > 
> > > > On 30/08/2023 11:37, Cédric Le Goater wrote:
> > > > > External email: Use caution opening links or attachments
> > > > > 
> > > > > 
> > > > > On 8/30/23 09:01, Avihai Horon wrote:
> > > > > > 
> > > > > > On 29/08/2023 21:27, Peter Xu wrote:
> > > > > > > External email: Use caution opening links or attachments
> > > > > > > 
> > > > > > > 
> > > > > > > On Tue, Aug 29, 2023 at 07:20:47PM +0300, Avihai Horon wrote:
> > > > > > > > On 29/08/2023 17:53, Peter Xu wrote:
> > > > > > > > > External email: Use caution opening links or attachments
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Mon, Aug 28, 2023 at 06:18:41PM +0300, Avihai Horon wrote:
> > > > > > > > > > diff --git a/migration/options.c b/migration/options.c
> > > > > > > > > > index 1d1e1321b0..e201053563 100644
> > > > > > > > > > --- a/migration/options.c
> > > > > > > > > > +++ b/migration/options.c
> > > > > > > > > > @@ -499,6 +499,11 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
> > > > > > > > > >                error_setg(errp, "Postcopy is not yet compatible with multifd");
> > > > > > > > > >                return false;
> > > > > > > > > >            }
> > > > > > > > > > +
> > > > > > > > > > +        if (migration_vfio_mig_active()) {
> > > > > > > > > > +            error_setg(errp, "Postcopy is not compatible with VFIO migration");
> > > > > > > > > > +            return false;
> > > > > > > > > > +        }
> > > > > > > > > Hmm.. this will add yet another vfio hard-coded line into migration/..
> > > > > > > > > 
> > > > > > > > > What will happen if the vfio device is hot plugged after enabling
> > > > > > > > > postcopy-ram here?
> > > > > > > > In that case a migration blocker will be added.
> > > > > > > > 
> > > > > > > > > Is it possible to do it in a generic way?
> > > > > > > > What comes to my mind is to let devices register a handler for a "caps
> > > > > > > > change" notification and allow them to object.
> > > > > > > > But maybe that's a bit of an overkill.
> > > > > > > This one also sounds better than hard-codes to me.
> > > > > > > 
> > > > > > > > > I was thinking the only unified place to do such check is when migration
> > > > > > > > > starts, as long as we switch to SETUP all caps are locked and doesn't allow
> > > > > > > > > any change until it finishes or fails.
> > > > > > > > > 
> > > > > > > > > So, can we do this check inside vfio_save_setup(), allow vfio_save_setup()
> > > > > > > > > to fail the whole migration early?  For example, maybe we should have an
> > > > > > > > > Error** passed in, then if it fails it calls migrate_set_error, so
> > > > > > > > > reflected in query-migrate later too.
> > > > > > > > Yes, I think this could work and it will simplify things because we could
> > > > > > > > also drop the VFIO migration blockers code.
> > > > > > > > The downside is that the user will know migration is blocked only when he
> > > > > > > > tries to migrate, and migrate_caps_check() will not block setting postcopy
> > > > > > > > when a VFIO device is already attached.
> > > > > > > > I don't have a strong opinion here, so if it's fine by you and everyone
> > > > > > > > else, I could change that to what you suggested.
> > > > > > > Failing later would be fine in this case to me; my expectation is VFIO
> > > > > > > users should be advanced already anyway (as the whole solution is still
> > > > > > > pretty involved comparing to a generic VM migration) and shouldn't try to
> > > > > > > trigger that at all in real life.  IOW I'd expect this check will be there
> > > > > > > just for sanity, rather than being relied on to let people be aware of it
> > > > > > > by the error message.
> > > > > > 
> > > > > > Yes, I agree with you.
> > > > > > 
> > > > > > > 
> > > > > > > Meanwhile the blocker + caps check is slightly complicated to me to guard
> > > > > > > both sides.  So I'd vote for failing at the QMP command. But we can wait
> > > > > > > and see whether there's other votes.
> > > > > > 
> > > > > > Sure.
> > > > > > So I will do the checking in vfio_save_setup(), unless someone else has a better idea.
> > > > > 
> > > > > Just to recap for my understanding,
> > > > > 
> > > > > vfio_save_setup() would test migrate_postcopy_ram() and update a new
> > > > > 'Error *err' parameter of the .save_setup() op which would be taken
> > > > > into account in qemu_savevm_state_setup(). Is that correct ?
> > > > > 
> > > > Yes.
> > > > But I wonder if it would be simpler to call migrate_set_error() directly from vfio_save_setup() instead of adding "Error *err" argument to .save_setup() and changing all other users.
> > > > What do you prefer?
> > > 
> > > Well, with my downstreamer hat, I would prefer a simpler solution for the
> > > VFIO postcopy limitation first. That said, there is value in adding
> > > a 'Error *' parameter to the .save_setup() op and letting the top routine
> > > qemu_savevm_state_setup() propagate. Other SaveVMhandler could start using
> > > it. even VFIO has multiple error_report() in vfio_save_setup() which could
> > > be propagated to the top callers.
> > > 
> > > Let's try that first. I will check your new series on top of 8.0
> > > 
> > OK, so just making sure, you want to add "Error *err" argument to .save_setup() first and see how it goes, right?
> 
> yes. Sorry. that was not clear.

I just remembered one pity of failing at save_setup() is it won't fail qmp
command "migrate" itself, but only reflected in query-migrate later.

If we want to make it even better (no strong opinion here, of now), we can
have it separate from save_setup(), e.g., SaveVMHandlers.save_prepare(), so
that it can be called even at migrate_prepare() and fail the QMP command
with proper errors.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2023-08-30 14:23 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-28 15:18 [PATCH 0/6] vfio/migration: Block VFIO migration with postcopy and background snapshot Avihai Horon
2023-08-28 15:18 ` [PATCH 1/6] migration: Add migration prefix to functions in target.c Avihai Horon
2023-08-29 13:23   ` Cédric Le Goater
2023-08-29 14:04   ` Peter Xu
2023-08-29 15:59     ` Avihai Horon
2023-08-28 15:18 ` [PATCH 2/6] vfio/migration: Fail adding device with enable-migration=on and existing blocker Avihai Horon
2023-08-29 13:23   ` Cédric Le Goater
2023-08-28 15:18 ` [PATCH 3/6] vfio/migration: Add vfio_migratable_devices_num() Avihai Horon
2023-08-29 13:24   ` Cédric Le Goater
2023-08-28 15:18 ` [PATCH 4/6] vfio/migration: Change vfio_mig_active() semantics Avihai Horon
2023-08-28 15:18 ` [PATCH 5/6] vfio/migration: Block VFIO migration with postcopy migration Avihai Horon
2023-08-29 13:24   ` Cédric Le Goater
2023-08-29 15:52     ` Avihai Horon
2023-08-29 14:53   ` Peter Xu
2023-08-29 16:20     ` Avihai Horon
2023-08-29 18:27       ` Peter Xu
2023-08-30  7:01         ` Avihai Horon
2023-08-30  8:37           ` Cédric Le Goater
2023-08-30  9:21             ` Avihai Horon
2023-08-30  9:53               ` Cédric Le Goater
2023-08-30 10:12                 ` Avihai Horon
2023-08-30 11:17                   ` Cédric Le Goater
2023-08-30 14:22                     ` Peter Xu [this message]
2023-08-30 16:06                       ` Avihai Horon
2023-08-28 15:18 ` [PATCH 6/6] vfio/migration: Block VFIO migration with background snapshot Avihai Horon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZO9Qj/tbHqZ/h34z@x1n \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=clg@redhat.com \
    --cc=leobras@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=yanghliu@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).