All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Cc: Steven Sistare <steven.sistare@oracle.com>,
	"Chaney, Ben" <bchaney@akamai.com>,
	Jason Wang <jasowang@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	Fabiano Rosas <farosas@suse.de>,
	Hamza Khan <hamza.khan@nutanix.com>,
	qemu-devel@nongnu.org
Subject: Re: [RFC V2 0/8] Live update: tap and vhost
Date: Wed, 10 Sep 2025 12:58:00 -0400	[thread overview]
Message-ID: <aMGuGDnjsfo8wBU-@x1.local> (raw)
In-Reply-To: <b742f0e7-3ee6-4824-9713-b45ba390df1e@yandex-team.ru>

On Wed, Sep 10, 2025 at 12:35:10AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > > I wished devices could opt-in to provide its own model so that it is
> > > prepared to boot the QEMU without FDs being there and pause itself at that
> > > stage if a load would happen.
> > 
> > So, you suggest to postpone the initialization up to "start" even for "normal start"
> > of QEMU, to avoid these endless "if (we are in our special local-incoming/CPR mode)".
> > 
> > Actually, that's how normal migratable devices live: we don't have "if (incoming)" for
> > every step of initialization/start currently.
> > 
> > I'll see, could I apply the concept to TAP local migration series.
> 
> 
> Hmm, not so simple.
> 
> OK, my current series behave like this:
> 
> init:  if tap.local_incoming then do nothing else open(/dev/net/tun)
> 
> incoming migration: get fd, and continue initialization
> 
> 
> Assume, we want to avoid extra "if"s, and just postpone the initialization to vm start point, like
> 
> init: do nothing. set fd=-1
> 
> incmoing migration: get fd (if cap-fd-passing enabled)
> 
> start: open(), if fd==-1, continue initialization
> 
> 
> But that mean that we postpone possible errors up to start as well, when we cannot rollback the
> migration..

Yep, doesn't sound like a good idea.  We also don't want to slow down VM
starts.

> 
> 
> Alternatively, we can postpone open() to post-load.. But what for normal start of vm?
> 
> init: if INMIGRATE then do nothing, else open()
> 
> incoming: get fd (if cap-fd-passing)
> 
> post-load: open(), if fd==-1, continue initialization
> 
> start: if fd is still -1, open(), continue initialization
> 
> that avoids extra tap.local_incoming option, but:
> 
> - seems even more complicated
> - open() and some initialization is done in downtime, when we don't enable cap-fd-passing
> 
> 
> So, now I think, that my current approach with additional "local-incoming" per-device option is better.
> 
> What do you think?
> 
> 
> Probably I'm trying to optimize wrong "if". As "if local-incomging .." in generic layer is a lot
> more expensive than checking the options in device code.
> 
> But the idea is generic: for non-fd migration, we do as much initialization at start as possible,

AFAIU, the non-fd migrations works simply because the portion that VMSD
loads will always be over-writeable.  When it's not, a pre_load() or
post_load() would make it work.

> to get early errors and to decrease further downtime. For fd migration, we postpone fd-initialization
> up to post-load stage. So, we have "if"s in device code to handle it, and we have "if"s in generic
> code to support device, which doesn't still have fully initialized backend (no fds during init).

What I meant is, IMHO we should try to not use things like
cpr_is_incoming() too deep into the device stack, and we should use it as
less frequent as possible.

In many cases, IIUC it's because the current device emulation code is not
yet separating the FD installation (and also whatever that can be relevant
to the FD) from the realize() process.  Hence a quick way to make it work
is to add cpr_is_incoming() or similar helpers either to skip some process,
or do something different with an existing FD.

If we can have device emulation be prepared with such, in an ideal world
and just to show what I am thinking.. it could be:

  - realize()
    - realize_frontend()
    - if migration is incoming, and backend should be postponed (for fd
      loading, or maybe something else)?
      - ... realize_backend() postponed until post_load()...
    - else
      - realize_backend()

If all of the devices would support such split of realize() process
v.s. FDs / backends, _maybe_ we can remove all cpr_is_incoming() but move
it upper and upper until qdev code, like:

device_set_realized():
        if (migration_incoming_XXX() && dc->realize_prepare) {
            /*
             * This is only part of realize(), rest done in a separate VMSD
             * post_load().
             */
            dc->realize_prepare(dev, &local_err);
            if (local_err != NULL) {
                goto fail;
            }
        } else if (dc->realize) {
            dc->realize(dev, &local_err);
            if (local_err != NULL) {
                goto fail;
            }
        }

In general, that "whether is incoming fd migration" concept will be passed
down from higher the stack, rather than randomly checked very deep in
stack.  That should IMHO make code more maintenable.

But that's only my two cents.. so please take that with a grain of salt.  I
don't really know device code well to say.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2025-09-10 16:59 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-17 18:39 [RFC V2 0/8] Live update: tap and vhost Steve Sistare
2025-07-17 18:39 ` [RFC V2 1/8] migration: stop vm earlier for cpr Steve Sistare
2025-07-17 18:39 ` [RFC V2 2/8] migration: cpr setup notifier Steve Sistare
2025-07-17 18:39 ` [RFC V2 3/8] vhost: reset vhost devices for cpr Steve Sistare
2025-08-27 11:29   ` Vladimir Sementsov-Ogievskiy
2025-08-27 18:38     ` Steven Sistare
2025-07-17 18:39 ` [RFC V2 4/8] cpr: delete all fds Steve Sistare
2025-07-17 18:39 ` [RFC V2 5/8] Revert "vhost-backend: remove vhost_kernel_reset_device()" Steve Sistare
2025-08-22 18:26   ` Steven Sistare
2025-07-17 18:39 ` [RFC V2 6/8] tap: common return label Steve Sistare
2025-07-17 18:39 ` [RFC V2 7/8] tap: cpr support Steve Sistare
2025-07-17 18:39 ` [RFC V2 8/8] tap: postload fix for cpr Steve Sistare
2025-07-18  8:48 ` [RFC V2 0/8] Live update: tap and vhost Lei Yang
2025-07-18 17:31   ` Steven Sistare
2025-07-24  5:46   ` Lei Yang
2025-08-05 13:54 ` Fabiano Rosas
2025-08-05 19:53   ` Steven Sistare
2025-08-06 15:51     ` Peter Xu
2025-08-11 18:24     ` Steven Sistare
2025-08-23 21:53 ` Vladimir Sementsov-Ogievskiy
2025-08-28 15:48   ` Steven Sistare
2025-08-29 19:37     ` Steven Sistare
2025-09-01 11:44       ` Vladimir Sementsov-Ogievskiy
2025-09-02 15:33         ` Steven Sistare
2025-09-02 17:09           ` Vladimir Sementsov-Ogievskiy
2025-09-05 16:16             ` Peter Xu
2025-09-08  9:55               ` Vladimir Sementsov-Ogievskiy
2025-09-09 21:35                 ` Vladimir Sementsov-Ogievskiy
2025-09-10 16:58                   ` Peter Xu [this message]
2025-09-10 17:27                     ` Vladimir Sementsov-Ogievskiy
2025-10-30 16:52 ` Ben Chaney
2025-10-30 17:07   ` Peter Xu
2025-10-30 18:24     ` Chaney, Ben
  -- strict thread matches above, loose matches on Subject: below --
2025-08-18 15:04 Chaney, Ben
2025-08-22 18:26 ` Steven Sistare

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMGuGDnjsfo8wBU-@x1.local \
    --to=peterx@redhat.com \
    --cc=bchaney@akamai.com \
    --cc=farosas@suse.de \
    --cc=hamza.khan@nutanix.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=sgarzare@redhat.com \
    --cc=steven.sistare@oracle.com \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.