Re: [PATCH] vhost-user-fs: add capability to allow migration

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Anton Kuchin <antonkuchin@yandex-team.ru>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, virtio-fs@redhat.com,
	Markus Armbruster <armbru@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	yc-core@yandex-team.ru
Subject: Re: [PATCH] vhost-user-fs: add capability to allow migration
Date: Thu, 19 Jan 2023 19:00:40 +0000	[thread overview]
Message-ID: <Y8mTWMVb9d2frxcx@work-vm> (raw)
In-Reply-To: <f9993404-f8b8-7a23-37f8-530313783466@yandex-team.ru>

* Anton Kuchin (antonkuchin@yandex-team.ru) wrote:
> On 19/01/2023 14:51, Michael S. Tsirkin wrote:
> > On Sun, Jan 15, 2023 at 07:09:03PM +0200, Anton Kuchin wrote:
> > > Now any vhost-user-fs device makes VM unmigratable, that also prevents
> > > qemu update without stopping the VM. In most cases that makes sense
> > > because qemu has no way to transfer FUSE session state.
> > > 
> > > But we can give an option to orchestrator to override this if it can
> > > guarantee that state will be preserved (e.g. it uses migration to
> > > update qemu and dst will run on the same host as src and use the same
> > > socket endpoints).
> > > 
> > > This patch keeps default behavior that prevents migration with such devices
> > > but adds migration capability 'vhost-user-fs' to explicitly allow migration.
> > > 
> > > Signed-off-by: Anton Kuchin <antonkuchin@yandex-team.ru>
> > > ---
> > >   hw/virtio/vhost-user-fs.c | 25 ++++++++++++++++++++++++-
> > >   qapi/migration.json       |  7 ++++++-
> > >   2 files changed, 30 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> > > index f5049735ac..13d920423e 100644
> > > --- a/hw/virtio/vhost-user-fs.c
> > > +++ b/hw/virtio/vhost-user-fs.c
> > > @@ -24,6 +24,7 @@
> > >   #include "hw/virtio/vhost-user-fs.h"
> > >   #include "monitor/monitor.h"
> > >   #include "sysemu/sysemu.h"
> > > +#include "migration/migration.h"
> > >   static const int user_feature_bits[] = {
> > >       VIRTIO_F_VERSION_1,
> > > @@ -298,9 +299,31 @@ static struct vhost_dev *vuf_get_vhost(VirtIODevice *vdev)
> > >       return &fs->vhost_dev;
> > >   }
> > > +static int vhost_user_fs_pre_save(void *opaque)
> > > +{
> > > +    MigrationState *s = migrate_get_current();
> > > +
> > > +    if (!s->enabled_capabilities[MIGRATION_CAPABILITY_VHOST_USER_FS]) {
> > > +        error_report("Migration of vhost-user-fs devices requires internal FUSE "
> > > +                     "state of backend to be preserved. If orchestrator can "
> > > +                     "guarantee this (e.g. dst connects to the same backend "
> > > +                     "instance or backend state is migrated) set 'vhost-user-fs' "
> > > +                     "migration capability to true to enable migration.");
> > Isn't it possible that some backends are same and some are not?
> > Shouldn't this be a device property then?
> If some are not the same it is not guaranteed that correct FUSE
> state is present there, so orchestrator shouldn't set the capability
> because this can result in destination devices being broken (they'll
> be fine after the remount in guest, but this is guest visible and is
> not acceptable).

Shouldn't this be something that's negotiated as a feature bit in the
vhost-user protocol - i.e. to say whether the device can do migration?

However, I think what you're saying is that a migration might only be
doable in some cases - e.g. a local migration - and that's hard for
either qemu or the daemon to discover by itself; so it's right that
an orchestrator enables it.
(I'm not sure the error_report message needs to be quite so verbose -
normally a one line clear message is enough that people can then look
up).

> I can imagine smart orchestrator and backend that can transfer
> internal FUSE state, but we are not there yet, and this would be
> their responsibility then to ensure endpoint compatibility between src
> and dst and set the capability (that's why I put "e.g." and "or" in
> the error description).

You also need the vhost-user device to be able to do the dirty bitmap
updates; is that being checked somewhere?

Dave

> > 
> > 
> > 
> > > +        return -1;
> > > +    }
> > > +
> > > +    return 0;
> > > +}
> > > +
> > >   static const VMStateDescription vuf_vmstate = {
> > >       .name = "vhost-user-fs",
> > > -    .unmigratable = 1,
> > > +    .minimum_version_id = 0,
> > > +    .version_id = 0,
> > > +    .fields = (VMStateField[]) {
> > > +        VMSTATE_VIRTIO_DEVICE,
> > > +        VMSTATE_END_OF_LIST()
> > > +    },
> > > +   .pre_save = vhost_user_fs_pre_save,
> > >   };
> > >   static Property vuf_properties[] = {
> > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > index 88ecf86ac8..9a229ea884 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -477,6 +477,11 @@
> > >   #                    will be handled faster.  This is a performance feature and
> > >   #                    should not affect the correctness of postcopy migration.
> > >   #                    (since 7.1)
> > > +# @vhost-user-fs: If enabled, the migration process will allow migration of
> > > +#                 vhost-user-fs devices, this should be enabled only when
> > > +#                 backend can preserve local FUSE state e.g. for qemu update
> > > +#                 when dst reconects to the same endpoints after migration.
> > > +#                 (since 8.0)
> > >   #
> > >   # Features:
> > >   # @unstable: Members @x-colo and @x-ignore-shared are experimental.
> > > @@ -492,7 +497,7 @@
> > >              'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
> > >              { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
> > >              'validate-uuid', 'background-snapshot',
> > > -           'zero-copy-send', 'postcopy-preempt'] }
> > > +           'zero-copy-send', 'postcopy-preempt', 'vhost-user-fs'] }
> > I kind of dislike that it's such a specific flag. Is only vhost-user-fs
> > ever going to be affected? Any way to put it in a way that is more generic?
> Here I agree with you: I would prefer less narrow naming too. But I
> didn't manage to come up with one. Looks like many other vhost-user
> devices could benefit from this so maybe "vhost-user-stateless" or
> something like this would be better.
> I'm not sure that other types of devices could handle reconnect to
> the old endpoint as easy as vhost-user-fs, but anyway the support for
> this flag needs to be implemented for each device individually.
> What do you think? Any ideas would be appreciated.
> 
> > 
> > 
> > >   ##
> > >   # @MigrationCapabilityStatus:
> > > -- 
> > > 2.34.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

next prev parent reply	other threads:[~2023-01-19 19:01 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-15 17:09 [PATCH] vhost-user-fs: add capability to allow migration Anton Kuchin
2023-01-18 15:52 ` Stefan Hajnoczi
2023-01-19 12:43   ` Anton Kuchin
2023-01-19 14:30     ` Stefan Hajnoczi
2023-01-19 15:29       ` Anton Kuchin
2023-01-19 16:02         ` Stefan Hajnoczi
2023-01-19 16:58           ` Anton Kuchin
2023-01-19 20:40             ` Stefan Hajnoczi
2023-02-01 14:26             ` Juan Quintela
2023-02-02  0:54               ` Anton Kuchin
2023-02-02  9:59                 ` Juan Quintela
2023-02-10 14:09                   ` Anton Kuchin
2023-02-10 16:08                     ` Juan Quintela
2023-02-16 21:00                       ` Stefan Hajnoczi
2023-01-19 12:51 ` Michael S. Tsirkin
2023-01-19 13:45   ` Anton Kuchin
2023-01-19 19:00     ` Dr. David Alan Gilbert [this message]
2023-01-19 20:47       ` Anton Kuchin
2023-01-20 13:58     ` Michael S. Tsirkin
2023-01-20 17:37       ` Anton Kuchin
2023-01-22  8:16         ` Michael S. Tsirkin
2023-01-22 12:36           ` Anton Kuchin
2023-01-22 14:46             ` Michael S. Tsirkin
2023-01-22 16:09               ` Anton Kuchin
2023-01-22 16:17                 ` Michael S. Tsirkin
2023-01-23 14:09                   ` Stefan Hajnoczi
2023-01-23 15:52                     ` Anton Kuchin
2023-01-23 19:49                       ` Stefan Hajnoczi
2023-01-23 21:00                         ` Michael S. Tsirkin
2023-01-23 21:56                           ` Stefan Hajnoczi
2023-01-23 18:27                   ` Dr. David Alan Gilbert
2023-01-23 19:53                     ` Stefan Hajnoczi
2023-01-24  1:46                       ` Stefan Hajnoczi
2023-01-24  9:50                         ` Dr. David Alan Gilbert
2023-01-24 12:48                           ` Stefan Hajnoczi
2023-02-01 14:37                             ` Juan Quintela
2023-01-25 19:46 ` Stefan Hajnoczi
2023-01-26 14:20   ` Anton Kuchin
2023-01-26 15:13     ` Stefan Hajnoczi
2023-01-26 15:21       ` Anton Kuchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8mTWMVb9d2frxcx@work-vm \
    --to=dgilbert@redhat.com \
    --cc=antonkuchin@yandex-team.ru \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-fs@redhat.com \
    --cc=yc-core@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).