All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Anton Kuchin <antonkuchin@yandex-team.ru>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, virtio-fs@redhat.com,
	Markus Armbruster <armbru@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	yc-core@yandex-team.ru
Subject: Re: [Virtio-fs] [PATCH] vhost-user-fs: add capability to allow migration
Date: Thu, 19 Jan 2023 19:00:40 +0000	[thread overview]
Message-ID: <Y8mTWMVb9d2frxcx@work-vm> (raw)
In-Reply-To: <f9993404-f8b8-7a23-37f8-530313783466@yandex-team.ru>

* Anton Kuchin (antonkuchin@yandex-team.ru) wrote:
> On 19/01/2023 14:51, Michael S. Tsirkin wrote:
> > On Sun, Jan 15, 2023 at 07:09:03PM +0200, Anton Kuchin wrote:
> > > Now any vhost-user-fs device makes VM unmigratable, that also prevents
> > > qemu update without stopping the VM. In most cases that makes sense
> > > because qemu has no way to transfer FUSE session state.
> > > 
> > > But we can give an option to orchestrator to override this if it can
> > > guarantee that state will be preserved (e.g. it uses migration to
> > > update qemu and dst will run on the same host as src and use the same
> > > socket endpoints).
> > > 
> > > This patch keeps default behavior that prevents migration with such devices
> > > but adds migration capability 'vhost-user-fs' to explicitly allow migration.
> > > 
> > > Signed-off-by: Anton Kuchin <antonkuchin@yandex-team.ru>
> > > ---
> > >   hw/virtio/vhost-user-fs.c | 25 ++++++++++++++++++++++++-
> > >   qapi/migration.json       |  7 ++++++-
> > >   2 files changed, 30 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> > > index f5049735ac..13d920423e 100644
> > > --- a/hw/virtio/vhost-user-fs.c
> > > +++ b/hw/virtio/vhost-user-fs.c
> > > @@ -24,6 +24,7 @@
> > >   #include "hw/virtio/vhost-user-fs.h"
> > >   #include "monitor/monitor.h"
> > >   #include "sysemu/sysemu.h"
> > > +#include "migration/migration.h"
> > >   static const int user_feature_bits[] = {
> > >       VIRTIO_F_VERSION_1,
> > > @@ -298,9 +299,31 @@ static struct vhost_dev *vuf_get_vhost(VirtIODevice *vdev)
> > >       return &fs->vhost_dev;
> > >   }
> > > +static int vhost_user_fs_pre_save(void *opaque)
> > > +{
> > > +    MigrationState *s = migrate_get_current();
> > > +
> > > +    if (!s->enabled_capabilities[MIGRATION_CAPABILITY_VHOST_USER_FS]) {
> > > +        error_report("Migration of vhost-user-fs devices requires internal FUSE "
> > > +                     "state of backend to be preserved. If orchestrator can "
> > > +                     "guarantee this (e.g. dst connects to the same backend "
> > > +                     "instance or backend state is migrated) set 'vhost-user-fs' "
> > > +                     "migration capability to true to enable migration.");
> > Isn't it possible that some backends are same and some are not?
> > Shouldn't this be a device property then?
> If some are not the same it is not guaranteed that correct FUSE
> state is present there, so orchestrator shouldn't set the capability
> because this can result in destination devices being broken (they'll
> be fine after the remount in guest, but this is guest visible and is
> not acceptable).

Shouldn't this be something that's negotiated as a feature bit in the
vhost-user protocol - i.e. to say whether the device can do migration?

However, I think what you're saying is that a migration might only be
doable in some cases - e.g. a local migration - and that's hard for
either qemu or the daemon to discover by itself; so it's right that
an orchestrator enables it.
(I'm not sure the error_report message needs to be quite so verbose -
normally a one line clear message is enough that people can then look
up).

> I can imagine smart orchestrator and backend that can transfer
> internal FUSE state, but we are not there yet, and this would be
> their responsibility then to ensure endpoint compatibility between src
> and dst and set the capability (that's why I put "e.g." and "or" in
> the error description).

You also need the vhost-user device to be able to do the dirty bitmap
updates; is that being checked somewhere?

Dave

> > 
> > 
> > 
> > > +        return -1;
> > > +    }
> > > +
> > > +    return 0;
> > > +}
> > > +
> > >   static const VMStateDescription vuf_vmstate = {
> > >       .name = "vhost-user-fs",
> > > -    .unmigratable = 1,
> > > +    .minimum_version_id = 0,
> > > +    .version_id = 0,
> > > +    .fields = (VMStateField[]) {
> > > +        VMSTATE_VIRTIO_DEVICE,
> > > +        VMSTATE_END_OF_LIST()
> > > +    },
> > > +   .pre_save = vhost_user_fs_pre_save,
> > >   };
> > >   static Property vuf_properties[] = {
> > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > index 88ecf86ac8..9a229ea884 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -477,6 +477,11 @@
> > >   #                    will be handled faster.  This is a performance feature and
> > >   #                    should not affect the correctness of postcopy migration.
> > >   #                    (since 7.1)
> > > +# @vhost-user-fs: If enabled, the migration process will allow migration of
> > > +#                 vhost-user-fs devices, this should be enabled only when
> > > +#                 backend can preserve local FUSE state e.g. for qemu update
> > > +#                 when dst reconects to the same endpoints after migration.
> > > +#                 (since 8.0)
> > >   #
> > >   # Features:
> > >   # @unstable: Members @x-colo and @x-ignore-shared are experimental.
> > > @@ -492,7 +497,7 @@
> > >              'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
> > >              { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
> > >              'validate-uuid', 'background-snapshot',
> > > -           'zero-copy-send', 'postcopy-preempt'] }
> > > +           'zero-copy-send', 'postcopy-preempt', 'vhost-user-fs'] }
> > I kind of dislike that it's such a specific flag. Is only vhost-user-fs
> > ever going to be affected? Any way to put it in a way that is more generic?
> Here I agree with you: I would prefer less narrow naming too. But I
> didn't manage to come up with one. Looks like many other vhost-user
> devices could benefit from this so maybe "vhost-user-stateless" or
> something like this would be better.
> I'm not sure that other types of devices could handle reconnect to
> the old endpoint as easy as vhost-user-fs, but anyway the support for
> this flag needs to be implemented for each device individually.
> What do you think? Any ideas would be appreciated.
> 
> > 
> > 
> > >   ##
> > >   # @MigrationCapabilityStatus:
> > > -- 
> > > 2.34.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


WARNING: multiple messages have this Message-ID (diff)
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Anton Kuchin <antonkuchin@yandex-team.ru>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, virtio-fs@redhat.com,
	Markus Armbruster <armbru@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	yc-core@yandex-team.ru
Subject: Re: [PATCH] vhost-user-fs: add capability to allow migration
Date: Thu, 19 Jan 2023 19:00:40 +0000	[thread overview]
Message-ID: <Y8mTWMVb9d2frxcx@work-vm> (raw)
In-Reply-To: <f9993404-f8b8-7a23-37f8-530313783466@yandex-team.ru>

* Anton Kuchin (antonkuchin@yandex-team.ru) wrote:
> On 19/01/2023 14:51, Michael S. Tsirkin wrote:
> > On Sun, Jan 15, 2023 at 07:09:03PM +0200, Anton Kuchin wrote:
> > > Now any vhost-user-fs device makes VM unmigratable, that also prevents
> > > qemu update without stopping the VM. In most cases that makes sense
> > > because qemu has no way to transfer FUSE session state.
> > > 
> > > But we can give an option to orchestrator to override this if it can
> > > guarantee that state will be preserved (e.g. it uses migration to
> > > update qemu and dst will run on the same host as src and use the same
> > > socket endpoints).
> > > 
> > > This patch keeps default behavior that prevents migration with such devices
> > > but adds migration capability 'vhost-user-fs' to explicitly allow migration.
> > > 
> > > Signed-off-by: Anton Kuchin <antonkuchin@yandex-team.ru>
> > > ---
> > >   hw/virtio/vhost-user-fs.c | 25 ++++++++++++++++++++++++-
> > >   qapi/migration.json       |  7 ++++++-
> > >   2 files changed, 30 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/hw/virtio/vhost-user-fs.c b/hw/virtio/vhost-user-fs.c
> > > index f5049735ac..13d920423e 100644
> > > --- a/hw/virtio/vhost-user-fs.c
> > > +++ b/hw/virtio/vhost-user-fs.c
> > > @@ -24,6 +24,7 @@
> > >   #include "hw/virtio/vhost-user-fs.h"
> > >   #include "monitor/monitor.h"
> > >   #include "sysemu/sysemu.h"
> > > +#include "migration/migration.h"
> > >   static const int user_feature_bits[] = {
> > >       VIRTIO_F_VERSION_1,
> > > @@ -298,9 +299,31 @@ static struct vhost_dev *vuf_get_vhost(VirtIODevice *vdev)
> > >       return &fs->vhost_dev;
> > >   }
> > > +static int vhost_user_fs_pre_save(void *opaque)
> > > +{
> > > +    MigrationState *s = migrate_get_current();
> > > +
> > > +    if (!s->enabled_capabilities[MIGRATION_CAPABILITY_VHOST_USER_FS]) {
> > > +        error_report("Migration of vhost-user-fs devices requires internal FUSE "
> > > +                     "state of backend to be preserved. If orchestrator can "
> > > +                     "guarantee this (e.g. dst connects to the same backend "
> > > +                     "instance or backend state is migrated) set 'vhost-user-fs' "
> > > +                     "migration capability to true to enable migration.");
> > Isn't it possible that some backends are same and some are not?
> > Shouldn't this be a device property then?
> If some are not the same it is not guaranteed that correct FUSE
> state is present there, so orchestrator shouldn't set the capability
> because this can result in destination devices being broken (they'll
> be fine after the remount in guest, but this is guest visible and is
> not acceptable).

Shouldn't this be something that's negotiated as a feature bit in the
vhost-user protocol - i.e. to say whether the device can do migration?

However, I think what you're saying is that a migration might only be
doable in some cases - e.g. a local migration - and that's hard for
either qemu or the daemon to discover by itself; so it's right that
an orchestrator enables it.
(I'm not sure the error_report message needs to be quite so verbose -
normally a one line clear message is enough that people can then look
up).

> I can imagine smart orchestrator and backend that can transfer
> internal FUSE state, but we are not there yet, and this would be
> their responsibility then to ensure endpoint compatibility between src
> and dst and set the capability (that's why I put "e.g." and "or" in
> the error description).

You also need the vhost-user device to be able to do the dirty bitmap
updates; is that being checked somewhere?

Dave

> > 
> > 
> > 
> > > +        return -1;
> > > +    }
> > > +
> > > +    return 0;
> > > +}
> > > +
> > >   static const VMStateDescription vuf_vmstate = {
> > >       .name = "vhost-user-fs",
> > > -    .unmigratable = 1,
> > > +    .minimum_version_id = 0,
> > > +    .version_id = 0,
> > > +    .fields = (VMStateField[]) {
> > > +        VMSTATE_VIRTIO_DEVICE,
> > > +        VMSTATE_END_OF_LIST()
> > > +    },
> > > +   .pre_save = vhost_user_fs_pre_save,
> > >   };
> > >   static Property vuf_properties[] = {
> > > diff --git a/qapi/migration.json b/qapi/migration.json
> > > index 88ecf86ac8..9a229ea884 100644
> > > --- a/qapi/migration.json
> > > +++ b/qapi/migration.json
> > > @@ -477,6 +477,11 @@
> > >   #                    will be handled faster.  This is a performance feature and
> > >   #                    should not affect the correctness of postcopy migration.
> > >   #                    (since 7.1)
> > > +# @vhost-user-fs: If enabled, the migration process will allow migration of
> > > +#                 vhost-user-fs devices, this should be enabled only when
> > > +#                 backend can preserve local FUSE state e.g. for qemu update
> > > +#                 when dst reconects to the same endpoints after migration.
> > > +#                 (since 8.0)
> > >   #
> > >   # Features:
> > >   # @unstable: Members @x-colo and @x-ignore-shared are experimental.
> > > @@ -492,7 +497,7 @@
> > >              'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
> > >              { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
> > >              'validate-uuid', 'background-snapshot',
> > > -           'zero-copy-send', 'postcopy-preempt'] }
> > > +           'zero-copy-send', 'postcopy-preempt', 'vhost-user-fs'] }
> > I kind of dislike that it's such a specific flag. Is only vhost-user-fs
> > ever going to be affected? Any way to put it in a way that is more generic?
> Here I agree with you: I would prefer less narrow naming too. But I
> didn't manage to come up with one. Looks like many other vhost-user
> devices could benefit from this so maybe "vhost-user-stateless" or
> something like this would be better.
> I'm not sure that other types of devices could handle reconnect to
> the old endpoint as easy as vhost-user-fs, but anyway the support for
> this flag needs to be implemented for each device individually.
> What do you think? Any ideas would be appreciated.
> 
> > 
> > 
> > >   ##
> > >   # @MigrationCapabilityStatus:
> > > -- 
> > > 2.34.1
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2023-01-19 19:00 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-15 17:09 [Virtio-fs] [PATCH] vhost-user-fs: add capability to allow migration Anton Kuchin
2023-01-15 17:09 ` Anton Kuchin
2023-01-18 15:52 ` [Virtio-fs] " Stefan Hajnoczi
2023-01-18 15:52   ` Stefan Hajnoczi
2023-01-19 12:43   ` [Virtio-fs] " Anton Kuchin
2023-01-19 12:43     ` Anton Kuchin
2023-01-19 14:30     ` [Virtio-fs] " Stefan Hajnoczi
2023-01-19 14:30       ` Stefan Hajnoczi
2023-01-19 15:29       ` [Virtio-fs] " Anton Kuchin
2023-01-19 15:29         ` Anton Kuchin
2023-01-19 16:02         ` [Virtio-fs] " Stefan Hajnoczi
2023-01-19 16:02           ` Stefan Hajnoczi
2023-01-19 16:58           ` [Virtio-fs] " Anton Kuchin
2023-01-19 16:58             ` Anton Kuchin
2023-01-19 20:40             ` [Virtio-fs] " Stefan Hajnoczi
2023-01-19 20:40               ` Stefan Hajnoczi
2023-02-01 14:26             ` [Virtio-fs] " Juan Quintela
2023-02-01 14:26               ` Juan Quintela
2023-02-02  0:54               ` [Virtio-fs] " Anton Kuchin
2023-02-02  0:54                 ` Anton Kuchin
2023-02-02  9:59                 ` [Virtio-fs] " Juan Quintela
2023-02-02  9:59                   ` Juan Quintela
2023-02-10 14:09                   ` [Virtio-fs] " Anton Kuchin
2023-02-10 14:09                     ` Anton Kuchin
2023-02-10 16:08                     ` [Virtio-fs] " Juan Quintela
2023-02-10 16:08                       ` Juan Quintela
2023-02-16 21:00                       ` [Virtio-fs] " Stefan Hajnoczi
2023-02-16 21:00                         ` Stefan Hajnoczi
2023-01-19 12:51 ` [Virtio-fs] " Michael S. Tsirkin
2023-01-19 12:51   ` Michael S. Tsirkin
2023-01-19 13:45   ` [Virtio-fs] " Anton Kuchin
2023-01-19 13:45     ` Anton Kuchin
2023-01-19 19:00     ` Dr. David Alan Gilbert [this message]
2023-01-19 19:00       ` Dr. David Alan Gilbert
2023-01-19 20:47       ` [Virtio-fs] " Anton Kuchin
2023-01-19 20:47         ` Anton Kuchin
2023-01-20 13:58     ` [Virtio-fs] " Michael S. Tsirkin
2023-01-20 13:58       ` Michael S. Tsirkin
2023-01-20 17:37       ` [Virtio-fs] " Anton Kuchin
2023-01-20 17:37         ` Anton Kuchin
2023-01-22  8:16         ` [Virtio-fs] " Michael S. Tsirkin
2023-01-22  8:16           ` Michael S. Tsirkin
2023-01-22 12:36           ` [Virtio-fs] " Anton Kuchin
2023-01-22 12:36             ` Anton Kuchin
2023-01-22 14:46             ` [Virtio-fs] " Michael S. Tsirkin
2023-01-22 14:46               ` Michael S. Tsirkin
2023-01-22 16:09               ` [Virtio-fs] " Anton Kuchin
2023-01-22 16:09                 ` Anton Kuchin
2023-01-22 16:17                 ` [Virtio-fs] " Michael S. Tsirkin
2023-01-22 16:17                   ` Michael S. Tsirkin
2023-01-23 14:09                   ` [Virtio-fs] " Stefan Hajnoczi
2023-01-23 14:09                     ` Stefan Hajnoczi
2023-01-23 15:52                     ` [Virtio-fs] " Anton Kuchin
2023-01-23 15:52                       ` Anton Kuchin
2023-01-23 19:49                       ` [Virtio-fs] " Stefan Hajnoczi
2023-01-23 19:49                         ` Stefan Hajnoczi
2023-01-23 21:00                         ` [Virtio-fs] " Michael S. Tsirkin
2023-01-23 21:00                           ` Michael S. Tsirkin
2023-01-23 21:56                           ` [Virtio-fs] " Stefan Hajnoczi
2023-01-23 21:56                             ` Stefan Hajnoczi
2023-01-23 18:27                   ` [Virtio-fs] " Dr. David Alan Gilbert
2023-01-23 18:27                     ` Dr. David Alan Gilbert
2023-01-23 19:53                     ` [Virtio-fs] " Stefan Hajnoczi
2023-01-23 19:53                       ` Stefan Hajnoczi
2023-01-24  1:46                       ` [Virtio-fs] " Stefan Hajnoczi
2023-01-24  1:46                         ` Stefan Hajnoczi
2023-01-24  9:50                         ` [Virtio-fs] " Dr. David Alan Gilbert
2023-01-24  9:50                           ` Dr. David Alan Gilbert
2023-01-24 12:48                           ` [Virtio-fs] " Stefan Hajnoczi
2023-01-24 12:48                             ` Stefan Hajnoczi
2023-02-01 14:37                             ` [Virtio-fs] " Juan Quintela
2023-02-01 14:37                               ` Juan Quintela
2023-01-25 19:46 ` [Virtio-fs] " Stefan Hajnoczi
2023-01-25 19:46   ` Stefan Hajnoczi
2023-01-26 14:20   ` [Virtio-fs] " Anton Kuchin
2023-01-26 14:20     ` Anton Kuchin
2023-01-26 15:13     ` [Virtio-fs] " Stefan Hajnoczi
2023-01-26 15:13       ` Stefan Hajnoczi
2023-01-26 15:21       ` [Virtio-fs] " Anton Kuchin
2023-01-26 15:21         ` Anton Kuchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8mTWMVb9d2frxcx@work-vm \
    --to=dgilbert@redhat.com \
    --cc=antonkuchin@yandex-team.ru \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=virtio-fs@redhat.com \
    --cc=yc-core@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.