Re: [Virtio-fs] [PATCH] vhost-user-fs: add capability to allow migration

public inbox for virtio-fs@lists.linux.dev
 help / color / mirror / Atom feed

From: Stefan Hajnoczi <stefanha@redhat.com>
To: Anton Kuchin <antonkuchin@yandex-team.ru>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, virtio-fs@redhat.com,
	Markus Armbruster <armbru@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	Juan Quintela <quintela@redhat.com>,
	yc-core@yandex-team.ru
Subject: Re: [Virtio-fs] [PATCH] vhost-user-fs: add capability to allow migration
Date: Mon, 23 Jan 2023 14:49:39 -0500	[thread overview]
Message-ID: <Y87k0wBnHuf5Oktp@fedora> (raw)
In-Reply-To: <21b87a0d-99b1-2755-00de-d1201d85a63e@yandex-team.ru>

[-- Attachment #1: Type: text/plain, Size: 5770 bytes --]

On Mon, Jan 23, 2023 at 05:52:17PM +0200, Anton Kuchin wrote:
> 
> On 23/01/2023 16:09, Stefan Hajnoczi wrote:
> > On Sun, 22 Jan 2023 at 11:18, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > On Sun, Jan 22, 2023 at 06:09:40PM +0200, Anton Kuchin wrote:
> > > > On 22/01/2023 16:46, Michael S. Tsirkin wrote:
> > > > > On Sun, Jan 22, 2023 at 02:36:04PM +0200, Anton Kuchin wrote:
> > > > > > > > This flag should be set when qemu don't need to worry about any
> > > > > > > > external state stored in vhost-user daemons during migration:
> > > > > > > > don't fail migration, just pack generic virtio device states to
> > > > > > > > migration stream and orchestrator guarantees that the rest of the
> > > > > > > > state will be present at the destination to restore full context and
> > > > > > > > continue running.
> > > > > > > Sorry  I still do not get it.  So fundamentally, why do we need this property?
> > > > > > > vhost-user-fs is not created by default that we'd then need opt-in to
> > > > > > > the special "migrateable" case.
> > > > > > > That's why I said it might make some sense as a device property as qemu
> > > > > > > tracks whether device is unplugged for us.
> > > > > > > 
> > > > > > > But as written, if you are going to teach the orchestrator about
> > > > > > > vhost-user-fs and its special needs, just teach it when to migrate and
> > > > > > > where not to migrate.
> > > > > > > 
> > > > > > > Either we describe the special situation to qemu and let qemu
> > > > > > > make an intelligent decision whether to allow migration,
> > > > > > > or we trust the orchestrator. And if it's the latter, then 'migrate'
> > > > > > > already says orchestrator decided to migrate.
> > > > > > The problem I'm trying to solve is that most of vhost-user devices
> > > > > > now block migration in qemu. And this makes sense since qemu can't
> > > > > > extract and transfer backend daemon state. But this prevents us from
> > > > > > updating qemu executable via local migration. So this flag is
> > > > > > intended more as a safety check that says "I know what I'm doing".
> > > > > > 
> > > > > > I agree that it is not really necessary if we trust the orchestrator
> > > > > > to request migration only when the migration can be performed in a
> > > > > > safe way. But changing the current behavior of vhost-user-fs from
> > > > > > "always blocks migration" to "migrates partial state whenever
> > > > > > orchestrator requests it" seems a little  dangerous and can be
> > > > > > misinterpreted as full support for migration in all cases.
> > > > > It's not really different from block is it? orchestrator has to arrange
> > > > > for backend migration. I think we just assumed there's no use-case where
> > > > > this is practical for vhost-user-fs so we blocked it.
> > > > > But in any case it's orchestrator's responsibility.
> > > > Yes, you are right. So do you think we should just drop the blocker
> > > > without adding a new flag?
> > > I'd be inclined to. I am curious what do dgilbert and stefanha think though.
> > If the migration blocker is removed, what happens when a user attempts
> > to migrate with a management tool and/or vhost-user-fs server
> > implementation that don't support migration?
> 
> There will be no matching fuse-session in destination endpoint so all
> requests to this fs will fail until it is remounted from guest to
> send new FUSE_INIT message that does session setup.

The point of the migration blocker is to prevent breaking running
guests. Situations where a migration completes but results in a broken
guest are problematic for users (especially when they are not logged in
to guests and able to fix them interactively).

If a command-line option is set to override the blocker, that's fine.
But there needs to be a blocker by default if external knowledge is
required to decide whether or not it's safe to migrate.

> > 
> > Anton: Can you explain how stateless migration will work on the
> > vhost-user-fs back-end side? Is it reusing vhost-user reconnect
> > functionality or introducing a new mode for stateless migration? I
> > guess the vhost-user-fs back-end implementation is required to
> > implement VHOST_F_LOG_ALL so dirty memory can be tracked and drain all
> > in-flight requests when vrings are stopped?
> 
> It reuses existing vhost-user reconnect code to resubmit inflight
> requests.
> Sure, backend needs to support this feature - presence of required
> features is checked by generic vhost and vhost-user code during init
> and if something is missing migration blocker is assigned to the
> device (not a static one in vmstate that I remove in this patch, but
> other per-device kind of blocker).

This is not enough detail. Please post the QEMU patches before we commit
to a user-visible vhost-user-fs command-line parameter.

I think what you're trying is a new approach that can be made to work.
However, both vhost-user and migration are fragile and you have not
explained how it will work. I don't have confidence in merging this
incrementally because I'm afraid of committing to user-visible or
vhost-user protocol behavior that turns out to be broken just a little
while later.

The kind of detail I was hoping to hear was, for example, how
vhost_user_blk_device_realize() blocks and tries to reconnect 3 times.
Does this approach work for stateless migration? The destination QEMU is
launched before the source QEMU disconnects from the vhost-user UNIX
domain socket, so I guess the destination QEMU cannot connect in the
current version of vhost-user reconnect as implemented by QEMU's
vhost-user-blk device. Have you come up with a new handover protocol?

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

next prev parent reply	other threads:[~2023-01-23 19:49 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-15 17:09 [Virtio-fs] [PATCH] vhost-user-fs: add capability to allow migration Anton Kuchin
2023-01-18 15:52 ` Stefan Hajnoczi
2023-01-19 12:43   ` Anton Kuchin
2023-01-19 14:30     ` Stefan Hajnoczi
2023-01-19 15:29       ` Anton Kuchin
2023-01-19 16:02         ` Stefan Hajnoczi
2023-01-19 16:58           ` Anton Kuchin
2023-01-19 20:40             ` Stefan Hajnoczi
2023-02-01 14:26             ` Juan Quintela
2023-02-02  0:54               ` Anton Kuchin
2023-02-02  9:59                 ` Juan Quintela
2023-02-10 14:09                   ` Anton Kuchin
2023-02-10 16:08                     ` Juan Quintela
2023-02-16 21:00                       ` Stefan Hajnoczi
2023-01-19 12:51 ` Michael S. Tsirkin
2023-01-19 13:45   ` Anton Kuchin
2023-01-19 19:00     ` Dr. David Alan Gilbert
2023-01-19 20:47       ` Anton Kuchin
2023-01-20 13:58     ` Michael S. Tsirkin
2023-01-20 17:37       ` Anton Kuchin
2023-01-22  8:16         ` Michael S. Tsirkin
2023-01-22 12:36           ` Anton Kuchin
2023-01-22 14:46             ` Michael S. Tsirkin
2023-01-22 16:09               ` Anton Kuchin
2023-01-22 16:17                 ` Michael S. Tsirkin
2023-01-23 14:09                   ` Stefan Hajnoczi
2023-01-23 15:52                     ` Anton Kuchin
2023-01-23 19:49                       ` Stefan Hajnoczi [this message]
2023-01-23 21:00                         ` Michael S. Tsirkin
2023-01-23 21:56                           ` Stefan Hajnoczi
2023-01-23 18:27                   ` Dr. David Alan Gilbert
2023-01-23 19:53                     ` Stefan Hajnoczi
2023-01-24  1:46                       ` Stefan Hajnoczi
2023-01-24  9:50                         ` Dr. David Alan Gilbert
2023-01-24 12:48                           ` Stefan Hajnoczi
2023-02-01 14:37                             ` Juan Quintela
2023-01-25 19:46 ` Stefan Hajnoczi
2023-01-26 14:20   ` Anton Kuchin
2023-01-26 15:13     ` Stefan Hajnoczi
2023-01-26 15:21       ` Anton Kuchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y87k0wBnHuf5Oktp@fedora \
    --to=stefanha@redhat.com \
    --cc=antonkuchin@yandex-team.ru \
    --cc=armbru@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@gmail.com \
    --cc=virtio-fs@redhat.com \
    --cc=yc-core@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox