qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: kwolf@redhat.com,
	"Denis Plotnikov" <den-plotnikov@yandex-team.ru>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org,
	raphael.norwitz@nutanix.com,
	"Roman Kagan" <rvkagan@yandex-team.ru>,
	yc-core@yandex-team.ru, pbonzini@redhat.com,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>
Subject: Re: [PATCH v0 0/2] virtio-blk and vhost-user-blk cross-device migration
Date: Wed, 6 Oct 2021 04:17:36 -0400	[thread overview]
Message-ID: <20211006041419-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <YV1ZuizhQ5gO9nd6@work-vm>

On Wed, Oct 06, 2021 at 09:09:30AM +0100, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
> > On Tue, Oct 05, 2021 at 12:10:08PM -0400, Eduardo Habkost wrote:
> > > On Tue, Oct 05, 2021 at 03:01:05PM +0100, Dr. David Alan Gilbert wrote:
> > > > * Michael S. Tsirkin (mst@redhat.com) wrote:
> > > > > On Tue, Oct 05, 2021 at 02:18:40AM +0300, Roman Kagan wrote:
> > > > > > On Mon, Oct 04, 2021 at 11:11:00AM -0400, Michael S. Tsirkin wrote:
> > > > > > > On Mon, Oct 04, 2021 at 06:07:29PM +0300, Denis Plotnikov wrote:
> > > > > > > > It might be useful for the cases when a slow block layer should be replaced
> > > > > > > > with a more performant one on running VM without stopping, i.e. with very low
> > > > > > > > downtime comparable with the one on migration.
> > > > > > > > 
> > > > > > > > It's possible to achive that for two reasons:
> > > > > > > > 
> > > > > > > > 1.The VMStates of "virtio-blk" and "vhost-user-blk" are almost the same.
> > > > > > > >   They consist of the identical VMSTATE_VIRTIO_DEVICE and differs from
> > > > > > > >   each other in the values of migration service fields only.
> > > > > > > > 2.The device driver used in the guest is the same: virtio-blk
> > > > > > > > 
> > > > > > > > In the series cross-migration is achieved by adding a new type.
> > > > > > > > The new type uses virtio-blk VMState instead of vhost-user-blk specific
> > > > > > > > VMstate, also it implements migration save/load callbacks to be compatible
> > > > > > > > with migration stream produced by "virtio-blk" device.
> > > > > > > > 
> > > > > > > > Adding the new type instead of modifying the existing one is convenent.
> > > > > > > > It ease to differ the new virtio-blk-compatible vhost-user-blk
> > > > > > > > device from the existing non-compatible one using qemu machinery without any
> > > > > > > > other modifiactions. That gives all the variety of qemu device related
> > > > > > > > constraints out of box.
> > > > > > > 
> > > > > > > Hmm I'm not sure I understand. What is the advantage for the user?
> > > > > > > What if vhost-user-blk became an alias for vhost-user-virtio-blk?
> > > > > > > We could add some hacks to make it compatible for old machine types.
> > > > > > 
> > > > > > The point is that virtio-blk and vhost-user-blk are not
> > > > > > migration-compatible ATM.  OTOH they are the same device from the guest
> > > > > > POV so there's nothing fundamentally preventing the migration between
> > > > > > the two.  In particular, we see it as a means to switch between the
> > > > > > storage backend transports via live migration without disrupting the
> > > > > > guest.
> > > > > > 
> > > > > > Migration-wise virtio-blk and vhost-user-blk have in common
> > > > > > 
> > > > > > - the content of the VMState -- VMSTATE_VIRTIO_DEVICE
> > > > > > 
> > > > > > The two differ in
> > > > > > 
> > > > > > - the name and the version of the VMStateDescription
> > > > > > 
> > > > > > - virtio-blk has an extra migration section (via .save/.load callbacks
> > > > > >   on VirtioDeviceClass) containing requests in flight
> > > > > > 
> > > > > > It looks like to become migration-compatible with virtio-blk,
> > > > > > vhost-user-blk has to start using VMStateDescription of virtio-blk and
> > > > > > provide compatible .save/.load callbacks.  It isn't entirely obvious how
> > > > > > to make this machine-type-dependent, so we came up with a simpler idea
> > > > > > of defining a new device that shares most of the implementation with the
> > > > > > original vhost-user-blk except for the migration stuff.  We're certainly
> > > > > > open to suggestions on how to reconcile this under a single
> > > > > > vhost-user-blk device, as this would be more user-friendly indeed.
> > > > > > 
> > > > > > We considered using a class property for this and defining the
> > > > > > respective compat clause, but IIUC the class constructors (where .vmsd
> > > > > > and .save/.load are defined) are not supposed to depend on class
> > > > > > properties.
> > > > > > 
> > > > > > Thanks,
> > > > > > Roman.
> > > > > 
> > > > > So the question is how to make vmsd depend on machine type.
> > > > > CC Eduardo who poked at this kind of compat stuff recently,
> > > > > paolo who looked at qom things most recently and dgilbert
> > > > > for advice on migration.
> > > > 
> > > > I don't think I've seen anyone change vmsd name dependent on machine
> > > > type; making fields appear/disappear is easy - that just ends up as a
> > > > property on the device that's checked;  I guess if that property is
> > > > global (rather than per instance) then you can check it in
> > > > vhost_user_blk_class_init and swing the dc->vmsd pointer?
> > > 
> > > class_init can be called very early during QEMU initialization,
> > > so it's too early to make decisions based on machine type.
> > > 
> > > Making a specific vmsd appear/disappear based on machine
> > > configuration or state is "easy", by implementing
> > > VMStateDescription.needed.  But this would require registering
> > > both vmsds (one of them would need to be registered manually
> > > instead of using DeviceClass.vmsd).
> > > 
> > > I don't remember what are the consequences of not using
> > > DeviceClass.vmsd to register a vmsd, I only remember it was
> > > subtle.  See commit b170fce3dd06 ("cpu: Register
> > > VMStateDescription through CPUState") and related threads.  CCing
> > > Philippe, who might remember the details here.
> > > 
> > > If that's an important use case, I would suggest allowing devices
> > > to implement a DeviceClass.get_vmsd method, which would override
> > > DeviceClass.vmsd if necessary.  Is the problem we're trying to
> > > address worth the additional complexity?
> > 
> > The tricky part is that we generally dont support migration when
> > command line is different on source and destination ...
> 
> The reality has always been a bit more subtle than that.
> For example, it's fine if the path to a block device is different on the
> source and destination; or if it's accessed by iSCSI on the destination
> say.  As long as what the guest sees, and the migration stream carries
> are the same, then in principal it's OK - but that does start getting
> trickier; also it would prboably get interesting to let libvirt know
> that this combo is OK.

I agree, but that's not the same as specifying a different
device. Yes we internally they are compatible, but
this is a detail users/tools generally won't be able to
figure out.

> > So maybe the actual answer is that vhost-user-blk should really
> > be a drive supplied to a virtio blk device, not a device
> > itself?
> > This way it's sane, and also matches what we do e.g. for net.
> 
> Hmm a bit of a fudge; it's not quite the same as a drive is it; there's
> almost another layer split in there.
> 
> Dave

We can make it something else, not "drive=". Maybe simply "vhost-user=" ?
Point is if we promise it looks the same to guest it should be the
same -device.


> > -- 
> > MST
> > 
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2021-10-06  8:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-04 15:07 [PATCH v0 0/2] virtio-blk and vhost-user-blk cross-device migration Denis Plotnikov
2021-10-04 15:07 ` [PATCH v0 1/2] vhost-user-blk: add a new vhost-user-virtio-blk type Denis Plotnikov
2021-10-04 15:16   ` Michael S. Tsirkin
2021-10-04 15:07 ` [PATCH v0 2/2] vhost-user-blk-pci: add new pci device type to support vhost-user-virtio-blk Denis Plotnikov
2021-10-04 15:11 ` [PATCH v0 0/2] virtio-blk and vhost-user-blk cross-device migration Michael S. Tsirkin
2021-10-04 23:18   ` Roman Kagan
2021-10-05  6:51     ` Michael S. Tsirkin
2021-10-05 14:01       ` Dr. David Alan Gilbert
2021-10-05 16:10         ` Eduardo Habkost
2021-10-05 22:06           ` Michael S. Tsirkin
2021-10-06  8:09             ` Dr. David Alan Gilbert
2021-10-06  8:17               ` Michael S. Tsirkin [this message]
2021-10-06  8:28                 ` Dr. David Alan Gilbert
2021-10-06  8:36                   ` Michael S. Tsirkin
2021-10-06  8:43                     ` Dr. David Alan Gilbert
2021-10-06 12:18                       ` Michael S. Tsirkin
2021-10-06 13:29                         ` Dr. David Alan Gilbert
2021-10-06 13:39                           ` Michael S. Tsirkin
2021-10-06 14:27                             ` Dr. David Alan Gilbert
2021-10-06 14:37                               ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211006041419-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=den-plotnikov@yandex-team.ru \
    --cc=dgilbert@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=raphael.norwitz@nutanix.com \
    --cc=rvkagan@yandex-team.ru \
    --cc=yc-core@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).