From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: John G Johnson <john.g.johnson@oracle.com>,
mtsirkin@redhat.com, quintela@redhat.com,
Jason Wang <jasowang@redhat.com>,
Felipe Franciosi <felipe@nutanix.com>,
Kirti Wankhede <kwankhede@nvidia.com>,
qemu-devel@nongnu.org,
Alex Williamson <alex.williamson@redhat.com>,
Thanos Makatos <thanos.makatos@nutanix.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: VFIO Migration
Date: Tue, 3 Nov 2020 15:23:03 +0000 [thread overview]
Message-ID: <20201103152303.GN205187@redhat.com> (raw)
In-Reply-To: <20201103150508.GB253848@stefanha-x1.localdomain>
On Tue, Nov 03, 2020 at 03:05:08PM +0000, Stefan Hajnoczi wrote:
> On Tue, Nov 03, 2020 at 11:39:29AM +0000, Daniel P. Berrangé wrote:
> > On Mon, Nov 02, 2020 at 11:11:53AM +0000, Stefan Hajnoczi wrote:
> > > Overview
> > > --------
> > > The purpose of device states is to save the device at a point in time and then
> > > restore the device back to the saved state later. This is more challenging than
> > > it first appears.
> > >
> > > The process of saving a device state and loading it later is called
> > > *migration*. The state may be loaded by the same device that saved it or by a
> > > new instance of the device, possibly running on a different computer.
> > >
> > > It must be possible to migrate to a newer implementation of the device
> > > as well as to an older implementation of the device. This allows users
> > > to upgrade and roll back their systems.
> > >
> > > Migration can fail if loading the device state is not possible. It should fail
> > > early with a clear error message. It must not appear to complete but leave the
> > > device inoperable due to a migration problem.
> >
> > I think there needs to be an addition requirement.
> >
> > It must be possible for a management application to query the supported
> > versions, independantly of execution of a migration operation.
> >
> > This is important to large scale data center / cloud management applications
> > because before initiating a migration they need to *automatically* select
> > a target host with high level of confidence that is will be compatible with
> > the source host.
> >
> > Today QEMU migration compatibility is largely determined by the machine
> > type version. Apps can query the supported machine types for host to
> > check whether it is compatible. Similarly they will query CPU model
> > features to check compatiblity.
> >
> > Validation and error checking at time of migration is of course still
> > required, but the goal should be that an mgmt application will *NEVER*
> > hit these errors because they will have pre-selected a host that is
> > known to be compatible based on reported versions that are supported.
>
> Okay. What do you think of the following?
>
> [
> {
> "model": "https://qemu.org/devices/e1000e",
> "params": [
> "rss",
> ...more configuration parameters...
> ],
> "versions": [
> {
> "name": "1",
> "params": [],
> },
> {
> "name": "2",
> "params": ["rss=on"],
> },
> ...more versions...
> ]
> },
> ...more device models...
> ]
>
> The management tool can generate the configuration parameter list by
> expanding a version into its params.
>
> Configuration parameter types and input ranges need more thought. For
> example, version 1 of the device might not have rx-table-size (it's
> effectively 0). Version 2 introduces rx-table-size and sets it to 32.
> Version 3 raises the value to 64. In addition, the user can set a custom
> value like rx-table-size=48. I haven't defined the rules for this yet,
> but it's clear there needs to be a way to extend configuration
> parameters.
>
> To check migration compatibility:
> 1. Verify that the device model URL matches the JSON data[n].model
> field.
> 2. For every configuration parameter name from the source device,
> check that it is contained within the JSON data[n].params list.
I'm not convinced that this makes sense. A matching set of parameter
names + values does not imply that the migration data stream is
actually compatible.
ie implementations may need to change the internal migration data
stream to fix bugs, without adding/removing a config parameter.
The migration version string alone expresses data stream compatibility.
This is similar to how 2 QEMU command lines can have identical set
of configuration parameters, aside from the machine type version,
and thus be migration *incompatible.
Basically the version string should be considered an opaque blob
that expresses compatibility on its own.
> > > VFIO Implementation
> > > -------------------
> > > The following applies both to kernel VFIO/mdev drivers and vfio-user device
> > > backends.
> > >
> > > Devices are instantiated based on a version and/or configuration parameters:
> > > * ``version=1`` - use the device configuration aliased by version 1
> > > * ``version=2,rx-filter-size=64`` - use version 1 and override ``rx-filter-size``
> > > * ``rx-filter-size=0`` - directly set configuration parameters without using a version
> > >
> > > Device creation fails if the version and/or configuration parameters are not
> > > supported.
> > >
> > > There must be a mechanism to query the "latest" configuration for a device
> > > model. It may simply report the ``version=5`` where 5 is the latest version but
> > > it could also report all configuration parameters instead of using a version
> > > alias.
> >
> > The mechanism needs to be able to report all supported versions strings,
> > not simple the latest version string. I think we need to specify the
> > actual mechanism todo this query too, because we can't end up in a place
> > where there's a different approach to queries for each device type.
>
> Makes sense.
>
> Stefan
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2020-11-03 15:24 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-02 11:11 VFIO Migration Stefan Hajnoczi
2020-11-02 12:28 ` Cornelia Huck
2020-11-02 14:56 ` Stefan Hajnoczi
2020-11-04 8:07 ` Gerd Hoffmann
2020-11-04 16:40 ` Stefan Hajnoczi
2020-11-05 6:47 ` Gerd Hoffmann
2020-11-05 11:42 ` Stefan Hajnoczi
2020-11-02 19:38 ` Alex Williamson
2020-11-03 11:03 ` Stefan Hajnoczi
2020-11-03 17:13 ` Alex Williamson
2020-11-03 18:09 ` Stefan Hajnoczi
2020-11-05 23:37 ` Yan Zhao
2020-11-03 8:46 ` Jason Wang
2020-11-03 12:15 ` Stefan Hajnoczi
2020-11-04 3:32 ` Jason Wang
2020-11-04 7:16 ` Stefan Hajnoczi
2020-11-03 11:39 ` Daniel P. Berrangé
2020-11-03 15:05 ` Stefan Hajnoczi
2020-11-03 15:23 ` Daniel P. Berrangé [this message]
2020-11-03 18:16 ` Stefan Hajnoczi
2020-11-03 12:17 ` Dr. David Alan Gilbert
2020-11-03 15:27 ` Stefan Hajnoczi
2020-11-03 18:49 ` Dr. David Alan Gilbert
2020-11-04 7:36 ` Stefan Hajnoczi
2020-11-04 10:14 ` Dr. David Alan Gilbert
2020-11-04 16:47 ` Stefan Hajnoczi
2020-11-04 17:32 ` Dr. David Alan Gilbert
2020-11-05 11:40 ` Stefan Hajnoczi
2020-11-05 12:13 ` Dr. David Alan Gilbert
2020-11-05 12:47 ` Michael S. Tsirkin
2020-11-05 14:17 ` Dr. David Alan Gilbert
2020-11-05 12:53 ` Michael S. Tsirkin
2020-11-04 11:05 ` Christophe de Dinechin
2020-11-03 15:23 ` Christophe de Dinechin
2020-11-03 15:33 ` Daniel P. Berrangé
2020-11-03 17:31 ` Alex Williamson
2020-11-04 10:13 ` Stefan Hajnoczi
2020-11-04 11:10 ` Stefan Hajnoczi
2020-11-04 7:50 ` Michael S. Tsirkin
2020-11-04 16:37 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201103152303.GN205187@redhat.com \
--to=berrange@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=dgilbert@redhat.com \
--cc=felipe@nutanix.com \
--cc=jasowang@redhat.com \
--cc=john.g.johnson@oracle.com \
--cc=kwankhede@nvidia.com \
--cc=mtsirkin@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@redhat.com \
--cc=thanos.makatos@nutanix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).