qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2] VFIO Migration
@ 2020-11-05 15:09 Stefan Hajnoczi
  2020-11-05 15:52 ` Cornelia Huck
  2020-11-05 19:37 ` Alex Williamson
  0 siblings, 2 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2020-11-05 15:09 UTC (permalink / raw)
  To: qemu-devel
  Cc: John G Johnson, Tian, Kevin, mtsirkin, Daniel P. Berrangé,
	quintela, Jason Wang, Zeng, Xin, Kirti Wankhede,
	Dr. David Alan Gilbert, Yan Zhao, Alex Williamson, Paolo Bonzini,
	Gerd Hoffmann, Felipe Franciosi, Christophe de Dinechin,
	Thanos Makatos

[-- Attachment #1: Type: text/plain, Size: 14461 bytes --]

v2 (big change, please reread everything):
* Replace URIs with Go-style <domain>/<path> strings
* Replace configuration parameters with migration parameters. The semantics are
  different; they only describe migration compatibility and do not capture all
  device configuration. This makes it easier to explain the purpose of
  parameters and also logically separates device instantiation from migration
  compatibility checking.
* Describe how to achieve subsection semantics
* Add hint that device internal state should be as general as possible to allow
  different device implementations
* Drop versions, they added complexity and aren't necessary for the migration
  compatibility check
* Add first draft VFIO/mdev sysfs attr interface

VFIO Migration
==============
This document describes how to ensure migration compatibility for VFIO devices,
including VFIO/mdev and vfio-user devices.

Overview
--------
VFIO devices can save and load a *device state*. Saving a device state produces
a snapshot of a VFIO device's state that can be loaded again at a later point
in time to resume the device from the snapshot.

The process of saving a device state and loading it later is called
*migration*. The device state may be loaded by the same device instance that
saved it or by a new instance, possibly running on a different machine.

A VFIO/mdev driver together with the physical device provides the functionality
of a device. Alternatively, a vfio-user device emulation program can provide
the functionality of a device. These are called *device implementations*.

The device implementation where a migration originates is called the *source*
and the device implementation that a migration targets is called the
*destination*.

This document describes how to establish whether or not migration compatibility
exists between the source and destination. When compatibility has been
established, the probability of migrating successfully is high and a successful
migration does not leave the device inoperable due to silent migration
problems.

Migration Parameters
--------------------
*Migration parameters* are used to describe characteristics that must match
between source and destination to achieve migration compatibility.

The first implementation of a simple device may not require migration
parameters if the source and destination are always compatible. As the device
evolves, the source and destination may differ and migration parameters are
required to express this variation. More complex devices may require migration
parameters from the start due to optional functionality that is not guaranteed
to be present in both source and destination.

A migration parameter consists of a name and a value. The name is a UTF-8
string that does not contain equals ('='), backslash ('/'), or whitespace
characters. The value is a UTF-8 string that does not contain whitespace
characters.

The meaning of the migration parameter and its possible values are specific to
the device, but values are generally based on one of the following types:
* Boolean (on/off)
* Integers (0, 1, 2, ...)
* Enumerations (red, green, blue, ...)
* Character strings

Migration parameters are conventionally formatted as <name>=<value> strings.
Examples include my-feature=on and num-queues=4.

The absence of a migration parameter must have the same effect as before the
migration parameter was introduced. For example, if my-feature=on|off is added
to control the availability of a new device feature, then my-feature=off is
equivalent to omitting the migration parameter.

Hardware Interface Compatibility
--------------------------------
VFIO devices have a *hardware interface* consisting of device regions and
interrupts. Aspects of the hardware interface can vary between device
implementations and require migration parameters to express migration
compatibility requirements.

Examples of migration parameters include:
* Feature availability - feature bitmasks, hardware revision numbers, etc. If
  the destination may lack support for optional features or hardware interface
  revisions, then migration parameters are required.
* Functionality - hardware register blocks that are only present on certain
  device instances. If there are multiple devices sub-models that have
  different hardware interfaces then migration parameters are required.

These examples demonstrate aspects of the hardware interface that must not
change unexpectedly. Were they to differ between source and destination, the
chance of device driver malfunction would be high because the layout of the
hardware interface would change or assumptions the device driver makes about
available functionality would be violated. Migration parameters are used to
preserve the hardware interface across migration and explicitly represent
variations between device implementations.

Hardware interfaces sometimes support reporting an event when a change occurs.
In those cases it may be possible to support visible changes in the hardware
interface across migration. In most other cases migration must not result in a
visible change in the hardware interface.

Migration parameters are not necessary for read-only values exposed through the
hardware interface, such as MAC address EEPROMs or serial numbers, so long as
all device implementations can be configured with the same range of input
values for these read-only values. This is possible because migration
parameters do not capture the full configuration of the device, only aspects
that affect migration compatibility.

Device configuration that is not visible through the hardware interface, such
as a host file system path of a disk image file or the physical network port
assigned to a network card, usually does not require migration parameters
because those values are not visible through the hardware interface and can be
changed without breaking migration compatibility.

The disk image file may indirectly affect the hardware interface, for example
by constraining the device's block size. In this case a block-size=N migration
parameter is required to ensure migration compatibility, but the host file
system path of the disk image file still does not require a migration
parameter.

Device State Representation
---------------------------
Device state contains both data accessible through the device's hardware
interface and device-internal state needed to restore device operation.

The contents of hardware registers are usually included in the device state if
they can change at runtime. Hardware registers with constant or computed data
may not need to be part of the device state provided that device
implementations can produce the necessary data.

Device-internal state includes the portion of the device's state that cannot be
reconstructed from the hardware interface alone. Defining device-internal state
in the most general way instead of exposing device implementation details
allows for flexibility in the future. For example, device implementations often
maintain a ring index, which is not available through the hardware interface,
to keep track of which ring elements have already been consumed. The ring index
must be included in the device state so that the destination can resume
processing from the correct point in the ring. Representing this as an index
into the ring in the hardware interface is more general than adding device
implementation-specific request tracking data structures into the device state.

The *device state representation* defines the binary data layout of the device
state. The device state representation is specific to each device and is beyond
the scope of this document, but aspects pertaining to migration compatibility
are discussed here.

Each change to the device state representation that affects migration
compatibility requires a migration parameter. When a new field is added to the
device state representation then a new migration parameter must be added to
reflect this change. Often a single migration parameter expresses both a change
to the hardware interface and the device state representation. It is also
possible to change the device state representation without changing the
hardware interface, for example when some state was forgotten while designing
the previous device state representation.

The device state representation may support extra data that can be safely
ignored by old device implementations. In this case migration compatibility is
unaffected and a migration parameter is not required to indicate such extra
data has been added.

Device Models
-------------
The combination of the hardware interface, device state representation, and
migration parameter definitions is called a *device model*. Device models are
identified by a unique UTF-8 string starting with a domain name and followed by
path components separated with backslashes ('/'). Examples include
vendor-a.com/my-nic, gitlab.com/user/my-device, virtio-spec.org/pci/virtio-net,
and qemu.org/pci/10ec/8139.

The unique device model string is not changed as the device evolves. Instead,
migration parameters are added to express variations in a device.

The device model is not tied to a specific device implementation. The same
device model could be implemented as a VFIO/dev driver or as a vfio-user device
emulation program.

Multiple device implementations can support the same device model. Doing so
means that the device implementations can offer migration compatiblity because
they support the same hardware interface, device state representation, and
migration parameters.

Multiple device models can exist for the same hardware interface, each with a
different device state representation and migration parameters. This makes it
possible to fork and independently develop device models.

Device models can evolve over time as the hardware interface and device state
representation change. The corresponding migration parameters ensure that
migration compatibility can be established between device implementations.

Orchestrating Migrations
------------------------
The following steps must be followed to migrate devices:

1. Check that the source and destination support the same device model.

2. Check that the destination supports the migration parameter list from the
   source.

3. Configure the destination so it is prepared to load the device state. This
   may involve instantiating a new device instance or resetting an existing
   device instance to a configuration that is compatible with the source.

   The migration parameter list may be used as part of this configuration, but
   note that not all of the configuration is captured in the migration
   parameter list. For example, the physical network port for a network card or
   the host file system path for a disk image file is typically not captured in
   the migration parameters and must be provided through other means.

4. Save the device state on the source and load it on the destination.

5. If migration succeeds then the destination resumes operation and the source
   must not resume operation. If the migration fails then the source resumes
   operation and the destination must not resume operation.

Note that these steps impose a conservative bound on device states that can be
migrated successfully. Not all configuration parameters may be strictly
required to match on the source and destination devices. For example, if the
device's hardware interface has not yet been initialized then changes to the
advertised features may not yet affect the device driver. However, accurately
representing runtime constraints is complex and risks introducing migration
bugs, so no attempt is made to support them.

VFIO/mdev Devices
-----------------
TODO this is a first draft, more thought needed around enumerating supported
parameters, representing default values, etc

The following mdev type sysfs attrs are available for managing device
instances:

  /sys/.../<parent-device>/mdev_supported_types/<type-id>/
      create - writing a UUID to this file instantiates a device
      migration/ - migration related files
          model - unique device model string, e.g. vendor-a.com/my-nic

Device models supported by an mdev driver can be enumerated by reading the
migration/model attr for each <type-id>.

The following mdev device sysfs attrs relate to a specific device instance:

  /sys/.../<parent-device>/<uuid>/
      mdev_type/ - symlink to mdev type sysfs attrs, e.g. to fetch migration/model
      migration/ - migration related files
          applied - Write "1" to apply current migration parameter values or
                    "0" to reset migration parameter values to their defaults.
                    Parameters can only be applied or reset while the mdev is
                    not opened.
          params/ - migration parameters
              <my-param> - read/write migration parameter "my-param"
              ...

When the device is created the migration/applied attr is "0". Migration
parameters are accessible in migration/params/ and read 0 bytes because they
are at their default values.  At the point opening the mdev device will fail
because migration parameters must be applied first. Migration parameters can be
set to the desired values or left at their defaults. "1" must be written to
migration/applied before opening the mdev device.

If writing to a migration/params/<param> attr or setting migration/applied to
"1" fails, then the device implementation does not support the migration
parameters.

An open mdev device typically does not allow migration parameters to be changed
at runtime. However, certain migration/params attrs may allow writes at
runtime. Usually these migration parameters only affect the device state
representation and not the hardware interface. This makes it possible to
upgrade or downgrade the device state representation at runtime so that
migration is possible to newer or older device implementations.

An existing mdev device instance can be reused by closing the mdev device and
writing "0" to migration/applied. This resets parameters to their defaults so
that a new list of migration parameters can be applied.

The migration parameter list for an mdev device that is in operation can be
read from migration/params/. Parameters that read 0 bytes are at their default
value.

vfio-user Devices
-----------------
TODO use FUSE to mimic VFIO/mdev sysfs (probably can't due to security
concerns, use UNIX domain socket RPC instead)?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC v2] VFIO Migration
  2020-11-05 15:09 [RFC v2] VFIO Migration Stefan Hajnoczi
@ 2020-11-05 15:52 ` Cornelia Huck
  2020-11-10  9:37   ` Stefan Hajnoczi
  2020-11-05 19:37 ` Alex Williamson
  1 sibling, 1 reply; 5+ messages in thread
From: Cornelia Huck @ 2020-11-05 15:52 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: John G Johnson, Tian, Kevin, mtsirkin, Daniel P. Berrangé,
	quintela, Jason Wang, Felipe Franciosi, Zeng, Xin, qemu-devel,
	Dr. David Alan Gilbert, Kirti Wankhede, Thanos Makatos,
	Alex Williamson, Gerd Hoffmann, Paolo Bonzini,
	Christophe de Dinechin, Yan Zhao

On Thu, 5 Nov 2020 15:09:02 +0000
Stefan Hajnoczi <stefanha@redhat.com> wrote:

(...)

<did not fully read through the v1 thread, so apologies if I missed
something>

> VFIO/mdev Devices
> -----------------
> TODO this is a first draft, more thought needed around enumerating supported
> parameters, representing default values, etc
> 
> The following mdev type sysfs attrs are available for managing device
> instances:
> 
>   /sys/.../<parent-device>/mdev_supported_types/<type-id>/
>       create - writing a UUID to this file instantiates a device
>       migration/ - migration related files
>           model - unique device model string, e.g. vendor-a.com/my-nic
> 
> Device models supported by an mdev driver can be enumerated by reading the
> migration/model attr for each <type-id>.

IIUC, we're grouping together all users of a specific mdev_type, but
support a variety of sub-configurations? Does that include parameters
or not? If not, shouldn't we already be covered by mdev_type?

> 
> The following mdev device sysfs attrs relate to a specific device instance:
> 
>   /sys/.../<parent-device>/<uuid>/
>       mdev_type/ - symlink to mdev type sysfs attrs, e.g. to fetch migration/model
>       migration/ - migration related files
>           applied - Write "1" to apply current migration parameter values or
>                     "0" to reset migration parameter values to their defaults.
>                     Parameters can only be applied or reset while the mdev is
>                     not opened.
>           params/ - migration parameters
>               <my-param> - read/write migration parameter "my-param"
>               ...
> 
> When the device is created the migration/applied attr is "0". Migration
> parameters are accessible in migration/params/ and read 0 bytes because they
> are at their default values.  At the point opening the mdev device will fail
> because migration parameters must be applied first. Migration parameters can be
> set to the desired values or left at their defaults. "1" must be written to
> migration/applied before opening the mdev device.
> 
> If writing to a migration/params/<param> attr or setting migration/applied to
> "1" fails, then the device implementation does not support the migration
> parameters.
> 
> An open mdev device typically does not allow migration parameters to be changed
> at runtime. However, certain migration/params attrs may allow writes at
> runtime. Usually these migration parameters only affect the device state
> representation and not the hardware interface. This makes it possible to
> upgrade or downgrade the device state representation at runtime so that
> migration is possible to newer or older device implementations.
> 
> An existing mdev device instance can be reused by closing the mdev device and
> writing "0" to migration/applied. This resets parameters to their defaults so
> that a new list of migration parameters can be applied.
> 
> The migration parameter list for an mdev device that is in operation can be
> read from migration/params/. Parameters that read 0 bytes are at their default
> value.

I'm trying to figure out what that means for the mdevs I'm most
familiar with, ccw and ap. Both of them currently support a single
mdev_type.

For ccw, there are some things that I could imagine as parameters, like
the device number, or channel paths. Maybe we could include the channel
path type (FICON, ...) in the migration device model? We should not
include device numbers etc. in the device model.

For ap, we have matrices covering tuples (APQNs) derived from a
cross-product of card/domains configure via sysfs attributes. I think
later modification of these is desired. I think we also might be able
to mix-and-match different types within the same matrix, so not sure if
we can put these into any device model. In fact, I'm a bit at a loss
how the device model for ap would look like (other than simply
'matrix'). Can we deal with dynamic parameters?

(...)



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC v2] VFIO Migration
  2020-11-05 15:09 [RFC v2] VFIO Migration Stefan Hajnoczi
  2020-11-05 15:52 ` Cornelia Huck
@ 2020-11-05 19:37 ` Alex Williamson
  2020-11-10  9:52   ` Stefan Hajnoczi
  1 sibling, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2020-11-05 19:37 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: John G Johnson, Tian, Kevin, mtsirkin, Daniel P. Berrangé,
	quintela, Jason Wang, Zeng,  Xin, qemu-devel,
	Dr. David Alan Gilbert, Yan Zhao, Kirti Wankhede, Paolo Bonzini,
	Gerd Hoffmann, Felipe Franciosi, Christophe de Dinechin,
	Thanos Makatos

On Thu, 5 Nov 2020 15:09:02 +0000
Stefan Hajnoczi <stefanha@redhat.com> wrote:

> v2 (big change, please reread everything):
> * Replace URIs with Go-style <domain>/<path> strings
> * Replace configuration parameters with migration parameters. The semantics are
>   different; they only describe migration compatibility and do not capture all
>   device configuration. This makes it easier to explain the purpose of
>   parameters and also logically separates device instantiation from migration
>   compatibility checking.
> * Describe how to achieve subsection semantics
> * Add hint that device internal state should be as general as possible to allow
>   different device implementations
> * Drop versions, they added complexity and aren't necessary for the migration
>   compatibility check
> * Add first draft VFIO/mdev sysfs attr interface
> 
> VFIO Migration
> ==============
> This document describes how to ensure migration compatibility for VFIO devices,
> including VFIO/mdev and vfio-user devices.
> 
> Overview
> --------
> VFIO devices can save and load a *device state*. Saving a device state produces
> a snapshot of a VFIO device's state that can be loaded again at a later point
> in time to resume the device from the snapshot.
> 
> The process of saving a device state and loading it later is called
> *migration*. The device state may be loaded by the same device instance that
> saved it or by a new instance, possibly running on a different machine.
> 
> A VFIO/mdev driver together with the physical device provides the functionality
> of a device. Alternatively, a vfio-user device emulation program can provide
> the functionality of a device. These are called *device implementations*.
> 
> The device implementation where a migration originates is called the *source*
> and the device implementation that a migration targets is called the
> *destination*.
> 
> This document describes how to establish whether or not migration compatibility
> exists between the source and destination. When compatibility has been
> established, the probability of migrating successfully is high and a successful
> migration does not leave the device inoperable due to silent migration
> problems.
> 
> Migration Parameters
> --------------------
> *Migration parameters* are used to describe characteristics that must match
> between source and destination to achieve migration compatibility.
> 
> The first implementation of a simple device may not require migration
> parameters if the source and destination are always compatible. As the device
> evolves, the source and destination may differ and migration parameters are
> required to express this variation. More complex devices may require migration
> parameters from the start due to optional functionality that is not guaranteed
> to be present in both source and destination.
> 
> A migration parameter consists of a name and a value. The name is a UTF-8
> string that does not contain equals ('='), backslash ('/'), or whitespace
> characters. The value is a UTF-8 string that does not contain whitespace
> characters.
> 
> The meaning of the migration parameter and its possible values are specific to
> the device, but values are generally based on one of the following types:
> * Boolean (on/off)
> * Integers (0, 1, 2, ...)
> * Enumerations (red, green, blue, ...)
> * Character strings
> 
> Migration parameters are conventionally formatted as <name>=<value> strings.
> Examples include my-feature=on and num-queues=4.
> 
> The absence of a migration parameter must have the same effect as before the
> migration parameter was introduced. For example, if my-feature=on|off is added
> to control the availability of a new device feature, then my-feature=off is
> equivalent to omitting the migration parameter.
> 
> Hardware Interface Compatibility
> --------------------------------
> VFIO devices have a *hardware interface* consisting of device regions and
> interrupts. Aspects of the hardware interface can vary between device
> implementations and require migration parameters to express migration
> compatibility requirements.
> 
> Examples of migration parameters include:
> * Feature availability - feature bitmasks, hardware revision numbers, etc. If
>   the destination may lack support for optional features or hardware interface
>   revisions, then migration parameters are required.
> * Functionality - hardware register blocks that are only present on certain
>   device instances. If there are multiple devices sub-models that have
>   different hardware interfaces then migration parameters are required.
> 
> These examples demonstrate aspects of the hardware interface that must not
> change unexpectedly. Were they to differ between source and destination, the
> chance of device driver malfunction would be high because the layout of the
> hardware interface would change or assumptions the device driver makes about
> available functionality would be violated. Migration parameters are used to
> preserve the hardware interface across migration and explicitly represent
> variations between device implementations.
> 
> Hardware interfaces sometimes support reporting an event when a change occurs.
> In those cases it may be possible to support visible changes in the hardware
> interface across migration. In most other cases migration must not result in a
> visible change in the hardware interface.
> 
> Migration parameters are not necessary for read-only values exposed through the
> hardware interface, such as MAC address EEPROMs or serial numbers, so long as
> all device implementations can be configured with the same range of input
> values for these read-only values. This is possible because migration
> parameters do not capture the full configuration of the device, only aspects
> that affect migration compatibility.
> 
> Device configuration that is not visible through the hardware interface, such
> as a host file system path of a disk image file or the physical network port
> assigned to a network card, usually does not require migration parameters
> because those values are not visible through the hardware interface and can be
> changed without breaking migration compatibility.
> 
> The disk image file may indirectly affect the hardware interface, for example
> by constraining the device's block size. In this case a block-size=N migration
> parameter is required to ensure migration compatibility, but the host file
> system path of the disk image file still does not require a migration
> parameter.
> 

I'm not sure what the above section defined.  We refer to these as
migration parameters, just as in the previous section, but are they
read-only and must match exactly?


> Device State Representation
> ---------------------------
> Device state contains both data accessible through the device's hardware
> interface and device-internal state needed to restore device operation.
> 
> The contents of hardware registers are usually included in the device state if
> they can change at runtime. Hardware registers with constant or computed data
> may not need to be part of the device state provided that device
> implementations can produce the necessary data.
> 
> Device-internal state includes the portion of the device's state that cannot be
> reconstructed from the hardware interface alone. Defining device-internal state
> in the most general way instead of exposing device implementation details
> allows for flexibility in the future. For example, device implementations often
> maintain a ring index, which is not available through the hardware interface,
> to keep track of which ring elements have already been consumed. The ring index
> must be included in the device state so that the destination can resume
> processing from the correct point in the ring. Representing this as an index
> into the ring in the hardware interface is more general than adding device
> implementation-specific request tracking data structures into the device state.
> 
> The *device state representation* defines the binary data layout of the device
> state. The device state representation is specific to each device and is beyond
> the scope of this document, but aspects pertaining to migration compatibility
> are discussed here.
> 
> Each change to the device state representation that affects migration
> compatibility requires a migration parameter. When a new field is added to the
> device state representation then a new migration parameter must be added to
> reflect this change. Often a single migration parameter expresses both a change
> to the hardware interface and the device state representation. It is also
> possible to change the device state representation without changing the
> hardware interface, for example when some state was forgotten while designing
> the previous device state representation.
> 
> The device state representation may support extra data that can be safely
> ignored by old device implementations. In this case migration compatibility is
> unaffected and a migration parameter is not required to indicate such extra
> data has been added.
> 
> Device Models
> -------------
> The combination of the hardware interface, device state representation, and
> migration parameter definitions is called a *device model*. Device models are
> identified by a unique UTF-8 string starting with a domain name and followed by
> path components separated with backslashes ('/'). Examples include
> vendor-a.com/my-nic, gitlab.com/user/my-device, virtio-spec.org/pci/virtio-net,
> and qemu.org/pci/10ec/8139.
> 
> The unique device model string is not changed as the device evolves. Instead,
> migration parameters are added to express variations in a device.
> 
> The device model is not tied to a specific device implementation. The same
> device model could be implemented as a VFIO/dev driver or as a vfio-user device
> emulation program.
> 
> Multiple device implementations can support the same device model. Doing so
> means that the device implementations can offer migration compatiblity because
> they support the same hardware interface, device state representation, and
> migration parameters.
> 
> Multiple device models can exist for the same hardware interface, each with a
> different device state representation and migration parameters. This makes it
> possible to fork and independently develop device models.
> 
> Device models can evolve over time as the hardware interface and device state
> representation change. The corresponding migration parameters ensure that
> migration compatibility can be established between device implementations.
> 
> Orchestrating Migrations
> ------------------------
> The following steps must be followed to migrate devices:
> 
> 1. Check that the source and destination support the same device model.
> 
> 2. Check that the destination supports the migration parameter list from the
>    source.
> 
> 3. Configure the destination so it is prepared to load the device state. This
>    may involve instantiating a new device instance or resetting an existing
>    device instance to a configuration that is compatible with the source.
> 
>    The migration parameter list may be used as part of this configuration, but
>    note that not all of the configuration is captured in the migration
>    parameter list. For example, the physical network port for a network card or
>    the host file system path for a disk image file is typically not captured in
>    the migration parameters and must be provided through other means.
> 
> 4. Save the device state on the source and load it on the destination.
> 
> 5. If migration succeeds then the destination resumes operation and the source
>    must not resume operation. If the migration fails then the source resumes
>    operation and the destination must not resume operation.
> 
> Note that these steps impose a conservative bound on device states that can be
> migrated successfully. Not all configuration parameters may be strictly
> required to match on the source and destination devices. For example, if the
> device's hardware interface has not yet been initialized then changes to the
> advertised features may not yet affect the device driver. However, accurately
> representing runtime constraints is complex and risks introducing migration
> bugs, so no attempt is made to support them.
> 
> VFIO/mdev Devices
> -----------------
> TODO this is a first draft, more thought needed around enumerating supported
> parameters, representing default values, etc
> 
> The following mdev type sysfs attrs are available for managing device
> instances:
> 
>   /sys/.../<parent-device>/mdev_supported_types/<type-id>/
>       create - writing a UUID to this file instantiates a device
>       migration/ - migration related files
>           model - unique device model string, e.g. vendor-a.com/my-nic
> 
> Device models supported by an mdev driver can be enumerated by reading the
> migration/model attr for each <type-id>.
> 
> The following mdev device sysfs attrs relate to a specific device instance:
> 
>   /sys/.../<parent-device>/<uuid>/
>       mdev_type/ - symlink to mdev type sysfs attrs, e.g. to fetch migration/model
>       migration/ - migration related files
>           applied - Write "1" to apply current migration parameter values or
>                     "0" to reset migration parameter values to their defaults.
>                     Parameters can only be applied or reset while the mdev is
>                     not opened.


This seems problematic, why aren't parameters applied on write so that
userspace can understand the bad values?


>           params/ - migration parameters
>               <my-param> - read/write migration parameter "my-param"
>               ...


Where do we learn the type and possibly valid values for a parameter?


> When the device is created the migration/applied attr is "0". Migration
> parameters are accessible in migration/params/ and read 0 bytes because they
> are at their default values.  At the point opening the mdev device will fail
> because migration parameters must be applied first. Migration parameters can be
> set to the desired values or left at their defaults. "1" must be written to
> migration/applied before opening the mdev device.


This breaks existing users, there cannot be a new requirement to apply
parameters or manipulate a new sysfs attribute before a device is
usable.  Besides, shouldn't default values always be acceptable?  This
presents a pretty high barrier for new features too, there will always
be a step where userspace must know about and actively enable that
feature.  That puts vendors in a difficult situation, either they break
migration by creating a new device model which enables features by
default or they need to go to extraordinary lengths to get userspace to
enable new features.  Is there intended to be a policy where all
parameters are enabled if we're not trying to match an existing device?
How would a value be determined where the parameter is not binary?


> 
> If writing to a migration/params/<param> attr or setting migration/applied to
> "1" fails, then the device implementation does not support the migration
> parameters.


s/parameter/value/  If the parameter is not supported, the attribute
shouldn't be present, right?  It might also be a resource issue that
prevents a value from being applied, errno might provide insight to
which it is.

> 
> An open mdev device typically does not allow migration parameters to be changed
> at runtime. However, certain migration/params attrs may allow writes at
> runtime. Usually these migration parameters only affect the device state
> representation and not the hardware interface. This makes it possible to
> upgrade or downgrade the device state representation at runtime so that
> migration is possible to newer or older device implementations.


Who does this and when?  How do we determine which are runtime and what
are acceptable values?  This seems really hard to orchestrate.

 
> An existing mdev device instance can be reused by closing the mdev device and
> writing "0" to migration/applied. This resets parameters to their defaults so
> that a new list of migration parameters can be applied.


Nope, can't make new requirements for re-use of an mdev device either.
I would expect an mdev device to retain it's configuration for the next
use, userspace can reset parameters as necessary or remove and recreate
the device.  Thanks,

Alex

 
> The migration parameter list for an mdev device that is in operation can be
> read from migration/params/. Parameters that read 0 bytes are at their default
> value.
> 
> vfio-user Devices
> -----------------
> TODO use FUSE to mimic VFIO/mdev sysfs (probably can't due to security
> concerns, use UNIX domain socket RPC instead)?



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC v2] VFIO Migration
  2020-11-05 15:52 ` Cornelia Huck
@ 2020-11-10  9:37   ` Stefan Hajnoczi
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2020-11-10  9:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: John G Johnson, Tian, Kevin, mtsirkin, Daniel P. Berrangé,
	quintela, Jason Wang, Felipe Franciosi, Zeng, Xin, qemu-devel,
	Dr. David Alan Gilbert, Kirti Wankhede, Thanos Makatos,
	Alex Williamson, Gerd Hoffmann, Paolo Bonzini,
	Christophe de Dinechin, Yan Zhao

[-- Attachment #1: Type: text/plain, Size: 5748 bytes --]

On Thu, Nov 05, 2020 at 04:52:20PM +0100, Cornelia Huck wrote:
> On Thu, 5 Nov 2020 15:09:02 +0000
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> (...)
> 
> <did not fully read through the v1 thread, so apologies if I missed
> something>
> 
> > VFIO/mdev Devices
> > -----------------
> > TODO this is a first draft, more thought needed around enumerating supported
> > parameters, representing default values, etc
> > 
> > The following mdev type sysfs attrs are available for managing device
> > instances:
> > 
> >   /sys/.../<parent-device>/mdev_supported_types/<type-id>/
> >       create - writing a UUID to this file instantiates a device
> >       migration/ - migration related files
> >           model - unique device model string, e.g. vendor-a.com/my-nic
> > 
> > Device models supported by an mdev driver can be enumerated by reading the
> > migration/model attr for each <type-id>.
> 
> IIUC, we're grouping together all users of a specific mdev_type, but
> support a variety of sub-configurations? Does that include parameters
> or not? If not, shouldn't we already be covered by mdev_type?

I will include an explanation of how mdev types relate to migration
parameters in the next revision of this document.

> > 
> > The following mdev device sysfs attrs relate to a specific device instance:
> > 
> >   /sys/.../<parent-device>/<uuid>/
> >       mdev_type/ - symlink to mdev type sysfs attrs, e.g. to fetch migration/model
> >       migration/ - migration related files
> >           applied - Write "1" to apply current migration parameter values or
> >                     "0" to reset migration parameter values to their defaults.
> >                     Parameters can only be applied or reset while the mdev is
> >                     not opened.
> >           params/ - migration parameters
> >               <my-param> - read/write migration parameter "my-param"
> >               ...
> > 
> > When the device is created the migration/applied attr is "0". Migration
> > parameters are accessible in migration/params/ and read 0 bytes because they
> > are at their default values.  At the point opening the mdev device will fail
> > because migration parameters must be applied first. Migration parameters can be
> > set to the desired values or left at their defaults. "1" must be written to
> > migration/applied before opening the mdev device.
> > 
> > If writing to a migration/params/<param> attr or setting migration/applied to
> > "1" fails, then the device implementation does not support the migration
> > parameters.
> > 
> > An open mdev device typically does not allow migration parameters to be changed
> > at runtime. However, certain migration/params attrs may allow writes at
> > runtime. Usually these migration parameters only affect the device state
> > representation and not the hardware interface. This makes it possible to
> > upgrade or downgrade the device state representation at runtime so that
> > migration is possible to newer or older device implementations.
> > 
> > An existing mdev device instance can be reused by closing the mdev device and
> > writing "0" to migration/applied. This resets parameters to their defaults so
> > that a new list of migration parameters can be applied.
> > 
> > The migration parameter list for an mdev device that is in operation can be
> > read from migration/params/. Parameters that read 0 bytes are at their default
> > value.
> 
> I'm trying to figure out what that means for the mdevs I'm most
> familiar with, ccw and ap. Both of them currently support a single
> mdev_type.
> 
> For ccw, there are some things that I could imagine as parameters, like
> the device number, or channel paths. Maybe we could include the channel
> path type (FICON, ...) in the migration device model? We should not
> include device numbers etc. in the device model.

That sounds good. Usually the host-specifics (which host device number
is being passed through) are not guest-visible and shouldn't be
migration parameters. Anything that affects the guest-visible hardware
interface or device state representation needs to be a migration
parameter.

> For ap, we have matrices covering tuples (APQNs) derived from a
> cross-product of card/domains configure via sysfs attributes. I think
> later modification of these is desired. I think we also might be able
> to mix-and-match different types within the same matrix, so not sure if
> we can put these into any device model. In fact, I'm a bit at a loss
> how the device model for ap would look like (other than simply
> 'matrix'). Can we deal with dynamic parameters?

Migration parameters are static. If you might need migration parameters
foo1, foo2, foo3, foo4, foo5 at runtime then they can be defined
statically but default to off.

Also, this migration compatibility scheme is progressive rather than a
binary "full compatibility checking" vs "no compatibility checking"
choice. QEMU relies on the user or management tool to set up compatible
source and destinations with a few sanity checks in cases where QEMU
developers thought it was helpful. So QEMU is somewhere in the middle of
the spectrum. I'm not trying to force anyone to express everything in
migration parameters. Public device models (e.g. if we device one for
virtio-net-pci) will probably be towards the "full compatibility
checking" side of the spectrum so that variations between device
implementations can be detected and handled. A proprietary device model
might be fine with just a single hardware-revision=N parameter that is
incremented every time something changes. It can be as simple or complex
as needed.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC v2] VFIO Migration
  2020-11-05 19:37 ` Alex Williamson
@ 2020-11-10  9:52   ` Stefan Hajnoczi
  0 siblings, 0 replies; 5+ messages in thread
From: Stefan Hajnoczi @ 2020-11-10  9:52 UTC (permalink / raw)
  To: Alex Williamson
  Cc: John G Johnson, Tian, Kevin, mtsirkin, Daniel P. Berrangé,
	quintela, Jason Wang, Zeng, Xin, qemu-devel,
	Dr. David Alan Gilbert, Yan Zhao, Kirti Wankhede, Paolo Bonzini,
	Gerd Hoffmann, Felipe Franciosi, Christophe de Dinechin,
	Thanos Makatos

[-- Attachment #1: Type: text/plain, Size: 11925 bytes --]

On Thu, Nov 05, 2020 at 12:37:08PM -0700, Alex Williamson wrote:
> On Thu, 5 Nov 2020 15:09:02 +0000
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > The disk image file may indirectly affect the hardware interface, for example
> > by constraining the device's block size. In this case a block-size=N migration
> > parameter is required to ensure migration compatibility, but the host file
> > system path of the disk image file still does not require a migration
> > parameter.
> > 
> 
> I'm not sure what the above section defined.  We refer to these as
> migration parameters, just as in the previous section, but are they
> read-only and must match exactly?

I will try to clarify this in the next revision. In this example
block-size=N is determined by the properties of the physical block
device. The device can only be migrated to a destination with the same
block size. The block-size=N migration parameter expresses this
constraint.

> > Device State Representation
> > ---------------------------
> > Device state contains both data accessible through the device's hardware
> > interface and device-internal state needed to restore device operation.
> > 
> > The contents of hardware registers are usually included in the device state if
> > they can change at runtime. Hardware registers with constant or computed data
> > may not need to be part of the device state provided that device
> > implementations can produce the necessary data.
> > 
> > Device-internal state includes the portion of the device's state that cannot be
> > reconstructed from the hardware interface alone. Defining device-internal state
> > in the most general way instead of exposing device implementation details
> > allows for flexibility in the future. For example, device implementations often
> > maintain a ring index, which is not available through the hardware interface,
> > to keep track of which ring elements have already been consumed. The ring index
> > must be included in the device state so that the destination can resume
> > processing from the correct point in the ring. Representing this as an index
> > into the ring in the hardware interface is more general than adding device
> > implementation-specific request tracking data structures into the device state.
> > 
> > The *device state representation* defines the binary data layout of the device
> > state. The device state representation is specific to each device and is beyond
> > the scope of this document, but aspects pertaining to migration compatibility
> > are discussed here.
> > 
> > Each change to the device state representation that affects migration
> > compatibility requires a migration parameter. When a new field is added to the
> > device state representation then a new migration parameter must be added to
> > reflect this change. Often a single migration parameter expresses both a change
> > to the hardware interface and the device state representation. It is also
> > possible to change the device state representation without changing the
> > hardware interface, for example when some state was forgotten while designing
> > the previous device state representation.
> > 
> > The device state representation may support extra data that can be safely
> > ignored by old device implementations. In this case migration compatibility is
> > unaffected and a migration parameter is not required to indicate such extra
> > data has been added.
> > 
> > Device Models
> > -------------
> > The combination of the hardware interface, device state representation, and
> > migration parameter definitions is called a *device model*. Device models are
> > identified by a unique UTF-8 string starting with a domain name and followed by
> > path components separated with backslashes ('/'). Examples include
> > vendor-a.com/my-nic, gitlab.com/user/my-device, virtio-spec.org/pci/virtio-net,
> > and qemu.org/pci/10ec/8139.
> > 
> > The unique device model string is not changed as the device evolves. Instead,
> > migration parameters are added to express variations in a device.
> > 
> > The device model is not tied to a specific device implementation. The same
> > device model could be implemented as a VFIO/dev driver or as a vfio-user device
> > emulation program.
> > 
> > Multiple device implementations can support the same device model. Doing so
> > means that the device implementations can offer migration compatiblity because
> > they support the same hardware interface, device state representation, and
> > migration parameters.
> > 
> > Multiple device models can exist for the same hardware interface, each with a
> > different device state representation and migration parameters. This makes it
> > possible to fork and independently develop device models.
> > 
> > Device models can evolve over time as the hardware interface and device state
> > representation change. The corresponding migration parameters ensure that
> > migration compatibility can be established between device implementations.
> > 
> > Orchestrating Migrations
> > ------------------------
> > The following steps must be followed to migrate devices:
> > 
> > 1. Check that the source and destination support the same device model.
> > 
> > 2. Check that the destination supports the migration parameter list from the
> >    source.
> > 
> > 3. Configure the destination so it is prepared to load the device state. This
> >    may involve instantiating a new device instance or resetting an existing
> >    device instance to a configuration that is compatible with the source.
> > 
> >    The migration parameter list may be used as part of this configuration, but
> >    note that not all of the configuration is captured in the migration
> >    parameter list. For example, the physical network port for a network card or
> >    the host file system path for a disk image file is typically not captured in
> >    the migration parameters and must be provided through other means.
> > 
> > 4. Save the device state on the source and load it on the destination.
> > 
> > 5. If migration succeeds then the destination resumes operation and the source
> >    must not resume operation. If the migration fails then the source resumes
> >    operation and the destination must not resume operation.
> > 
> > Note that these steps impose a conservative bound on device states that can be
> > migrated successfully. Not all configuration parameters may be strictly
> > required to match on the source and destination devices. For example, if the
> > device's hardware interface has not yet been initialized then changes to the
> > advertised features may not yet affect the device driver. However, accurately
> > representing runtime constraints is complex and risks introducing migration
> > bugs, so no attempt is made to support them.
> > 
> > VFIO/mdev Devices
> > -----------------
> > TODO this is a first draft, more thought needed around enumerating supported
> > parameters, representing default values, etc
> > 
> > The following mdev type sysfs attrs are available for managing device
> > instances:
> > 
> >   /sys/.../<parent-device>/mdev_supported_types/<type-id>/
> >       create - writing a UUID to this file instantiates a device
> >       migration/ - migration related files
> >           model - unique device model string, e.g. vendor-a.com/my-nic
> > 
> > Device models supported by an mdev driver can be enumerated by reading the
> > migration/model attr for each <type-id>.
> > 
> > The following mdev device sysfs attrs relate to a specific device instance:
> > 
> >   /sys/.../<parent-device>/<uuid>/
> >       mdev_type/ - symlink to mdev type sysfs attrs, e.g. to fetch migration/model
> >       migration/ - migration related files
> >           applied - Write "1" to apply current migration parameter values or
> >                     "0" to reset migration parameter values to their defaults.
> >                     Parameters can only be applied or reset while the mdev is
> >                     not opened.
> 
> 
> This seems problematic, why aren't parameters applied on write so that
> userspace can understand the bad values?

I found a way to get rid of the "applied" sysfs attr. Will fix in the
next revision.

> >           params/ - migration parameters
> >               <my-param> - read/write migration parameter "my-param"
> >               ...
> 
> 
> Where do we learn the type and possibly valid values for a parameter?

The next revision will add that information.

> > When the device is created the migration/applied attr is "0". Migration
> > parameters are accessible in migration/params/ and read 0 bytes because they
> > are at their default values.  At the point opening the mdev device will fail
> > because migration parameters must be applied first. Migration parameters can be
> > set to the desired values or left at their defaults. "1" must be written to
> > migration/applied before opening the mdev device.
> 
> 
> This breaks existing users, there cannot be a new requirement to apply
> parameters or manipulate a new sysfs attribute before a device is
> usable.  Besides, shouldn't default values always be acceptable?  This
> presents a pretty high barrier for new features too, there will always
> be a step where userspace must know about and actively enable that
> feature.  That puts vendors in a difficult situation, either they break
> migration by creating a new device model which enables features by
> default or they need to go to extraordinary lengths to get userspace to
> enable new features.  Is there intended to be a policy where all
> parameters are enabled if we're not trying to match an existing device?
> How would a value be determined where the parameter is not binary?

Good points, the next revision will solve this so the device is created
with the latest supported migration parameter values by default instead
of the oldest/most compatible ones.

> > If writing to a migration/params/<param> attr or setting migration/applied to
> > "1" fails, then the device implementation does not support the migration
> > parameters.
> 
> 
> s/parameter/value/  If the parameter is not supported, the attribute
> shouldn't be present, right?  It might also be a resource issue that
> prevents a value from being applied, errno might provide insight to
> which it is.

Yes, will fix.

> > An open mdev device typically does not allow migration parameters to be changed
> > at runtime. However, certain migration/params attrs may allow writes at
> > runtime. Usually these migration parameters only affect the device state
> > representation and not the hardware interface. This makes it possible to
> > upgrade or downgrade the device state representation at runtime so that
> > migration is possible to newer or older device implementations.
> 
> 
> Who does this and when?  How do we determine which are runtime and what
> are acceptable values?  This seems really hard to orchestrate.

Modifying a device at runtime is an explicit operation. The user needs
to know what they are doing. I'm not sure if trying to define metadata
is useful since it cannot be done without an understanding of the
migration parameter's effect.

> > An existing mdev device instance can be reused by closing the mdev device and
> > writing "0" to migration/applied. This resets parameters to their defaults so
> > that a new list of migration parameters can be applied.
> 
> 
> Nope, can't make new requirements for re-use of an mdev device either.
> I would expect an mdev device to retain it's configuration for the next
> use, userspace can reset parameters as necessary or remove and recreate
> the device.  Thanks,

Will fix in the next revision.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-10  9:53 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-05 15:09 [RFC v2] VFIO Migration Stefan Hajnoczi
2020-11-05 15:52 ` Cornelia Huck
2020-11-10  9:37   ` Stefan Hajnoczi
2020-11-05 19:37 ` Alex Williamson
2020-11-10  9:52   ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).