Linux s390 Architecture development
 help / color / mirror / Atom feed
* [PATCH v3 00/15] s390/vfio-ap: Add live guest migration support
@ 2026-06-30 10:41 Anthony Krowiak
  2026-06-30 10:41 ` [PATCH v3 01/15] s390/vfio-ap: Provide function to get the number of queues assigned to mdev Anthony Krowiak
                   ` (14 more replies)
  0 siblings, 15 replies; 30+ messages in thread
From: Anthony Krowiak @ 2026-06-30 10:41 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: jjherne, borntraeger, mjrosato, pasic, alex, kwankhede, fiuczy,
	pbonzini, frankja, imbrenda, agordeev, hca, gor

This patch series implements live guest migration support for KVM guests
with s390 AP (Adjunct Processor) devices passed through via the VFIO
mediated device framework.

Background
~~~~~~~~~~

The vfio-ap device driver differs from typical VFIO device drivers in that
it does not virtualize a physical device. Instead, it manages AP
configuration metadata identifying the AP adapters, domains, and control
domains to which a guest will be granted access. These AP resources are
configured by assigning them to a vfio-ap mediated device via its sysfs
assignment interfaces. When the fd for the VFIO device is opened by
userspace, the vfio_ap device driver sets the guest's AP configuration
from the metadata stored with the mediated device. As such, the AP devices
are not accessed directly through the vfio_ap driver, so the driver has no
internal AP device state to migrate. What it does migrate is the AP
configuration metadata of the source guest.

Implementation Approach
~~~~~~~~~~~~~~~~~~~~~~~

This series implements the VFIO migration protocol using the STOP_COPY
migration flow. The key aspects are:

1. On transition of the migration state from STOP to STOP_COPY
   - The vfio_ap device driver creates a filestream for userspace to use to
     read the guest's AP configuration from the mdev

2. During the STOP_COPY phase
   - Userspace uses the filestream created in #1 to read the source guest's
     AP configuration
   - The vfio_ap device driver copies the source guest's AP configuration
     information to userspace

3. On transition of the migration state from STOP to RESUMING
   - The vfio_ap device driver creates a filestream for userspace to use to
     write the source guest's AP configuration information so it can be
     restored to the mdev on the destination host.

4. During the RESUMING phase
   - Userspace uses the filestream created in #3 to send the source guest's
     AP configuration information to the vfio_ap device driver on the
     destination host.
   - The vfio_ap device driver first verifies the source guest's AP
     configuration is compatible with the destination host's.
   - The driver restores AP configuration to the mdev on the destination
     host which automatically hot plugs the AP resources identified
     therein.

5. Documentation
   - Add live guest migration chapter to vfio-ap.rst

Compatibility Validation
~~~~~~~~~~~~~~~~~~~~~~~~

The series includes comprehensive validation to ensure source and
destination AP configurations are compatible. For each queue, the following
characteristics must match:

- AP type (target must be same or newer than source)
- Installed facilities (APSC, APQKM, AP4KC, SLCF)
- Operating mode (CCA, Accelerator, XCP)
- APXA facility setting
- Classification (native vs stateless functions)
- Queue usability (binding/associated state)

When incompatibilities are detected, migration fails with detailed error
messages identifying the specific queue and characteristic that caused
the failure.

Configuration Management
~~~~~~~~~~~~~~~~~~~~~~~~

This implementation does not prevent configuration changes during
migration. Configuration stability is an orchestration-layer
responsibility, consistent with other VFIO device types. The driver's
role is to validate configurations and provide clear diagnostics when
incompatibilities are detected, enabling orchestration tools to implement
appropriate policies.

Change log v2 => v3:
~~~~~~~~~~~~~~~~~~~
Patch 1:
* Removed this patch because the hardware info will be retrieved using
  PQAP(TAPQ) command when needed in cases changes have been made to the
  host's AP configuration since the mdev was probed.
* Removed #include "uapi/linux/vfio_ap.h" from vfio_ap_migration.c
  file

Patch 3:
* Removed vfio_ap_migration_file class and replaced with file pointers as
  there is no longer a need to store AP configuration information which
  will now be retrieved from the mdev (during STOP_COPY phase) or stored
  with the mdev (during RESUMING phase)
* Added padding fields to struct vfio_ap_queue_info & struct vfio_ap_config
  to ensure alignment

Patch 4:
* vfio_ap_release_mig_files function now only needs to fput the files used
  during ghe STOP_COPY and RESUMING phases of migration since there are no
  longer vfio_ap_migration_file objects containing the file pointers as
  well as AP configuration data.
* Removed #include "uapi/linux/vfio_ap.h" from vfio_ap_migration.c
  file
* Set the struct vfio_device migration_flags and mig_ops fields from mdev
  probe callback prior to call to vfio_ap_init_migration_capabilities
  function so the VFIO core doesn't see mig_ops as NULL during the
  registration.
* In the vfio_ap_mdev_open_device function in vfio_ap_ops.c, make a call
  to vfio_ap_release_migration_data if the vfio_ap_mdev_set_kvm function
  fails to clean up mig_data. This prevents data leak of mig_data if
  multiple opens are executed.
* Call vfio_ap_release_migration_data if setting the KVM pointer fails in
  the open vfio device callback to prevent leaking the allocation of
  the matrix_mdev->mig_data pointer.

Patch 5:
* Combined the state transition checks from STOP_COPY to STOP and RESUMING
  to STOP into one if statement since the handling is the same.
* Squashed patch 16 into this patch

Patch 6:
* No longer copies the AP configuration from the mdev to a
  vfio_ap_migration_file structure (removed) and simply opens a file stream
  for STOP_COPY phase of migration.

Patch 7:
* No longer copies a vfio_ap_config object retrieved from a
  vfio_ap_migration_file structure (removed) to userspace.
* Now creates an vfio_ap_config object and obtains the hardware information
  stored for each APQN assigned to the mdev via the PQAP(TAPQ) instruction,
  then copies the vfio_ap_configuration object to userspace
* Removed dev_err messages from validate_save_read_parms function to
  avoid kernal log spamming.
* Add check to validate_save_read_parms function to verify that matrix_mdev
  is stored as private data of the file object and that
  matrix_mdev->mig_data is not NULL since we access it in the read
  callback.
* Release matrix_dev->mdevs_lock while copying AP configuration data to
  userspace in vfio_ap_save_read function

Patch 8:
* No longer creates a struct vfio_ap_migration_data object (removed) in
  which to store the AP configuration data sent from userspace during the
  RESUMING phase of migration.
* Now, merely creates and returns the filestream for userspace to use to
  restore the source guest's data on the destination

Patch 9:
* Complete redesign to better comport with the VFIO migration framework
  whereby the vfio device state of the source device is restored to the
  destination device. Prior to this change, the destination guest's AP
  configuration was acquired at startup from the vfio-ap mediated device.
  The vfio_ap driver would only verify that the source and destination
  AP configurations are compatible.
  - The source guest's state (AP configuration) is read from the file
    stream and validated
  - If valid, it is stored directly in the ap_matrix_mdev object and hot
    hot plugged into the running guest

Patches 10 and 11:
* Squashed patches 10 and 11 together as the state transition to from
  STOP_COPY to STOP and RESUMING to STOP are handled the same way (also see
  changes for Patch 5 above.

Patch 16:
* Updated vfio-ap.rst to include text related to design changes

Anthony Krowiak (15):
  s390/vfio-ap: Provide function to get the number of queues assigned to
    mdev
  s390/vfio-ap: Data structures for facilitating vfio device migration
  s390/vfio-ap: Initialize/release vfio device migration data
  s390/vfio-ap: Reset migration state in VFIO_DEVICE_RESET ioctl handler
  s390-vfio-ap: Callback to get/set vfio device mig state during guest
    migration
  s390/vfio-ap: Transition guest migration state from STOP to STOP_COPY
  s390/vfio-ap: File ops called to save the vfio device migration state
  s390/vfio-ap: Transition device migration state from STOP to RESUMING
  s390/vfio-ap: Add method to set a new guest AP configuration
  s390/vfio-ap: File ops called to resume the vfio device migration
  s390/vfio-ap: Transition device migration state to STOP
  s390/vfio-ap: Transition device migration state from STOP to RUNNING
    and vice versa
  s390/vfio-ap: Callback to get the size of data to be migrated during
    guest migration
  s390/vfio-ap: Add 'migratable' feature to sysfs 'features' attribute
  s390/vfio-ap: Add live guest migration chapter to vfio-ap.rst

 Documentation/arch/s390/vfio-ap.rst     |  514 +++++++--
 drivers/s390/crypto/Makefile            |    2 +-
 drivers/s390/crypto/vfio_ap_drv.c       |    4 +-
 drivers/s390/crypto/vfio_ap_migration.c | 1373 +++++++++++++++++++++++
 drivers/s390/crypto/vfio_ap_ops.c       |  462 ++++++--
 drivers/s390/crypto/vfio_ap_private.h   |   72 ++
 6 files changed, 2225 insertions(+), 202 deletions(-)
 create mode 100644 drivers/s390/crypto/vfio_ap_migration.c

--
2.53.0


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2026-06-30 11:56 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 10:41 [PATCH v3 00/15] s390/vfio-ap: Add live guest migration support Anthony Krowiak
2026-06-30 10:41 ` [PATCH v3 01/15] s390/vfio-ap: Provide function to get the number of queues assigned to mdev Anthony Krowiak
2026-06-30 10:41 ` [PATCH v3 02/15] s390/vfio-ap: Data structures for facilitating vfio device migration Anthony Krowiak
2026-06-30 10:55   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 03/15] s390/vfio-ap: Initialize/release vfio device migration data Anthony Krowiak
2026-06-30 11:04   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 04/15] s390/vfio-ap: Reset migration state in VFIO_DEVICE_RESET ioctl handler Anthony Krowiak
2026-06-30 11:10   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 05/15] s390-vfio-ap: Callback to get/set vfio device mig state during guest migration Anthony Krowiak
2026-06-30 11:11   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 06/15] s390/vfio-ap: Transition guest migration state from STOP to STOP_COPY Anthony Krowiak
2026-06-30 11:23   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 07/15] s390/vfio-ap: File ops called to save the vfio device migration state Anthony Krowiak
2026-06-30 11:26   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 08/15] s390/vfio-ap: Transition device migration state from STOP to RESUMING Anthony Krowiak
2026-06-30 11:28   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 09/15] s390/vfio-ap: Add method to set a new guest AP configuration Anthony Krowiak
2026-06-30 11:34   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 10/15] s390/vfio-ap: File ops called to resume the vfio device migration Anthony Krowiak
2026-06-30 11:37   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 11/15] s390/vfio-ap: Transition device migration state to STOP Anthony Krowiak
2026-06-30 11:46   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 12/15] s390/vfio-ap: Transition device migration state from STOP to RUNNING and vice versa Anthony Krowiak
2026-06-30 11:48   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 13/15] s390/vfio-ap: Callback to get the size of data to be migrated during guest migration Anthony Krowiak
2026-06-30 11:49   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 14/15] s390/vfio-ap: Add 'migratable' feature to sysfs 'features' attribute Anthony Krowiak
2026-06-30 11:56   ` sashiko-bot
2026-06-30 10:41 ` [PATCH v3 15/15] s390/vfio-ap: Add live guest migration chapter to vfio-ap.rst Anthony Krowiak
2026-06-30 11:54   ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox