qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device
@ 2019-07-08 19:17 Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 01/13] vfio: KABI for migration interface Kirti Wankhede
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

Add migration support for VFIO device

This Patch set include patches as below:
- Define KABI for VFIO device for migration support.
- Added save and restore functions for PCI configuration space
- Generic migration functionality for VFIO device.
  * This patch set adds functionality only for PCI devices, but can be
    extended to other VFIO devices.
  * Added all the basic functions required for pre-copy, stop-and-copy and
    resume phases of migration.
  * Added state change notifier and from that notifier function, VFIO
    device's state changed is conveyed to VFIO device driver.
  * During save setup phase and resume/load setup phase, migration region
    is queried and is used to read/write VFIO device data.
  * .save_live_pending and .save_live_iterate are implemented to use QEMU's
    functionality of iteration during pre-copy phase.
  * In .save_live_complete_precopy, that is in stop-and-copy phase,
    iteration to read data from VFIO device driver is implemented till pending
    bytes returned by driver are not zero.
  * Added function to get dirty pages bitmap for the pages which are used by
    driver.
- Add vfio_listerner_log_sync to mark dirty pages.
- Make VFIO PCI device migration capable. If migration region is not provided by
  driver, migration is blocked.

Below is the flow of state change for live migration where states in brackets
represent VM state, migration state and VFIO device state as:
    (VM state, MIGRATION_STATUS, VFIO_DEVICE_STATE)

Live migration save path:
        QEMU normal running state
        (RUNNING, _NONE, _RUNNING)
                        |
    migrate_init spawns migration_thread.
    (RUNNING, _SETUP, _RUNNING|_SAVING)
    Migration thread then calls each device's .save_setup()
                        |
    (RUNNING, _ACTIVE, _RUNNING|_SAVING)
    If device is active, get pending bytes by .save_live_pending()
    if pending bytes >= threshold_size,  call save_live_iterate()
    Data of VFIO device for pre-copy phase is copied.
    Iterate till pending bytes converge and are less than threshold
                        |
    On migration completion, vCPUs stops and calls .save_live_complete_precopy
    for each active device. VFIO device is then transitioned in
     _SAVING state.
    (FINISH_MIGRATE, _DEVICE, _SAVING)
    For VFIO device, iterate in  .save_live_complete_precopy  until
    pending data is 0.
    (FINISH_MIGRATE, _DEVICE, _STOPPED)
                        |
    (FINISH_MIGRATE, _COMPLETED, STOPPED)
    Migraton thread schedule cleanup bottom half and exit

Live migration resume path:
    Incomming migration calls .load_setup for each device
    (RESTORE_VM, _ACTIVE, STOPPED)
                        |
    For each device, .load_state is called for that device section data
                        |
    At the end, called .load_cleanup for each device and vCPUs are started.
                        |
        (RUNNING, _NONE, _RUNNING)

Note that:
- Migration post copy is not supported.

v4 -> v5:
- Added decriptive comment about the sequence of access of members of structure
  vfio_device_migration_info to be followed based on Alex's suggestion
- Updated get dirty pages sequence.
- As per Cornelia Huck's suggestion, added callbacks to VFIODeviceOps to
  get_object, save_config and load_config.
- Fixed multiple nit picks.
- Tested live migration with multiple vfio device assigned to a VM.

v3 -> v4:
- Added one more bit for _RESUMING flag to be set explicitly.
- data_offset field is read-only for user space application.
- data_size is read for every iteration before reading data from migration, that
  is removed assumption that data will be till end of migration region.
- If vendor driver supports mappable sparsed region, map those region during
  setup state of save/load, similarly unmap those from cleanup routines.
- Handles race condition that causes data corruption in migration region during
  save device state by adding mutex and serialiaing save_buffer and
  get_dirty_pages routines.
- Skip called get_dirty_pages routine for mapped MMIO region of device.
- Added trace events.
- Splitted into multiple functional patches.

v2 -> v3:
- Removed enum of VFIO device states. Defined VFIO device state with 2 bits.
- Re-structured vfio_device_migration_info to keep it minimal and defined action
  on read and write access on its members.

v1 -> v2:
- Defined MIGRATION region type and sub-type which should be used with region
  type capability.
- Re-structured vfio_device_migration_info. This structure will be placed at 0th
  offset of migration region.
- Replaced ioctl with read/write for trapped part of migration region.
- Added both type of access support, trapped or mmapped, for data section of the
  region.
- Moved PCI device functions to pci file.
- Added iteration to get dirty page bitmap until bitmap for all requested pages
  are copied.

Thanks,
Kirti

Kirti Wankhede (13):
  vfio: KABI for migration interface
  vfio: Add function to unmap VFIO region
  vfio: Add vfio_get_object callback to VFIODeviceOps
  vfio: Add save and load functions for VFIO PCI devices
  vfio: Add migration region initialization and finalize function
  vfio: Add VM state change handler to know state of VM
  vfio: Add migration state change notifier
  vfio: Register SaveVMHandlers for VFIO device
  vfio: Add save state functions to SaveVMHandlers
  vfio: Add load state functions to SaveVMHandlers
  vfio: Add function to get dirty page list
  vfio: Add vfio_listerner_log_sync to mark dirty pages
  vfio: Make vfio-pci device migration capable.

 hw/vfio/Makefile.objs         |   2 +-
 hw/vfio/common.c              |  55 +++
 hw/vfio/migration.c           | 874 ++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/pci.c                 | 137 ++++++-
 hw/vfio/trace-events          |  19 +
 include/hw/vfio/vfio-common.h |  25 ++
 linux-headers/linux/vfio.h    | 166 ++++++++
 7 files changed, 1271 insertions(+), 7 deletions(-)
 create mode 100644 hw/vfio/migration.c

-- 
2.7.0



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 01/13] vfio: KABI for migration interface
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 02/13] vfio: Add function to unmap VFIO region Kirti Wankhede
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

- Defined MIGRATION region type and sub-type.
- Used 3 bits to define VFIO device states.
    Bit 0 => _RUNNING
    Bit 1 => _SAVING
    Bit 2 => _RESUMING
    Combination of these bits defines VFIO device's state during migration
    _STOPPED => All bits 0 indicates VFIO device stopped.
    _RUNNING => Normal VFIO device running state.
    _SAVING | _RUNNING => vCPUs are running, VFIO device is running but start
                          saving state of device i.e. pre-copy state
    _SAVING  => vCPUs are stoppped, VFIO device should be stopped, and
                          save device state,i.e. stop-n-copy state
    _RESUMING => VFIO device resuming state.
    _SAVING | _RESUMING => Invalid state if _SAVING and _RESUMING bits are set
- Defined vfio_device_migration_info structure which will be placed at 0th
  offset of migration region to get/set VFIO device related information.
  Defined members of structure and usage on read/write access:
    * device_state: (read/write)
        To convey VFIO device state to be transitioned to. Only 3 bits are used
        as of now.
    * pending bytes: (read only)
        To get pending bytes yet to be migrated for VFIO device.
    * data_offset: (read only)
        To get data offset in migration from where data exist during _SAVING
        and from where data should be written by user space application during
         _RESUMING state
    * data_size: (read/write)
        To get and set size of data copied in migration region during _SAVING
        and _RESUMING state.
    * start_pfn, page_size, total_pfns: (write only)
        To get bitmap of dirty pages from vendor driver from given
        start address for total_pfns.
    * copied_pfns: (read only)
        To get number of pfns bitmap copied in migration region.
        Vendor driver should copy the bitmap with bits set only for
        pages to be marked dirty in migration region. Vendor driver
        should return 0 if there are 0 pages dirty in requested
        range. Vendor driver should return -1 to mark all pages in the section
        as dirty

Migration region looks like:
 ------------------------------------------------------------------
|vfio_device_migration_info|    data section                      |
|                          |     ///////////////////////////////  |
 ------------------------------------------------------------------
 ^                              ^                              ^
 offset 0-trapped part        data_offset                 data_size

Data section is always followed by vfio_device_migration_info
structure in the region, so data_offset will always be none-0.
Offset from where data is copied is decided by kernel driver, data
section can be trapped or mapped depending on how kernel driver
defines data section. If mmapped, then data_offset should be page
aligned, where as initial section which contain
vfio_device_migration_info structure might not end at offset which
is page aligned.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 linux-headers/linux/vfio.h | 166 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 166 insertions(+)

diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 24f505199f83..6696a4600545 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -372,6 +372,172 @@ struct vfio_region_gfx_edid {
  */
 #define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD	(1)
 
+/* Migration region type and sub-type */
+#define VFIO_REGION_TYPE_MIGRATION	        (2)
+#define VFIO_REGION_SUBTYPE_MIGRATION	        (1)
+
+/**
+ * Structure vfio_device_migration_info is placed at 0th offset of
+ * VFIO_REGION_SUBTYPE_MIGRATION region to get/set VFIO device related migration
+ * information. Field accesses from this structure are only supported at their
+ * native width and alignment, otherwise should return error.
+ *
+ * device_state: (read/write)
+ *      To indicate vendor driver the state VFIO device should be transitioned
+ *      to. If device state transition fails, write on this field return error.
+ *      It consists of 3 bits:
+ *      - If bit 0 set, indicates _RUNNING state. When its reset, that indicates
+ *        _STOPPED state. When device is changed to _STOPPED, driver should stop
+ *        device before write() returns.
+ *      - If bit 1 set, indicates _SAVING state.
+ *      - If bit 2 set, indicates _RESUMING state.
+ *      _SAVING and _RESUMING set at the same time is invalid state.
+ *
+ * pending bytes: (read only)
+ *      Number of pending bytes yet to be migrated from vendor driver
+ *
+ * data_offset: (read only)
+ *      User application should read data_offset in migration region from where
+ *      user application should read device data during _SAVING state or write
+ *      device data during _RESUMING state or read dirty pages bitmap. See below
+ *      for detail of sequence to be followed.
+ *
+ * data_size: (read/write)
+ *      User application should read data_size to get size of data copied in
+ *      migration region during _SAVING state and write size of data copied in
+ *      migration region during _RESUMING state.
+ *
+ * start_pfn: (write only)
+ *      Start address pfn to get bitmap of dirty pages from vendor driver duing
+ *      _SAVING state.
+ *
+ * page_size: (write only)
+ *      User application should write the page_size of pfn.
+ *
+ * total_pfns: (write only)
+ *      Total pfn count from start_pfn for which dirty bitmap is requested.
+ *
+ * copied_pfns: (read only)
+ *      pfn count for which dirty bitmap is copied to migration region.
+ *      Vendor driver should copy the bitmap with bits set only for pages to be
+ *      marked dirty in migration region.
+ *      - Vendor driver should return VFIO_DEVICE_DIRTY_PFNS_NONE if none of the
+ *        pages are dirty in requested range or rest of the range.
+ *      - Vendor driver should return VFIO_DEVICE_DIRTY_PFNS_ALL to mark all
+ *        pages dirty in the given section.
+ *      - Vendor driver should return pfn count for which bitmap is written in
+ *        the region.
+ *
+ * Migration region looks like:
+ *  ------------------------------------------------------------------
+ * |vfio_device_migration_info|    data section                      |
+ * |                          |     ///////////////////////////////  |
+ * ------------------------------------------------------------------
+ *   ^                              ^                              ^
+ *  offset 0-trapped part        data_offset                 data_size
+ *
+ * Data section is always followed by vfio_device_migration_info structure
+ * in the region, so data_offset will always be none-0. Offset from where data
+ * is copied is decided by kernel driver, data section can be trapped or
+ * mapped depending on how kernel driver defines data section. If mmapped,
+ * then data_offset should be page aligned, where as initial section which
+ * contain vfio_device_migration_info structure might not end at offset which
+ * is page aligned.
+ * Data_offset can be same or different for device data and dirty page bitmap.
+ * Vendor driver should decide whether to partition data section and how to
+ * partition the data section. Vendor driver should return data_offset
+ * accordingly.
+ *
+ * Sequence to be followed:
+ * In _SAVING|_RUNNING device state or pre-copy phase:
+ * a. read pending_bytes. If pending_bytes > 0, go through below steps.
+ * b. read data_offset, indicates kernel driver to write data to staging buffer
+ *    which is mmapped.
+ * c. read data_size, amount of data in bytes written by vendor driver in
+ *    migration region.
+ * d. if data section is trapped, read from data_offset of data_size.
+ * e. if data section is mmaped, read data_size bytes from mmaped buffer from
+ *    data_offset in the migration region.
+ * f. Write data_size and data to file stream.
+ * g. iterate through steps a to f while (pending_bytes > 0)
+ *
+ * In _SAVING device state or stop-and-copy phase:
+ * a. read config space of device and save to migration file stream. This
+ *    doesn't need to be from vendor driver. Any other special config state
+ *    from driver can be saved as data in following iteration.
+ * b. read pending_bytes.
+ * c. read data_offset, indicates kernel driver to write data to staging
+ *    buffer which is mmapped.
+ * d. read data_size, amount of data in bytes written by vendor driver in
+ *    migration region.
+ * e. if data section is trapped, read from data_offset of data_size.
+ * f. if data section is mmaped, read data_size bytes from mmaped buffer from
+ *    data_offset in the migration region.
+ * g. Write data_size and data to file stream
+ * h. iterate through steps b to g while (pending_bytes > 0)
+ *
+ * When data region is mapped, its user's responsibility to read data from
+ * data_offset of data_size before moving to next steps.
+ *
+ * Dirty page tracking is part of RAM copy state, where vendor driver
+ * provides the bitmap of pages which are dirtied by vendor driver through
+ * migration region and as part of RAM copy those pages gets copied to file
+ * stream.
+ *
+ * To get dirty page bitmap:
+ * a. write start_pfn, page_size and total_pfns.
+ * b. read copied_pfns.
+ *     - Vendor driver should return VFIO_DEVICE_DIRTY_PFNS_NONE if driver
+ *       doesn't have any page to report dirty in given range or rest of the
+ *       range. Exit loop.
+ *     - Vendor driver should return VFIO_DEVICE_DIRTY_PFNS_ALL to mark all
+ *       pages dirty for given range. Mark all pages in the range as dirty and
+ *       exit the loop.
+ *     - Vendor driver should return copied_pfns and provide bitmap for
+ *       copied_pfn, which means that bitmap copied for given range contains
+ *       information for all pages where some bits are 0s and some are 1s.
+ * c. read data_offset, where vendor driver has written bitmap.
+ * d. read bitmap from the region or mmaped part of the region.
+ * e. Iterate through steps a to d while (total copied_pfns < total_pfns)
+ *
+ * In _RESUMING device state:
+ * - Load device config state.
+ * - While end of data for this device is not reached, repeat below steps:
+ *      - read data_size from file stream, read data from file stream of
+ *        data_size.
+ *      - read data_offset from where User application should write data.
+ *          if region is mmaped, write data of data_size to mmaped region.
+ *      - write data_size.
+ *          In case of mmapped region, write on data_size indicates kernel
+ *          driver that data is written in staging buffer.
+ *      - if region is trapped, write data of data_size from data_offset.
+ *
+ * For user application, data is opaque. User should write data in the same
+ * order as received.
+ */
+
+struct vfio_device_migration_info {
+        __u32 device_state;         /* VFIO device state */
+#define VFIO_DEVICE_STATE_RUNNING   (1 << 0)
+#define VFIO_DEVICE_STATE_SAVING    (1 << 1)
+#define VFIO_DEVICE_STATE_RESUMING  (1 << 2)
+#define VFIO_DEVICE_STATE_MASK      (VFIO_DEVICE_STATE_RUNNING | \
+                                     VFIO_DEVICE_STATE_SAVING | \
+                                     VFIO_DEVICE_STATE_RESUMING)
+#define VFIO_DEVICE_STATE_INVALID   (VFIO_DEVICE_STATE_SAVING | \
+                                     VFIO_DEVICE_STATE_RESUMING)
+        __u32 reserved;
+        __u64 pending_bytes;
+        __u64 data_offset;
+        __u64 data_size;
+        __u64 start_pfn;
+        __u64 page_size;
+        __u64 total_pfns;
+        __u64 copied_pfns;
+#define VFIO_DEVICE_DIRTY_PFNS_NONE     (0)
+#define VFIO_DEVICE_DIRTY_PFNS_ALL      (~0ULL)
+} __attribute__((packed));
+
 /*
  * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped
  * which allows direct access to non-MSIX registers which happened to be within
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 02/13] vfio: Add function to unmap VFIO region
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 01/13] vfio: KABI for migration interface Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 03/13] vfio: Add vfio_get_object callback to VFIODeviceOps Kirti Wankhede
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

This function is used in follwing patch in this series.
Migration region is mmaped when migration starts and will be unmapped when
migration is complete.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/common.c              | 20 ++++++++++++++++++++
 hw/vfio/trace-events          |  1 +
 include/hw/vfio/vfio-common.h |  1 +
 3 files changed, 22 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index a859298fdad9..de74dae8d6a6 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -964,6 +964,26 @@ int vfio_region_mmap(VFIORegion *region)
     return 0;
 }
 
+void vfio_region_unmap(VFIORegion *region)
+{
+    int i;
+
+    if (!region->mem) {
+        return;
+    }
+
+    for (i = 0; i < region->nr_mmaps; i++) {
+        trace_vfio_region_unmap(memory_region_name(&region->mmaps[i].mem),
+                                region->mmaps[i].offset,
+                                region->mmaps[i].offset +
+                                region->mmaps[i].size - 1);
+        memory_region_del_subregion(region->mem, &region->mmaps[i].mem);
+        munmap(region->mmaps[i].mmap, region->mmaps[i].size);
+        object_unparent(OBJECT(&region->mmaps[i].mem));
+        region->mmaps[i].mmap = NULL;
+    }
+}
+
 void vfio_region_exit(VFIORegion *region)
 {
     int i;
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index b1ef55a33ffd..8cdc27946cb8 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -111,6 +111,7 @@ vfio_region_mmap(const char *name, unsigned long offset, unsigned long end) "Reg
 vfio_region_exit(const char *name, int index) "Device %s, region %d"
 vfio_region_finalize(const char *name, int index) "Device %s, region %d"
 vfio_region_mmaps_set_enabled(const char *name, bool enabled) "Region %s mmaps enabled: %d"
+vfio_region_unmap(const char *name, unsigned long offset, unsigned long end) "Region %s unmap [0x%lx - 0x%lx]"
 vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) "Device %s region %d: %d sparse mmap entries"
 vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) "sparse entry %d [0x%lx - 0x%lx]"
 vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype) "%s index %d, %08x/%0x8"
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 9107bd41c030..93493891ba40 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -171,6 +171,7 @@ int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region,
                       int index, const char *name);
 int vfio_region_mmap(VFIORegion *region);
 void vfio_region_mmaps_set_enabled(VFIORegion *region, bool enabled);
+void vfio_region_unmap(VFIORegion *region);
 void vfio_region_exit(VFIORegion *region);
 void vfio_region_finalize(VFIORegion *region);
 void vfio_reset_handler(void *opaque);
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 03/13] vfio: Add vfio_get_object callback to VFIODeviceOps
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 01/13] vfio: KABI for migration interface Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 02/13] vfio: Add function to unmap VFIO region Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 04/13] vfio: Add save and load functions for VFIO PCI devices Kirti Wankhede
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

Hook vfio_get_object callback for PCI devices.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
---
 hw/vfio/pci.c                 | 8 ++++++++
 include/hw/vfio/vfio-common.h | 1 +
 2 files changed, 9 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index d7a4e1875c05..de0d286fc9dd 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2388,10 +2388,18 @@ static void vfio_pci_compute_needs_reset(VFIODevice *vbasedev)
     }
 }
 
+static Object *vfio_pci_get_object(VFIODevice *vbasedev)
+{
+    VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+
+    return OBJECT(vdev);
+}
+
 static VFIODeviceOps vfio_pci_ops = {
     .vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
     .vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
     .vfio_eoi = vfio_intx_eoi,
+    .vfio_get_object = vfio_pci_get_object,
 };
 
 int vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 93493891ba40..771b6d59a3db 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -119,6 +119,7 @@ struct VFIODeviceOps {
     void (*vfio_compute_needs_reset)(VFIODevice *vdev);
     int (*vfio_hot_reset_multi)(VFIODevice *vdev);
     void (*vfio_eoi)(VFIODevice *vdev);
+    Object *(*vfio_get_object)(VFIODevice *vdev);
 };
 
 typedef struct VFIOGroup {
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 04/13] vfio: Add save and load functions for VFIO PCI devices
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (2 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 03/13] vfio: Add vfio_get_object callback to VFIODeviceOps Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 05/13] vfio: Add migration region initialization and finalize function Kirti Wankhede
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

These functions save and restore PCI device specific data - config
space of PCI device.
Tested save and restore with MSI and MSIX type.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/pci.c                 | 114 ++++++++++++++++++++++++++++++++++++++++++
 include/hw/vfio/vfio-common.h |   2 +
 2 files changed, 116 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index de0d286fc9dd..5fe4f8076cac 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2395,11 +2395,125 @@ static Object *vfio_pci_get_object(VFIODevice *vbasedev)
     return OBJECT(vdev);
 }
 
+static void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f)
+{
+    VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+    PCIDevice *pdev = &vdev->pdev;
+    uint16_t pci_cmd;
+    int i;
+
+    for (i = 0; i < PCI_ROM_SLOT; i++) {
+        uint32_t bar;
+
+        bar = pci_default_read_config(pdev, PCI_BASE_ADDRESS_0 + i * 4, 4);
+        qemu_put_be32(f, bar);
+    }
+
+    qemu_put_be32(f, vdev->interrupt);
+    if (vdev->interrupt == VFIO_INT_MSI) {
+        uint32_t msi_flags, msi_addr_lo, msi_addr_hi = 0, msi_data;
+        bool msi_64bit;
+
+        msi_flags = pci_default_read_config(pdev, pdev->msi_cap + PCI_MSI_FLAGS,
+                                            2);
+        msi_64bit = (msi_flags & PCI_MSI_FLAGS_64BIT);
+
+        msi_addr_lo = pci_default_read_config(pdev,
+                                         pdev->msi_cap + PCI_MSI_ADDRESS_LO, 4);
+        qemu_put_be32(f, msi_addr_lo);
+
+        if (msi_64bit) {
+            msi_addr_hi = pci_default_read_config(pdev,
+                                             pdev->msi_cap + PCI_MSI_ADDRESS_HI,
+                                             4);
+        }
+        qemu_put_be32(f, msi_addr_hi);
+
+        msi_data = pci_default_read_config(pdev,
+                pdev->msi_cap + (msi_64bit ? PCI_MSI_DATA_64 : PCI_MSI_DATA_32),
+                2);
+        qemu_put_be32(f, msi_data);
+    } else if (vdev->interrupt == VFIO_INT_MSIX) {
+        uint16_t offset;
+
+        /* save enable bit and maskall bit */
+        offset = pci_default_read_config(pdev,
+                                       pdev->msix_cap + PCI_MSIX_FLAGS + 1, 2);
+        qemu_put_be16(f, offset);
+        msix_save(pdev, f);
+    }
+    pci_cmd = pci_default_read_config(pdev, PCI_COMMAND, 2);
+    qemu_put_be16(f, pci_cmd);
+}
+
+static void vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f)
+{
+    VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
+    PCIDevice *pdev = &vdev->pdev;
+    uint32_t interrupt_type;
+    uint32_t msi_flags, msi_addr_lo, msi_addr_hi = 0, msi_data;
+    uint16_t pci_cmd;
+    bool msi_64bit;
+    int i;
+
+    /* retore pci bar configuration */
+    pci_cmd = pci_default_read_config(pdev, PCI_COMMAND, 2);
+    vfio_pci_write_config(pdev, PCI_COMMAND,
+                        pci_cmd & (!(PCI_COMMAND_IO | PCI_COMMAND_MEMORY)), 2);
+    for (i = 0; i < PCI_ROM_SLOT; i++) {
+        uint32_t bar = qemu_get_be32(f);
+
+        vfio_pci_write_config(pdev, PCI_BASE_ADDRESS_0 + i * 4, bar, 4);
+    }
+    vfio_pci_write_config(pdev, PCI_COMMAND,
+                          pci_cmd | PCI_COMMAND_IO | PCI_COMMAND_MEMORY, 2);
+
+    interrupt_type = qemu_get_be32(f);
+
+    if (interrupt_type == VFIO_INT_MSI) {
+        /* restore msi configuration */
+        msi_flags = pci_default_read_config(pdev,
+                                            pdev->msi_cap + PCI_MSI_FLAGS, 2);
+        msi_64bit = (msi_flags & PCI_MSI_FLAGS_64BIT);
+
+        vfio_pci_write_config(pdev, pdev->msi_cap + PCI_MSI_FLAGS,
+                              msi_flags & (!PCI_MSI_FLAGS_ENABLE), 2);
+
+        msi_addr_lo = qemu_get_be32(f);
+        vfio_pci_write_config(pdev, pdev->msi_cap + PCI_MSI_ADDRESS_LO,
+                              msi_addr_lo, 4);
+
+        msi_addr_hi = qemu_get_be32(f);
+        if (msi_64bit) {
+            vfio_pci_write_config(pdev, pdev->msi_cap + PCI_MSI_ADDRESS_HI,
+                                  msi_addr_hi, 4);
+        }
+        msi_data = qemu_get_be32(f);
+        vfio_pci_write_config(pdev,
+                pdev->msi_cap + (msi_64bit ? PCI_MSI_DATA_64 : PCI_MSI_DATA_32),
+                msi_data, 2);
+
+        vfio_pci_write_config(pdev, pdev->msi_cap + PCI_MSI_FLAGS,
+                              msi_flags | PCI_MSI_FLAGS_ENABLE, 2);
+    } else if (interrupt_type == VFIO_INT_MSIX) {
+        uint16_t offset = qemu_get_be16(f);
+
+        /* load enable bit and maskall bit */
+        vfio_pci_write_config(pdev, pdev->msix_cap + PCI_MSIX_FLAGS + 1,
+                              offset, 2);
+        msix_load(pdev, f);
+    }
+    pci_cmd = qemu_get_be16(f);
+    vfio_pci_write_config(pdev, PCI_COMMAND, pci_cmd, 2);
+}
+
 static VFIODeviceOps vfio_pci_ops = {
     .vfio_compute_needs_reset = vfio_pci_compute_needs_reset,
     .vfio_hot_reset_multi = vfio_pci_hot_reset_multi,
     .vfio_eoi = vfio_intx_eoi,
     .vfio_get_object = vfio_pci_get_object,
+    .vfio_save_config = vfio_pci_save_config,
+    .vfio_load_config = vfio_pci_load_config,
 };
 
 int vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp)
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 771b6d59a3db..ee72bd984a36 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -120,6 +120,8 @@ struct VFIODeviceOps {
     int (*vfio_hot_reset_multi)(VFIODevice *vdev);
     void (*vfio_eoi)(VFIODevice *vdev);
     Object *(*vfio_get_object)(VFIODevice *vdev);
+    void (*vfio_save_config)(VFIODevice *vdev, QEMUFile *f);
+    void (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f);
 };
 
 typedef struct VFIOGroup {
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 05/13] vfio: Add migration region initialization and finalize function
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (3 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 04/13] vfio: Add save and load functions for VFIO PCI devices Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 06/13] vfio: Add VM state change handler to know state of VM Kirti Wankhede
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

- Migration functions are implemented for VFIO_DEVICE_TYPE_PCI device in this
  patch series.
- VFIO device supports migration or not is decided based of migration region
  query. If migration region query is successful and migration region
  initialization is successful then migration is supported else migration is
  blocked.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/Makefile.objs         |   2 +-
 hw/vfio/migration.c           | 145 ++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/trace-events          |   3 +
 include/hw/vfio/vfio-common.h |  14 ++++
 4 files changed, 163 insertions(+), 1 deletion(-)
 create mode 100644 hw/vfio/migration.c

diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
index abad8b818c9b..36033d1437c5 100644
--- a/hw/vfio/Makefile.objs
+++ b/hw/vfio/Makefile.objs
@@ -1,4 +1,4 @@
-obj-y += common.o spapr.o
+obj-y += common.o spapr.o migration.o
 obj-$(CONFIG_VFIO_PCI) += pci.o pci-quirks.o display.o
 obj-$(CONFIG_VFIO_CCW) += ccw.o
 obj-$(CONFIG_VFIO_PLATFORM) += platform.o
diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
new file mode 100644
index 000000000000..a2cfbd5af2e1
--- /dev/null
+++ b/hw/vfio/migration.c
@@ -0,0 +1,145 @@
+/*
+ * Migration support for VFIO devices
+ *
+ * Copyright NVIDIA, Inc. 2019
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2. See
+ * the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include <linux/vfio.h>
+
+#include "hw/vfio/vfio-common.h"
+#include "cpu.h"
+#include "migration/migration.h"
+#include "migration/qemu-file.h"
+#include "migration/register.h"
+#include "migration/blocker.h"
+#include "migration/misc.h"
+#include "qapi/error.h"
+#include "exec/ramlist.h"
+#include "exec/ram_addr.h"
+#include "pci.h"
+#include "trace.h"
+
+static void vfio_migration_region_exit(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+
+    if (!migration) {
+        return;
+    }
+
+    if (migration->region.buffer.size) {
+        vfio_region_exit(&migration->region.buffer);
+        vfio_region_finalize(&migration->region.buffer);
+    }
+}
+
+static int vfio_migration_region_init(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    Object *obj = NULL;
+    int ret = -EINVAL;
+
+    if (!migration) {
+        return ret;
+    }
+
+    if (!vbasedev->ops || !vbasedev->ops->vfio_get_object) {
+        return ret;
+    }
+
+    obj = vbasedev->ops->vfio_get_object(vbasedev);
+    if (!obj) {
+        return ret;
+    }
+
+    ret = vfio_region_setup(obj, vbasedev, &migration->region.buffer,
+                            migration->region.index, "migration");
+    if (ret) {
+        error_report("%s: Failed to setup VFIO migration region %d: %s",
+                     vbasedev->name, migration->region.index, strerror(-ret));
+        goto err;
+    }
+
+    if (!migration->region.buffer.size) {
+        ret = -EINVAL;
+        error_report("%s: Invalid region size of VFIO migration region %d: %s",
+                     vbasedev->name, migration->region.index, strerror(-ret));
+        goto err;
+    }
+
+    return 0;
+
+err:
+    vfio_migration_region_exit(vbasedev);
+    return ret;
+}
+
+static int vfio_migration_init(VFIODevice *vbasedev,
+                               struct vfio_region_info *info)
+{
+    int ret;
+
+    vbasedev->migration = g_new0(VFIOMigration, 1);
+    vbasedev->migration->region.index = info->index;
+
+    ret = vfio_migration_region_init(vbasedev);
+    if (ret) {
+        error_report("%s: Failed to initialise migration region",
+                     vbasedev->name);
+        return ret;
+    }
+
+    return 0;
+}
+
+/* ---------------------------------------------------------------------- */
+
+int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
+{
+    struct vfio_region_info *info;
+    Error *local_err = NULL;
+    int ret;
+
+    ret = vfio_get_dev_region_info(vbasedev, VFIO_REGION_TYPE_MIGRATION,
+                                   VFIO_REGION_SUBTYPE_MIGRATION, &info);
+    if (ret) {
+        goto add_blocker;
+    }
+
+    ret = vfio_migration_init(vbasedev, info);
+    if (ret) {
+        goto add_blocker;
+    }
+
+    trace_vfio_migration_probe(vbasedev->name, info->index);
+    return 0;
+
+add_blocker:
+    error_setg(&vbasedev->migration_blocker,
+               "VFIO device doesn't support migration");
+    ret = migrate_add_blocker(vbasedev->migration_blocker, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        error_free(vbasedev->migration_blocker);
+    }
+    return ret;
+}
+
+void vfio_migration_finalize(VFIODevice *vbasedev)
+{
+    if (!vbasedev->migration) {
+        return;
+    }
+
+    if (vbasedev->migration_blocker) {
+        migrate_del_blocker(vbasedev->migration_blocker);
+        error_free(vbasedev->migration_blocker);
+    }
+
+    vfio_migration_region_exit(vbasedev);
+    g_free(vbasedev->migration);
+}
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 8cdc27946cb8..191a726a1312 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -143,3 +143,6 @@ vfio_display_edid_link_up(void) ""
 vfio_display_edid_link_down(void) ""
 vfio_display_edid_update(uint32_t prefx, uint32_t prefy) "%ux%u"
 vfio_display_edid_write_error(void) ""
+
+# migration.c
+vfio_migration_probe(char *name, uint32_t index) " (%s) Region %d"
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index ee72bd984a36..152da3f8d6f3 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -57,6 +57,15 @@ typedef struct VFIORegion {
     uint8_t nr; /* cache the region number for debug */
 } VFIORegion;
 
+typedef struct VFIOMigration {
+    struct {
+        VFIORegion buffer;
+        uint32_t index;
+    } region;
+    uint64_t pending_bytes;
+    QemuMutex lock;
+} VFIOMigration;
+
 typedef struct VFIOAddressSpace {
     AddressSpace *as;
     QLIST_HEAD(, VFIOContainer) containers;
@@ -113,6 +122,8 @@ typedef struct VFIODevice {
     unsigned int num_irqs;
     unsigned int num_regions;
     unsigned int flags;
+    VFIOMigration *migration;
+    Error *migration_blocker;
 } VFIODevice;
 
 struct VFIODeviceOps {
@@ -204,4 +215,7 @@ int vfio_spapr_create_window(VFIOContainer *container,
 int vfio_spapr_remove_window(VFIOContainer *container,
                              hwaddr offset_within_address_space);
 
+int vfio_migration_probe(VFIODevice *vbasedev, Error **errp);
+void vfio_migration_finalize(VFIODevice *vbasedev);
+
 #endif /* HW_VFIO_VFIO_COMMON_H */
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 06/13] vfio: Add VM state change handler to know state of VM
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (4 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 05/13] vfio: Add migration region initialization and finalize function Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 07/13] vfio: Add migration state change notifier Kirti Wankhede
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

VM state change handler gets called on change in VM's state. This is used to set
VFIO device state to _RUNNING.
VM state change handler, migration state change handler and log_sync listener
are called asynchronously, which sometimes lead to data corruption in migration
region. Initialised mutex that is used to serialize operations on migration data
region during saving state.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/migration.c           | 64 +++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/trace-events          |  2 ++
 include/hw/vfio/vfio-common.h |  4 +++
 3 files changed, 70 insertions(+)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index a2cfbd5af2e1..c01f08b659d0 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -78,6 +78,60 @@ err:
     return ret;
 }
 
+static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t state)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIORegion *region = &migration->region.buffer;
+    uint32_t device_state;
+    int ret = 0;
+
+    device_state = (state & VFIO_DEVICE_STATE_MASK) |
+                   (vbasedev->device_state & ~VFIO_DEVICE_STATE_MASK);
+
+    if ((device_state & VFIO_DEVICE_STATE_MASK) == VFIO_DEVICE_STATE_INVALID) {
+        return -EINVAL;
+    }
+
+    ret = pwrite(vbasedev->fd, &device_state, sizeof(device_state),
+                 region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                              device_state));
+    if (ret < 0) {
+        error_report("%s: Failed to set device state %d %s",
+                     vbasedev->name, ret, strerror(errno));
+        return ret;
+    }
+
+    vbasedev->device_state = device_state;
+    trace_vfio_migration_set_state(vbasedev->name, device_state);
+    return 0;
+}
+
+static void vfio_vmstate_change(void *opaque, int running, RunState state)
+{
+    VFIODevice *vbasedev = opaque;
+
+    if ((vbasedev->vm_running != running)) {
+        int ret;
+        uint32_t dev_state;
+
+        if (running) {
+            dev_state = VFIO_DEVICE_STATE_RUNNING;
+        } else {
+            dev_state = (vbasedev->device_state & VFIO_DEVICE_STATE_MASK) &
+                     ~VFIO_DEVICE_STATE_RUNNING;
+        }
+
+        ret = vfio_migration_set_state(vbasedev, dev_state);
+        if (ret) {
+            error_report("%s: Failed to set device state 0x%x",
+                         vbasedev->name, dev_state);
+        }
+        vbasedev->vm_running = running;
+        trace_vfio_vmstate_change(vbasedev->name, running, RunState_str(state),
+                                  dev_state);
+    }
+}
+
 static int vfio_migration_init(VFIODevice *vbasedev,
                                struct vfio_region_info *info)
 {
@@ -93,6 +147,11 @@ static int vfio_migration_init(VFIODevice *vbasedev,
         return ret;
     }
 
+    qemu_mutex_init(&vbasedev->migration->lock);
+
+    vbasedev->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
+                                                          vbasedev);
+
     return 0;
 }
 
@@ -135,11 +194,16 @@ void vfio_migration_finalize(VFIODevice *vbasedev)
         return;
     }
 
+    if (vbasedev->vm_state) {
+        qemu_del_vm_change_state_handler(vbasedev->vm_state);
+    }
+
     if (vbasedev->migration_blocker) {
         migrate_del_blocker(vbasedev->migration_blocker);
         error_free(vbasedev->migration_blocker);
     }
 
+    qemu_mutex_destroy(&vbasedev->migration->lock);
     vfio_migration_region_exit(vbasedev);
     g_free(vbasedev->migration);
 }
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 191a726a1312..3d15bacd031a 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -146,3 +146,5 @@ vfio_display_edid_write_error(void) ""
 
 # migration.c
 vfio_migration_probe(char *name, uint32_t index) " (%s) Region %d"
+vfio_migration_set_state(char *name, uint32_t state) " (%s) state %d"
+vfio_vmstate_change(char *name, int running, const char *reason, uint32_t dev_state) " (%s) running %d reason %s device state %d"
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index 152da3f8d6f3..f6c70db3a9c1 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -29,6 +29,7 @@
 #ifdef CONFIG_LINUX
 #include <linux/vfio.h>
 #endif
+#include "sysemu/sysemu.h"
 
 #define VFIO_MSG_PREFIX "vfio %s: "
 
@@ -124,6 +125,9 @@ typedef struct VFIODevice {
     unsigned int flags;
     VFIOMigration *migration;
     Error *migration_blocker;
+    uint32_t device_state;
+    VMChangeStateEntry *vm_state;
+    int vm_running;
 } VFIODevice;
 
 struct VFIODeviceOps {
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 07/13] vfio: Add migration state change notifier
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (5 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 06/13] vfio: Add VM state change handler to know state of VM Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 08/13] vfio: Register SaveVMHandlers for VFIO device Kirti Wankhede
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

Added migration state change notifier to get notification on migration state
change. These states are translated to VFIO device state and conveyed to vendor
driver.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/migration.c           | 54 +++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/trace-events          |  1 +
 include/hw/vfio/vfio-common.h |  1 +
 3 files changed, 56 insertions(+)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index c01f08b659d0..e4a89a6f9bc7 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -132,6 +132,53 @@ static void vfio_vmstate_change(void *opaque, int running, RunState state)
     }
 }
 
+static void vfio_migration_state_notifier(Notifier *notifier, void *data)
+{
+    MigrationState *s = data;
+    VFIODevice *vbasedev = container_of(notifier, VFIODevice, migration_state);
+    int ret;
+
+    trace_vfio_migration_state_notifier(vbasedev->name, s->state);
+
+    switch (s->state) {
+    case MIGRATION_STATUS_ACTIVE:
+        if (vbasedev->device_state & VFIO_DEVICE_STATE_RUNNING) {
+            if (vbasedev->vm_running) {
+                ret = vfio_migration_set_state(vbasedev,
+                          VFIO_DEVICE_STATE_RUNNING | VFIO_DEVICE_STATE_SAVING);
+                if (ret) {
+                    error_report("%s: Failed to set state RUNNING and SAVING",
+                                  vbasedev->name);
+                }
+            } else {
+                ret = vfio_migration_set_state(vbasedev,
+                                               VFIO_DEVICE_STATE_SAVING);
+                if (ret) {
+                    error_report("%s: Failed to set state STOP and SAVING",
+                                 vbasedev->name);
+                }
+            }
+        } else {
+            ret = vfio_migration_set_state(vbasedev,
+                                           VFIO_DEVICE_STATE_RESUMING);
+            if (ret) {
+                error_report("%s: Failed to set state RESUMING",
+                             vbasedev->name);
+            }
+        }
+        return;
+
+    case MIGRATION_STATUS_CANCELLING:
+    case MIGRATION_STATUS_CANCELLED:
+    case MIGRATION_STATUS_FAILED:
+        ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RUNNING);
+        if (ret) {
+            error_report("%s: Failed to set state RUNNING", vbasedev->name);
+        }
+        return;
+    }
+}
+
 static int vfio_migration_init(VFIODevice *vbasedev,
                                struct vfio_region_info *info)
 {
@@ -152,6 +199,9 @@ static int vfio_migration_init(VFIODevice *vbasedev,
     vbasedev->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
                                                           vbasedev);
 
+    vbasedev->migration_state.notify = vfio_migration_state_notifier;
+    add_migration_state_change_notifier(&vbasedev->migration_state);
+
     return 0;
 }
 
@@ -194,6 +244,10 @@ void vfio_migration_finalize(VFIODevice *vbasedev)
         return;
     }
 
+    if (vbasedev->migration_state.notify) {
+        remove_migration_state_change_notifier(&vbasedev->migration_state);
+    }
+
     if (vbasedev->vm_state) {
         qemu_del_vm_change_state_handler(vbasedev->vm_state);
     }
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 3d15bacd031a..69503228f20e 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -148,3 +148,4 @@ vfio_display_edid_write_error(void) ""
 vfio_migration_probe(char *name, uint32_t index) " (%s) Region %d"
 vfio_migration_set_state(char *name, uint32_t state) " (%s) state %d"
 vfio_vmstate_change(char *name, int running, const char *reason, uint32_t dev_state) " (%s) running %d reason %s device state %d"
+vfio_migration_state_notifier(char *name, int state) " (%s) state %d"
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index f6c70db3a9c1..a022484d2636 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -128,6 +128,7 @@ typedef struct VFIODevice {
     uint32_t device_state;
     VMChangeStateEntry *vm_state;
     int vm_running;
+    Notifier migration_state;
 } VFIODevice;
 
 struct VFIODeviceOps {
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 08/13] vfio: Register SaveVMHandlers for VFIO device
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (6 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 07/13] vfio: Add migration state change notifier Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 09/13] vfio: Add save state functions to SaveVMHandlers Kirti Wankhede
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

Define flags to be used as delimeter in migration file stream.
Added .save_setup and .save_cleanup functions. Mapped & unmapped migration
region from these functions at source during saving or pre-copy phase.
Set VFIO device state depending on VM's state. During live migration, VM is
running when .save_setup is called, _SAVING | _RUNNING state is set for VFIO
device. During save-restore, VM is paused, _SAVING state is set for VFIO device.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/migration.c  | 82 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 hw/vfio/trace-events |  2 ++
 2 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index e4a89a6f9bc7..0597a45fda2d 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -23,6 +23,17 @@
 #include "pci.h"
 #include "trace.h"
 
+/*
+ * Flags used as delimiter:
+ * 0xffffffff => MSB 32-bit all 1s
+ * 0xef10     => emulated (virtual) function IO
+ * 0x0000     => 16-bits reserved for flags
+ */
+#define VFIO_MIG_FLAG_END_OF_STATE      (0xffffffffef100001ULL)
+#define VFIO_MIG_FLAG_DEV_CONFIG_STATE  (0xffffffffef100002ULL)
+#define VFIO_MIG_FLAG_DEV_SETUP_STATE   (0xffffffffef100003ULL)
+#define VFIO_MIG_FLAG_DEV_DATA_STATE    (0xffffffffef100004ULL)
+
 static void vfio_migration_region_exit(VFIODevice *vbasedev)
 {
     VFIOMigration *migration = vbasedev->migration;
@@ -106,6 +117,74 @@ static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t state)
     return 0;
 }
 
+/* ---------------------------------------------------------------------- */
+
+static int vfio_save_setup(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
+    int ret;
+
+    qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
+
+    if (migration->region.buffer.mmaps) {
+        qemu_mutex_lock_iothread();
+        ret = vfio_region_mmap(&migration->region.buffer);
+        qemu_mutex_unlock_iothread();
+        if (ret) {
+            error_report("%s: Failed to mmap VFIO migration region %d: %s",
+                         vbasedev->name, migration->region.index,
+                         strerror(-ret));
+            return ret;
+        }
+    }
+
+    if (vbasedev->vm_running) {
+        ret = vfio_migration_set_state(vbasedev,
+                         VFIO_DEVICE_STATE_RUNNING | VFIO_DEVICE_STATE_SAVING);
+        if (ret) {
+            error_report("%s: Failed to set state RUNNING and SAVING",
+                         vbasedev->name);
+            return ret;
+        }
+    } else {
+        ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_SAVING);
+        if (ret) {
+            error_report("%s: Failed to set state STOP and SAVING",
+                         vbasedev->name);
+            return ret;
+        }
+    }
+
+    qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
+
+    ret = qemu_file_get_error(f);
+    if (ret) {
+        return ret;
+    }
+
+    trace_vfio_save_setup(vbasedev->name);
+    return 0;
+}
+
+static void vfio_save_cleanup(void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
+
+    if (migration->region.buffer.mmaps) {
+        vfio_region_unmap(&migration->region.buffer);
+    }
+    trace_vfio_save_cleanup(vbasedev->name);
+}
+
+static SaveVMHandlers savevm_vfio_handlers = {
+    .save_setup = vfio_save_setup,
+    .save_cleanup = vfio_save_cleanup,
+};
+
+/* ---------------------------------------------------------------------- */
+
 static void vfio_vmstate_change(void *opaque, int running, RunState state)
 {
     VFIODevice *vbasedev = opaque;
@@ -195,7 +274,8 @@ static int vfio_migration_init(VFIODevice *vbasedev,
     }
 
     qemu_mutex_init(&vbasedev->migration->lock);
-
+    register_savevm_live(vbasedev->dev, "vfio", -1, 1, &savevm_vfio_handlers,
+                         vbasedev);
     vbasedev->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
                                                           vbasedev);
 
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 69503228f20e..4bb43f18f315 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -149,3 +149,5 @@ vfio_migration_probe(char *name, uint32_t index) " (%s) Region %d"
 vfio_migration_set_state(char *name, uint32_t state) " (%s) state %d"
 vfio_vmstate_change(char *name, int running, const char *reason, uint32_t dev_state) " (%s) running %d reason %s device state %d"
 vfio_migration_state_notifier(char *name, int state) " (%s) state %d"
+vfio_save_setup(char *name) " (%s)"
+vfio_save_cleanup(char *name) " (%s)"
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 09/13] vfio: Add save state functions to SaveVMHandlers
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (7 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 08/13] vfio: Register SaveVMHandlers for VFIO device Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 10/13] vfio: Add load " Kirti Wankhede
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

Added .save_live_pending, .save_live_iterate and .save_live_complete_precopy
functions. These functions handles pre-copy and stop-and-copy phase.

In _SAVING|_RUNNING device state or pre-copy phase:
- read pending_bytes
- read data_offset - indicates kernel driver to write data to staging
  buffer which is mmapped.
- read data_size - amount of data in bytes written by vendor driver in migration
  region.
- if data section is trapped, pread() from data_offset of data_size.
- if data section is mmaped, read mmaped buffer of data_size.
- Write data packet to file stream as below:
{VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data,
VFIO_MIG_FLAG_END_OF_STATE }

In _SAVING device state or stop-and-copy phase
a. read config space of device and save to migration file stream. This
   doesn't need to be from vendor driver. Any other special config state
   from driver can be saved as data in following iteration.
b. read pending_bytes
c. read data_offset - indicates kernel driver to write data to staging
   buffer which is mmapped.
d. read data_size - amount of data in bytes written by vendor driver in
   migration region.
e. if data section is trapped, pread() from data_offset of data_size.
f. if data section is mmaped, read mmaped buffer of data_size.
g. Write data packet as below:
   {VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data}
h. iterate through steps b to g while (pending_bytes > 0)
i. Write {VFIO_MIG_FLAG_END_OF_STATE}

When data region is mapped, its user's responsibility to read data from
data_offset of data_size before moving to next steps.

.save_live_iterate runs outside the iothread lock in the migration case, which
could race with asynchronous call to get dirty page list causing data corruption
in mapped migration region. Mutex added here to serial migration buffer read
operation.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/migration.c  | 246 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/trace-events |   6 ++
 2 files changed, 252 insertions(+)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 0597a45fda2d..4e9b4cce230b 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -117,6 +117,138 @@ static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t state)
     return 0;
 }
 
+static void *find_data_region(VFIORegion *region,
+                              uint64_t data_offset,
+                              uint64_t data_size)
+{
+    void *ptr = NULL;
+    int i;
+
+    for (i = 0; i < region->nr_mmaps; i++) {
+        if ((data_offset >= region->mmaps[i].offset) &&
+            (data_offset < region->mmaps[i].offset + region->mmaps[i].size) &&
+            (data_size <= region->mmaps[i].size)) {
+            ptr = region->mmaps[i].mmap + (data_offset -
+                                           region->mmaps[i].offset);
+            break;
+        }
+    }
+    return ptr;
+}
+
+static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIORegion *region = &migration->region.buffer;
+    uint64_t data_offset = 0, data_size = 0;
+    int ret;
+
+    ret = pread(vbasedev->fd, &data_offset, sizeof(data_offset),
+                region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                             data_offset));
+    if (ret != sizeof(data_offset)) {
+        error_report("%s: Failed to get migration buffer data offset %d",
+                     vbasedev->name, ret);
+        return -EINVAL;
+    }
+
+    ret = pread(vbasedev->fd, &data_size, sizeof(data_size),
+                region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                             data_size));
+    if (ret != sizeof(data_size)) {
+        error_report("%s: Failed to get migration buffer data size %d",
+                     vbasedev->name, ret);
+        return -EINVAL;
+    }
+
+    if (data_size > 0) {
+        void *buf = NULL;
+        bool buffer_mmaped;
+
+        if (region->mmaps) {
+            buf = find_data_region(region, data_offset, data_size);
+        }
+
+        buffer_mmaped = (buf != NULL) ? true : false;
+
+        if (!buffer_mmaped) {
+            buf = g_try_malloc0(data_size);
+            if (!buf) {
+                error_report("%s: Error allocating buffer ", __func__);
+                return -ENOMEM;
+            }
+
+            ret = pread(vbasedev->fd, buf, data_size,
+                        region->fd_offset + data_offset);
+            if (ret != data_size) {
+                error_report("%s: Failed to get migration data %d",
+                             vbasedev->name, ret);
+                g_free(buf);
+                return -EINVAL;
+            }
+        }
+
+        qemu_put_be64(f, data_size);
+        qemu_put_buffer(f, buf, data_size);
+
+        if (!buffer_mmaped) {
+            g_free(buf);
+        }
+        migration->pending_bytes -= data_size;
+    } else {
+        qemu_put_be64(f, data_size);
+    }
+
+    trace_vfio_save_buffer(vbasedev->name, data_offset, data_size,
+                           migration->pending_bytes);
+
+    ret = qemu_file_get_error(f);
+    if (ret) {
+        return ret;
+    }
+
+    return data_size;
+}
+
+static int vfio_update_pending(VFIODevice *vbasedev)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIORegion *region = &migration->region.buffer;
+    uint64_t pending_bytes = 0;
+    int ret;
+
+    ret = pread(vbasedev->fd, &pending_bytes, sizeof(pending_bytes),
+                region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                             pending_bytes));
+    if ((ret < 0) || (ret != sizeof(pending_bytes))) {
+        error_report("%s: Failed to get pending bytes %d",
+                     vbasedev->name, ret);
+        migration->pending_bytes = 0;
+        return (ret < 0) ? ret : -EINVAL;
+    }
+
+    migration->pending_bytes = pending_bytes;
+    trace_vfio_update_pending(vbasedev->name, pending_bytes);
+    return 0;
+}
+
+static int vfio_save_device_config_state(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+
+    qemu_put_be64(f, VFIO_MIG_FLAG_DEV_CONFIG_STATE);
+
+    if (vbasedev->ops && vbasedev->ops->vfio_save_config) {
+        vbasedev->ops->vfio_save_config(vbasedev, f);
+    }
+
+    qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
+
+    trace_vfio_save_device_config_state(vbasedev->name);
+
+    return qemu_file_get_error(f);
+}
+
 /* ---------------------------------------------------------------------- */
 
 static int vfio_save_setup(QEMUFile *f, void *opaque)
@@ -178,9 +310,123 @@ static void vfio_save_cleanup(void *opaque)
     trace_vfio_save_cleanup(vbasedev->name);
 }
 
+static void vfio_save_pending(QEMUFile *f, void *opaque,
+                              uint64_t threshold_size,
+                              uint64_t *res_precopy_only,
+                              uint64_t *res_compatible,
+                              uint64_t *res_postcopy_only)
+{
+    VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
+    int ret;
+
+    ret = vfio_update_pending(vbasedev);
+    if (ret) {
+        return;
+    }
+
+    *res_precopy_only += migration->pending_bytes;
+
+    trace_vfio_save_pending(vbasedev->name, *res_precopy_only,
+                            *res_postcopy_only, *res_compatible);
+}
+
+static int vfio_save_iterate(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
+    int ret, data_size;
+
+    qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
+
+    qemu_mutex_lock(&migration->lock);
+    data_size = vfio_save_buffer(f, vbasedev);
+    qemu_mutex_unlock(&migration->lock);
+
+    if (data_size < 0) {
+        error_report("%s: vfio_save_buffer failed %s", vbasedev->name,
+                     strerror(errno));
+        return data_size;
+    }
+
+    qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
+
+    ret = qemu_file_get_error(f);
+    if (ret) {
+        return ret;
+    }
+
+    trace_vfio_save_iterate(vbasedev->name, data_size);
+    if (data_size == 0) {
+        /* indicates data finished, goto complete phase */
+        return 1;
+    }
+
+    return 0;
+}
+
+static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
+    int ret;
+
+    ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_SAVING);
+    if (ret) {
+        error_report("%s: Failed to set state STOP and SAVING",
+                     vbasedev->name);
+        return ret;
+    }
+
+    ret = vfio_save_device_config_state(f, opaque);
+    if (ret) {
+        return ret;
+    }
+
+    ret = vfio_update_pending(vbasedev);
+    if (ret) {
+        return ret;
+    }
+
+    while (migration->pending_bytes > 0) {
+        qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
+        ret = vfio_save_buffer(f, vbasedev);
+        if (ret < 0) {
+            error_report("%s: Failed to save buffer", vbasedev->name);
+            return ret;
+        } else if (ret == 0) {
+            break;
+        }
+
+        ret = vfio_update_pending(vbasedev);
+        if (ret) {
+            return ret;
+        }
+    }
+
+    qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
+
+    ret = qemu_file_get_error(f);
+    if (ret) {
+        return ret;
+    }
+
+    ret = vfio_migration_set_state(vbasedev, ~VFIO_DEVICE_STATE_MASK);
+    if (ret) {
+        error_report("%s: Failed to set state STOPPED", vbasedev->name);
+        return ret;
+    }
+
+    trace_vfio_save_complete_precopy(vbasedev->name);
+    return ret;
+}
+
 static SaveVMHandlers savevm_vfio_handlers = {
     .save_setup = vfio_save_setup,
     .save_cleanup = vfio_save_cleanup,
+    .save_live_pending = vfio_save_pending,
+    .save_live_iterate = vfio_save_iterate,
+    .save_live_complete_precopy = vfio_save_complete_precopy,
 };
 
 /* ---------------------------------------------------------------------- */
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 4bb43f18f315..4c16e787f5a1 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -151,3 +151,9 @@ vfio_vmstate_change(char *name, int running, const char *reason, uint32_t dev_st
 vfio_migration_state_notifier(char *name, int state) " (%s) state %d"
 vfio_save_setup(char *name) " (%s)"
 vfio_save_cleanup(char *name) " (%s)"
+vfio_save_buffer(char *name, uint64_t data_offset, uint64_t data_size, uint64_t pending) " (%s), Offset 0x%"PRIx64" size 0x%"PRIx64" pending 0x%"PRIx64
+vfio_update_pending(char *name, uint64_t pending) " (%s), pending 0x%"PRIx64
+vfio_save_device_config_state(char *name) " (%s)"
+vfio_save_pending(char *name, uint64_t precopy, uint64_t postcopy, uint64_t compatible) " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64
+vfio_save_iterate(char *name, int data_size) " (%s) data_size %d"
+vfio_save_complete_precopy(char *name) " (%s)"
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 10/13] vfio: Add load state functions to SaveVMHandlers
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (8 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 09/13] vfio: Add save state functions to SaveVMHandlers Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 11/13] vfio: Add function to get dirty page list Kirti Wankhede
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

Flow during _RESUMING device state:
- If Vendor driver defines mappable region, mmap migration region.
- Load config state.
- For data packet, till VFIO_MIG_FLAG_END_OF_STATE is not reached
    - read data_size from packet, read buffer of data_size
    - read data_offset from where QEMU should write data.
        if region is mmaped, write data of data_size to mmaped region.
    - write data_size.
        In case of mmapped region, write to data_size indicates kernel
        driver that data is written in staging buffer.
    - if region is trapped, pwrite() data of data_size from data_offset.
- Repeat above until VFIO_MIG_FLAG_END_OF_STATE.
- Unmap migration region.

For user, data is opaque. User should write data in the same order as
received.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/migration.c  | 162 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/trace-events |   4 ++
 2 files changed, 166 insertions(+)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 4e9b4cce230b..5fb4c5329ede 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -249,6 +249,26 @@ static int vfio_save_device_config_state(QEMUFile *f, void *opaque)
     return qemu_file_get_error(f);
 }
 
+static int vfio_load_device_config_state(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    uint64_t data;
+
+    if (vbasedev->ops && vbasedev->ops->vfio_load_config) {
+        vbasedev->ops->vfio_load_config(vbasedev, f);
+    }
+
+    data = qemu_get_be64(f);
+    if (data != VFIO_MIG_FLAG_END_OF_STATE) {
+        error_report("%s: Failed loading device config space, "
+                     "end flag incorrect 0x%"PRIx64, vbasedev->name, data);
+        return -EINVAL;
+    }
+
+    trace_vfio_load_device_config_state(vbasedev->name);
+    return qemu_file_get_error(f);
+}
+
 /* ---------------------------------------------------------------------- */
 
 static int vfio_save_setup(QEMUFile *f, void *opaque)
@@ -421,12 +441,154 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
     return ret;
 }
 
+static int vfio_load_setup(QEMUFile *f, void *opaque)
+{
+    VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
+    int ret = 0;
+
+    if (migration->region.buffer.mmaps) {
+        ret = vfio_region_mmap(&migration->region.buffer);
+        if (ret) {
+            error_report("%s: Failed to mmap VFIO migration region %d: %s",
+                         vbasedev->name, migration->region.index,
+                         strerror(-ret));
+            return ret;
+        }
+    }
+
+    ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING);
+    if (ret) {
+        error_report("%s: Failed to set state RESUMING", vbasedev->name);
+    }
+    return ret;
+}
+
+static int vfio_load_cleanup(void *opaque)
+{
+    vfio_save_cleanup(opaque);
+    return 0;
+}
+
+static int vfio_load_state(QEMUFile *f, void *opaque, int version_id)
+{
+    VFIODevice *vbasedev = opaque;
+    VFIOMigration *migration = vbasedev->migration;
+    int ret = 0;
+    uint64_t data, data_size;
+
+    data = qemu_get_be64(f);
+    while (data != VFIO_MIG_FLAG_END_OF_STATE) {
+
+        trace_vfio_load_state(vbasedev->name, data);
+
+        switch (data) {
+        case VFIO_MIG_FLAG_DEV_CONFIG_STATE:
+        {
+            ret = vfio_load_device_config_state(f, opaque);
+            if (ret) {
+                return ret;
+            }
+            break;
+        }
+        case VFIO_MIG_FLAG_DEV_SETUP_STATE:
+        {
+            data = qemu_get_be64(f);
+            if (data == VFIO_MIG_FLAG_END_OF_STATE) {
+                return ret;
+            } else {
+                error_report("%s: SETUP STATE: EOS not found 0x%"PRIx64,
+                             vbasedev->name, data);
+                return -EINVAL;
+            }
+            break;
+        }
+        case VFIO_MIG_FLAG_DEV_DATA_STATE:
+        {
+            VFIORegion *region = &migration->region.buffer;
+            void *buf = NULL;
+            bool buffer_mmaped = false;
+            uint64_t data_offset = 0;
+
+            data_size = qemu_get_be64(f);
+            if (data_size == 0) {
+                break;
+            }
+
+            ret = pread(vbasedev->fd, &data_offset, sizeof(data_offset),
+                        region->fd_offset +
+                        offsetof(struct vfio_device_migration_info,
+                        data_offset));
+            if (ret != sizeof(data_offset)) {
+                error_report("%s:Failed to get migration buffer data offset %d",
+                             vbasedev->name, ret);
+                return -EINVAL;
+            }
+
+            if (region->mmaps) {
+                buf = find_data_region(region, data_offset, data_size);
+            }
+
+            buffer_mmaped = (buf != NULL) ? true : false;
+
+            if (!buffer_mmaped) {
+                buf = g_try_malloc0(data_size);
+                if (!buf) {
+                    error_report("%s: Error allocating buffer ", __func__);
+                    return -ENOMEM;
+                }
+            }
+
+            qemu_get_buffer(f, buf, data_size);
+
+            ret = pwrite(vbasedev->fd, &data_size, sizeof(data_size),
+                         region->fd_offset +
+                       offsetof(struct vfio_device_migration_info, data_size));
+            if (ret != sizeof(data_size)) {
+                error_report("%s: Failed to set migration buffer data size %d",
+                             vbasedev->name, ret);
+                if (!buffer_mmaped) {
+                    g_free(buf);
+                }
+                return -EINVAL;
+            }
+
+            if (!buffer_mmaped) {
+                ret = pwrite(vbasedev->fd, buf, data_size,
+                             region->fd_offset + data_offset);
+                g_free(buf);
+
+                if (ret != data_size) {
+                    error_report("%s: Failed to set migration buffer %d",
+                                 vbasedev->name, ret);
+                    return -EINVAL;
+                }
+            }
+            trace_vfio_load_state_device_data(vbasedev->name, data_offset,
+                                              data_size);
+            break;
+        }
+        }
+
+        ret = qemu_file_get_error(f);
+        if (ret) {
+            return ret;
+        }
+        data = qemu_get_be64(f);
+    }
+
+    return ret;
+}
+
 static SaveVMHandlers savevm_vfio_handlers = {
     .save_setup = vfio_save_setup,
     .save_cleanup = vfio_save_cleanup,
     .save_live_pending = vfio_save_pending,
     .save_live_iterate = vfio_save_iterate,
     .save_live_complete_precopy = vfio_save_complete_precopy,
+    .load_setup = vfio_load_setup,
+    .load_cleanup = vfio_load_cleanup,
+    .load_state = vfio_load_state,
 };
 
 /* ---------------------------------------------------------------------- */
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index 4c16e787f5a1..e01c94e6ebd4 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -157,3 +157,7 @@ vfio_save_device_config_state(char *name) " (%s)"
 vfio_save_pending(char *name, uint64_t precopy, uint64_t postcopy, uint64_t compatible) " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64
 vfio_save_iterate(char *name, int data_size) " (%s) data_size %d"
 vfio_save_complete_precopy(char *name) " (%s)"
+vfio_load_device_config_state(char *name) " (%s)"
+vfio_load_state(char *name, uint64_t data) " (%s) data 0x%"PRIx64
+vfio_load_state_device_data(char *name, uint64_t data_offset, uint64_t data_size) " (%s), Offset 0x%"PRIx64" size 0x%"PRIx64
+
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 11/13] vfio: Add function to get dirty page list
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (9 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 10/13] vfio: Add load " Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 12/13] vfio: Add vfio_listerner_log_sync to mark dirty pages Kirti Wankhede
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

Dirty page tracking (.log_sync) is part of RAM copying state, where
vendor driver provides the bitmap of pages which are dirtied by vendor
driver through migration region and as part of RAM copy, those pages
gets copied to file stream.

To get dirty page bitmap:
- write start address, page_size and pfn count.
- read count of pfns copied.
    - Vendor driver should return 0 if driver doesn't have any page to
      report dirty in given range.
    - Vendor driver should return -1 to mark all pages dirty for given range.
- read data_offset, where vendor driver has written bitmap.
- read bitmap from the region or mmaped part of the region.
- Iterate above steps till page bitmap for all requested pfns are copied.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/migration.c           | 123 ++++++++++++++++++++++++++++++++++++++++++
 hw/vfio/trace-events          |   4 +-
 include/hw/vfio/vfio-common.h |   2 +
 3 files changed, 127 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
index 5fb4c5329ede..ca1a8c0f5f1f 100644
--- a/hw/vfio/migration.c
+++ b/hw/vfio/migration.c
@@ -269,6 +269,129 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque)
     return qemu_file_get_error(f);
 }
 
+void vfio_get_dirty_page_list(VFIODevice *vbasedev,
+                              uint64_t start_pfn,
+                              uint64_t pfn_count,
+                              uint64_t page_size)
+{
+    VFIOMigration *migration = vbasedev->migration;
+    VFIORegion *region = &migration->region.buffer;
+    uint64_t count = 0;
+    int64_t copied_pfns = 0;
+    int64_t total_pfns = pfn_count;
+    int ret;
+
+    qemu_mutex_lock(&migration->lock);
+
+    while (total_pfns > 0) {
+        uint64_t bitmap_size, data_offset = 0;
+        uint64_t start = start_pfn + count;
+        void *buf = NULL;
+        bool buffer_mmaped = false;
+
+        ret = pwrite(vbasedev->fd, &start, sizeof(start),
+                 region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                              start_pfn));
+        if (ret < 0) {
+            error_report("%s: Failed to set dirty pages start address %d %s",
+                         vbasedev->name, ret, strerror(errno));
+            goto dpl_unlock;
+        }
+
+        ret = pwrite(vbasedev->fd, &page_size, sizeof(page_size),
+                 region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                              page_size));
+        if (ret < 0) {
+            error_report("%s: Failed to set dirty page size %d %s",
+                         vbasedev->name, ret, strerror(errno));
+            goto dpl_unlock;
+        }
+
+        ret = pwrite(vbasedev->fd, &total_pfns, sizeof(total_pfns),
+                 region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                              total_pfns));
+        if (ret < 0) {
+            error_report("%s: Failed to set dirty page total pfns %d %s",
+                         vbasedev->name, ret, strerror(errno));
+            goto dpl_unlock;
+        }
+
+        /* Read copied dirty pfns */
+        ret = pread(vbasedev->fd, &copied_pfns, sizeof(copied_pfns),
+                region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                             copied_pfns));
+        if (ret < 0) {
+            error_report("%s: Failed to get dirty pages bitmap count %d %s",
+                         vbasedev->name, ret, strerror(errno));
+            goto dpl_unlock;
+        }
+
+        if (copied_pfns == VFIO_DEVICE_DIRTY_PFNS_NONE) {
+            /*
+             * copied_pfns could be 0 if driver doesn't have any page to
+             * report dirty in given range
+             */
+            break;
+        } else if (copied_pfns == VFIO_DEVICE_DIRTY_PFNS_ALL) {
+            /* Mark all pages dirty for this range */
+            cpu_physical_memory_set_dirty_range(start_pfn * page_size,
+                                                pfn_count * page_size,
+                                                DIRTY_MEMORY_MIGRATION);
+            break;
+        }
+
+        bitmap_size = (BITS_TO_LONGS(copied_pfns) + 1) * sizeof(unsigned long);
+
+        ret = pread(vbasedev->fd, &data_offset, sizeof(data_offset),
+                region->fd_offset + offsetof(struct vfio_device_migration_info,
+                                             data_offset));
+        if (ret != sizeof(data_offset)) {
+            error_report("%s: Failed to get migration buffer data offset %d",
+                         vbasedev->name, ret);
+            goto dpl_unlock;
+        }
+
+        if (region->mmaps) {
+            buf = find_data_region(region, data_offset, bitmap_size);
+        }
+
+        buffer_mmaped = (buf != NULL) ? true : false;
+
+        if (!buffer_mmaped) {
+            buf = g_try_malloc0(bitmap_size);
+            if (!buf) {
+                error_report("%s: Error allocating buffer ", __func__);
+                goto dpl_unlock;
+            }
+
+            ret = pread(vbasedev->fd, buf, bitmap_size,
+                        region->fd_offset + data_offset);
+            if (ret != bitmap_size) {
+                error_report("%s: Failed to get dirty pages bitmap %d",
+                             vbasedev->name, ret);
+                g_free(buf);
+                goto dpl_unlock;
+            }
+        }
+
+        cpu_physical_memory_set_dirty_lebitmap((unsigned long *)buf,
+                                               (start_pfn + count) * page_size,
+                                                copied_pfns);
+        count      += copied_pfns;
+        total_pfns -= copied_pfns;
+
+        if (!buffer_mmaped) {
+            g_free(buf);
+        }
+    }
+
+    trace_vfio_get_dirty_page_list(vbasedev->name, start_pfn, pfn_count,
+                                   page_size);
+
+dpl_unlock:
+    qemu_mutex_unlock(&migration->lock);
+}
+
 /* ---------------------------------------------------------------------- */
 
 static int vfio_save_setup(QEMUFile *f, void *opaque)
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index e01c94e6ebd4..268260eb862d 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -159,5 +159,5 @@ vfio_save_iterate(char *name, int data_size) " (%s) data_size %d"
 vfio_save_complete_precopy(char *name) " (%s)"
 vfio_load_device_config_state(char *name) " (%s)"
 vfio_load_state(char *name, uint64_t data) " (%s) data 0x%"PRIx64
-vfio_load_state_device_data(char *name, uint64_t data_offset, uint64_t data_size) " (%s), Offset 0x%"PRIx64" size 0x%"PRIx64
-
+vfio_load_state_device_data(char *name, uint64_t data_offset, uint64_t data_size) " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6
+vfio_get_dirty_page_list(char *name, uint64_t start, uint64_t pfn_count, uint64_t page_size) " (%s) start 0x%"PRIx64" pfn_count 0x%"PRIx64 " page size 0x%"PRIx64
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index a022484d2636..dc1b83a0b4ef 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -222,5 +222,7 @@ int vfio_spapr_remove_window(VFIOContainer *container,
 
 int vfio_migration_probe(VFIODevice *vbasedev, Error **errp);
 void vfio_migration_finalize(VFIODevice *vbasedev);
+void vfio_get_dirty_page_list(VFIODevice *vbasedev, uint64_t start_pfn,
+                               uint64_t pfn_count, uint64_t page_size);
 
 #endif /* HW_VFIO_VFIO_COMMON_H */
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 12/13] vfio: Add vfio_listerner_log_sync to mark dirty pages
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (10 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 11/13] vfio: Add function to get dirty page list Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 13/13] vfio: Make vfio-pci device migration capable Kirti Wankhede
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

vfio_listerner_log_sync gets list of dirty pages from vendor driver and mark
those pages dirty when in _SAVING state.
Return early for the RAM block section of mapped MMIO region.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/common.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index de74dae8d6a6..d5ee35c95e76 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -36,6 +36,7 @@
 #include "sysemu/kvm.h"
 #include "trace.h"
 #include "qapi/error.h"
+#include "migration/migration.h"
 
 VFIOGroupList vfio_group_list =
     QLIST_HEAD_INITIALIZER(vfio_group_list);
@@ -794,9 +795,43 @@ static void vfio_listener_region_del(MemoryListener *listener,
     }
 }
 
+static void vfio_listerner_log_sync(MemoryListener *listener,
+                                    MemoryRegionSection *section)
+{
+    uint64_t start_addr, size, pfn_count;
+    VFIOGroup *group;
+    VFIODevice *vbasedev;
+
+    if (memory_region_is_ram_device(section->mr)) {
+        return;
+    }
+
+    QLIST_FOREACH(group, &vfio_group_list, next) {
+        QLIST_FOREACH(vbasedev, &group->device_list, next) {
+            if (vbasedev->device_state & VFIO_DEVICE_STATE_SAVING) {
+                continue;
+            } else {
+                return;
+            }
+        }
+    }
+
+    start_addr = TARGET_PAGE_ALIGN(section->offset_within_address_space);
+    size = int128_get64(section->size);
+    pfn_count = size >> TARGET_PAGE_BITS;
+
+    QLIST_FOREACH(group, &vfio_group_list, next) {
+        QLIST_FOREACH(vbasedev, &group->device_list, next) {
+            vfio_get_dirty_page_list(vbasedev, start_addr >> TARGET_PAGE_BITS,
+                                     pfn_count, TARGET_PAGE_SIZE);
+        }
+    }
+}
+
 static const MemoryListener vfio_memory_listener = {
     .region_add = vfio_listener_region_add,
     .region_del = vfio_listener_region_del,
+    .log_sync = vfio_listerner_log_sync,
 };
 
 static void vfio_listener_release(VFIOContainer *container)
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH v5 13/13] vfio: Make vfio-pci device migration capable.
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (11 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 12/13] vfio: Add vfio_listerner_log_sync to mark dirty pages Kirti Wankhede
@ 2019-07-08 19:17 ` Kirti Wankhede
  2019-07-08 20:49 ` [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device no-reply
  2019-07-08 22:15 ` no-reply
  14 siblings, 0 replies; 16+ messages in thread
From: Kirti Wankhede @ 2019-07-08 19:17 UTC (permalink / raw)
  To: alex.williamson, cjia
  Cc: Zhengxiao.zx, kevin.tian, yi.l.liu, yan.y.zhao, eskultet,
	ziye.yang, qemu-devel, cohuck, shuangtai.tst, dgilbert,
	zhi.a.wang, mlevitsk, pasic, aik, Kirti Wankhede, eauger, felipe,
	jonathan.davies, changpeng.liu, Ken.Xue

Call vfio_migration_probe() and vfio_migration_finalize() functions for
vfio-pci device to enable migration for vfio PCI device.
Removed vfio_pci_vmstate structure.

Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Neo Jia <cjia@nvidia.com>
---
 hw/vfio/pci.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 5fe4f8076cac..2ea17a814d55 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2852,6 +2852,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
     vdev->vbasedev.ops = &vfio_pci_ops;
     vdev->vbasedev.type = VFIO_DEVICE_TYPE_PCI;
     vdev->vbasedev.dev = DEVICE(vdev);
+    vdev->vbasedev.device_state = 0;
 
     tmp = g_strdup_printf("%s/iommu_group", vdev->vbasedev.sysfsdev);
     len = readlink(tmp, group_path, sizeof(group_path));
@@ -3112,6 +3113,12 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
         }
     }
 
+    ret = vfio_migration_probe(&vdev->vbasedev, errp);
+    if (ret) {
+            error_report("%s: Failed to setup for migration",
+                         vdev->vbasedev.name);
+    }
+
     vfio_register_err_notifier(vdev);
     vfio_register_req_notifier(vdev);
     vfio_setup_resetfn_quirk(vdev);
@@ -3131,6 +3138,7 @@ static void vfio_instance_finalize(Object *obj)
     VFIOPCIDevice *vdev = PCI_VFIO(obj);
     VFIOGroup *group = vdev->vbasedev.group;
 
+    vdev->vbasedev.device_state = 0;
     vfio_display_finalize(vdev);
     vfio_bars_finalize(vdev);
     g_free(vdev->emulated_config_bits);
@@ -3159,6 +3167,7 @@ static void vfio_exitfn(PCIDevice *pdev)
     }
     vfio_teardown_msi(vdev);
     vfio_bars_exit(vdev);
+    vfio_migration_finalize(&vdev->vbasedev);
 }
 
 static void vfio_pci_reset(DeviceState *dev)
@@ -3267,11 +3276,6 @@ static Property vfio_pci_dev_properties[] = {
     DEFINE_PROP_END_OF_LIST(),
 };
 
-static const VMStateDescription vfio_pci_vmstate = {
-    .name = "vfio-pci",
-    .unmigratable = 1,
-};
-
 static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
@@ -3279,7 +3283,6 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data)
 
     dc->reset = vfio_pci_reset;
     dc->props = vfio_pci_dev_properties;
-    dc->vmsd = &vfio_pci_vmstate;
     dc->desc = "VFIO-based PCI device assignment";
     set_bit(DEVICE_CATEGORY_MISC, dc->categories);
     pdc->realize = vfio_realize;
-- 
2.7.0



^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (12 preceding siblings ...)
  2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 13/13] vfio: Make vfio-pci device migration capable Kirti Wankhede
@ 2019-07-08 20:49 ` no-reply
  2019-07-08 22:15 ` no-reply
  14 siblings, 0 replies; 16+ messages in thread
From: no-reply @ 2019-07-08 20:49 UTC (permalink / raw)
  To: kwankhede
  Cc: cjia, aik, Zhengxiao.zx, shuangtai.tst, qemu-devel, kwankhede,
	eauger, yi.l.liu, eskultet, ziye.yang, mlevitsk, pasic, felipe,
	Ken.Xue, kevin.tian, yan.y.zhao, dgilbert, alex.williamson,
	changpeng.liu, cohuck, zhi.a.wang, jonathan.davies

Patchew URL: https://patchew.org/QEMU/1562613452-24969-1-git-send-email-kwankhede@nvidia.com/



Hi,

This series failed build test on s390x host. Please find the details below.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e

echo
echo "=== ENV ==="
env

echo
echo "=== PACKAGES ==="
rpm -qa

echo
echo "=== UNAME ==="
uname -a

CC=$HOME/bin/cc
INSTALL=$PWD/install
BUILD=$PWD/build
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL
make -j4
# XXX: we need reliable clean up
# make check -j4 V=1
make install
=== TEST SCRIPT END ===

  CC      hw/virtio/trace.o
In file included from hw/vfio/trace.c:4:
hw/vfio/trace.h: In function ‘_nocheck__trace_vfio_save_pending’:
hw/vfio/trace.h:3655:126: error: stray ‘\’ in program
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                                                                                                                              ^
hw/vfio/trace.h:3655:51: error: expected ‘)’ before ‘n’
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                                                   ^                                                                           ~
      |                                                   )
hw/vfio/trace.h:3655:128: error: missing terminating " character [-Werror]
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                                                                                                                                ^
hw/vfio/trace.h:3655:128: error: missing terminating " character
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                                                                                                                                ^~
hw/vfio/trace.h:3655:20: error: format ‘%d’ expects a matching ‘int’ argument [-Werror=format=]
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                   ~^
      |                    |
      |                    int
hw/vfio/trace.h:3655:24: error: format ‘%zu’ expects a matching ‘size_t’ argument [-Werror=format=]
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                      ~~^
      |                        |
      |                        long unsigned int
hw/vfio/trace.h:3655:30: error: format ‘%zu’ expects a matching ‘size_t’ argument [-Werror=format=]
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                          ~~~~^
      |                              |
      |                              long unsigned int
hw/vfio/trace.h:3655:18: error: format ‘%s’ expects a matching ‘char *’ argument [-Werror=format=]
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hw/vfio/trace.h:3655:56: note: format string is defined here
---
      |                                                       ~^
      |                                                        |
      |                                                        char *
hw/vfio/trace.h:3655:18: error: format ‘%lx’ expects a matching ‘long unsigned int’ argument [-Werror=format=]
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /var/tmp/patchew-tester-tmp-gaz1sa3c/src/include/qemu/osdep.h:103,
---
/usr/include/inttypes.h:121:34: note: format string is defined here
  121 | # define PRIx64  __PRI64_PREFIX "x"
In file included from hw/vfio/trace.c:4:
hw/vfio/trace.h:3655:18: error: format ‘%x’ expects a matching ‘unsigned int’ argument [-Werror=format=]
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hw/vfio/trace.h:3655:92: note: format string is defined here
---
      |                                                                                           ~^
      |                                                                                            |
      |                                                                                            unsigned int
hw/vfio/trace.h:3655:18: error: unknown conversion type character ‘P’ in format [-Werror=format=]
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hw/vfio/trace.h:3655:118: note: format string is defined here
 3655 |         qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
      |                                                                                                                      ^
hw/vfio/trace.h: In function ‘_nocheck__trace_vfio_load_state_device_data’:
hw/vfio/trace.h:3770:61: error: expected ‘)’ before ‘PRIx6’
 3770 |         qemu_log("%d@%zu.%06zu:vfio_load_state_device_data " " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6 "\n",
      |                                                             ^                                    ~~~~~
      |                                                             )
hw/vfio/trace.h:3770:20: error: format ‘%d’ expects a matching ‘int’ argument [-Werror=format=]
 3770 |         qemu_log("%d@%zu.%06zu:vfio_load_state_device_data " " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6 "\n",
      |                   ~^
      |                    |
      |                    int
hw/vfio/trace.h:3770:24: error: format ‘%zu’ expects a matching ‘size_t’ argument [-Werror=format=]
 3770 |         qemu_log("%d@%zu.%06zu:vfio_load_state_device_data " " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6 "\n",
      |                      ~~^
      |                        |
      |                        long unsigned int
hw/vfio/trace.h:3770:30: error: format ‘%zu’ expects a matching ‘size_t’ argument [-Werror=format=]
 3770 |         qemu_log("%d@%zu.%06zu:vfio_load_state_device_data " " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6 "\n",
      |                          ~~~~^
      |                              |
      |                              long unsigned int
hw/vfio/trace.h:3770:18: error: format ‘%s’ expects a matching ‘char *’ argument [-Werror=format=]
 3770 |         qemu_log("%d@%zu.%06zu:vfio_load_state_device_data " " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6 "\n",
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hw/vfio/trace.h:3770:66: note: format string is defined here
---
      |                                                                 ~^
      |                                                                  |
      |                                                                  char *
hw/vfio/trace.h:3770:18: error: format ‘%lx’ expects a matching ‘long unsigned int’ argument [-Werror=format=]
 3770 |         qemu_log("%d@%zu.%06zu:vfio_load_state_device_data " " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6 "\n",
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /var/tmp/patchew-tester-tmp-gaz1sa3c/src/include/qemu/osdep.h:103,
---
/usr/include/inttypes.h:121:34: note: format string is defined here
  121 | # define PRIx64  __PRI64_PREFIX "x"
In file included from hw/vfio/trace.c:4:
hw/vfio/trace.h:3770:18: error: spurious trailing ‘%’ in format [-Werror=format=]
 3770 |         qemu_log("%d@%zu.%06zu:vfio_load_state_device_data " " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6 "\n",
      |                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hw/vfio/trace.h:3770:96: note: format string is defined here


The full log is available at
http://patchew.org/logs/1562613452-24969-1-git-send-email-kwankhede@nvidia.com/testing.s390x/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device
  2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
                   ` (13 preceding siblings ...)
  2019-07-08 20:49 ` [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device no-reply
@ 2019-07-08 22:15 ` no-reply
  14 siblings, 0 replies; 16+ messages in thread
From: no-reply @ 2019-07-08 22:15 UTC (permalink / raw)
  To: kwankhede
  Cc: cjia, aik, Zhengxiao.zx, shuangtai.tst, qemu-devel, kwankhede,
	eauger, yi.l.liu, eskultet, ziye.yang, mlevitsk, pasic, felipe,
	Ken.Xue, kevin.tian, yan.y.zhao, dgilbert, alex.williamson,
	changpeng.liu, cohuck, zhi.a.wang, jonathan.davies

Patchew URL: https://patchew.org/QEMU/1562613452-24969-1-git-send-email-kwankhede@nvidia.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      block/qed-table.o
  CC      block/qed-cluster.o
In file included from hw/vfio/trace.c:4:
/tmp/qemu-test/build/hw/vfio/trace.h:3655:126: error: expected ')'
        qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
                                                                                                                             ^
/tmp/qemu-test/build/hw/vfio/trace.h:3655:17: note: to match this '('
        qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
                ^
/tmp/qemu-test/build/hw/vfio/trace.h:3655:128: error: missing terminating '"' character [-Werror,-Winvalid-pp-token]
        qemu_log("%d@%zu.%06zu:vfio_save_pending " " (%s), precopy 0x%"PRIx64" postcopy 0x%x"PRIx64" compatible 0x%""PRIx64 "\n",
                                                                                                                               ^
/tmp/qemu-test/build/hw/vfio/trace.h:3770:98: error: expected ')'
        qemu_log("%d@%zu.%06zu:vfio_load_state_device_data " " (%s), Offset 0x%"PRIx64" size 0x%"PRIx6 "\n",
                                                                                                 ^
/tmp/qemu-test/build/hw/vfio/trace.h:3770:17: note: to match this '('


The full log is available at
http://patchew.org/logs/1562613452-24969-1-git-send-email-kwankhede@nvidia.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-07-08 22:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-07-08 19:17 [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 01/13] vfio: KABI for migration interface Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 02/13] vfio: Add function to unmap VFIO region Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 03/13] vfio: Add vfio_get_object callback to VFIODeviceOps Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 04/13] vfio: Add save and load functions for VFIO PCI devices Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 05/13] vfio: Add migration region initialization and finalize function Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 06/13] vfio: Add VM state change handler to know state of VM Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 07/13] vfio: Add migration state change notifier Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 08/13] vfio: Register SaveVMHandlers for VFIO device Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 09/13] vfio: Add save state functions to SaveVMHandlers Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 10/13] vfio: Add load " Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 11/13] vfio: Add function to get dirty page list Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 12/13] vfio: Add vfio_listerner_log_sync to mark dirty pages Kirti Wankhede
2019-07-08 19:17 ` [Qemu-devel] [PATCH v5 13/13] vfio: Make vfio-pci device migration capable Kirti Wankhede
2019-07-08 20:49 ` [Qemu-devel] [PATCH v5 00/13] Add migration support for VFIO device no-reply
2019-07-08 22:15 ` no-reply

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).