* [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration
@ 2025-10-21 22:41 Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
` (25 more replies)
0 siblings, 26 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Hi,
This is a second round of patches introducing support for Xe SR-IOV VF
migration.
I appreciate all of the review feedback and I hope I was able to address
all of the requested changes.
Outside of the usual small tweaks, one thing that stands out and sends
some ripples through the series is the reorganization of PF state
machine. Other important things to note is the addition of one patch,
that removes the need to use Xe Kconfig debug flag in order to use
migration.
Cover letter from the previous revision:
Xe is a DRM driver supporting Intel GPUs and for SR-IOV capable
devices, it enables the creation of SR-IOV VFs.
This series adds xe-vfio-pci driver variant that interacts with Xe
driver to control VF device state and read/write migration data,
allowing it to extend regular vfio-pci functionality with VFIO migration
capability.
The driver doesn't expose PRE_COPY support, as currently supported
hardware lacks the capability to track dirty pages.
While Xe driver already had the capability to manage VF device state,
management of migration data was something that needed to be implemented
and constitutes the majority of the series.
The migration data is processed asynchronously by the Xe driver, and is
organized into multiple migration data packet types representing the
hardware interfaces of the device (GGTT / MMIO / GuC FW / VRAM).
Since the VRAM can potentially be larger than available system memory,
it is copied in multiple chunks. The metadata needed for migration
compatibility decisions is added as part of descriptor packet (currently
limited to PCI device ID / revision).
Xe driver abstracts away the internals of packet processing and takes
care of tracking the position within individual packets.
The API exported to VFIO is similar to API exported by VFIO to
userspace, a simple .read()/.write().
Note that some of the VF resources are not virtualized (e.g. GGTT - the
GFX device global virtual address space). This means that the VF driver
needs to be aware that migration has occurred in order to properly
relocate (patching or reemiting data that contains references to GGTT
addresses) before resuming operation.
The code to handle that is already present in upstream Linux and in
production VF drivers for other OSes.
v1 -> v2:
* Do not require debug flag to support migration on PTL/BMG
* Fix PCI class match on VFIO side
* Reorganized PF Control state machine (Michał Wajdeczko)
* Kerneldoc tidying (Michał Wajdeczko)
* Return NULL instead of -ENODATA for produce/consume (Michał Wajdeczko)
* guc_buf s/sync/sync_read (Matt Brost)
* Squash patch 03 (Matt Brost)
* Assert on PM ref instead of taking it (Matt Brost)
* Remove CCS completely (Matt Brost)
* Return ptr on guc_buf_sync_read (Michał Wajdeczko)
* Define default guc_buf size (Michał Wajdeczko)
* Drop CONFIG_PCI_IOV=n stubs where not needed (Michał Wajdeczko)
* And other, more minor changes
Lukasz Laguna (2):
drm/xe/pf: Add helper to retrieve VF's LMEM object
drm/xe/migrate: Add function to copy of VRAM data in chunks
Michał Winiarski (24):
drm/xe/pf: Remove GuC version check for migration support
drm/xe: Move migration support to device-level struct
drm/xe/pf: Add save/restore control state stubs and connect to debugfs
drm/xe/pf: Add data structures and handlers for migration rings
drm/xe/pf: Add helpers for migration data allocation / free
drm/xe/pf: Add support for encap/decap of bitstream to/from packet
drm/xe/pf: Add minimalistic migration descriptor
drm/xe/pf: Expose VF migration data size over debugfs
drm/xe: Add sa/guc_buf_cache sync interface
drm/xe: Allow the caller to pass guc_buf_cache size
drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF
migration
drm/xe/pf: Remove GuC migration data save/restore from GT debugfs
drm/xe/pf: Don't save GuC VF migration data on pause
drm/xe/pf: Switch VF migration GuC save/restore to struct migration
data
drm/xe/pf: Handle GuC migration data as part of PF control
drm/xe/pf: Add helpers for VF GGTT migration data handling
drm/xe/pf: Handle GGTT migration data as part of PF control
drm/xe/pf: Add helpers for VF MMIO migration data handling
drm/xe/pf: Handle MMIO migration data as part of PF control
drm/xe/pf: Handle VRAM migration data as part of PF control
drm/xe/pf: Add wait helper for VF FLR
drm/xe/pf: Enable SR-IOV VF migration for PTL and BMG
drm/xe/pf: Export helpers for VFIO
vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
MAINTAINERS | 7 +
drivers/gpu/drm/xe/Makefile | 4 +
drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c | 2 +-
drivers/gpu/drm/xe/xe_device.h | 5 +
drivers/gpu/drm/xe/xe_device_types.h | 2 +
drivers/gpu/drm/xe/xe_ggtt.c | 100 ++
drivers/gpu/drm/xe/xe_ggtt.h | 3 +
drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +
drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 88 ++
drivers/gpu/drm/xe/xe_gt_sriov_pf.h | 6 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 75 ++
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 6 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 595 ++++++++++-
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 8 +
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 36 +-
drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 47 -
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 933 ++++++++++++++----
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 47 +-
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 30 +-
drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 5 +-
drivers/gpu/drm/xe/xe_guc.c | 14 +-
drivers/gpu/drm/xe/xe_guc_buf.c | 19 +-
drivers/gpu/drm/xe/xe_guc_buf.h | 5 +-
drivers/gpu/drm/xe/xe_migrate.c | 134 ++-
drivers/gpu/drm/xe/xe_migrate.h | 8 +
drivers/gpu/drm/xe/xe_pci.c | 8 +-
drivers/gpu/drm/xe/xe_pci_types.h | 1 +
drivers/gpu/drm/xe/xe_sa.c | 21 +
drivers/gpu/drm/xe/xe_sa.h | 1 +
drivers/gpu/drm/xe/xe_sriov_migration_data.c | 540 ++++++++++
drivers/gpu/drm/xe/xe_sriov_migration_data.h | 38 +
drivers/gpu/drm/xe/xe_sriov_pf.c | 5 +
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 128 +++
drivers/gpu/drm/xe/xe_sriov_pf_control.h | 5 +
drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 102 ++
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 276 ++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 24 +
.../gpu/drm/xe/xe_sriov_pf_migration_types.h | 67 ++
drivers/gpu/drm/xe/xe_sriov_pf_types.h | 9 +
drivers/gpu/drm/xe/xe_sriov_vfio.c | 296 ++++++
drivers/vfio/pci/Kconfig | 2 +
drivers/vfio/pci/Makefile | 2 +
drivers/vfio/pci/xe/Kconfig | 12 +
drivers/vfio/pci/xe/Makefile | 3 +
drivers/vfio/pci/xe/main.c | 470 +++++++++
include/drm/intel/xe_sriov_vfio.h | 28 +
46 files changed, 3892 insertions(+), 327 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_sriov_migration_data.c
create mode 100644 drivers/gpu/drm/xe/xe_sriov_migration_data.h
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.c
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.h
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
create mode 100644 drivers/gpu/drm/xe/xe_sriov_vfio.c
create mode 100644 drivers/vfio/pci/xe/Kconfig
create mode 100644 drivers/vfio/pci/xe/Makefile
create mode 100644 drivers/vfio/pci/xe/main.c
create mode 100644 include/drm/intel/xe_sriov_vfio.h
--
2.50.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [PATCH v2 01/26] drm/xe/pf: Remove GuC version check for migration support
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-28 2:33 ` Tian, Kevin
2025-10-21 22:41 ` [PATCH v2 02/26] drm/xe: Move migration support to device-level struct Michał Winiarski
` (24 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Since commit 4eb0aab6e4434 ("drm/xe/guc: Bump minimum required GuC
version to v70.29.2"), the minimum GuC version required by the driver
is v70.29.2, which should already include everything that we need for
migration.
Remove the version check.
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 44cc612b0a752..a5bf327ef8889 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -384,9 +384,6 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
static bool pf_check_migration_support(struct xe_gt *gt)
{
- /* GuC 70.25 with save/restore v2 is required */
- xe_gt_assert(gt, GUC_FIRMWARE_VER(>->uc.guc) >= MAKE_GUC_VER(70, 25, 0));
-
/* XXX: for now this is for feature enabling only */
return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 02/26] drm/xe: Move migration support to device-level struct
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
` (23 subsequent siblings)
25 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Upcoming changes will allow users to control VF state and obtain its
migration data with a device-level granularity (not tile/gt).
Change the data structures to reflect that and move the GT-level
migration init to happen after device-level init.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
drivers/gpu/drm/xe/Makefile | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 12 +-----
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 3 --
drivers/gpu/drm/xe/xe_sriov_pf.c | 5 +++
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 41 +++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 16 ++++++++
.../gpu/drm/xe/xe_sriov_pf_migration_types.h | 0
drivers/gpu/drm/xe/xe_sriov_pf_types.h | 6 +++
8 files changed, 71 insertions(+), 13 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.c
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.h
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 82c6b3d296769..89e5b26c27975 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -176,6 +176,7 @@ xe-$(CONFIG_PCI_IOV) += \
xe_sriov_pf.o \
xe_sriov_pf_control.o \
xe_sriov_pf_debugfs.o \
+ xe_sriov_pf_migration.o \
xe_sriov_pf_provision.o \
xe_sriov_pf_service.o \
xe_tile_sriov_pf_debugfs.o
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index a5bf327ef8889..ca28f45aaf481 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -13,6 +13,7 @@
#include "xe_guc.h"
#include "xe_guc_ct.h"
#include "xe_sriov.h"
+#include "xe_sriov_pf_migration.h"
/* Return: number of dwords saved/restored/required or a negative error code on failure */
static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
@@ -115,8 +116,7 @@ static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
static bool pf_migration_supported(struct xe_gt *gt)
{
- xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- return gt->sriov.pf.migration.supported;
+ return xe_sriov_pf_migration_supported(gt_to_xe(gt));
}
static struct mutex *pf_migration_mutex(struct xe_gt *gt)
@@ -382,12 +382,6 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
}
#endif /* CONFIG_DEBUG_FS */
-static bool pf_check_migration_support(struct xe_gt *gt)
-{
- /* XXX: for now this is for feature enabling only */
- return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
-}
-
/**
* xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
* @gt: the &xe_gt
@@ -403,8 +397,6 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
xe_gt_assert(gt, IS_SRIOV_PF(xe));
- gt->sriov.pf.migration.supported = pf_check_migration_support(gt);
-
if (!pf_migration_supported(gt))
return 0;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
index 1f3110b6d44fa..9d672feac5f04 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
@@ -30,9 +30,6 @@ struct xe_gt_sriov_state_snapshot {
* Used by the PF driver to maintain non-VF specific per-GT data.
*/
struct xe_gt_sriov_pf_migration {
- /** @supported: indicates whether the feature is supported */
- bool supported;
-
/** @snapshot_lock: protects all VFs snapshots */
struct mutex snapshot_lock;
};
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
index bc1ab9ee31d92..95743c7af8050 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
@@ -15,6 +15,7 @@
#include "xe_sriov.h"
#include "xe_sriov_pf.h"
#include "xe_sriov_pf_helpers.h"
+#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_service.h"
#include "xe_sriov_printk.h"
@@ -101,6 +102,10 @@ int xe_sriov_pf_init_early(struct xe_device *xe)
if (err)
return err;
+ err = xe_sriov_pf_migration_init(xe);
+ if (err)
+ return err;
+
xe_sriov_pf_service_init(xe);
return 0;
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
new file mode 100644
index 0000000000000..8c523c392f98b
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -0,0 +1,41 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include "xe_sriov.h"
+#include "xe_sriov_pf_migration.h"
+
+/**
+ * xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
+ * @xe: the &xe_device
+ *
+ * Return: true if migration is supported, false otherwise
+ */
+bool xe_sriov_pf_migration_supported(struct xe_device *xe)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ return xe->sriov.pf.migration.supported;
+}
+
+static bool pf_check_migration_support(struct xe_device *xe)
+{
+ /* XXX: for now this is for feature enabling only */
+ return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
+}
+
+/**
+ * xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
+ * @xe: the &xe_device
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_migration_init(struct xe_device *xe)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
+
+ return 0;
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
new file mode 100644
index 0000000000000..d2b4a24165438
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_SRIOV_PF_MIGRATION_H_
+#define _XE_SRIOV_PF_MIGRATION_H_
+
+#include <linux/types.h>
+
+struct xe_device;
+
+int xe_sriov_pf_migration_init(struct xe_device *xe);
+bool xe_sriov_pf_migration_supported(struct xe_device *xe);
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
new file mode 100644
index 0000000000000..e69de29bb2d1d
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
index c753cd59aed2b..24d22afeececa 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
@@ -39,6 +39,12 @@ struct xe_device_pf {
/** @provision: device level provisioning data. */
struct xe_sriov_pf_provision provision;
+ /** @migration: device level VF migration data */
+ struct {
+ /** @migration.supported: indicates whether VF migration feature is supported */
+ bool supported;
+ } migration;
+
/** @service: device level service data. */
struct xe_sriov_pf_service service;
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 02/26] drm/xe: Move migration support to device-level struct Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 22:31 ` Michal Wajdeczko
2025-10-28 3:06 ` Tian, Kevin
2025-10-21 22:41 ` [PATCH v2 04/26] drm/xe/pf: Add data structures and handlers for migration rings Michał Winiarski
` (22 subsequent siblings)
25 siblings, 2 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
The states will be used by upcoming changes to produce (in case of save)
or consume (in case of resume) the VF migration data.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 248 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 6 +
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 14 +
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 96 +++++++
drivers/gpu/drm/xe/xe_sriov_pf_control.h | 4 +
drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 38 +++
6 files changed, 406 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 2e6bd3d1fe1da..b770916e88e53 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -184,6 +184,12 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(PAUSE_SAVE_GUC);
CASE2STR(PAUSE_FAILED);
CASE2STR(PAUSED);
+ CASE2STR(SAVE_WIP);
+ CASE2STR(SAVE_FAILED);
+ CASE2STR(SAVED);
+ CASE2STR(RESTORE_WIP);
+ CASE2STR(RESTORE_FAILED);
+ CASE2STR(RESTORED);
CASE2STR(RESUME_WIP);
CASE2STR(RESUME_SEND_RESUME);
CASE2STR(RESUME_FAILED);
@@ -208,6 +214,8 @@ static unsigned long pf_get_default_timeout(enum xe_gt_sriov_control_bits bit)
case XE_GT_SRIOV_STATE_FLR_WIP:
case XE_GT_SRIOV_STATE_FLR_RESET_CONFIG:
return 5 * HZ;
+ case XE_GT_SRIOV_STATE_RESTORE_WIP:
+ return 20 * HZ;
default:
return HZ;
}
@@ -329,6 +337,8 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
}
#define pf_enter_vf_state_machine_bug(gt, vfid) ({ \
@@ -359,6 +369,8 @@ static void pf_queue_vf(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_flr_wip(struct xe_gt *gt, unsigned int vfid);
static void pf_exit_vf_stop_wip(struct xe_gt *gt, unsigned int vfid);
+static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid);
+static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid);
static void pf_exit_vf_pause_wip(struct xe_gt *gt, unsigned int vfid);
static void pf_exit_vf_resume_wip(struct xe_gt *gt, unsigned int vfid);
@@ -380,6 +392,8 @@ static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_flr_wip(gt, vfid);
pf_exit_vf_stop_wip(gt, vfid);
+ pf_exit_vf_save_wip(gt, vfid);
+ pf_exit_vf_restore_wip(gt, vfid);
pf_exit_vf_pause_wip(gt, vfid);
pf_exit_vf_resume_wip(gt, vfid);
@@ -399,6 +413,8 @@ static void pf_enter_vf_ready(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
pf_exit_vf_mismatch(gt, vfid);
pf_exit_vf_wip(gt, vfid);
}
@@ -675,6 +691,8 @@ static void pf_enter_vf_resumed(struct xe_gt *gt, unsigned int vfid)
{
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
pf_exit_vf_mismatch(gt, vfid);
pf_exit_vf_wip(gt, vfid);
}
@@ -753,6 +771,16 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
return -EPERM;
}
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
+ xe_gt_sriov_dbg(gt, "VF%u save is in progress!\n", vfid);
+ return -EBUSY;
+ }
+
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ xe_gt_sriov_dbg(gt, "VF%u restore is in progress!\n", vfid);
+ return -EBUSY;
+ }
+
if (!pf_enter_vf_resume_wip(gt, vfid)) {
xe_gt_sriov_dbg(gt, "VF%u resume already in progress!\n", vfid);
return -EALREADY;
@@ -776,6 +804,218 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
return -ECANCELED;
}
+static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
+}
+
+static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED))
+ pf_enter_vf_state_machine_bug(gt, vfid);
+
+ xe_gt_sriov_dbg(gt, "VF%u saved!\n", vfid);
+
+ pf_exit_vf_mismatch(gt, vfid);
+ pf_exit_vf_wip(gt, vfid);
+ pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+}
+
+static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
+ return false;
+
+ pf_enter_vf_saved(gt, vfid);
+
+ return true;
+}
+
+static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
+ pf_enter_vf_wip(gt, vfid);
+ pf_queue_vf(gt, vfid);
+ return true;
+ }
+
+ return false;
+}
+
+/**
+ * xe_gt_sriov_pf_control_trigger_save_vf() - Start an SR-IOV VF migration data save sequence.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid)
+{
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
+ xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
+ return -EPERM;
+ }
+
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
+ xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
+ return -EPERM;
+ }
+
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ xe_gt_sriov_dbg(gt, "VF%u restore is in progress!\n", vfid);
+ return -EBUSY;
+ }
+
+ if (!pf_enter_vf_save_wip(gt, vfid)) {
+ xe_gt_sriov_dbg(gt, "VF%u save already in progress!\n", vfid);
+ return -EALREADY;
+ }
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_control_finish_save_vf() - Complete a VF migration data save sequence.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED)) {
+ pf_enter_vf_mismatch(gt, vfid);
+ return -EIO;
+ }
+
+ pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+
+ return 0;
+}
+
+static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP);
+}
+
+static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED))
+ pf_enter_vf_state_machine_bug(gt, vfid);
+
+ xe_gt_sriov_dbg(gt, "VF%u restored!\n", vfid);
+
+ pf_exit_vf_mismatch(gt, vfid);
+ pf_exit_vf_wip(gt, vfid);
+ pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+}
+
+static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
+ return false;
+
+ pf_enter_vf_restored(gt, vfid);
+
+ return true;
+}
+
+static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ pf_enter_vf_wip(gt, vfid);
+ pf_queue_vf(gt, vfid);
+ return true;
+ }
+
+ return false;
+}
+
+/**
+ * xe_gt_sriov_pf_control_trigger restore_vf() - Start an SR-IOV VF migration data restore sequence.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid)
+{
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
+ xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
+ return -EPERM;
+ }
+
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
+ xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
+ return -EPERM;
+ }
+
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
+ xe_gt_sriov_dbg(gt, "VF%u save is in progress!\n", vfid);
+ return -EBUSY;
+ }
+
+ if (!pf_enter_vf_restore_wip(gt, vfid)) {
+ xe_gt_sriov_dbg(gt, "VF%u restore already in progress!\n", vfid);
+ return -EALREADY;
+ }
+
+ return 0;
+}
+
+static int pf_wait_vf_restore_done(struct xe_gt *gt, unsigned int vfid)
+{
+ unsigned long timeout = pf_get_default_timeout(XE_GT_SRIOV_STATE_RESTORE_WIP);
+ int err;
+
+ err = pf_wait_vf_wip_done(gt, vfid, timeout);
+ if (err) {
+ xe_gt_sriov_notice(gt, "VF%u RESTORE didn't finish in %u ms (%pe)\n",
+ vfid, jiffies_to_msecs(timeout), ERR_PTR(err));
+ return err;
+ }
+
+ if (!pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED))
+ return -EIO;
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_control_finish_restore_vf() - Complete a VF migration data restore sequence.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid)
+{
+ int ret;
+
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ ret = pf_wait_vf_restore_done(gt, vfid);
+ if (ret)
+ return ret;
+ }
+
+ if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED)) {
+ pf_enter_vf_mismatch(gt, vfid);
+ return -EIO;
+ }
+
+ pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+
+ return 0;
+}
+
/**
* DOC: The VF STOP state machine
*
@@ -817,6 +1057,8 @@ static void pf_enter_vf_stopped(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
pf_exit_vf_mismatch(gt, vfid);
pf_exit_vf_wip(gt, vfid);
}
@@ -1461,6 +1703,12 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
if (pf_exit_vf_pause_save_guc(gt, vfid))
return true;
+ if (pf_handle_vf_save(gt, vfid))
+ return true;
+
+ if (pf_handle_vf_restore(gt, vfid))
+ return true;
+
if (pf_exit_vf_resume_send_resume(gt, vfid))
return true;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
index 8a72ef3778d47..abc233f6302ed 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
@@ -14,8 +14,14 @@ struct xe_gt;
int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
+bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
+
int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_stop_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_trigger_flr(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_sync_flr(struct xe_gt *gt, unsigned int vfid, bool sync);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index c80b7e77f1ad2..e113dc98b33ce 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -31,6 +31,12 @@
* @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state.
* @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
* @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
+ * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
+ * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
+ * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
+ * @XE_GT_SRIOV_STATE_RESTORE_WIP: indicates that VF restore operation is in progress.
+ * @XE_GT_SRIOV_STATE_RESTORE_FAILED: indicates that VF restore operation has failed.
+ * @XE_GT_SRIOV_STATE_RESTORED: indicates that VF data is restored.
* @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress.
* @XE_GT_SRIOV_STATE_RESUME_SEND_RESUME: indicates that the PF is about to send RESUME command.
* @XE_GT_SRIOV_STATE_RESUME_FAILED: indicates that a VF resume operation has failed.
@@ -63,6 +69,14 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_PAUSE_FAILED,
XE_GT_SRIOV_STATE_PAUSED,
+ XE_GT_SRIOV_STATE_SAVE_WIP,
+ XE_GT_SRIOV_STATE_SAVE_FAILED,
+ XE_GT_SRIOV_STATE_SAVED,
+
+ XE_GT_SRIOV_STATE_RESTORE_WIP,
+ XE_GT_SRIOV_STATE_RESTORE_FAILED,
+ XE_GT_SRIOV_STATE_RESTORED,
+
XE_GT_SRIOV_STATE_RESUME_WIP,
XE_GT_SRIOV_STATE_RESUME_SEND_RESUME,
XE_GT_SRIOV_STATE_RESUME_FAILED,
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
index 416d00a03fbb7..8d8a01faf5291 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
@@ -149,3 +149,99 @@ int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid)
return 0;
}
+
+/**
+ * xe_sriov_pf_control_trigger_save_vf - Start a VF migration data SAVE sequence on all GTs.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ unsigned int id;
+ int ret;
+
+ for_each_gt(gt, xe, id) {
+ ret = xe_gt_sriov_pf_control_trigger_save_vf(gt, vfid);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+/**
+ * xe_sriov_pf_control_finish_save_vf - Complete a VF migration data SAVE sequence on all GTs.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_finish_save_vf(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ unsigned int id;
+ int ret;
+
+ for_each_gt(gt, xe, id) {
+ ret = xe_gt_sriov_pf_control_finish_save_vf(gt, vfid);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
+/**
+ * xe_sriov_pf_control_trigger_restore_vf - Start a VF migration data RESTORE sequence on all GTs.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_trigger_restore_vf(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ unsigned int id;
+ int ret;
+
+ for_each_gt(gt, xe, id) {
+ ret = xe_gt_sriov_pf_control_trigger_restore_vf(gt, vfid);
+ if (ret)
+ return ret;
+ }
+
+ return ret;
+}
+
+/**
+ * xe_sriov_pf_control_wait_restore_vf - Complete a VF migration data RESTORE sequence in all GTs.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_finish_restore_vf(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ unsigned int id;
+ int ret;
+
+ for_each_gt(gt, xe, id) {
+ ret = xe_gt_sriov_pf_control_finish_restore_vf(gt, vfid);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
index 2d52d0ac1b28f..30318c1fba34e 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
@@ -13,5 +13,9 @@ int xe_sriov_pf_control_resume_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_stop_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_finish_save_vf(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_trigger_restore_vf(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_finish_restore_vf(struct xe_device *xe, unsigned int vfid);
#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
index a81aa05c55326..e0e6340c49106 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
@@ -136,11 +136,31 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
* │ │ ├── reset
* │ │ ├── resume
* │ │ ├── stop
+ * │ │ ├── save
+ * │ │ ├── restore
* │ │ :
* │ ├── vf2
* │ │ ├── ...
*/
+static int from_file_read_to_vf_call(struct seq_file *s,
+ int (*call)(struct xe_device *, unsigned int))
+{
+ struct dentry *dent = file_dentry(s->file)->d_parent;
+ struct xe_device *xe = extract_xe(dent);
+ unsigned int vfid = extract_vfid(dent);
+ int ret;
+
+ xe_pm_runtime_get(xe);
+ ret = call(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ if (ret < 0)
+ return ret;
+
+ return 0;
+}
+
static ssize_t from_file_write_to_vf_call(struct file *file, const char __user *userbuf,
size_t count, loff_t *ppos,
int (*call)(struct xe_device *, unsigned int))
@@ -179,10 +199,26 @@ static ssize_t OP##_write(struct file *file, const char __user *userbuf, \
} \
DEFINE_SHOW_STORE_ATTRIBUTE(OP)
+#define DEFINE_VF_CONTROL_ATTRIBUTE_RW(OP) \
+static int OP##_show(struct seq_file *s, void *unused) \
+{ \
+ return from_file_read_to_vf_call(s, \
+ xe_sriov_pf_control_finish_##OP); \
+} \
+static ssize_t OP##_write(struct file *file, const char __user *userbuf, \
+ size_t count, loff_t *ppos) \
+{ \
+ return from_file_write_to_vf_call(file, userbuf, count, ppos, \
+ xe_sriov_pf_control_trigger_##OP); \
+} \
+DEFINE_SHOW_STORE_ATTRIBUTE(OP)
+
DEFINE_VF_CONTROL_ATTRIBUTE(pause_vf);
DEFINE_VF_CONTROL_ATTRIBUTE(resume_vf);
DEFINE_VF_CONTROL_ATTRIBUTE(stop_vf);
DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
+DEFINE_VF_CONTROL_ATTRIBUTE_RW(save_vf);
+DEFINE_VF_CONTROL_ATTRIBUTE_RW(restore_vf);
static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
{
@@ -190,6 +226,8 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
debugfs_create_file("resume", 0200, vfdent, xe, &resume_vf_fops);
debugfs_create_file("stop", 0200, vfdent, xe, &stop_vf_fops);
debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
+ debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
+ debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
}
static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 04/26] drm/xe/pf: Add data structures and handlers for migration rings
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (2 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 22:06 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 05/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
` (21 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Migration data is queued in a per-GT ptr_ring to decouple the worker
responsible for handling the data transfer from the .read() and .write()
syscalls.
Add the data structures and handlers that will be used in future
commits.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 259 +++++++++++++++++-
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 6 +-
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 12 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 183 +++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 14 +
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 11 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 3 +
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 143 ++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 7 +
.../gpu/drm/xe/xe_sriov_pf_migration_types.h | 58 ++++
drivers/gpu/drm/xe/xe_sriov_pf_types.h | 3 +
11 files changed, 684 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index b770916e88e53..cad73fdaee93c 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -19,6 +19,7 @@
#include "xe_guc_ct.h"
#include "xe_sriov.h"
#include "xe_sriov_pf_control.h"
+#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_service.h"
#include "xe_tile.h"
@@ -185,9 +186,15 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(PAUSE_FAILED);
CASE2STR(PAUSED);
CASE2STR(SAVE_WIP);
+ CASE2STR(SAVE_PROCESS_DATA);
+ CASE2STR(SAVE_WAIT_DATA);
+ CASE2STR(SAVE_DATA_DONE);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
CASE2STR(RESTORE_WIP);
+ CASE2STR(RESTORE_PROCESS_DATA);
+ CASE2STR(RESTORE_WAIT_DATA);
+ CASE2STR(RESTORE_DATA_DONE);
CASE2STR(RESTORE_FAILED);
CASE2STR(RESTORED);
CASE2STR(RESUME_WIP);
@@ -804,9 +811,50 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
return -ECANCELED;
}
+/**
+ * DOC: The VF SAVE state machine
+ *
+ * SAVE extends the PAUSED state.
+ *
+ * The VF SAVE state machine looks like::
+ *
+ * ....PAUSED....................................................
+ * : :
+ * : (O)<---------o :
+ * : | \ :
+ * : save (SAVED) (SAVE_FAILED) :
+ * : | ^ ^ :
+ * : | | | :
+ * : ....V...............o...........o......SAVE_WIP......... :
+ * : : | | | : :
+ * : : | empty | : :
+ * : : | | | : :
+ * : : | | | : :
+ * : : | DATA_DONE | : :
+ * : : | ^ | : :
+ * : : | | error : :
+ * : : | no_data / : :
+ * : : | / / : :
+ * : : | / / : :
+ * : : | / / : :
+ * : : o---------->PROCESS_DATA<----consume : :
+ * : : \ \ : :
+ * : : \ \ : :
+ * : : \ \ : :
+ * : : ring_full----->WAIT_DATA : :
+ * : : : :
+ * : :......................................................: :
+ * :............................................................:
+ *
+ * For the full state machine view, see `The VF state machine`_.
+ */
static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
- pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
+ }
}
static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
@@ -821,12 +869,39 @@ static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
}
+static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
+ pf_exit_vf_wip(gt, vfid);
+}
+
+static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
+{
+ return 0;
+}
+
static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
{
- if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
+ int ret;
+
+ if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA))
return false;
- pf_enter_vf_saved(gt, vfid);
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
+ if (xe_gt_sriov_pf_migration_ring_full(gt, vfid)) {
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
+
+ return true;
+ }
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
+
+ ret = pf_handle_vf_save_data(gt, vfid);
+ if (ret == -EAGAIN)
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
+ else if (ret)
+ pf_enter_vf_save_failed(gt, vfid);
+ else
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
return true;
}
@@ -834,6 +909,7 @@ static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
pf_enter_vf_wip(gt, vfid);
pf_queue_vf(gt, vfid);
return true;
@@ -842,6 +918,36 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
return false;
}
+/**
+ * xe_gt_sriov_pf_control_check_save_data_done() - Check if all save migration data was produced.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+bool xe_gt_sriov_pf_control_check_save_data_done(struct xe_gt *gt, unsigned int vfid)
+{
+ return pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
+}
+
+/**
+ * xe_gt_sriov_pf_control_process_save_data() - Queue VF save migration data processing.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ */
+void xe_gt_sriov_pf_control_process_save_data(struct xe_gt *gt, unsigned int vfid)
+{
+ if (xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
+ return;
+
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA))
+ pf_queue_vf(gt, vfid);
+}
+
/**
* xe_gt_sriov_pf_control_trigger_save_vf() - Start an SR-IOV VF migration data save sequence.
* @gt: the &xe_gt
@@ -887,19 +993,62 @@ int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid)
*/
int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
{
- if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED)) {
- pf_enter_vf_mismatch(gt, vfid);
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE)) {
+ xe_gt_sriov_err(gt, "VF%u save is still in progress!\n", vfid);
return -EIO;
}
pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
+ pf_enter_vf_saved(gt, vfid);
return 0;
}
+/**
+ * DOC: The VF RESTORE state machine
+ *
+ * RESTORE extends the PAUSED state.
+ *
+ * The VF RESTORE state machine looks like::
+ *
+ * ....PAUSED....................................................
+ * : :
+ * : (O)<---------o :
+ * : | \ :
+ * : restore (RESTORED) (RESTORE_FAILED) :
+ * : | ^ ^ :
+ * : | | | :
+ * : ....V...............o...........o......RESTORE_WIP...... :
+ * : : | | | : :
+ * : : | empty | : :
+ * : : | | | : :
+ * : : | | | : :
+ * : : | DATA_DONE | : :
+ * : : | ^ | : :
+ * : : | | error : :
+ * : : | trailer / : :
+ * : : | / / : :
+ * : : | / / : :
+ * : : | / / : :
+ * : : o---------->PROCESS_DATA<----produce : :
+ * : : \ \ : :
+ * : : \ \ : :
+ * : : \ \ : :
+ * : : ring_empty---->WAIT_DATA : :
+ * : : : :
+ * : :......................................................: :
+ * :............................................................:
+ *
+ * For the full state machine view, see `The VF state machine`_.
+ */
static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
{
- pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP);
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE);
+ }
}
static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
@@ -914,12 +1063,50 @@ static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
}
+static void pf_enter_vf_restore_failed(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
+ pf_exit_vf_wip(gt, vfid);
+}
+
+static int
+pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_sriov_migration_data *data = xe_gt_sriov_pf_migration_restore_consume(gt, vfid);
+
+ xe_gt_assert(gt, data);
+
+ xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
+
+ return 0;
+}
+
static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
{
- if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
+ int ret;
+
+ if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA))
return false;
- pf_enter_vf_restored(gt, vfid);
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
+ if (xe_gt_sriov_pf_migration_ring_empty(gt, vfid)) {
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE)) {
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
+ pf_enter_vf_restored(gt, vfid);
+
+ return true;
+ }
+
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
+ return true;
+ }
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
+
+ ret = pf_handle_vf_restore_data(gt, vfid);
+ if (ret)
+ pf_enter_vf_restore_failed(gt, vfid);
+ else
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
return true;
}
@@ -927,6 +1114,7 @@ static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
{
if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
pf_enter_vf_wip(gt, vfid);
pf_queue_vf(gt, vfid);
return true;
@@ -935,6 +1123,41 @@ static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
return false;
}
+/**
+ * xe_gt_sriov_pf_control_restore_data_done() - Indicate the end of VF migration data stream.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_restore_data_done(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE)) {
+ pf_enter_vf_state_machine_bug(gt, vfid);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_control_process_restore_data() - Queue VF restore migration data processing.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ */
+void xe_gt_sriov_pf_control_process_restore_data(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
+ pf_enter_vf_state_machine_bug(gt, vfid);
+
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA))
+ pf_queue_vf(gt, vfid);
+}
+
/**
* xe_gt_sriov_pf_control_trigger restore_vf() - Start an SR-IOV VF migration data restore sequence.
* @gt: the &xe_gt
@@ -1000,11 +1223,9 @@ int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid
{
int ret;
- if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
- ret = pf_wait_vf_restore_done(gt, vfid);
- if (ret)
- return ret;
- }
+ ret = pf_wait_vf_restore_done(gt, vfid);
+ if (ret)
+ return ret;
if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED)) {
pf_enter_vf_mismatch(gt, vfid);
@@ -1703,9 +1924,21 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
if (pf_exit_vf_pause_save_guc(gt, vfid))
return true;
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA)) {
+ xe_gt_sriov_dbg_verbose(gt, "VF%u in %s\n", vfid,
+ control_bit_to_string(XE_GT_SRIOV_STATE_SAVE_WAIT_DATA));
+ return false;
+ }
+
if (pf_handle_vf_save(gt, vfid))
return true;
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA)) {
+ xe_gt_sriov_dbg_verbose(gt, "VF%u in %s\n", vfid,
+ control_bit_to_string(XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA));
+ return false;
+ }
+
if (pf_handle_vf_restore(gt, vfid))
return true;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
index abc233f6302ed..6b1ab339e3b73 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
@@ -14,12 +14,14 @@ struct xe_gt;
int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
-bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
-
int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
+bool xe_gt_sriov_pf_control_check_save_data_done(struct xe_gt *gt, unsigned int vfid);
+void xe_gt_sriov_pf_control_process_save_data(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_restore_data_done(struct xe_gt *gt, unsigned int vfid);
+void xe_gt_sriov_pf_control_process_restore_data(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_stop_vf(struct xe_gt *gt, unsigned int vfid);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index e113dc98b33ce..6e19a8ea88f0b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -32,9 +32,15 @@
* @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
* @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
* @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
+ * @XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA: indicates that VF migration data is being produced.
+ * @XE_GT_SRIOV_STATE_SAVE_WAIT_DATA: indicates that PF awaits for space in migration data ring.
+ * @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
* @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
* @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
* @XE_GT_SRIOV_STATE_RESTORE_WIP: indicates that VF restore operation is in progress.
+ * @XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA: indicates that VF migration data is being consumed.
+ * @XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA: indicates that PF awaits for data in migration data ring.
+ * @XE_GT_SRIOV_STATE_RESTORE_DATA_DONE: indicates that all migration data was produced by the user.
* @XE_GT_SRIOV_STATE_RESTORE_FAILED: indicates that VF restore operation has failed.
* @XE_GT_SRIOV_STATE_RESTORED: indicates that VF data is restored.
* @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress.
@@ -70,10 +76,16 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_PAUSED,
XE_GT_SRIOV_STATE_SAVE_WIP,
+ XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
+ XE_GT_SRIOV_STATE_SAVE_WAIT_DATA,
+ XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
XE_GT_SRIOV_STATE_RESTORE_WIP,
+ XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA,
+ XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA,
+ XE_GT_SRIOV_STATE_RESTORE_DATA_DONE,
XE_GT_SRIOV_STATE_RESTORE_FAILED,
XE_GT_SRIOV_STATE_RESTORED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index ca28f45aaf481..b6ffd982d6007 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -7,6 +7,7 @@
#include "abi/guc_actions_sriov_abi.h"
#include "xe_bo.h"
+#include "xe_gt_sriov_pf_control.h"
#include "xe_gt_sriov_pf_helpers.h"
#include "xe_gt_sriov_pf_migration.h"
#include "xe_gt_sriov_printk.h"
@@ -15,6 +16,17 @@
#include "xe_sriov.h"
#include "xe_sriov_pf_migration.h"
+#define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
+
+static struct xe_gt_sriov_migration_data *pf_pick_gt_migration(struct xe_gt *gt, unsigned int vfid)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return >->sriov.pf.vfs[vfid].migration;
+}
+
/* Return: number of dwords saved/restored/required or a negative error code on failure */
static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
u64 addr, u32 ndwords)
@@ -382,6 +394,162 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
}
#endif /* CONFIG_DEBUG_FS */
+/**
+ * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * Return: true if the ring is empty, otherwise false.
+ */
+bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid)
+{
+ return ptr_ring_empty(&pf_pick_gt_migration(gt, vfid)->ring);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_ring_full() - Check if a migration ring is full.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * Return: true if the ring is full, otherwise false.
+ */
+bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid)
+{
+ return ptr_ring_full(&pf_pick_gt_migration(gt, vfid)->ring);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_save_produce() - Add VF save data packet to migration ring.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ * @data: &xe_sriov_migration_data packet
+ *
+ * Called by the save migration data producer (PF SR-IOV Control worker) when
+ * processing migration data.
+ * Wakes up the save migration data consumer (userspace), that is potentially
+ * waiting for data when the ring is empty.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ int ret;
+
+ ret = ptr_ring_produce(&pf_pick_gt_migration(gt, vfid)->ring, data);
+ if (ret)
+ return ret;
+
+ wake_up_all(xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid));
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_restore_consume() - Get VF restore data packet from migration ring.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * Called by the restore migration data consumer (PF SR-IOV Control worker) when
+ * processing migration data.
+ * Wakes up the restore migration data producer (userspace), that is
+ * potentially waiting to add more data when the ring is full.
+ *
+ * Return: Pointer to &struct xe_sriov_migration_data on success,
+ * NULL if ring is empty.
+ */
+struct xe_sriov_migration_data *
+xe_gt_sriov_pf_migration_restore_consume(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
+ struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
+ struct xe_sriov_migration_data *data;
+
+ data = ptr_ring_consume(&migration->ring);
+ if (data)
+ wake_up_all(wq);
+
+ return data;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_restore_produce() - Add VF restore data packet to migration ring.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ * @data: &xe_sriov_migration_data packet
+ *
+ * Called by the restore migration data producer (userspace) when processing
+ * migration data.
+ * If the ring is full, waits until there is space.
+ * Queues the restore migration data consumer (PF SR-IOV Control worker), that
+ * is potentially waiting for data when the ring is empty.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_restore_produce(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
+ struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
+ int ret;
+
+ xe_gt_assert(gt, data->tile == gt->tile->id);
+ xe_gt_assert(gt, data->gt == gt->info.id);
+
+ while (1) {
+ ret = ptr_ring_produce(&migration->ring, data);
+ if (!ret)
+ break;
+
+ ret = wait_event_interruptible(*wq, !ptr_ring_full(&migration->ring));
+ if (ret)
+ return ret;
+ }
+
+ xe_gt_sriov_pf_control_process_restore_data(gt, vfid);
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_save_consume() - Get VF save data packet from migration ring.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * Called by the save migration data consumer (userspace) when
+ * processing migration data.
+ * Queues the save migration data producer (PF SR-IOV Control worker), that is
+ * potentially waiting to add more data when the ring is full.
+ *
+ * Return: Pointer to &struct xe_sriov_migration_data on success,
+ * NULL if ring is empty and there's no more data available,
+ * ERR_PTR(-EAGAIN) if the ring is empty, but data is still produced.
+ */
+struct xe_sriov_migration_data *
+xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
+ struct xe_sriov_migration_data *data;
+
+ data = ptr_ring_consume(&migration->ring);
+ if (data) {
+ xe_gt_sriov_pf_control_process_save_data(gt, vfid);
+ return data;
+ }
+
+ if (xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
+ return NULL;
+
+ return ERR_PTR(-EAGAIN);
+}
+
+static void action_ring_cleanup(struct drm_device *dev, void *arg)
+{
+ struct ptr_ring *r = arg;
+
+ ptr_ring_cleanup(r, NULL);
+}
+
/**
* xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
* @gt: the &xe_gt
@@ -393,6 +561,7 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
+ unsigned int n, totalvfs;
int err;
xe_gt_assert(gt, IS_SRIOV_PF(xe));
@@ -404,5 +573,19 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
if (err)
return err;
+ totalvfs = xe_sriov_pf_get_totalvfs(xe);
+ for (n = 1; n <= totalvfs; n++) {
+ struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, n);
+
+ err = ptr_ring_init(&migration->ring,
+ XE_GT_SRIOV_PF_MIGRATION_RING_SIZE, GFP_KERNEL);
+ if (err)
+ return err;
+
+ err = drmm_add_action_or_reset(&xe->drm, action_ring_cleanup, &migration->ring);
+ if (err)
+ return err;
+ }
+
return 0;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 09faeae00ddbb..9e67f18ded205 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -9,11 +9,25 @@
#include <linux/types.h>
struct xe_gt;
+struct xe_sriov_migration_data;
int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
+bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
+bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid);
+
+int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data);
+struct xe_sriov_migration_data *
+xe_gt_sriov_pf_migration_restore_consume(struct xe_gt *gt, unsigned int vfid);
+
+int xe_gt_sriov_pf_migration_restore_produce(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data);
+struct xe_sriov_migration_data *
+xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid);
+
#ifdef CONFIG_DEBUG_FS
ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
char __user *buf, size_t count, loff_t *pos);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
index 9d672feac5f04..84be6fac16c8b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
@@ -7,6 +7,7 @@
#define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
#include <linux/mutex.h>
+#include <linux/ptr_ring.h>
#include <linux/types.h>
/**
@@ -24,6 +25,16 @@ struct xe_gt_sriov_state_snapshot {
} guc;
};
+/**
+ * struct xe_gt_sriov_migration_data - GT-level per-VF migration data.
+ *
+ * Used by the PF driver to maintain per-VF migration data.
+ */
+struct xe_gt_sriov_migration_data {
+ /** @ring: queue containing VF save / restore migration data */
+ struct ptr_ring ring;
+};
+
/**
* struct xe_gt_sriov_pf_migration - GT-level data.
*
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
index a64a6835ad656..812e74d3f8f80 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
@@ -33,6 +33,9 @@ struct xe_gt_sriov_metadata {
/** @snapshot: snapshot of the VF state data */
struct xe_gt_sriov_state_snapshot snapshot;
+
+ /** @migration: per-VF migration data. */
+ struct xe_gt_sriov_migration_data migration;
};
/**
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index 8c523c392f98b..eaf581317bdef 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -3,8 +3,36 @@
* Copyright © 2025 Intel Corporation
*/
+#include <drm/drm_managed.h>
+
+#include "xe_device.h"
+#include "xe_gt_sriov_pf_control.h"
+#include "xe_gt_sriov_pf_migration.h"
+#include "xe_pm.h"
#include "xe_sriov.h"
+#include "xe_sriov_pf_helpers.h"
#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_printk.h"
+
+static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+
+ return &xe->sriov.pf.vfs[vfid].migration;
+}
+
+/**
+ * xe_sriov_pf_migration_waitqueue - Get waitqueue for migration.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * Return: pointer to the migration waitqueue.
+ */
+wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
+{
+ return &pf_pick_migration(xe, vfid)->wq;
+}
/**
* xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
@@ -33,9 +61,124 @@ static bool pf_check_migration_support(struct xe_device *xe)
*/
int xe_sriov_pf_migration_init(struct xe_device *xe)
{
+ unsigned int n, totalvfs;
+
xe_assert(xe, IS_SRIOV_PF(xe));
xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
+ if (!xe_sriov_pf_migration_supported(xe))
+ return 0;
+
+ totalvfs = xe_sriov_pf_get_totalvfs(xe);
+ for (n = 1; n <= totalvfs; n++) {
+ struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
+
+ init_waitqueue_head(&migration->wq);
+ }
return 0;
}
+
+static bool pf_migration_data_ready(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ u8 gt_id;
+
+ for_each_gt(gt, xe, gt_id) {
+ if (!xe_gt_sriov_pf_migration_ring_empty(gt, vfid) ||
+ xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
+ return true;
+ }
+
+ return false;
+}
+
+static struct xe_sriov_migration_data *
+pf_migration_consume(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_migration_data *data;
+ struct xe_gt *gt;
+ u8 gt_id;
+ bool more_data = false;
+
+ for_each_gt(gt, xe, gt_id) {
+ data = xe_gt_sriov_pf_migration_save_consume(gt, vfid);
+ if (data && PTR_ERR(data) != EAGAIN)
+ return data;
+ if (PTR_ERR(data) == -EAGAIN)
+ more_data = true;
+ }
+
+ if (!more_data)
+ return NULL;
+
+ return ERR_PTR(-EAGAIN);
+}
+
+/**
+ * xe_sriov_pf_migration_save_consume() - Consume a VF migration data packet from the device.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * Called by the save migration data consumer (userspace) when
+ * processing migration data.
+ * If there is no migration data to process, wait until more data is available.
+ *
+ * Return: Pointer to &xe_sriov_migration_data on success,
+ * NULL if ring is empty and no more migration data is expected,
+ * ERR_PTR value in case of error.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+struct xe_sriov_migration_data *
+xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, vfid);
+ struct xe_sriov_migration_data *data;
+ int ret;
+
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ while (1) {
+ data = pf_migration_consume(xe, vfid);
+ if (PTR_ERR(data) != -EAGAIN)
+ goto out;
+
+ ret = wait_event_interruptible(migration->wq,
+ pf_migration_data_ready(xe, vfid));
+ if (ret)
+ return ERR_PTR(ret);
+ }
+
+out:
+ return data;
+}
+
+/**
+ * xe_sriov_pf_migration_restore_produce() - Produce a VF migration data packet to the device.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ * @data: Pointer to &xe_sriov_migration_data
+ *
+ * Called by the restore migration data producer (userspace) when processing
+ * migration data.
+ * If the underlying data structure is full, wait until there is space.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ struct xe_gt *gt;
+
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ gt = xe_device_get_gt(xe, data->gt);
+ if (!gt || data->tile != gt->tile->id) {
+ xe_sriov_err_ratelimited(xe, "VF%d Invalid GT - tile:%u, GT:%u\n",
+ vfid, data->tile, data->gt);
+ return -EINVAL;
+ }
+
+ return xe_gt_sriov_pf_migration_restore_produce(gt, vfid, data);
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
index d2b4a24165438..df81a540c246a 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
@@ -7,10 +7,17 @@
#define _XE_SRIOV_PF_MIGRATION_H_
#include <linux/types.h>
+#include <linux/wait.h>
struct xe_device;
+struct xe_sriov_migration_data;
int xe_sriov_pf_migration_init(struct xe_device *xe);
bool xe_sriov_pf_migration_supported(struct xe_device *xe);
+int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_migration_data *data);
+struct xe_sriov_migration_data *
+xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid);
+wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid);
#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
index e69de29bb2d1d..2a45ee4e3ece8 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_SRIOV_PF_MIGRATION_TYPES_H_
+#define _XE_SRIOV_PF_MIGRATION_TYPES_H_
+
+#include <linux/types.h>
+#include <linux/wait.h>
+
+/**
+ * struct xe_sriov_migration_data - Xe SR-IOV VF migration data packet
+ */
+struct xe_sriov_migration_data {
+ /** @xe: Xe device */
+ struct xe_device *xe;
+ /** @vaddr: CPU pointer to payload data */
+ void *vaddr;
+ /** @remaining: payload data remaining */
+ size_t remaining;
+ /** @hdr_remaining: header data remaining */
+ size_t hdr_remaining;
+ union {
+ /** @bo: Buffer object with migration data */
+ struct xe_bo *bo;
+ /** @buff: Buffer with migration data */
+ void *buff;
+ };
+ __struct_group(xe_sriov_pf_migration_hdr, hdr, __packed,
+ /** @hdr.version: migration data protocol version */
+ u8 version;
+ /** @hdr.type: migration data type */
+ u8 type;
+ /** @hdr.tile: migration data tile id */
+ u8 tile;
+ /** @hdr.gt: migration data gt id */
+ u8 gt;
+ /** @hdr.flags: migration data flags */
+ u32 flags;
+ /** @hdr.offset: offset into the resource;
+ * used when multiple packets of given type are used for migration
+ */
+ u64 offset;
+ /** @hdr.size: migration data size */
+ u64 size;
+ );
+};
+
+/**
+ * struct xe_sriov_pf_migration - Per VF device-level migration related data
+ */
+struct xe_sriov_pf_migration {
+ /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
+ wait_queue_head_t wq;
+};
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
index 24d22afeececa..c92baaa1694ca 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
@@ -9,6 +9,7 @@
#include <linux/mutex.h>
#include <linux/types.h>
+#include "xe_sriov_pf_migration_types.h"
#include "xe_sriov_pf_provision_types.h"
#include "xe_sriov_pf_service_types.h"
@@ -18,6 +19,8 @@
struct xe_sriov_metadata {
/** @version: negotiated VF/PF ABI version */
struct xe_sriov_pf_service_version version;
+ /** @migration: migration data */
+ struct xe_sriov_pf_migration migration;
};
/**
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 05/26] drm/xe/pf: Add helpers for migration data allocation / free
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (3 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 04/26] drm/xe/pf: Add data structures and handlers for migration rings Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 22:18 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 06/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
` (20 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Now that it's possible to free the packets - connect the restore
handling logic with the ring.
The helpers will also be used in upcoming changes that will start producing
migration data packets.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/Makefile | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 7 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 29 +++-
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 1 +
drivers/gpu/drm/xe/xe_sriov_migration_data.c | 127 ++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_migration_data.h | 31 +++++
6 files changed, 195 insertions(+), 1 deletion(-)
create mode 100644 drivers/gpu/drm/xe/xe_sriov_migration_data.c
create mode 100644 drivers/gpu/drm/xe/xe_sriov_migration_data.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 89e5b26c27975..3d72db9e528e4 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -173,6 +173,7 @@ xe-$(CONFIG_PCI_IOV) += \
xe_lmtt_2l.o \
xe_lmtt_ml.o \
xe_pci_sriov.o \
+ xe_sriov_migration_data.o \
xe_sriov_pf.o \
xe_sriov_pf_control.o \
xe_sriov_pf_debugfs.o \
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index cad73fdaee93c..dd9bc9c99f78c 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -18,6 +18,7 @@
#include "xe_gt_sriov_printk.h"
#include "xe_guc_ct.h"
#include "xe_sriov.h"
+#include "xe_sriov_migration_data.h"
#include "xe_sriov_pf_control.h"
#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_service.h"
@@ -851,6 +852,8 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
+ xe_gt_sriov_pf_migration_ring_free(gt, vfid);
+
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
@@ -1045,6 +1048,8 @@ int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
{
if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ xe_gt_sriov_pf_migration_ring_free(gt, vfid);
+
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE);
@@ -1078,6 +1083,8 @@ pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
+ xe_sriov_migration_data_free(data);
+
return 0;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index b6ffd982d6007..8ba72165759b3 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -14,6 +14,7 @@
#include "xe_guc.h"
#include "xe_guc_ct.h"
#include "xe_sriov.h"
+#include "xe_sriov_migration_data.h"
#include "xe_sriov_pf_migration.h"
#define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
@@ -418,6 +419,25 @@ bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid)
return ptr_ring_full(&pf_pick_gt_migration(gt, vfid)->ring);
}
+/**
+ * xe_gt_sriov_pf_migration_ring_free() - Consume and free all data in migration ring
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ */
+void xe_gt_sriov_pf_migration_ring_free(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
+ struct xe_sriov_migration_data *data;
+
+ if (ptr_ring_empty(&migration->ring))
+ return;
+
+ xe_gt_sriov_notice(gt, "VF%u unprocessed migration data left in the ring!\n", vfid);
+
+ while ((data = ptr_ring_consume(&migration->ring)))
+ xe_sriov_migration_data_free(data);
+}
+
/**
* xe_gt_sriov_pf_migration_save_produce() - Add VF save data packet to migration ring.
* @gt: the &xe_gt
@@ -543,11 +563,18 @@ xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid)
return ERR_PTR(-EAGAIN);
}
+static void pf_mig_data_destroy(void *ptr)
+{
+ struct xe_sriov_migration_data *data = ptr;
+
+ xe_sriov_migration_data_free(data);
+}
+
static void action_ring_cleanup(struct drm_device *dev, void *arg)
{
struct ptr_ring *r = arg;
- ptr_ring_cleanup(r, NULL);
+ ptr_ring_cleanup(r, pf_mig_data_destroy);
}
/**
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 9e67f18ded205..1ed2248f0a17e 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -17,6 +17,7 @@ int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vf
bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid);
+void xe_gt_sriov_pf_migration_ring_free(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_migration_data *data);
diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
new file mode 100644
index 0000000000000..b04f9be3b7fed
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
@@ -0,0 +1,127 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include "xe_bo.h"
+#include "xe_device.h"
+#include "xe_sriov_migration_data.h"
+
+static bool data_needs_bo(struct xe_sriov_migration_data *data)
+{
+ return data->type == XE_SRIOV_MIGRATION_DATA_TYPE_VRAM;
+}
+
+/**
+ * xe_sriov_migration_data() - Allocate migration data packet
+ * @xe: the &xe_device
+ *
+ * Only allocates the "outer" structure, without initializing the migration
+ * data backing storage.
+ *
+ * Return: Pointer to &xe_sriov_migration_data on success,
+ * NULL in case of error.
+ */
+struct xe_sriov_migration_data *
+xe_sriov_migration_data_alloc(struct xe_device *xe)
+{
+ struct xe_sriov_migration_data *data;
+
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return NULL;
+
+ data->xe = xe;
+ data->hdr_remaining = sizeof(data->hdr);
+
+ return data;
+}
+
+/**
+ * xe_sriov_migration_data_free() - Free migration data packet.
+ * @data: the &xe_sriov_migration_data packet
+ */
+void xe_sriov_migration_data_free(struct xe_sriov_migration_data *data)
+{
+ if (data_needs_bo(data))
+ xe_bo_unpin_map_no_vm(data->bo);
+ else
+ kvfree(data->buff);
+
+ kfree(data);
+}
+
+static int mig_data_init(struct xe_sriov_migration_data *data)
+{
+ struct xe_gt *gt = xe_device_get_gt(data->xe, data->gt);
+
+ if (data->size == 0)
+ return 0;
+
+ if (data_needs_bo(data)) {
+ struct xe_bo *bo = xe_bo_create_pin_map_novm(data->xe, gt->tile,
+ PAGE_ALIGN(data->size),
+ ttm_bo_type_kernel,
+ XE_BO_FLAG_SYSTEM | XE_BO_FLAG_PINNED,
+ false);
+ if (IS_ERR(bo))
+ return PTR_ERR(bo);
+
+ data->bo = bo;
+ data->vaddr = bo->vmap.vaddr;
+ } else {
+ void *buff = kvzalloc(data->size, GFP_KERNEL);
+
+ if (!buff)
+ return -ENOMEM;
+
+ data->buff = buff;
+ data->vaddr = buff;
+ }
+
+ return 0;
+}
+
+#define XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION 1
+/**
+ * xe_sriov_migration_data_init() - Initialize the migration data header and backing storage.
+ * @data: the &xe_sriov_migration_data packet
+ * @tile_id: tile identifier
+ * @gt_id: GT identifier
+ * @type: &xe_sriov_migration_data_type
+ * @offset: offset of data packet payload (within wider resource)
+ * @size: size of data packet payload
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
+ enum xe_sriov_migration_data_type type, loff_t offset, size_t size)
+{
+ data->version = XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION;
+ data->type = type;
+ data->tile = tile_id;
+ data->gt = gt_id;
+ data->offset = offset;
+ data->size = size;
+ data->remaining = size;
+
+ return mig_data_init(data);
+}
+
+/**
+ * xe_sriov_migration_data_init() - Initialize the migration data backing storage based on header.
+ * @data: the &xe_sriov_migration_data packet
+ *
+ * Header data is expected to be filled prior to calling this function.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *data)
+{
+ if (data->version != XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION)
+ return -EINVAL;
+
+ data->remaining = data->size;
+
+ return mig_data_init(data);
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
new file mode 100644
index 0000000000000..ef65dccddc035
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_SRIOV_MIGRATION_DATA_H_
+#define _XE_SRIOV_MIGRATION_DATA_H_
+
+#include <linux/types.h>
+
+struct xe_device;
+
+enum xe_sriov_migration_data_type {
+ /* Skipping 0 to catch uninitialized data */
+ XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR = 1,
+ XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER,
+ XE_SRIOV_MIGRATION_DATA_TYPE_GGTT,
+ XE_SRIOV_MIGRATION_DATA_TYPE_MMIO,
+ XE_SRIOV_MIGRATION_DATA_TYPE_GUC,
+ XE_SRIOV_MIGRATION_DATA_TYPE_VRAM,
+};
+
+struct xe_sriov_migration_data *
+xe_sriov_migration_data_alloc(struct xe_device *xe);
+void xe_sriov_migration_data_free(struct xe_sriov_migration_data *snapshot);
+
+int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
+ enum xe_sriov_migration_data_type, loff_t offset, size_t size);
+int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *snapshot);
+
+#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 06/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (4 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 05/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 22:34 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 07/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
` (19 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Add debugfs handlers for migration state and handle bitstream
.read()/.write() to convert from bitstream to/from migration data
packets.
As descriptor/trailer are handled at this layer - add handling for both
save and restore side.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_sriov_migration_data.c | 336 ++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_migration_data.h | 5 +
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 5 +
drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 35 ++
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 54 +++
.../gpu/drm/xe/xe_sriov_pf_migration_types.h | 9 +
6 files changed, 444 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
index b04f9be3b7fed..4cd6c6fc9ba18 100644
--- a/drivers/gpu/drm/xe/xe_sriov_migration_data.c
+++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
@@ -6,6 +6,44 @@
#include "xe_bo.h"
#include "xe_device.h"
#include "xe_sriov_migration_data.h"
+#include "xe_sriov_pf_helpers.h"
+#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_printk.h"
+
+static struct mutex *pf_migration_mutex(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+ return &xe->sriov.pf.vfs[vfid].migration.lock;
+}
+
+static struct xe_sriov_migration_data **pf_pick_pending(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+ lockdep_assert_held(pf_migration_mutex(xe, vfid));
+
+ return &xe->sriov.pf.vfs[vfid].migration.pending;
+}
+
+static struct xe_sriov_migration_data **
+pf_pick_descriptor(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+ lockdep_assert_held(pf_migration_mutex(xe, vfid));
+
+ return &xe->sriov.pf.vfs[vfid].migration.descriptor;
+}
+
+static struct xe_sriov_migration_data **pf_pick_trailer(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+ lockdep_assert_held(pf_migration_mutex(xe, vfid));
+
+ return &xe->sriov.pf.vfs[vfid].migration.trailer;
+}
static bool data_needs_bo(struct xe_sriov_migration_data *data)
{
@@ -43,6 +81,9 @@ xe_sriov_migration_data_alloc(struct xe_device *xe)
*/
void xe_sriov_migration_data_free(struct xe_sriov_migration_data *data)
{
+ if (IS_ERR_OR_NULL(data))
+ return;
+
if (data_needs_bo(data))
xe_bo_unpin_map_no_vm(data->bo);
else
@@ -125,3 +166,298 @@ int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *data)
return mig_data_init(data);
}
+
+static ssize_t vf_mig_data_hdr_read(struct xe_sriov_migration_data *data,
+ char __user *buf, size_t len)
+{
+ loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
+
+ if (!data->hdr_remaining)
+ return -EINVAL;
+
+ if (len > data->hdr_remaining)
+ len = data->hdr_remaining;
+
+ if (copy_to_user(buf, (void *)&data->hdr + offset, len))
+ return -EFAULT;
+
+ data->hdr_remaining -= len;
+
+ return len;
+}
+
+static ssize_t vf_mig_data_read(struct xe_sriov_migration_data *data,
+ char __user *buf, size_t len)
+{
+ if (len > data->remaining)
+ len = data->remaining;
+
+ if (copy_to_user(buf, data->vaddr + (data->size - data->remaining), len))
+ return -EFAULT;
+
+ data->remaining -= len;
+
+ return len;
+}
+
+static ssize_t __vf_mig_data_read_single(struct xe_sriov_migration_data **data,
+ unsigned int vfid, char __user *buf, size_t len)
+{
+ ssize_t copied = 0;
+
+ if ((*data)->hdr_remaining)
+ copied = vf_mig_data_hdr_read(*data, buf, len);
+ else
+ copied = vf_mig_data_read(*data, buf, len);
+
+ if ((*data)->remaining == 0 && (*data)->hdr_remaining == 0) {
+ xe_sriov_migration_data_free(*data);
+ *data = NULL;
+ }
+
+ return copied;
+}
+
+static struct xe_sriov_migration_data **vf_mig_pick_data(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_migration_data **data;
+
+ data = pf_pick_descriptor(xe, vfid);
+ if (*data)
+ return data;
+
+ data = pf_pick_pending(xe, vfid);
+ if (!*data)
+ *data = xe_sriov_pf_migration_save_consume(xe, vfid);
+ if (*data)
+ return data;
+
+ data = pf_pick_trailer(xe, vfid);
+ if (*data)
+ return data;
+
+ return ERR_PTR(-ENODATA);
+}
+
+static ssize_t vf_mig_data_read_single(struct xe_device *xe, unsigned int vfid,
+ char __user *buf, size_t len)
+{
+ struct xe_sriov_migration_data **data = vf_mig_pick_data(xe, vfid);
+
+ if (IS_ERR_OR_NULL(data))
+ return PTR_ERR(data);
+
+ return __vf_mig_data_read_single(data, vfid, buf, len);
+}
+
+/**
+ * xe_sriov_migration_data_read() - Read migration data from the device.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ * @buf: start address of userspace buffer
+ * @len: requested read size from userspace
+ *
+ * Return: number of bytes that has been successfully read,
+ * 0 if no more migration data is available,
+ * -errno on failure.
+ */
+ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
+ char __user *buf, size_t len)
+{
+ ssize_t ret, consumed = 0;
+
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
+ while (consumed < len) {
+ ret = vf_mig_data_read_single(xe, vfid, buf, len - consumed);
+ if (ret == -ENODATA)
+ break;
+ if (ret < 0)
+ return ret;
+
+ consumed += ret;
+ buf += ret;
+ }
+ }
+
+ return consumed;
+}
+
+static ssize_t vf_mig_hdr_write(struct xe_sriov_migration_data *data,
+ const char __user *buf, size_t len)
+{
+ loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
+ int ret;
+
+ if (len > data->hdr_remaining)
+ len = data->hdr_remaining;
+
+ if (copy_from_user((void *)&data->hdr + offset, buf, len))
+ return -EFAULT;
+
+ data->hdr_remaining -= len;
+
+ if (!data->hdr_remaining) {
+ ret = xe_sriov_migration_data_init_from_hdr(data);
+ if (ret)
+ return ret;
+ }
+
+ return len;
+}
+
+static ssize_t vf_mig_data_write(struct xe_sriov_migration_data *data,
+ const char __user *buf, size_t len)
+{
+ if (len > data->remaining)
+ len = data->remaining;
+
+ if (copy_from_user(data->vaddr + (data->size - data->remaining), buf, len))
+ return -EFAULT;
+
+ data->remaining -= len;
+
+ return len;
+}
+
+static ssize_t vf_mig_data_write_single(struct xe_device *xe, unsigned int vfid,
+ const char __user *buf, size_t len)
+{
+ struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
+ int ret;
+ ssize_t copied;
+
+ if (IS_ERR_OR_NULL(*data)) {
+ *data = xe_sriov_migration_data_alloc(xe);
+ if (!*data)
+ return -ENOMEM;
+ }
+
+ if ((*data)->hdr_remaining)
+ copied = vf_mig_hdr_write(*data, buf, len);
+ else
+ copied = vf_mig_data_write(*data, buf, len);
+
+ if ((*data)->hdr_remaining == 0 && (*data)->remaining == 0) {
+ ret = xe_sriov_pf_migration_restore_produce(xe, vfid, *data);
+ if (ret) {
+ xe_sriov_migration_data_free(*data);
+ return ret;
+ }
+
+ *data = NULL;
+ }
+
+ return copied;
+}
+
+/**
+ * xe_sriov_migration_data_write() - Write migration data to the device.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ * @buf: start address of userspace buffer
+ * @len: requested write size from userspace
+ *
+ * Return: number of bytes that has been successfully written,
+ * -errno on failure.
+ */
+ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
+ const char __user *buf, size_t len)
+{
+ ssize_t ret, produced = 0;
+
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
+ while (produced < len) {
+ ret = vf_mig_data_write_single(xe, vfid, buf, len - produced);
+ if (ret < 0)
+ return ret;
+
+ produced += ret;
+ buf += ret;
+ }
+ }
+
+ return produced;
+}
+
+#define MIGRATION_DESCRIPTOR_DWORDS 0
+static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_migration_data **desc = pf_pick_descriptor(xe, vfid);
+ struct xe_sriov_migration_data *data;
+ int ret;
+
+ data = xe_sriov_migration_data_alloc(xe);
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_migration_data_init(data, 0, 0, XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR,
+ 0, MIGRATION_DESCRIPTOR_DWORDS * sizeof(u32));
+ if (ret) {
+ xe_sriov_migration_data_free(data);
+ return ret;
+ }
+
+ *desc = data;
+
+ return 0;
+}
+
+static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
+
+ *data = NULL;
+}
+
+#define MIGRATION_TRAILER_SIZE 0
+static int pf_trailer_init(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_migration_data **trailer = pf_pick_trailer(xe, vfid);
+ struct xe_sriov_migration_data *data;
+ int ret;
+
+ data = xe_sriov_migration_data_alloc(xe);
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_migration_data_init(data, 0, 0, XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER,
+ 0, MIGRATION_TRAILER_SIZE);
+ if (ret) {
+ xe_sriov_migration_data_free(data);
+ return ret;
+ }
+
+ *trailer = data;
+
+ return 0;
+}
+
+/**
+ * xe_sriov_migration_data_save_init() - Initialize the pending save migration data.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid)
+{
+ int ret;
+
+ scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
+ ret = pf_descriptor_init(xe, vfid);
+ if (ret)
+ return ret;
+
+ ret = pf_trailer_init(xe, vfid);
+ if (ret)
+ return ret;
+
+ pf_pending_init(xe, vfid);
+ }
+
+ return 0;
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
index ef65dccddc035..5cde6e9439677 100644
--- a/drivers/gpu/drm/xe/xe_sriov_migration_data.h
+++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
@@ -27,5 +27,10 @@ void xe_sriov_migration_data_free(struct xe_sriov_migration_data *snapshot);
int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
enum xe_sriov_migration_data_type, loff_t offset, size_t size);
int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *snapshot);
+ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
+ char __user *buf, size_t len);
+ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
+ const char __user *buf, size_t len);
+int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
index 8d8a01faf5291..c2768848daba1 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
@@ -5,6 +5,7 @@
#include "xe_device.h"
#include "xe_gt_sriov_pf_control.h"
+#include "xe_sriov_migration_data.h"
#include "xe_sriov_pf_control.h"
#include "xe_sriov_printk.h"
@@ -165,6 +166,10 @@ int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid)
unsigned int id;
int ret;
+ ret = xe_sriov_migration_data_save_init(xe, vfid);
+ if (ret)
+ return ret;
+
for_each_gt(gt, xe, id) {
ret = xe_gt_sriov_pf_control_trigger_save_vf(gt, vfid);
if (ret)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
index e0e6340c49106..a9a28aec22421 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
@@ -9,6 +9,7 @@
#include "xe_device.h"
#include "xe_device_types.h"
#include "xe_pm.h"
+#include "xe_sriov_migration_data.h"
#include "xe_sriov_pf.h"
#include "xe_sriov_pf_control.h"
#include "xe_sriov_pf_debugfs.h"
@@ -132,6 +133,7 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
* /sys/kernel/debug/dri/BDF/
* ├── sriov
* │ ├── vf1
+ * │ │ ├── migration_data
* │ │ ├── pause
* │ │ ├── reset
* │ │ ├── resume
@@ -220,6 +222,38 @@ DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
DEFINE_VF_CONTROL_ATTRIBUTE_RW(save_vf);
DEFINE_VF_CONTROL_ATTRIBUTE_RW(restore_vf);
+static ssize_t data_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
+{
+ struct dentry *dent = file_dentry(file)->d_parent;
+ struct xe_device *xe = extract_xe(dent);
+ unsigned int vfid = extract_vfid(dent);
+
+ if (*pos)
+ return -ESPIPE;
+
+ return xe_sriov_migration_data_write(xe, vfid, buf, count);
+}
+
+static ssize_t data_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
+{
+ struct dentry *dent = file_dentry(file)->d_parent;
+ struct xe_device *xe = extract_xe(dent);
+ unsigned int vfid = extract_vfid(dent);
+
+ if (*ppos)
+ return -ESPIPE;
+
+ return xe_sriov_migration_data_read(xe, vfid, buf, count);
+}
+
+static const struct file_operations data_vf_fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .write = data_write,
+ .read = data_read,
+ .llseek = default_llseek,
+};
+
static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
{
debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
@@ -228,6 +262,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
+ debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
}
static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index eaf581317bdef..029e14f1ffa74 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -10,6 +10,7 @@
#include "xe_gt_sriov_pf_migration.h"
#include "xe_pm.h"
#include "xe_sriov.h"
+#include "xe_sriov_migration_data.h"
#include "xe_sriov_pf_helpers.h"
#include "xe_sriov_pf_migration.h"
#include "xe_sriov_printk.h"
@@ -53,6 +54,15 @@ static bool pf_check_migration_support(struct xe_device *xe)
return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
}
+static void pf_migration_cleanup(struct drm_device *dev, void *arg)
+{
+ struct xe_sriov_pf_migration *migration = arg;
+
+ xe_sriov_migration_data_free(migration->pending);
+ xe_sriov_migration_data_free(migration->trailer);
+ xe_sriov_migration_data_free(migration->descriptor);
+}
+
/**
* xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
* @xe: the &xe_device
@@ -62,6 +72,7 @@ static bool pf_check_migration_support(struct xe_device *xe)
int xe_sriov_pf_migration_init(struct xe_device *xe)
{
unsigned int n, totalvfs;
+ int err;
xe_assert(xe, IS_SRIOV_PF(xe));
@@ -73,7 +84,15 @@ int xe_sriov_pf_migration_init(struct xe_device *xe)
for (n = 1; n <= totalvfs; n++) {
struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
+ err = drmm_mutex_init(&xe->drm, &migration->lock);
+ if (err)
+ return err;
+
init_waitqueue_head(&migration->wq);
+
+ err = drmm_add_action_or_reset(&xe->drm, pf_migration_cleanup, migration);
+ if (err)
+ return err;
}
return 0;
@@ -154,6 +173,36 @@ xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
return data;
}
+static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ if (data->tile != 0 || data->gt != 0)
+ return -EINVAL;
+
+ xe_sriov_migration_data_free(data);
+
+ return 0;
+}
+
+static int pf_handle_trailer(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ struct xe_gt *gt;
+ u8 gt_id;
+
+ if (data->tile != 0 || data->gt != 0)
+ return -EINVAL;
+ if (data->offset != 0 || data->size != 0 || data->buff || data->bo)
+ return -EINVAL;
+
+ xe_sriov_migration_data_free(data);
+
+ for_each_gt(gt, xe, gt_id)
+ xe_gt_sriov_pf_control_restore_data_done(gt, vfid);
+
+ return 0;
+}
+
/**
* xe_sriov_pf_migration_restore_produce() - Produce a VF migration data packet to the device.
* @xe: the &xe_device
@@ -173,6 +222,11 @@ int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfi
xe_assert(xe, IS_SRIOV_PF(xe));
+ if (data->type == XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR)
+ return pf_handle_descriptor(xe, vfid, data);
+ else if (data->type == XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER)
+ return pf_handle_trailer(xe, vfid, data);
+
gt = xe_device_get_gt(xe, data->gt);
if (!gt || data->tile != gt->tile->id) {
xe_sriov_err_ratelimited(xe, "VF%d Invalid GT - tile:%u, GT:%u\n",
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
index 2a45ee4e3ece8..8468e5eeb6d66 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
@@ -7,6 +7,7 @@
#define _XE_SRIOV_PF_MIGRATION_TYPES_H_
#include <linux/types.h>
+#include <linux/mutex_types.h>
#include <linux/wait.h>
/**
@@ -53,6 +54,14 @@ struct xe_sriov_migration_data {
struct xe_sriov_pf_migration {
/** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
wait_queue_head_t wq;
+ /** @lock: Mutex protecting the migration data */
+ struct mutex lock;
+ /** @pending: currently processed data packet of VF resource */
+ struct xe_sriov_migration_data *pending;
+ /** @trailer: data packet used to indicate the end of stream */
+ struct xe_sriov_migration_data *trailer;
+ /** @descriptor: data packet containing the metadata describing the device */
+ struct xe_sriov_migration_data *descriptor;
};
#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 07/26] drm/xe/pf: Add minimalistic migration descriptor
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (5 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 06/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 22:49 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 08/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
` (18 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
The descriptor reuses the KLV format used by GuC and contains metadata
that can be used to quickly fail migration when source is incompatible
with destination.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_sriov_migration_data.c | 79 +++++++++++++++++++-
drivers/gpu/drm/xe/xe_sriov_migration_data.h | 2 +
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 6 ++
3 files changed, 86 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
index 4cd6c6fc9ba18..b58508c0c30f1 100644
--- a/drivers/gpu/drm/xe/xe_sriov_migration_data.c
+++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
@@ -5,6 +5,7 @@
#include "xe_bo.h"
#include "xe_device.h"
+#include "xe_guc_klv_helpers.h"
#include "xe_sriov_migration_data.h"
#include "xe_sriov_pf_helpers.h"
#include "xe_sriov_pf_migration.h"
@@ -383,11 +384,18 @@ ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
return produced;
}
-#define MIGRATION_DESCRIPTOR_DWORDS 0
+#define MIGRATION_KLV_DEVICE_DEVID_KEY 0xf001u
+#define MIGRATION_KLV_DEVICE_DEVID_LEN 1u
+#define MIGRATION_KLV_DEVICE_REVID_KEY 0xf002u
+#define MIGRATION_KLV_DEVICE_REVID_LEN 1u
+
+#define MIGRATION_DESCRIPTOR_DWORDS (GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_DEVID_LEN + \
+ GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_REVID_LEN)
static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
{
struct xe_sriov_migration_data **desc = pf_pick_descriptor(xe, vfid);
struct xe_sriov_migration_data *data;
+ u32 *klvs;
int ret;
data = xe_sriov_migration_data_alloc(xe);
@@ -401,11 +409,80 @@ static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
return ret;
}
+ klvs = data->vaddr;
+ *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_DEVID_KEY,
+ MIGRATION_KLV_DEVICE_DEVID_LEN);
+ *klvs++ = xe->info.devid;
+ *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_REVID_KEY,
+ MIGRATION_KLV_DEVICE_REVID_LEN);
+ *klvs++ = xe->info.revid;
+
*desc = data;
return 0;
}
+/**
+ * xe_sriov_migration_data_process_descriptor() - Process migration data descriptor.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ * @data: the &struct xe_sriov_pf_migration_data containing the descriptor
+ *
+ * The descriptor uses the same KLV format as GuC, and contains metadata used for
+ * checking migration data compatibility.
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+int xe_sriov_migration_data_process_descriptor(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ u32 num_dwords = data->size / sizeof(u32);
+ u32 *klvs = data->vaddr;
+
+ xe_assert(xe, data->type == XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR);
+ if (data->size % sizeof(u32) != 0)
+ return -EINVAL;
+
+ while (num_dwords >= GUC_KLV_LEN_MIN) {
+ u32 key = FIELD_GET(GUC_KLV_0_KEY, klvs[0]);
+ u32 len = FIELD_GET(GUC_KLV_0_LEN, klvs[0]);
+
+ klvs += GUC_KLV_LEN_MIN;
+ num_dwords -= GUC_KLV_LEN_MIN;
+
+ switch (key) {
+ case MIGRATION_KLV_DEVICE_DEVID_KEY:
+ if (*klvs != xe->info.devid) {
+ xe_sriov_warn(xe,
+ "Aborting migration, devid mismatch %#04x!=%#04x\n",
+ *klvs, xe->info.devid);
+ return -ENODEV;
+ }
+ break;
+ case MIGRATION_KLV_DEVICE_REVID_KEY:
+ if (*klvs != xe->info.revid) {
+ xe_sriov_warn(xe,
+ "Aborting migration, revid mismatch %#04x!=%#04x\n",
+ *klvs, xe->info.revid);
+ return -ENODEV;
+ }
+ break;
+ default:
+ xe_sriov_dbg(xe,
+ "Unknown migration descriptor key %#06x - skipping\n", key);
+ break;
+ }
+
+ if (len > num_dwords)
+ return -EINVAL;
+
+ klvs += len;
+ num_dwords -= len;
+ }
+
+ return 0;
+}
+
static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
{
struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
index 5cde6e9439677..e7f3b332124bc 100644
--- a/drivers/gpu/drm/xe/xe_sriov_migration_data.h
+++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
@@ -31,6 +31,8 @@ ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
char __user *buf, size_t len);
ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
const char __user *buf, size_t len);
+int xe_sriov_migration_data_process_descriptor(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_migration_data *data);
int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index 029e14f1ffa74..0b4b237780102 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -176,9 +176,15 @@ xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
struct xe_sriov_migration_data *data)
{
+ int ret;
+
if (data->tile != 0 || data->gt != 0)
return -EINVAL;
+ ret = xe_sriov_migration_data_process_descriptor(xe, vfid, data);
+ if (ret)
+ return ret;
+
xe_sriov_migration_data_free(data);
return 0;
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 08/26] drm/xe/pf: Expose VF migration data size over debugfs
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (6 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 07/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 23:02 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 09/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
` (17 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
The size is normally used to make a decision on when to stop the device
(mainly when it's in a pre_copy state).
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 19 ++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 2 ++
drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 29 ++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 30 +++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 1 +
5 files changed, 81 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 8ba72165759b3..4e26feb9c267f 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -395,6 +395,25 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
}
#endif /* CONFIG_DEBUG_FS */
+/**
+ * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: total migration data size in bytes or a negative error code on failure.
+ */
+ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
+{
+ ssize_t total = 0;
+
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+
+ /* Nothing to query yet - will be updated once per-GT migration data types are added */
+ return total;
+}
+
/**
* xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty.
* @gt: the &xe_gt
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 1ed2248f0a17e..e2d41750f863c 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -15,6 +15,8 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
+ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
+
bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid);
void xe_gt_sriov_pf_migration_ring_free(struct xe_gt *gt, unsigned int vfid);
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
index a9a28aec22421..bc2d0b0342f22 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
@@ -14,6 +14,7 @@
#include "xe_sriov_pf_control.h"
#include "xe_sriov_pf_debugfs.h"
#include "xe_sriov_pf_helpers.h"
+#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_provision.h"
#include "xe_sriov_pf_service.h"
#include "xe_sriov_printk.h"
@@ -254,6 +255,33 @@ static const struct file_operations data_vf_fops = {
.llseek = default_llseek,
};
+static ssize_t size_read(struct file *file, char __user *ubuf, size_t count, loff_t *ppos)
+{
+ struct dentry *dent = file_dentry(file)->d_parent;
+ struct xe_device *xe = extract_xe(dent);
+ unsigned int vfid = extract_vfid(dent);
+ char buf[21];
+ ssize_t ret;
+ int len;
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_migration_size(xe, vfid);
+ xe_pm_runtime_put(xe);
+ if (ret < 0)
+ return ret;
+
+ len = scnprintf(buf, sizeof(buf), "%zd\n", ret);
+
+ return simple_read_from_buffer(ubuf, count, ppos, buf, len);
+}
+
+static const struct file_operations size_vf_fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .read = size_read,
+ .llseek = default_llseek,
+};
+
static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
{
debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
@@ -263,6 +291,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
+ debugfs_create_file("migration_size", 0400, vfdent, xe, &size_vf_fops);
}
static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index 0b4b237780102..88babec9c893e 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -242,3 +242,33 @@ int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfi
return xe_gt_sriov_pf_migration_restore_produce(gt, vfid, data);
}
+
+/**
+ * xe_sriov_pf_migration_size() - Total size of migration data from all components within a device
+ * @xe: the &xe_device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function is for PF only.
+ *
+ * Return: total migration data size in bytes or a negative error code on failure.
+ */
+ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid)
+{
+ size_t size = 0;
+ struct xe_gt *gt;
+ ssize_t ret;
+ u8 gt_id;
+
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid);
+
+ for_each_gt(gt, xe, gt_id) {
+ ret = xe_gt_sriov_pf_migration_size(gt, vfid);
+ if (ret < 0)
+ return ret;
+
+ size += ret;
+ }
+
+ return size;
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
index df81a540c246a..16cb444c36aa6 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
@@ -18,6 +18,7 @@ int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfi
struct xe_sriov_migration_data *data);
struct xe_sriov_migration_data *
xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid);
+ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid);
wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid);
#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 09/26] drm/xe: Add sa/guc_buf_cache sync interface
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (7 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 08/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 23:05 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 10/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
` (16 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
In upcoming changes the cached buffers are going to be used to read data
produced by the GuC. Add a counterpart to flush, which synchronizes the
CPU-side of suballocation with the GPU data and propagate the interface
to GuC Buffer Cache.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_guc_buf.c | 13 +++++++++++++
drivers/gpu/drm/xe/xe_guc_buf.h | 1 +
drivers/gpu/drm/xe/xe_sa.c | 21 +++++++++++++++++++++
drivers/gpu/drm/xe/xe_sa.h | 1 +
4 files changed, 36 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
index 502ca3a4ee606..4d8a4712309f4 100644
--- a/drivers/gpu/drm/xe/xe_guc_buf.c
+++ b/drivers/gpu/drm/xe/xe_guc_buf.c
@@ -115,6 +115,19 @@ void xe_guc_buf_release(const struct xe_guc_buf buf)
xe_sa_bo_free(buf.sa, NULL);
}
+/**
+ * xe_guc_buf_sync_read() - Copy the data from the GPU memory to the sub-allocation.
+ * @buf: the &xe_guc_buf to sync
+ *
+ * Return: a CPU pointer of the sub-allocation.
+ */
+void *xe_guc_buf_sync_read(const struct xe_guc_buf buf)
+{
+ xe_sa_bo_sync_read(buf.sa);
+
+ return xe_sa_bo_cpu_addr(buf.sa);
+}
+
/**
* xe_guc_buf_flush() - Copy the data from the sub-allocation to the GPU memory.
* @buf: the &xe_guc_buf to flush
diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
index 0d67604d96bdd..c5e0f1fd24d74 100644
--- a/drivers/gpu/drm/xe/xe_guc_buf.h
+++ b/drivers/gpu/drm/xe/xe_guc_buf.h
@@ -30,6 +30,7 @@ static inline bool xe_guc_buf_is_valid(const struct xe_guc_buf buf)
}
void *xe_guc_buf_cpu_ptr(const struct xe_guc_buf buf);
+void *xe_guc_buf_sync_read(const struct xe_guc_buf buf);
u64 xe_guc_buf_flush(const struct xe_guc_buf buf);
u64 xe_guc_buf_gpu_addr(const struct xe_guc_buf buf);
u64 xe_guc_cache_gpu_addr_from_ptr(struct xe_guc_buf_cache *cache, const void *ptr, u32 size);
diff --git a/drivers/gpu/drm/xe/xe_sa.c b/drivers/gpu/drm/xe/xe_sa.c
index fedd017d6dd36..63a5263dcf1b1 100644
--- a/drivers/gpu/drm/xe/xe_sa.c
+++ b/drivers/gpu/drm/xe/xe_sa.c
@@ -110,6 +110,10 @@ struct drm_suballoc *__xe_sa_bo_new(struct xe_sa_manager *sa_manager, u32 size,
return drm_suballoc_new(&sa_manager->base, size, gfp, true, 0);
}
+/**
+ * xe_sa_bo_flush_write() - Copy the data from the sub-allocation to the GPU memory.
+ * @sa_bo: the &drm_suballoc to flush
+ */
void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
{
struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
@@ -123,6 +127,23 @@ void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
drm_suballoc_size(sa_bo));
}
+/**
+ * xe_sa_bo_sync_read() - Copy the data from GPU memory to the sub-allocation.
+ * @sa_bo: the &drm_suballoc to sync
+ */
+void xe_sa_bo_sync_read(struct drm_suballoc *sa_bo)
+{
+ struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
+ struct xe_device *xe = tile_to_xe(sa_manager->bo->tile);
+
+ if (!sa_manager->bo->vmap.is_iomem)
+ return;
+
+ xe_map_memcpy_from(xe, xe_sa_bo_cpu_addr(sa_bo), &sa_manager->bo->vmap,
+ drm_suballoc_soffset(sa_bo),
+ drm_suballoc_size(sa_bo));
+}
+
void xe_sa_bo_free(struct drm_suballoc *sa_bo,
struct dma_fence *fence)
{
diff --git a/drivers/gpu/drm/xe/xe_sa.h b/drivers/gpu/drm/xe/xe_sa.h
index 99dbf0eea5402..1be7443508361 100644
--- a/drivers/gpu/drm/xe/xe_sa.h
+++ b/drivers/gpu/drm/xe/xe_sa.h
@@ -37,6 +37,7 @@ static inline struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager
}
void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo);
+void xe_sa_bo_sync_read(struct drm_suballoc *sa_bo);
void xe_sa_bo_free(struct drm_suballoc *sa_bo, struct dma_fence *fence);
static inline struct xe_sa_manager *
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 10/26] drm/xe: Allow the caller to pass guc_buf_cache size
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (8 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 09/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 23:13 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 11/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration Michał Winiarski
` (15 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
An upcoming change will use GuC buffer cache as a place where GuC
migration data will be stored, and the memory requirement for that is
larger than indirect data.
Allow the caller to pass the size based on the intended usecase.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c | 2 +-
drivers/gpu/drm/xe/xe_guc.c | 4 ++--
drivers/gpu/drm/xe/xe_guc_buf.c | 6 +++---
drivers/gpu/drm/xe/xe_guc_buf.h | 4 +++-
4 files changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
index d266882adc0e0..485e7a70e6bb7 100644
--- a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
+++ b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
@@ -72,7 +72,7 @@ static int guc_buf_test_init(struct kunit *test)
kunit_activate_static_stub(test, xe_managed_bo_create_pin_map,
replacement_xe_managed_bo_create_pin_map);
- KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf));
+ KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE));
test->priv = &guc->buf;
return 0;
diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index ecc3e091b89e6..7c65528859ecb 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -812,7 +812,7 @@ static int vf_guc_init_post_hwconfig(struct xe_guc *guc)
if (err)
return err;
- err = xe_guc_buf_cache_init(&guc->buf);
+ err = xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE);
if (err)
return err;
@@ -860,7 +860,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
if (ret)
return ret;
- ret = xe_guc_buf_cache_init(&guc->buf);
+ ret = xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
index 4d8a4712309f4..ed096a0331244 100644
--- a/drivers/gpu/drm/xe/xe_guc_buf.c
+++ b/drivers/gpu/drm/xe/xe_guc_buf.c
@@ -28,16 +28,16 @@ static struct xe_gt *cache_to_gt(struct xe_guc_buf_cache *cache)
* @cache: the &xe_guc_buf_cache to initialize
*
* The Buffer Cache allows to obtain a reusable buffer that can be used to pass
- * indirect H2G data to GuC without a need to create a ad-hoc allocation.
+ * data to GuC or read data from GuC without a need to create a ad-hoc allocation.
*
* Return: 0 on success or a negative error code on failure.
*/
-int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache)
+int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size)
{
struct xe_gt *gt = cache_to_gt(cache);
struct xe_sa_manager *sam;
- sam = __xe_sa_bo_manager_init(gt_to_tile(gt), SZ_8K, 0, sizeof(u32));
+ sam = __xe_sa_bo_manager_init(gt_to_tile(gt), size, 0, sizeof(u32));
if (IS_ERR(sam))
return PTR_ERR(sam);
cache->sam = sam;
diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
index c5e0f1fd24d74..5210703309e81 100644
--- a/drivers/gpu/drm/xe/xe_guc_buf.h
+++ b/drivers/gpu/drm/xe/xe_guc_buf.h
@@ -11,7 +11,9 @@
#include "xe_guc_buf_types.h"
-int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache);
+#define XE_GUC_BUF_CACHE_DEFAULT_SIZE SZ_8K
+
+int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size);
u32 xe_guc_buf_cache_dwords(struct xe_guc_buf_cache *cache);
struct xe_guc_buf xe_guc_buf_reserve(struct xe_guc_buf_cache *cache, u32 dwords);
struct xe_guc_buf xe_guc_buf_from_data(struct xe_guc_buf_cache *cache,
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 11/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (9 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 10/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-23 17:37 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 12/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs Michał Winiarski
` (14 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Contiguous PF GGTT VMAs can be scarce after creating VFs.
Increase the GuC buffer cache size to 4M for PF so that we can fit GuC
migration data (which currently maxes out at just under 4M) and use the
cache instead of allocating fresh BOs.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 46 ++++++-------------
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 3 ++
drivers/gpu/drm/xe/xe_guc.c | 12 ++++-
3 files changed, 28 insertions(+), 33 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 4e26feb9c267f..04fad3126865c 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -11,7 +11,7 @@
#include "xe_gt_sriov_pf_helpers.h"
#include "xe_gt_sriov_pf_migration.h"
#include "xe_gt_sriov_printk.h"
-#include "xe_guc.h"
+#include "xe_guc_buf.h"
#include "xe_guc_ct.h"
#include "xe_sriov.h"
#include "xe_sriov_migration_data.h"
@@ -57,73 +57,55 @@ static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
/* Return: number of state dwords saved or a negative error code on failure */
static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
- void *buff, size_t size)
+ void *dst, size_t size)
{
const int ndwords = size / sizeof(u32);
- struct xe_tile *tile = gt_to_tile(gt);
- struct xe_device *xe = tile_to_xe(tile);
struct xe_guc *guc = >->uc.guc;
- struct xe_bo *bo;
+ CLASS(xe_guc_buf, buf)(&guc->buf, ndwords);
int ret;
xe_gt_assert(gt, size % sizeof(u32) == 0);
xe_gt_assert(gt, size == ndwords * sizeof(u32));
- bo = xe_bo_create_pin_map_novm(xe, tile,
- ALIGN(size, PAGE_SIZE),
- ttm_bo_type_kernel,
- XE_BO_FLAG_SYSTEM |
- XE_BO_FLAG_GGTT |
- XE_BO_FLAG_GGTT_INVALIDATE, false);
- if (IS_ERR(bo))
- return PTR_ERR(bo);
+ if (!xe_guc_buf_is_valid(buf))
+ return -ENOBUFS;
+
+ memset(xe_guc_buf_cpu_ptr(buf), 0, size);
ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_SAVE,
- xe_bo_ggtt_addr(bo), ndwords);
+ xe_guc_buf_flush(buf), ndwords);
if (!ret)
ret = -ENODATA;
else if (ret > ndwords)
ret = -EPROTO;
else if (ret > 0)
- xe_map_memcpy_from(xe, buff, &bo->vmap, 0, ret * sizeof(u32));
+ memcpy(dst, xe_guc_buf_sync_read(buf), ret * sizeof(u32));
- xe_bo_unpin_map_no_vm(bo);
return ret;
}
/* Return: number of state dwords restored or a negative error code on failure */
static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
- const void *buff, size_t size)
+ const void *src, size_t size)
{
const int ndwords = size / sizeof(u32);
- struct xe_tile *tile = gt_to_tile(gt);
- struct xe_device *xe = tile_to_xe(tile);
struct xe_guc *guc = >->uc.guc;
- struct xe_bo *bo;
+ CLASS(xe_guc_buf_from_data, buf)(&guc->buf, src, size);
int ret;
xe_gt_assert(gt, size % sizeof(u32) == 0);
xe_gt_assert(gt, size == ndwords * sizeof(u32));
- bo = xe_bo_create_pin_map_novm(xe, tile,
- ALIGN(size, PAGE_SIZE),
- ttm_bo_type_kernel,
- XE_BO_FLAG_SYSTEM |
- XE_BO_FLAG_GGTT |
- XE_BO_FLAG_GGTT_INVALIDATE, false);
- if (IS_ERR(bo))
- return PTR_ERR(bo);
-
- xe_map_memcpy_to(xe, &bo->vmap, 0, buff, size);
+ if (!xe_guc_buf_is_valid(buf))
+ return -ENOBUFS;
ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_RESTORE,
- xe_bo_ggtt_addr(bo), ndwords);
+ xe_guc_buf_flush(buf), ndwords);
if (!ret)
ret = -ENODATA;
else if (ret > ndwords)
ret = -EPROTO;
- xe_bo_unpin_map_no_vm(bo);
return ret;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index e2d41750f863c..4f2f2783339c3 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -11,6 +11,9 @@
struct xe_gt;
struct xe_sriov_migration_data;
+/* TODO: get this information by querying GuC in the future */
+#define XE_GT_SRIOV_PF_MIGRATION_GUC_DATA_MAX_SIZE SZ_8M
+
int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index 7c65528859ecb..cd6ab277a7876 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -24,6 +24,7 @@
#include "xe_gt_printk.h"
#include "xe_gt_sriov_vf.h"
#include "xe_gt_throttle.h"
+#include "xe_gt_sriov_pf_migration.h"
#include "xe_guc_ads.h"
#include "xe_guc_buf.h"
#include "xe_guc_capture.h"
@@ -40,6 +41,7 @@
#include "xe_mmio.h"
#include "xe_platform_types.h"
#include "xe_sriov.h"
+#include "xe_sriov_pf_migration.h"
#include "xe_uc.h"
#include "xe_uc_fw.h"
#include "xe_wa.h"
@@ -821,6 +823,14 @@ static int vf_guc_init_post_hwconfig(struct xe_guc *guc)
return 0;
}
+static u32 guc_buf_cache_size(struct xe_guc *guc)
+{
+ if (IS_SRIOV_PF(guc_to_xe(guc)) && xe_sriov_pf_migration_supported(guc_to_xe(guc)))
+ return XE_GT_SRIOV_PF_MIGRATION_GUC_DATA_MAX_SIZE;
+ else
+ return XE_GUC_BUF_CACHE_DEFAULT_SIZE;
+}
+
/**
* xe_guc_init_post_hwconfig - initialize GuC post hwconfig load
* @guc: The GuC object
@@ -860,7 +870,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
if (ret)
return ret;
- ret = xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE);
+ ret = xe_guc_buf_cache_init(&guc->buf, guc_buf_cache_size(guc));
if (ret)
return ret;
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 12/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (10 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 11/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 13/26] drm/xe/pf: Don't save GuC VF migration data on pause Michał Winiarski
` (13 subsequent siblings)
25 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
In upcoming changes, SR-IOV VF migration data will be extended beyond
GuC data and exported to userspace using VFIO interface (with a
vendor-specific variant driver) and a device-level debugfs interface.
Remove the GT-level debugfs.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 47 ---------------------
1 file changed, 47 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
index 838beb7f6327f..5278ea4fd6552 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
@@ -327,9 +327,6 @@ static const struct {
{ "stop", xe_gt_sriov_pf_control_stop_vf },
{ "pause", xe_gt_sriov_pf_control_pause_vf },
{ "resume", xe_gt_sriov_pf_control_resume_vf },
-#ifdef CONFIG_DRM_XE_DEBUG_SRIOV
- { "restore!", xe_gt_sriov_pf_migration_restore_guc_state },
-#endif
};
static ssize_t control_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
@@ -393,47 +390,6 @@ static const struct file_operations control_ops = {
.llseek = default_llseek,
};
-/*
- * /sys/kernel/debug/dri/BDF/
- * ├── sriov
- * : ├── vf1
- * : ├── tile0
- * : ├── gt0
- * : ├── guc_state
- */
-
-static ssize_t guc_state_read(struct file *file, char __user *buf,
- size_t count, loff_t *pos)
-{
- struct dentry *dent = file_dentry(file);
- struct dentry *parent = dent->d_parent;
- struct xe_gt *gt = extract_gt(parent);
- unsigned int vfid = extract_vfid(parent);
-
- return xe_gt_sriov_pf_migration_read_guc_state(gt, vfid, buf, count, pos);
-}
-
-static ssize_t guc_state_write(struct file *file, const char __user *buf,
- size_t count, loff_t *pos)
-{
- struct dentry *dent = file_dentry(file);
- struct dentry *parent = dent->d_parent;
- struct xe_gt *gt = extract_gt(parent);
- unsigned int vfid = extract_vfid(parent);
-
- if (*pos)
- return -EINVAL;
-
- return xe_gt_sriov_pf_migration_write_guc_state(gt, vfid, buf, count);
-}
-
-static const struct file_operations guc_state_ops = {
- .owner = THIS_MODULE,
- .read = guc_state_read,
- .write = guc_state_write,
- .llseek = default_llseek,
-};
-
/*
* /sys/kernel/debug/dri/BDF/
* ├── sriov
@@ -568,9 +524,6 @@ static void pf_populate_gt(struct xe_gt *gt, struct dentry *dent, unsigned int v
/* for testing/debugging purposes only! */
if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
- debugfs_create_file("guc_state",
- IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400,
- dent, NULL, &guc_state_ops);
debugfs_create_file("config_blob",
IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400,
dent, NULL, &config_blob_ops);
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 13/26] drm/xe/pf: Don't save GuC VF migration data on pause
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (11 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 12/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 14/26] drm/xe/pf: Switch VF migration GuC save/restore to struct migration data Michał Winiarski
` (12 subsequent siblings)
25 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
In upcoming changes, the GuC VF migration data will be handled as part
of separate SAVE/RESTORE states in VF control state machine.
Remove it from PAUSE state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 39 +------------------
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 -
2 files changed, 2 insertions(+), 39 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index dd9bc9c99f78c..c159f35adcbe7 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -183,7 +183,6 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(PAUSE_SEND_PAUSE);
CASE2STR(PAUSE_WAIT_GUC);
CASE2STR(PAUSE_GUC_DONE);
- CASE2STR(PAUSE_SAVE_GUC);
CASE2STR(PAUSE_FAILED);
CASE2STR(PAUSED);
CASE2STR(SAVE_WIP);
@@ -453,8 +452,7 @@ static void pf_enter_vf_ready(struct xe_gt *gt, unsigned int vfid)
* : PAUSE_GUC_DONE o-----restart
* : | :
* : | o---<--busy :
- * : v / / :
- * : PAUSE_SAVE_GUC :
+ * : / :
* : / :
* : / :
* :....o..............o...............o...........:
@@ -474,7 +472,6 @@ static void pf_exit_vf_pause_wip(struct xe_gt *gt, unsigned int vfid)
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_GUC_DONE);
- pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC);
}
}
@@ -505,41 +502,12 @@ static void pf_enter_vf_pause_rejected(struct xe_gt *gt, unsigned int vfid)
pf_enter_vf_pause_failed(gt, vfid);
}
-static void pf_enter_vf_pause_save_guc(struct xe_gt *gt, unsigned int vfid)
-{
- if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC))
- pf_enter_vf_state_machine_bug(gt, vfid);
-}
-
-static bool pf_exit_vf_pause_save_guc(struct xe_gt *gt, unsigned int vfid)
-{
- int err;
-
- if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC))
- return false;
-
- err = xe_gt_sriov_pf_migration_save_guc_state(gt, vfid);
- if (err) {
- /* retry if busy */
- if (err == -EBUSY) {
- pf_enter_vf_pause_save_guc(gt, vfid);
- return true;
- }
- /* give up on error */
- if (err == -EIO)
- pf_enter_vf_mismatch(gt, vfid);
- }
-
- pf_enter_vf_pause_completed(gt, vfid);
- return true;
-}
-
static bool pf_exit_vf_pause_guc_done(struct xe_gt *gt, unsigned int vfid)
{
if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_GUC_DONE))
return false;
- pf_enter_vf_pause_save_guc(gt, vfid);
+ pf_enter_vf_pause_completed(gt, vfid);
return true;
}
@@ -1928,9 +1896,6 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
if (pf_exit_vf_pause_guc_done(gt, vfid))
return true;
- if (pf_exit_vf_pause_save_guc(gt, vfid))
- return true;
-
if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA)) {
xe_gt_sriov_dbg_verbose(gt, "VF%u in %s\n", vfid,
control_bit_to_string(XE_GT_SRIOV_STATE_SAVE_WAIT_DATA));
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index 6e19a8ea88f0b..35ceb2ff62110 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -28,7 +28,6 @@
* @XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE: indicates that the PF is about to send a PAUSE command.
* @XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC: indicates that the PF awaits for a response from the GuC.
* @XE_GT_SRIOV_STATE_PAUSE_GUC_DONE: indicates that the PF has received a response from the GuC.
- * @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state.
* @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
* @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
* @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
@@ -71,7 +70,6 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE,
XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC,
XE_GT_SRIOV_STATE_PAUSE_GUC_DONE,
- XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC,
XE_GT_SRIOV_STATE_PAUSE_FAILED,
XE_GT_SRIOV_STATE_PAUSED,
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 14/26] drm/xe/pf: Switch VF migration GuC save/restore to struct migration data
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (12 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 13/26] drm/xe/pf: Don't save GuC VF migration data on pause Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 15/26] drm/xe/pf: Handle GuC migration data as part of PF control Michał Winiarski
` (11 subsequent siblings)
25 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
In upcoming changes, the GuC VF migration data will be handled as part
of separate SAVE/RESTORE states in VF control state machine.
Now that the data is decoupled from both guc_state debugfs and PAUSE
state, we can safely remove the struct xe_gt_sriov_state_snapshot and
modify the GuC save/restore functions to operate on struct
xe_sriov_migration_data.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 266 +++++-------------
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 13 +-
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 27 --
drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 4 -
4 files changed, 80 insertions(+), 230 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 04fad3126865c..127162e8c66e8 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -28,6 +28,15 @@ static struct xe_gt_sriov_migration_data *pf_pick_gt_migration(struct xe_gt *gt,
return >->sriov.pf.vfs[vfid].migration;
}
+static void pf_dump_mig_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ if (IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV)) {
+ print_hex_dump_bytes("mig_data: ", DUMP_PREFIX_OFFSET,
+ data->vaddr, min(SZ_64, data->size));
+ }
+}
+
/* Return: number of dwords saved/restored/required or a negative error code on failure */
static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
u64 addr, u32 ndwords)
@@ -47,7 +56,7 @@ static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
}
/* Return: size of the state in dwords or a negative error code on failure */
-static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
+static int pf_send_guc_query_vf_mig_data_size(struct xe_gt *gt, unsigned int vfid)
{
int ret;
@@ -56,8 +65,8 @@ static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
}
/* Return: number of state dwords saved or a negative error code on failure */
-static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
- void *dst, size_t size)
+static int pf_send_guc_save_vf_mig_data(struct xe_gt *gt, unsigned int vfid,
+ void *dst, size_t size)
{
const int ndwords = size / sizeof(u32);
struct xe_guc *guc = >->uc.guc;
@@ -85,8 +94,8 @@ static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
}
/* Return: number of state dwords restored or a negative error code on failure */
-static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
- const void *src, size_t size)
+static int pf_send_guc_restore_vf_mig_data(struct xe_gt *gt, unsigned int vfid,
+ const void *src, size_t size)
{
const int ndwords = size / sizeof(u32);
struct xe_guc *guc = >->uc.guc;
@@ -114,120 +123,68 @@ static bool pf_migration_supported(struct xe_gt *gt)
return xe_sriov_pf_migration_supported(gt_to_xe(gt));
}
-static struct mutex *pf_migration_mutex(struct xe_gt *gt)
+static int pf_save_vf_guc_mig_data(struct xe_gt *gt, unsigned int vfid)
{
- xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- return >->sriov.pf.migration.snapshot_lock;
-}
-
-static struct xe_gt_sriov_state_snapshot *pf_pick_vf_snapshot(struct xe_gt *gt,
- unsigned int vfid)
-{
- xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
- lockdep_assert_held(pf_migration_mutex(gt));
-
- return >->sriov.pf.vfs[vfid].snapshot;
-}
-
-static unsigned int pf_snapshot_index(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot)
-{
- return container_of(snapshot, struct xe_gt_sriov_metadata, snapshot) - gt->sriov.pf.vfs;
-}
-
-static void pf_free_guc_state(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot)
-{
- struct xe_device *xe = gt_to_xe(gt);
-
- drmm_kfree(&xe->drm, snapshot->guc.buff);
- snapshot->guc.buff = NULL;
- snapshot->guc.size = 0;
-}
-
-static int pf_alloc_guc_state(struct xe_gt *gt,
- struct xe_gt_sriov_state_snapshot *snapshot,
- size_t size)
-{
- struct xe_device *xe = gt_to_xe(gt);
- void *p;
-
- pf_free_guc_state(gt, snapshot);
-
- if (!size)
- return -ENODATA;
-
- if (size % sizeof(u32))
- return -EINVAL;
-
- if (size > SZ_2M)
- return -EFBIG;
-
- p = drmm_kzalloc(&xe->drm, size, GFP_KERNEL);
- if (!p)
- return -ENOMEM;
-
- snapshot->guc.buff = p;
- snapshot->guc.size = size;
- return 0;
-}
-
-static void pf_dump_guc_state(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot)
-{
- if (IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV)) {
- unsigned int vfid __maybe_unused = pf_snapshot_index(gt, snapshot);
-
- xe_gt_sriov_dbg_verbose(gt, "VF%u GuC state is %zu dwords:\n",
- vfid, snapshot->guc.size / sizeof(u32));
- print_hex_dump_bytes("state: ", DUMP_PREFIX_OFFSET,
- snapshot->guc.buff, min(SZ_64, snapshot->guc.size));
- }
-}
-
-static int pf_save_vf_guc_state(struct xe_gt *gt, unsigned int vfid)
-{
- struct xe_gt_sriov_state_snapshot *snapshot = pf_pick_vf_snapshot(gt, vfid);
+ struct xe_sriov_migration_data *data;
size_t size;
int ret;
- ret = pf_send_guc_query_vf_state_size(gt, vfid);
+ ret = pf_send_guc_query_vf_mig_data_size(gt, vfid);
if (ret < 0)
goto fail;
+
size = ret * sizeof(u32);
- xe_gt_sriov_dbg_verbose(gt, "VF%u state size is %d dwords (%zu bytes)\n", vfid, ret, size);
+ xe_gt_sriov_dbg_verbose(gt, "VF%u GuC data size is %d dwords (%zu bytes)\n",
+ vfid, ret, size);
- ret = pf_alloc_guc_state(gt, snapshot, size);
- if (ret < 0)
+ data = xe_sriov_migration_data_alloc(gt_to_xe(gt));
+ if (!data) {
+ ret = -ENOMEM;
goto fail;
+ }
+
+ ret = xe_sriov_migration_data_init(data, gt->tile->id, gt->info.id,
+ XE_SRIOV_MIGRATION_DATA_TYPE_GUC, 0, size);
+ if (ret)
+ goto fail_free;
- ret = pf_send_guc_save_vf_state(gt, vfid, snapshot->guc.buff, size);
+ ret = pf_send_guc_save_vf_mig_data(gt, vfid, data->vaddr, size);
if (ret < 0)
- goto fail;
+ goto fail_free;
size = ret * sizeof(u32);
xe_gt_assert(gt, size);
- xe_gt_assert(gt, size <= snapshot->guc.size);
- snapshot->guc.size = size;
+ xe_gt_assert(gt, size <= data->size);
+ data->size = size;
+ data->remaining = size;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_migration_save_produce(gt, vfid, data);
+ if (ret)
+ goto fail_free;
- pf_dump_guc_state(gt, snapshot);
return 0;
+fail_free:
+ xe_sriov_migration_data_free(data);
fail:
- xe_gt_sriov_dbg(gt, "Unable to save VF%u state (%pe)\n", vfid, ERR_PTR(ret));
- pf_free_guc_state(gt, snapshot);
+ xe_gt_sriov_err(gt, "Failed to save VF%u GuC data (%pe)\n",
+ vfid, ERR_PTR(ret));
return ret;
}
/**
- * xe_gt_sriov_pf_migration_save_guc_state() - Take a GuC VF state snapshot.
+ * xe_gt_sriov_pf_migration_guc_size() - Get the size of VF GuC migration data.
* @gt: the &xe_gt
* @vfid: the VF identifier
*
* This function is for PF only.
*
- * Return: 0 on success or a negative error code on failure.
+ * Return: size in bytes or a negative error code on failure.
*/
-int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid)
+ssize_t xe_gt_sriov_pf_migration_guc_size(struct xe_gt *gt, unsigned int vfid)
{
- int err;
+ ssize_t size;
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
xe_gt_assert(gt, vfid != PFID);
@@ -236,37 +193,15 @@ int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid)
if (!pf_migration_supported(gt))
return -ENOPKG;
- mutex_lock(pf_migration_mutex(gt));
- err = pf_save_vf_guc_state(gt, vfid);
- mutex_unlock(pf_migration_mutex(gt));
-
- return err;
-}
-
-static int pf_restore_vf_guc_state(struct xe_gt *gt, unsigned int vfid)
-{
- struct xe_gt_sriov_state_snapshot *snapshot = pf_pick_vf_snapshot(gt, vfid);
- int ret;
-
- if (!snapshot->guc.size)
- return -ENODATA;
-
- xe_gt_sriov_dbg_verbose(gt, "restoring %zu dwords of VF%u GuC state\n",
- snapshot->guc.size / sizeof(u32), vfid);
- ret = pf_send_guc_restore_vf_state(gt, vfid, snapshot->guc.buff, snapshot->guc.size);
- if (ret < 0)
- goto fail;
-
- xe_gt_sriov_dbg_verbose(gt, "restored %d dwords of VF%u GuC state\n", ret, vfid);
- return 0;
+ size = pf_send_guc_query_vf_mig_data_size(gt, vfid);
+ if (size >= 0)
+ size *= sizeof(u32);
-fail:
- xe_gt_sriov_dbg(gt, "Failed to restore VF%u GuC state (%pe)\n", vfid, ERR_PTR(ret));
- return ret;
+ return size;
}
/**
- * xe_gt_sriov_pf_migration_restore_guc_state() - Restore a GuC VF state.
+ * xe_gt_sriov_pf_migration_guc_save() - Save VF GuC migration data.
* @gt: the &xe_gt
* @vfid: the VF identifier
*
@@ -274,10 +209,8 @@ static int pf_restore_vf_guc_state(struct xe_gt *gt, unsigned int vfid)
*
* Return: 0 on success or a negative error code on failure.
*/
-int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid)
+int xe_gt_sriov_pf_migration_guc_save(struct xe_gt *gt, unsigned int vfid)
{
- int ret;
-
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
xe_gt_assert(gt, vfid != PFID);
xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
@@ -285,75 +218,45 @@ int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vf
if (!pf_migration_supported(gt))
return -ENOPKG;
- mutex_lock(pf_migration_mutex(gt));
- ret = pf_restore_vf_guc_state(gt, vfid);
- mutex_unlock(pf_migration_mutex(gt));
-
- return ret;
+ return pf_save_vf_guc_mig_data(gt, vfid);
}
-#ifdef CONFIG_DEBUG_FS
-/**
- * xe_gt_sriov_pf_migration_read_guc_state() - Read a GuC VF state.
- * @gt: the &xe_gt
- * @vfid: the VF identifier
- * @buf: the user space buffer to read to
- * @count: the maximum number of bytes to read
- * @pos: the current position in the buffer
- *
- * This function is for PF only.
- *
- * This function reads up to @count bytes from the saved VF GuC state buffer
- * at offset @pos into the user space address starting at @buf.
- *
- * Return: the number of bytes read or a negative error code on failure.
- */
-ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
- char __user *buf, size_t count, loff_t *pos)
+static int pf_restore_vf_guc_state(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
{
- struct xe_gt_sriov_state_snapshot *snapshot;
- ssize_t ret;
+ int ret;
- xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- xe_gt_assert(gt, vfid != PFID);
- xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+ xe_gt_assert(gt, data->size);
- if (!pf_migration_supported(gt))
- return -ENOPKG;
+ xe_gt_sriov_dbg_verbose(gt, "restoring %lld dwords of VF%u GuC data\n",
+ data->size / sizeof(u32), vfid);
+ pf_dump_mig_data(gt, vfid, data);
- mutex_lock(pf_migration_mutex(gt));
- snapshot = pf_pick_vf_snapshot(gt, vfid);
- if (snapshot->guc.size)
- ret = simple_read_from_buffer(buf, count, pos, snapshot->guc.buff,
- snapshot->guc.size);
- else
- ret = -ENODATA;
- mutex_unlock(pf_migration_mutex(gt));
+ ret = pf_send_guc_restore_vf_mig_data(gt, vfid, data->vaddr, data->size);
+ if (ret < 0)
+ goto fail;
+
+ xe_gt_sriov_dbg_verbose(gt, "restored %d dwords of VF%u GuC data\n", ret, vfid);
+ return 0;
+fail:
+ xe_gt_sriov_err(gt, "Failed to restore VF%u GuC data (%pe)\n",
+ vfid, ERR_PTR(ret));
return ret;
}
/**
- * xe_gt_sriov_pf_migration_write_guc_state() - Write a GuC VF state.
+ * xe_gt_sriov_pf_migration_guc_restore() - Restore VF GuC migration data.
* @gt: the &xe_gt
* @vfid: the VF identifier
- * @buf: the user space buffer with GuC VF state
- * @size: the size of GuC VF state (in bytes)
*
* This function is for PF only.
*
- * This function reads @size bytes of the VF GuC state stored at user space
- * address @buf and writes it into a internal VF state buffer.
- *
- * Return: the number of bytes used or a negative error code on failure.
+ * Return: 0 on success or a negative error code on failure.
*/
-ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int vfid,
- const char __user *buf, size_t size)
+int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
{
- struct xe_gt_sriov_state_snapshot *snapshot;
- loff_t pos = 0;
- ssize_t ret;
-
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
xe_gt_assert(gt, vfid != PFID);
xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
@@ -361,21 +264,8 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
if (!pf_migration_supported(gt))
return -ENOPKG;
- mutex_lock(pf_migration_mutex(gt));
- snapshot = pf_pick_vf_snapshot(gt, vfid);
- ret = pf_alloc_guc_state(gt, snapshot, size);
- if (!ret) {
- ret = simple_write_to_buffer(snapshot->guc.buff, size, &pos, buf, size);
- if (ret < 0)
- pf_free_guc_state(gt, snapshot);
- else
- pf_dump_guc_state(gt, snapshot);
- }
- mutex_unlock(pf_migration_mutex(gt));
-
- return ret;
+ return pf_restore_vf_guc_state(gt, vfid, data);
}
-#endif /* CONFIG_DEBUG_FS */
/**
* xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT.
@@ -597,10 +487,6 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
if (!pf_migration_supported(gt))
return 0;
- err = drmm_mutex_init(&xe->drm, >->sriov.pf.migration.snapshot_lock);
- if (err)
- return err;
-
totalvfs = xe_sriov_pf_get_totalvfs(xe);
for (n = 1; n <= totalvfs; n++) {
struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, n);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 4f2f2783339c3..b3c18e369df79 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -15,8 +15,10 @@ struct xe_sriov_migration_data;
#define XE_GT_SRIOV_PF_MIGRATION_GUC_DATA_MAX_SIZE SZ_8M
int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
-int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
-int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
+ssize_t xe_gt_sriov_pf_migration_guc_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_guc_save(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data);
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
@@ -34,11 +36,4 @@ int xe_gt_sriov_pf_migration_restore_produce(struct xe_gt *gt, unsigned int vfid
struct xe_sriov_migration_data *
xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid);
-#ifdef CONFIG_DEBUG_FS
-ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
- char __user *buf, size_t count, loff_t *pos);
-ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int vfid,
- const char __user *buf, size_t count);
-#endif
-
#endif
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
index 84be6fac16c8b..75d8b94cbbefb 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
@@ -6,24 +6,7 @@
#ifndef _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
#define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
-#include <linux/mutex.h>
#include <linux/ptr_ring.h>
-#include <linux/types.h>
-
-/**
- * struct xe_gt_sriov_state_snapshot - GT-level per-VF state snapshot data.
- *
- * Used by the PF driver to maintain per-VF migration data.
- */
-struct xe_gt_sriov_state_snapshot {
- /** @guc: GuC VF state snapshot */
- struct {
- /** @guc.buff: buffer with the VF state */
- u32 *buff;
- /** @guc.size: size of the buffer (must be dwords aligned) */
- u32 size;
- } guc;
-};
/**
* struct xe_gt_sriov_migration_data - GT-level per-VF migration data.
@@ -35,14 +18,4 @@ struct xe_gt_sriov_migration_data {
struct ptr_ring ring;
};
-/**
- * struct xe_gt_sriov_pf_migration - GT-level data.
- *
- * Used by the PF driver to maintain non-VF specific per-GT data.
- */
-struct xe_gt_sriov_pf_migration {
- /** @snapshot_lock: protects all VFs snapshots */
- struct mutex snapshot_lock;
-};
-
#endif
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
index 812e74d3f8f80..667b8310478d4 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
@@ -31,9 +31,6 @@ struct xe_gt_sriov_metadata {
/** @version: negotiated VF/PF ABI version */
struct xe_gt_sriov_pf_service_version version;
- /** @snapshot: snapshot of the VF state data */
- struct xe_gt_sriov_state_snapshot snapshot;
-
/** @migration: per-VF migration data. */
struct xe_gt_sriov_migration_data migration;
};
@@ -61,7 +58,6 @@ struct xe_gt_sriov_pf {
struct xe_gt_sriov_pf_service service;
struct xe_gt_sriov_pf_control control;
struct xe_gt_sriov_pf_policy policy;
- struct xe_gt_sriov_pf_migration migration;
struct xe_gt_sriov_spare_config spare;
struct xe_gt_sriov_metadata *vfs;
};
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 15/26] drm/xe/pf: Handle GuC migration data as part of PF control
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (13 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 14/26] drm/xe/pf: Switch VF migration GuC save/restore to struct migration data Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-23 20:39 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
` (10 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Connect the helpers to allow save and restore of GuC migration data in
stop_copy / resume device state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 26 +++++++++++++++++--
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 ++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 9 ++++++-
3 files changed, 34 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index c159f35adcbe7..18f6e3028d4f0 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -188,6 +188,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(SAVE_WIP);
CASE2STR(SAVE_PROCESS_DATA);
CASE2STR(SAVE_WAIT_DATA);
+ CASE2STR(SAVE_DATA_GUC);
CASE2STR(SAVE_DATA_DONE);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
@@ -343,6 +344,7 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOP_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
@@ -824,6 +826,7 @@ static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
}
}
@@ -848,6 +851,16 @@ static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
{
+ int ret;
+
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC)) {
+ xe_gt_assert(gt, xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0);
+
+ ret = xe_gt_sriov_pf_migration_guc_save(gt, vfid);
+ if (ret)
+ return ret;
+ }
+
return 0;
}
@@ -881,6 +894,7 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
pf_enter_vf_wip(gt, vfid);
pf_queue_vf(gt, vfid);
return true;
@@ -1046,14 +1060,22 @@ static int
pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
{
struct xe_sriov_migration_data *data = xe_gt_sriov_pf_migration_restore_consume(gt, vfid);
+ int ret = 0;
xe_gt_assert(gt, data);
- xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
+ switch (data->type) {
+ case XE_SRIOV_MIGRATION_DATA_TYPE_GUC:
+ ret = xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
+ break;
+ default:
+ xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
+ break;
+ }
xe_sriov_migration_data_free(data);
- return 0;
+ return ret;
}
static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index 35ceb2ff62110..8b951ee8a24fe 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -33,6 +33,7 @@
* @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
* @XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA: indicates that VF migration data is being produced.
* @XE_GT_SRIOV_STATE_SAVE_WAIT_DATA: indicates that PF awaits for space in migration data ring.
+ * @XE_GT_SRIOV_STATE_SAVE_DATA_GUC: indicates PF needs to save VF GuC migration data.
* @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
* @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
* @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
@@ -76,6 +77,7 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_SAVE_WIP,
XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
XE_GT_SRIOV_STATE_SAVE_WAIT_DATA,
+ XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 127162e8c66e8..594178fbe36d0 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -279,10 +279,17 @@ int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
{
ssize_t total = 0;
+ ssize_t size;
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- /* Nothing to query yet - will be updated once per-GT migration data types are added */
+ size = xe_gt_sriov_pf_migration_guc_size(gt, vfid);
+ if (size < 0)
+ return size;
+ else if (size > 0)
+ size += sizeof(struct xe_sriov_pf_migration_hdr);
+ total += size;
+
return total;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (14 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 15/26] drm/xe/pf: Handle GuC migration data as part of PF control Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-23 21:50 ` Michal Wajdeczko
2025-10-28 3:22 ` Tian, Kevin
2025-10-21 22:41 ` [PATCH v2 17/26] drm/xe/pf: Handle GGTT migration data as part of PF control Michał Winiarski
` (9 subsequent siblings)
25 siblings, 2 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
In an upcoming change, the VF GGTT migration data will be handled as
part of VF control state machine. Add the necessary helpers to allow the
migration data transfer to/from the HW GGTT resource.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_ggtt.c | 100 +++++++++++++++++++++
drivers/gpu/drm/xe/xe_ggtt.h | 3 +
drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 44 +++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 5 ++
5 files changed, 154 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index 40680f0c49a17..99fe891c7939e 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -151,6 +151,14 @@ static void xe_ggtt_set_pte_and_flush(struct xe_ggtt *ggtt, u64 addr, u64 pte)
ggtt_update_access_counter(ggtt);
}
+static u64 xe_ggtt_get_pte(struct xe_ggtt *ggtt, u64 addr)
+{
+ xe_tile_assert(ggtt->tile, !(addr & XE_PTE_MASK));
+ xe_tile_assert(ggtt->tile, addr < ggtt->size);
+
+ return readq(&ggtt->gsm[addr >> XE_PTE_SHIFT]);
+}
+
static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size)
{
u16 pat_index = tile_to_xe(ggtt->tile)->pat.idx[XE_CACHE_WB];
@@ -233,16 +241,19 @@ void xe_ggtt_might_lock(struct xe_ggtt *ggtt)
static const struct xe_ggtt_pt_ops xelp_pt_ops = {
.pte_encode_flags = xelp_ggtt_pte_flags,
.ggtt_set_pte = xe_ggtt_set_pte,
+ .ggtt_get_pte = xe_ggtt_get_pte,
};
static const struct xe_ggtt_pt_ops xelpg_pt_ops = {
.pte_encode_flags = xelpg_ggtt_pte_flags,
.ggtt_set_pte = xe_ggtt_set_pte,
+ .ggtt_get_pte = xe_ggtt_get_pte,
};
static const struct xe_ggtt_pt_ops xelpg_pt_wa_ops = {
.pte_encode_flags = xelpg_ggtt_pte_flags,
.ggtt_set_pte = xe_ggtt_set_pte_and_flush,
+ .ggtt_get_pte = xe_ggtt_get_pte,
};
static void __xe_ggtt_init_early(struct xe_ggtt *ggtt, u32 reserved)
@@ -912,6 +923,22 @@ static void xe_ggtt_assign_locked(struct xe_ggtt *ggtt, const struct drm_mm_node
xe_ggtt_invalidate(ggtt);
}
+/**
+ * xe_ggtt_pte_size() - Convert GGTT VMA size to page table entries size.
+ * @ggtt: the &xe_ggtt
+ * @size: GGTT VMA size in bytes
+ *
+ * Return: GGTT page table entries size in bytes.
+ */
+size_t xe_ggtt_pte_size(struct xe_ggtt *ggtt, size_t size)
+{
+ struct xe_device __maybe_unused *xe = tile_to_xe(ggtt->tile);
+
+ xe_assert(xe, size % XE_PAGE_SIZE == 0);
+
+ return size / XE_PAGE_SIZE * sizeof(u64);
+}
+
/**
* xe_ggtt_assign - assign a GGTT region to the VF
* @node: the &xe_ggtt_node to update
@@ -927,6 +954,79 @@ void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid)
xe_ggtt_assign_locked(node->ggtt, &node->base, vfid);
mutex_unlock(&node->ggtt->lock);
}
+
+/**
+ * xe_ggtt_node_save() - Save a &xe_ggtt_node to a buffer.
+ * @node: the &xe_ggtt_node to be saved
+ * @dst: destination buffer
+ * @size: destination buffer size in bytes
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size)
+{
+ struct xe_ggtt *ggtt;
+ u64 start, end;
+ u64 *buf = dst;
+
+ if (!node)
+ return -ENOENT;
+
+ guard(mutex)(&node->ggtt->lock);
+
+ ggtt = node->ggtt;
+ start = node->base.start;
+ end = start + node->base.size - 1;
+
+ if (xe_ggtt_pte_size(ggtt, node->base.size) > size)
+ return -EINVAL;
+
+ while (start < end) {
+ *buf++ = ggtt->pt_ops->ggtt_get_pte(ggtt, start) & ~GGTT_PTE_VFID;
+ start += XE_PAGE_SIZE;
+ }
+
+ return 0;
+}
+
+/**
+ * xe_ggtt_node_load() - Load a &xe_ggtt_node from a buffer.
+ * @node: the &xe_ggtt_node to be loaded
+ * @src: source buffer
+ * @size: source buffer size in bytes
+ * @vfid: VF identifier
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid)
+{
+ u64 vfid_pte = xe_encode_vfid_pte(vfid);
+ const u64 *buf = src;
+ struct xe_ggtt *ggtt;
+ u64 start, end;
+
+ if (!node)
+ return -ENOENT;
+
+ guard(mutex)(&node->ggtt->lock);
+
+ ggtt = node->ggtt;
+ start = node->base.start;
+ end = start + size - 1;
+
+ if (xe_ggtt_pte_size(ggtt, node->base.size) != size)
+ return -EINVAL;
+
+ while (start < end) {
+ ggtt->pt_ops->ggtt_set_pte(ggtt, start, (*buf & ~GGTT_PTE_VFID) | vfid_pte);
+ start += XE_PAGE_SIZE;
+ buf++;
+ }
+ xe_ggtt_invalidate(ggtt);
+
+ return 0;
+}
+
#endif
/**
diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
index 75fc7a1efea76..5f55f80fe3adc 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.h
+++ b/drivers/gpu/drm/xe/xe_ggtt.h
@@ -42,7 +42,10 @@ int xe_ggtt_dump(struct xe_ggtt *ggtt, struct drm_printer *p);
u64 xe_ggtt_print_holes(struct xe_ggtt *ggtt, u64 alignment, struct drm_printer *p);
#ifdef CONFIG_PCI_IOV
+size_t xe_ggtt_pte_size(struct xe_ggtt *ggtt, size_t size);
void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid);
+int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size);
+int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid);
#endif
#ifndef CONFIG_LOCKDEP
diff --git a/drivers/gpu/drm/xe/xe_ggtt_types.h b/drivers/gpu/drm/xe/xe_ggtt_types.h
index c5e999d58ff2a..dacd796f81844 100644
--- a/drivers/gpu/drm/xe/xe_ggtt_types.h
+++ b/drivers/gpu/drm/xe/xe_ggtt_types.h
@@ -78,6 +78,8 @@ struct xe_ggtt_pt_ops {
u64 (*pte_encode_flags)(struct xe_bo *bo, u16 pat_index);
/** @ggtt_set_pte: Directly write into GGTT's PTE */
void (*ggtt_set_pte)(struct xe_ggtt *ggtt, u64 addr, u64 pte);
+ /** @ggtt_get_pte: Directly read from GGTT's PTE */
+ u64 (*ggtt_get_pte)(struct xe_ggtt *ggtt, u64 addr);
};
#endif
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index c0c0215c07036..c857879e28fe5 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -726,6 +726,50 @@ int xe_gt_sriov_pf_config_set_fair_ggtt(struct xe_gt *gt, unsigned int vfid,
return xe_gt_sriov_pf_config_bulk_set_ggtt(gt, vfid, num_vfs, fair);
}
+/**
+ * xe_gt_sriov_pf_config_ggtt_save() - Save a VF provisioned GGTT data into a buffer.
+ * @gt: the &xe_gt
+ * @vfid: VF identifier (can't be 0)
+ * @buf: the GGTT data destination buffer
+ * @size: the size of the buffer
+ *
+ * This function can only be called on PF.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
+ void *buf, size_t size)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid);
+
+ guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
+
+ return xe_ggtt_node_save(pf_pick_vf_config(gt, vfid)->ggtt_region, buf, size);
+}
+
+/**
+ * xe_gt_sriov_pf_config_ggtt_restore() - Restore a VF provisioned GGTT data from a buffer.
+ * @gt: the &xe_gt
+ * @vfid: VF identifier (can't be 0)
+ * @buf: the GGTT data source buffer
+ * @size: the size of the buffer
+ *
+ * This function can only be called on PF.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid);
+
+ guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
+
+ return xe_ggtt_node_load(pf_pick_vf_config(gt, vfid)->ggtt_region, buf, size, vfid);
+}
+
static u32 pf_get_min_spare_ctxs(struct xe_gt *gt)
{
/* XXX: preliminary */
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
index 513e6512a575b..6916b8f58ebf2 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
@@ -61,6 +61,11 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
int xe_gt_sriov_pf_config_restore(struct xe_gt *gt, unsigned int vfid,
const void *buf, size_t size);
+int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
+ void *buf, size_t size);
+int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size);
+
bool xe_gt_sriov_pf_config_is_empty(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_config_init(struct xe_gt *gt);
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 17/26] drm/xe/pf: Handle GGTT migration data as part of PF control
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (15 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 18/26] drm/xe/pf: Add helpers for VF MMIO migration data handling Michał Winiarski
` (8 subsequent siblings)
25 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Connect the helpers to allow save and restore of GGTT migration data in
stop_copy / resume device state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 16 +++
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 118 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 4 +
4 files changed, 140 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 18f6e3028d4f0..f5c215fb93c5a 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -189,6 +189,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(SAVE_PROCESS_DATA);
CASE2STR(SAVE_WAIT_DATA);
CASE2STR(SAVE_DATA_GUC);
+ CASE2STR(SAVE_DATA_GGTT);
CASE2STR(SAVE_DATA_DONE);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
@@ -827,6 +828,7 @@ static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
}
}
@@ -859,6 +861,17 @@ static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
ret = xe_gt_sriov_pf_migration_guc_save(gt, vfid);
if (ret)
return ret;
+
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
+ return -EAGAIN;
+ }
+
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT)) {
+ if (xe_gt_sriov_pf_migration_ggtt_size(gt, vfid) > 0) {
+ ret = xe_gt_sriov_pf_migration_ggtt_save(gt, vfid);
+ if (ret)
+ return ret;
+ }
}
return 0;
@@ -1065,6 +1078,9 @@ pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
xe_gt_assert(gt, data);
switch (data->type) {
+ case XE_SRIOV_MIGRATION_DATA_TYPE_GGTT:
+ ret = xe_gt_sriov_pf_migration_ggtt_restore(gt, vfid, data);
+ break;
case XE_SRIOV_MIGRATION_DATA_TYPE_GUC:
ret = xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
break;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index 8b951ee8a24fe..1e8fa3f8f9be8 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -34,6 +34,7 @@
* @XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA: indicates that VF migration data is being produced.
* @XE_GT_SRIOV_STATE_SAVE_WAIT_DATA: indicates that PF awaits for space in migration data ring.
* @XE_GT_SRIOV_STATE_SAVE_DATA_GUC: indicates PF needs to save VF GuC migration data.
+ * @XE_GT_SRIOV_STATE_SAVE_DATA_GGTT: indicates PF needs to save VF GGTT migration data.
* @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
* @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
* @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
@@ -78,6 +79,7 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
XE_GT_SRIOV_STATE_SAVE_WAIT_DATA,
XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
+ XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 594178fbe36d0..75e965f75f6a7 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -7,6 +7,9 @@
#include "abi/guc_actions_sriov_abi.h"
#include "xe_bo.h"
+#include "xe_ggtt.h"
+#include "xe_gt.h"
+#include "xe_gt_sriov_pf_config.h"
#include "xe_gt_sriov_pf_control.h"
#include "xe_gt_sriov_pf_helpers.h"
#include "xe_gt_sriov_pf_migration.h"
@@ -37,6 +40,114 @@ static void pf_dump_mig_data(struct xe_gt *gt, unsigned int vfid,
}
}
+/**
+ * xe_gt_sriov_pf_migration_ggtt_size() - Get the size of VF GGTT migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: size in bytes or a negative error code on failure.
+ */
+ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!xe_gt_is_main_type(gt))
+ return 0;
+
+ return xe_ggtt_pte_size(gt->tile->mem.ggtt, xe_gt_sriov_pf_config_get_ggtt(gt, vfid));
+}
+
+static int pf_save_vf_ggtt_mig_data(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_sriov_migration_data *data;
+ size_t size;
+ int ret;
+
+ size = xe_gt_sriov_pf_migration_ggtt_size(gt, vfid);
+ if (size == 0)
+ return 0;
+
+ data = xe_sriov_migration_data_alloc(gt_to_xe(gt));
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_migration_data_init(data, gt->tile->id, gt->info.id,
+ XE_SRIOV_MIGRATION_DATA_TYPE_GGTT, 0, size);
+ if (ret)
+ goto fail;
+
+ ret = xe_gt_sriov_pf_config_ggtt_save(gt, vfid, data->vaddr, size);
+ if (ret)
+ goto fail;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_migration_save_produce(gt, vfid, data);
+ if (ret)
+ goto fail;
+
+ return 0;
+
+fail:
+ xe_sriov_migration_data_free(data);
+ xe_gt_sriov_err(gt, "Failed to save VF%u GGTT data (%pe)\n", vfid, ERR_PTR(ret));
+ return ret;
+}
+
+static int pf_restore_vf_ggtt_mig_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ int ret;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_config_ggtt_restore(gt, vfid, data->vaddr, data->size);
+ if (ret) {
+ xe_gt_sriov_err(gt, "Failed to restore VF%u GGTT data (%pe)\n",
+ vfid, ERR_PTR(ret));
+ return ret;
+ }
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_ggtt_save() - Save VF GGTT migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_save_vf_ggtt_mig_data(gt, vfid);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_ggtt_restore() - Restore VF GGTT migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_restore_vf_ggtt_mig_data(gt, vfid, data);
+}
+
/* Return: number of dwords saved/restored/required or a negative error code on failure */
static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
u64 addr, u32 ndwords)
@@ -290,6 +401,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
size += sizeof(struct xe_sriov_pf_migration_hdr);
total += size;
+ size = xe_gt_sriov_pf_migration_ggtt_size(gt, vfid);
+ if (size < 0)
+ return size;
+ else if (size > 0)
+ size += sizeof(struct xe_sriov_pf_migration_hdr);
+ total += size;
+
return total;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index b3c18e369df79..09abdd9e82e10 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -19,6 +19,10 @@ ssize_t xe_gt_sriov_pf_migration_guc_size(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_guc_save(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_migration_data *data);
+ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data);
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 18/26] drm/xe/pf: Add helpers for VF MMIO migration data handling
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (16 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 17/26] drm/xe/pf: Handle GGTT migration data as part of PF control Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-23 22:10 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 19/26] drm/xe/pf: Handle MMIO migration data as part of PF control Michał Winiarski
` (7 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
In an upcoming change, the VF MMIO migration data will be handled as
part of VF control state machine. Add the necessary helpers to allow the
migration data transfer to/from the VF MMIO registers.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 88 +++++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf.h | 6 ++
2 files changed, 94 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
index c4dda87b47cc8..31ee86166dfd0 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
@@ -194,6 +194,94 @@ static void pf_clear_vf_scratch_regs(struct xe_gt *gt, unsigned int vfid)
}
}
+/**
+ * xe_gt_sriov_pf_mmio_vf_size - Get the size of VF MMIO register data.
+ * @gt: the &xe_gt
+ * @vfid: VF identifier
+ *
+ * Return: size in bytes.
+ */
+size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid)
+{
+ if (xe_gt_is_media_type(gt))
+ return MED_VF_SW_FLAG_COUNT * sizeof(u32);
+ else
+ return VF_SW_FLAG_COUNT * sizeof(u32);
+}
+
+/**
+ * xe_gt_sriov_pf_mmio_vf_save - Save VF MMIO register values to a buffer.
+ * @gt: the &xe_gt
+ * @vfid: VF identifier
+ * @buf: destination buffer
+ * @size: destination buffer size in bytes
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size)
+{
+ u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
+ struct xe_reg scratch;
+ u32 *regs = buf;
+ int n, count;
+
+ if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
+ return -EINVAL;
+
+ if (xe_gt_is_media_type(gt)) {
+ count = MED_VF_SW_FLAG_COUNT;
+ for (n = 0; n < count; n++) {
+ scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
+ regs[n] = xe_mmio_read32(>->mmio, scratch);
+ }
+ } else {
+ count = VF_SW_FLAG_COUNT;
+ for (n = 0; n < count; n++) {
+ scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
+ regs[n] = xe_mmio_read32(>->mmio, scratch);
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_mmio_vf_restore - Restore VF MMIO register values from a buffer.
+ * @gt: the &xe_gt
+ * @vfid: VF identifier
+ * @buf: source buffer
+ * @size: source buffer size in bytes
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size)
+{
+ u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
+ const u32 *regs = buf;
+ struct xe_reg scratch;
+ int n, count;
+
+ if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
+ return -EINVAL;
+
+ if (xe_gt_is_media_type(gt)) {
+ count = MED_VF_SW_FLAG_COUNT;
+ for (n = 0; n < count; n++) {
+ scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
+ xe_mmio_write32(>->mmio, scratch, regs[n]);
+ }
+ } else {
+ count = VF_SW_FLAG_COUNT;
+ for (n = 0; n < count; n++) {
+ scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
+ xe_mmio_write32(>->mmio, scratch, regs[n]);
+ }
+ }
+
+ return 0;
+}
+
/**
* xe_gt_sriov_pf_sanitize_hw() - Reset hardware state related to a VF.
* @gt: the &xe_gt
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
index e7fde3f9937af..7f4f1fda5f77a 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
@@ -6,6 +6,8 @@
#ifndef _XE_GT_SRIOV_PF_H_
#define _XE_GT_SRIOV_PF_H_
+#include <linux/types.h>
+
struct xe_gt;
#ifdef CONFIG_PCI_IOV
@@ -16,6 +18,10 @@ void xe_gt_sriov_pf_init_hw(struct xe_gt *gt);
void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid);
void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt);
void xe_gt_sriov_pf_restart(struct xe_gt *gt);
+size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size);
+int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size);
#else
static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
{
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 19/26] drm/xe/pf: Handle MMIO migration data as part of PF control
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (17 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 18/26] drm/xe/pf: Add helpers for VF MMIO migration data handling Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 20/26] drm/xe/pf: Add helper to retrieve VF's LMEM object Michał Winiarski
` (6 subsequent siblings)
25 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Connect the helpers to allow save and restore of MMIO migration data in
stop_copy / resume device state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 16 +++
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 114 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 4 +
4 files changed, 136 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index f5c215fb93c5a..e7156ad3d1839 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -190,6 +190,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(SAVE_WAIT_DATA);
CASE2STR(SAVE_DATA_GUC);
CASE2STR(SAVE_DATA_GGTT);
+ CASE2STR(SAVE_DATA_MMIO);
CASE2STR(SAVE_DATA_DONE);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
@@ -829,6 +830,7 @@ static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
}
}
@@ -872,6 +874,17 @@ static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
if (ret)
return ret;
}
+
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
+ return -EAGAIN;
+ }
+
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO)) {
+ xe_gt_assert(gt, xe_gt_sriov_pf_migration_mmio_size(gt, vfid) > 0);
+
+ ret = xe_gt_sriov_pf_migration_mmio_save(gt, vfid);
+ if (ret)
+ return ret;
}
return 0;
@@ -1081,6 +1094,9 @@ pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
case XE_SRIOV_MIGRATION_DATA_TYPE_GGTT:
ret = xe_gt_sriov_pf_migration_ggtt_restore(gt, vfid, data);
break;
+ case XE_SRIOV_MIGRATION_DATA_TYPE_MMIO:
+ ret = xe_gt_sriov_pf_migration_mmio_restore(gt, vfid, data);
+ break;
case XE_SRIOV_MIGRATION_DATA_TYPE_GUC:
ret = xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
break;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index 1e8fa3f8f9be8..9dfcebd5078ac 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -35,6 +35,7 @@
* @XE_GT_SRIOV_STATE_SAVE_WAIT_DATA: indicates that PF awaits for space in migration data ring.
* @XE_GT_SRIOV_STATE_SAVE_DATA_GUC: indicates PF needs to save VF GuC migration data.
* @XE_GT_SRIOV_STATE_SAVE_DATA_GGTT: indicates PF needs to save VF GGTT migration data.
+ * @XE_GT_SRIOV_STATE_SAVE_DATA_MMIO: indicates PF needs to save VF MMIO migration data.
* @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
* @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
* @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
@@ -80,6 +81,7 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_SAVE_WAIT_DATA,
XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
+ XE_GT_SRIOV_STATE_SAVE_DATA_MMIO,
XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 75e965f75f6a7..41335b15ffdbe 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -9,6 +9,7 @@
#include "xe_bo.h"
#include "xe_ggtt.h"
#include "xe_gt.h"
+#include "xe_gt_sriov_pf.h"
#include "xe_gt_sriov_pf_config.h"
#include "xe_gt_sriov_pf_control.h"
#include "xe_gt_sriov_pf_helpers.h"
@@ -378,6 +379,112 @@ int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
return pf_restore_vf_guc_state(gt, vfid, data);
}
+/**
+ * xe_gt_sriov_pf_migration_mmio_size() - Get the size of VF MMIO migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: size in bytes or a negative error code on failure.
+ */
+ssize_t xe_gt_sriov_pf_migration_mmio_size(struct xe_gt *gt, unsigned int vfid)
+{
+ return xe_gt_sriov_pf_mmio_vf_size(gt, vfid);
+}
+
+static int pf_save_vf_mmio_mig_data(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_sriov_migration_data *data;
+ size_t size;
+ int ret;
+
+ size = xe_gt_sriov_pf_migration_mmio_size(gt, vfid);
+ if (size == 0)
+ return 0;
+
+ data = xe_sriov_migration_data_alloc(gt_to_xe(gt));
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_migration_data_init(data, gt->tile->id, gt->info.id,
+ XE_SRIOV_MIGRATION_DATA_TYPE_MMIO, 0, size);
+ if (ret)
+ goto fail;
+
+ ret = xe_gt_sriov_pf_mmio_vf_save(gt, vfid, data->vaddr, size);
+ if (ret)
+ goto fail;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_migration_save_produce(gt, vfid, data);
+ if (ret)
+ goto fail;
+
+ return 0;
+
+fail:
+ xe_sriov_migration_data_free(data);
+ xe_gt_sriov_err(gt, "Failed to save VF%u MMIO data (%pe)\n", vfid, ERR_PTR(ret));
+ return ret;
+}
+
+static int pf_restore_vf_mmio_mig_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ int ret;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_mmio_vf_restore(gt, vfid, data->vaddr, data->size);
+ if (ret) {
+ xe_gt_sriov_err(gt, "Failed to restore VF%u MMIO data (%pe)\n",
+ vfid, ERR_PTR(ret));
+
+ return ret;
+ }
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_mmio_save() - Save VF MMIO migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_save_vf_mmio_mig_data(gt, vfid);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_mmio_restore() - Restore VF MMIO migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_restore_vf_mmio_mig_data(gt, vfid, data);
+}
+
/**
* xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT.
* @gt: the &xe_gt
@@ -408,6 +515,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
size += sizeof(struct xe_sriov_pf_migration_hdr);
total += size;
+ size = xe_gt_sriov_pf_migration_mmio_size(gt, vfid);
+ if (size < 0)
+ return size;
+ else if (size > 0)
+ size += sizeof(struct xe_sriov_pf_migration_hdr);
+ total += size;
+
return total;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 09abdd9e82e10..24a233c4cd0bb 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -23,6 +23,10 @@ ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_migration_data *data);
+ssize_t xe_gt_sriov_pf_migration_mmio_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data);
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 20/26] drm/xe/pf: Add helper to retrieve VF's LMEM object
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (18 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 19/26] drm/xe/pf: Handle MMIO migration data as part of PF control Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-23 20:25 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 21/26] drm/xe/migrate: Add function to copy of VRAM data in chunks Michał Winiarski
` (5 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
From: Lukasz Laguna <lukasz.laguna@intel.com>
Instead of accessing VF's lmem_obj directly, introduce a helper function
to make the access more convenient.
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 31 ++++++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 1 +
2 files changed, 32 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index c857879e28fe5..28d648c386487 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -1643,6 +1643,37 @@ int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid,
"LMEM", n, err);
}
+static struct xe_bo *pf_get_vf_config_lmem_obj(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
+
+ return config->lmem_obj;
+}
+
+/**
+ * xe_gt_sriov_pf_config_get_lmem_obj - Take a reference to the struct &xe_bo backing VF LMEM.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function can only be called on PF.
+ * The caller is responsible for calling xe_bo_put() on the returned object.
+ *
+ * Return: pointer to struct &xe_bo backing VF LMEM (if any).
+ */
+struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_bo *lmem_obj;
+
+ xe_gt_assert(gt, vfid);
+
+ mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
+ lmem_obj = pf_get_vf_config_lmem_obj(gt, vfid);
+ xe_bo_get(lmem_obj);
+ mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
+
+ return lmem_obj;
+}
+
static u64 pf_query_free_lmem(struct xe_gt *gt)
{
struct xe_tile *tile = gt->tile;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
index 6916b8f58ebf2..03c5dc0cd5fef 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
@@ -36,6 +36,7 @@ int xe_gt_sriov_pf_config_set_lmem(struct xe_gt *gt, unsigned int vfid, u64 size
int xe_gt_sriov_pf_config_set_fair_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs);
int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs,
u64 size);
+struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid);
u32 xe_gt_sriov_pf_config_get_exec_quantum(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_config_set_exec_quantum(struct xe_gt *gt, unsigned int vfid, u32 exec_quantum);
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 21/26] drm/xe/migrate: Add function to copy of VRAM data in chunks
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (19 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 20/26] drm/xe/pf: Add helper to retrieve VF's LMEM object Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-23 19:29 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control Michał Winiarski
` (4 subsequent siblings)
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
From: Lukasz Laguna <lukasz.laguna@intel.com>
Introduce a new function to copy data between VRAM and sysmem objects.
The existing xe_migrate_copy() is tailored for eviction and restore
operations, which involves additional logic and operates on entire
objects.
The xe_migrate_vram_copy_chunk() allows copying chunks of data to or
from a dedicated buffer object, which is essential in case of VF
migration.
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_migrate.c | 134 ++++++++++++++++++++++++++++++--
drivers/gpu/drm/xe/xe_migrate.h | 8 ++
2 files changed, 136 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index 3112c966c67d7..d30675707162b 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -514,7 +514,7 @@ int xe_migrate_init(struct xe_migrate *m)
static u64 max_mem_transfer_per_pass(struct xe_device *xe)
{
- if (!IS_DGFX(xe) && xe_device_has_flat_ccs(xe))
+ if ((!IS_DGFX(xe) || IS_SRIOV_PF(xe)) && xe_device_has_flat_ccs(xe))
return MAX_CCS_LIMITED_TRANSFER;
return MAX_PREEMPTDISABLE_TRANSFER;
@@ -1155,6 +1155,133 @@ struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate)
return migrate->q;
}
+/**
+ * xe_migrate_vram_copy_chunk() - Copy a chunk of a VRAM buffer object.
+ * @vram_bo: The VRAM buffer object.
+ * @vram_offset: The VRAM offset.
+ * @sysmem_bo: The sysmem buffer object.
+ * @sysmem_offset: The sysmem offset.
+ * @size: The size of VRAM chunk to copy.
+ * @dir: The direction of the copy operation.
+ *
+ * Copies a portion of a buffer object between VRAM and system memory.
+ * On Xe2 platforms that support flat CCS, VRAM data is decompressed when
+ * copying to system memory.
+ *
+ * Return: Pointer to a dma_fence representing the last copy batch, or
+ * an error pointer on failure. If there is a failure, any copy operation
+ * started by the function call has been synced.
+ */
+struct dma_fence *xe_migrate_vram_copy_chunk(struct xe_bo *vram_bo, u64 vram_offset,
+ struct xe_bo *sysmem_bo, u64 sysmem_offset,
+ u64 size, enum xe_migrate_copy_dir dir)
+{
+ struct xe_device *xe = xe_bo_device(vram_bo);
+ struct xe_tile *tile = vram_bo->tile;
+ struct xe_gt *gt = tile->primary_gt;
+ struct xe_migrate *m = tile->migrate;
+ struct dma_fence *fence = NULL;
+ struct ttm_resource *vram = vram_bo->ttm.resource;
+ struct ttm_resource *sysmem = sysmem_bo->ttm.resource;
+ struct xe_res_cursor vram_it, sysmem_it;
+ u64 vram_L0_ofs, sysmem_L0_ofs;
+ u32 vram_L0_pt, sysmem_L0_pt;
+ u64 vram_L0, sysmem_L0;
+ bool to_sysmem = (dir == XE_MIGRATE_COPY_TO_SRAM);
+ bool use_comp_pat = to_sysmem &&
+ GRAPHICS_VER(xe) >= 20 && xe_device_has_flat_ccs(xe);
+ int pass = 0;
+ int err;
+
+ xe_assert(xe, IS_ALIGNED(vram_offset | sysmem_offset | size, PAGE_SIZE));
+ xe_assert(xe, xe_bo_is_vram(vram_bo));
+ xe_assert(xe, !xe_bo_is_vram(sysmem_bo));
+ xe_assert(xe, !range_overflows(vram_offset, size, (u64)vram_bo->ttm.base.size));
+ xe_assert(xe, !range_overflows(sysmem_offset, size, (u64)sysmem_bo->ttm.base.size));
+
+ xe_res_first(vram, vram_offset, size, &vram_it);
+ xe_res_first_sg(xe_bo_sg(sysmem_bo), sysmem_offset, size, &sysmem_it);
+
+ while (size) {
+ u32 pte_flags = PTE_UPDATE_FLAG_IS_VRAM;
+ u32 batch_size = 2; /* arb_clear() + MI_BATCH_BUFFER_END */
+ struct xe_sched_job *job;
+ struct xe_bb *bb;
+ u32 update_idx;
+ bool usm = xe->info.has_usm;
+ u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE;
+
+ sysmem_L0 = xe_migrate_res_sizes(m, &sysmem_it);
+ vram_L0 = min(xe_migrate_res_sizes(m, &vram_it), sysmem_L0);
+
+ drm_dbg(&xe->drm, "Pass %u, size: %llu\n", pass++, vram_L0);
+
+ pte_flags |= use_comp_pat ? PTE_UPDATE_FLAG_IS_COMP_PTE : 0;
+ batch_size += pte_update_size(m, pte_flags, vram, &vram_it, &vram_L0,
+ &vram_L0_ofs, &vram_L0_pt, 0, 0, avail_pts);
+
+ batch_size += pte_update_size(m, 0, sysmem, &sysmem_it, &vram_L0, &sysmem_L0_ofs,
+ &sysmem_L0_pt, 0, avail_pts, avail_pts);
+ batch_size += EMIT_COPY_DW;
+
+ bb = xe_bb_new(gt, batch_size, usm);
+ if (IS_ERR(bb)) {
+ err = PTR_ERR(bb);
+ return ERR_PTR(err);
+ }
+
+ if (xe_migrate_allow_identity(vram_L0, &vram_it))
+ xe_res_next(&vram_it, vram_L0);
+ else
+ emit_pte(m, bb, vram_L0_pt, true, use_comp_pat, &vram_it, vram_L0, vram);
+
+ emit_pte(m, bb, sysmem_L0_pt, false, false, &sysmem_it, vram_L0, sysmem);
+
+ bb->cs[bb->len++] = MI_BATCH_BUFFER_END;
+ update_idx = bb->len;
+
+ if (to_sysmem)
+ emit_copy(gt, bb, vram_L0_ofs, sysmem_L0_ofs, vram_L0, XE_PAGE_SIZE);
+ else
+ emit_copy(gt, bb, sysmem_L0_ofs, vram_L0_ofs, vram_L0, XE_PAGE_SIZE);
+
+ job = xe_bb_create_migration_job(m->q, bb, xe_migrate_batch_base(m, usm),
+ update_idx);
+ if (IS_ERR(job)) {
+ err = PTR_ERR(job);
+ goto err;
+ }
+
+ xe_sched_job_add_migrate_flush(job, MI_INVALIDATE_TLB);
+
+ WARN_ON_ONCE(!dma_resv_test_signaled(vram_bo->ttm.base.resv,
+ DMA_RESV_USAGE_BOOKKEEP));
+ WARN_ON_ONCE(!dma_resv_test_signaled(sysmem_bo->ttm.base.resv,
+ DMA_RESV_USAGE_BOOKKEEP));
+
+ mutex_lock(&m->job_mutex);
+ xe_sched_job_arm(job);
+ dma_fence_put(fence);
+ fence = dma_fence_get(&job->drm.s_fence->finished);
+ xe_sched_job_push(job);
+
+ dma_fence_put(m->fence);
+ m->fence = dma_fence_get(fence);
+ mutex_unlock(&m->job_mutex);
+
+ xe_bb_free(bb, fence);
+ size -= vram_L0;
+ continue;
+
+err:
+ xe_bb_free(bb, NULL);
+
+ return ERR_PTR(err);
+ }
+
+ return fence;
+}
+
static void emit_clear_link_copy(struct xe_gt *gt, struct xe_bb *bb, u64 src_ofs,
u32 size, u32 pitch)
{
@@ -1852,11 +1979,6 @@ static bool xe_migrate_vram_use_pde(struct drm_pagemap_addr *sram_addr,
return true;
}
-enum xe_migrate_copy_dir {
- XE_MIGRATE_COPY_TO_VRAM,
- XE_MIGRATE_COPY_TO_SRAM,
-};
-
#define XE_CACHELINE_BYTES 64ull
#define XE_CACHELINE_MASK (XE_CACHELINE_BYTES - 1)
diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h
index 4fad324b62535..d7bcc6ad8464e 100644
--- a/drivers/gpu/drm/xe/xe_migrate.h
+++ b/drivers/gpu/drm/xe/xe_migrate.h
@@ -28,6 +28,11 @@ struct xe_vma;
enum xe_sriov_vf_ccs_rw_ctxs;
+enum xe_migrate_copy_dir {
+ XE_MIGRATE_COPY_TO_VRAM,
+ XE_MIGRATE_COPY_TO_SRAM,
+};
+
/**
* struct xe_migrate_pt_update_ops - Callbacks for the
* xe_migrate_update_pgtables() function.
@@ -131,6 +136,9 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
struct xe_lrc *xe_migrate_lrc(struct xe_migrate *migrate);
struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate);
+struct dma_fence *xe_migrate_vram_copy_chunk(struct xe_bo *vram_bo, u64 vram_offset,
+ struct xe_bo *sysmem_bo, u64 sysmem_offset,
+ u64 size, enum xe_migrate_copy_dir dir);
int xe_migrate_access_memory(struct xe_migrate *m, struct xe_bo *bo,
unsigned long offset, void *buf, int len,
int write);
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (20 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 21/26] drm/xe/migrate: Add function to copy of VRAM data in chunks Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-23 11:44 ` kernel test robot
2025-10-23 19:54 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 23/26] drm/xe/pf: Add wait helper for VF FLR Michał Winiarski
` (3 subsequent siblings)
25 siblings, 2 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Connect the helpers to allow save and restore of VRAM migration data in
stop_copy / resume device state.
Co-developed-by: Lukasz Laguna <lukasz.laguna@intel.com>
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 18 ++
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 222 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 6 +
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 3 +
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 3 +
6 files changed, 254 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index e7156ad3d1839..680f2de44144b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -191,6 +191,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(SAVE_DATA_GUC);
CASE2STR(SAVE_DATA_GGTT);
CASE2STR(SAVE_DATA_MMIO);
+ CASE2STR(SAVE_DATA_VRAM);
CASE2STR(SAVE_DATA_DONE);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
@@ -832,6 +833,7 @@ static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
}
}
@@ -885,6 +887,19 @@ static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
ret = xe_gt_sriov_pf_migration_mmio_save(gt, vfid);
if (ret)
return ret;
+
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
+ return -EAGAIN;
+ }
+
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM)) {
+ if (xe_gt_sriov_pf_migration_vram_size(gt, vfid) > 0) {
+ ret = xe_gt_sriov_pf_migration_vram_save(gt, vfid);
+ if (ret == -EAGAIN)
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
+ if (ret)
+ return ret;
+ }
}
return 0;
@@ -1100,6 +1115,9 @@ pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
case XE_SRIOV_MIGRATION_DATA_TYPE_GUC:
ret = xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
break;
+ case XE_SRIOV_MIGRATION_DATA_TYPE_VRAM:
+ ret = xe_gt_sriov_pf_migration_vram_restore(gt, vfid, data);
+ break;
default:
xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
break;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index 9dfcebd5078ac..fba10136f7cc7 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -36,6 +36,7 @@
* @XE_GT_SRIOV_STATE_SAVE_DATA_GUC: indicates PF needs to save VF GuC migration data.
* @XE_GT_SRIOV_STATE_SAVE_DATA_GGTT: indicates PF needs to save VF GGTT migration data.
* @XE_GT_SRIOV_STATE_SAVE_DATA_MMIO: indicates PF needs to save VF MMIO migration data.
+ * @XE_GT_SRIOV_STATE_SAVE_DATA_VRAM: indicates PF needs to save VF VRAM migration data.
* @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
* @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
* @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
@@ -82,6 +83,7 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
XE_GT_SRIOV_STATE_SAVE_DATA_MMIO,
+ XE_GT_SRIOV_STATE_SAVE_DATA_VRAM,
XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 41335b15ffdbe..2c6a86d98ee31 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -17,6 +17,7 @@
#include "xe_gt_sriov_printk.h"
#include "xe_guc_buf.h"
#include "xe_guc_ct.h"
+#include "xe_migrate.h"
#include "xe_sriov.h"
#include "xe_sriov_migration_data.h"
#include "xe_sriov_pf_migration.h"
@@ -485,6 +486,220 @@ int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
return pf_restore_vf_mmio_mig_data(gt, vfid, data);
}
+/**
+ * xe_gt_sriov_pf_migration_vram_size() - Get the size of VF VRAM migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: size in bytes or a negative error code on failure.
+ */
+ssize_t xe_gt_sriov_pf_migration_vram_size(struct xe_gt *gt, unsigned int vfid)
+{
+ if (gt != xe_root_mmio_gt(gt_to_xe(gt)))
+ return 0;
+
+ return xe_gt_sriov_pf_config_get_lmem(gt, vfid);
+}
+
+static struct dma_fence *__pf_save_restore_vram(struct xe_gt *gt, unsigned int vfid,
+ struct xe_bo *vram, u64 vram_offset,
+ struct xe_bo *sysmem, u64 sysmem_offset,
+ size_t size, bool save)
+{
+ struct dma_fence *ret = NULL;
+ struct drm_exec exec;
+ int err;
+
+ drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
+ drm_exec_until_all_locked(&exec) {
+ err = drm_exec_lock_obj(&exec, &vram->ttm.base);
+ drm_exec_retry_on_contention(&exec);
+ if (err) {
+ ret = ERR_PTR(err);
+ goto err;
+ }
+
+ err = drm_exec_lock_obj(&exec, &sysmem->ttm.base);
+ drm_exec_retry_on_contention(&exec);
+ if (err) {
+ ret = ERR_PTR(err);
+ goto err;
+ }
+ }
+
+ ret = xe_migrate_vram_copy_chunk(vram, vram_offset, sysmem, sysmem_offset, size,
+ save ? XE_MIGRATE_COPY_TO_SRAM : XE_MIGRATE_COPY_TO_VRAM);
+
+err:
+ drm_exec_fini(&exec);
+
+ return ret;
+}
+
+static int pf_save_vram_chunk(struct xe_gt *gt, unsigned int vfid,
+ struct xe_bo *src_vram, u64 src_vram_offset,
+ size_t size)
+{
+ struct xe_sriov_migration_data *data;
+ struct dma_fence *fence;
+ int ret;
+
+ data = xe_sriov_migration_data_alloc(gt_to_xe(gt));
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_migration_data_init(data, gt->tile->id, gt->info.id,
+ XE_SRIOV_MIGRATION_DATA_TYPE_VRAM,
+ src_vram_offset, size);
+ if (ret)
+ goto fail;
+
+ fence = __pf_save_restore_vram(gt, vfid,
+ src_vram, src_vram_offset,
+ data->bo, 0, size, true);
+
+ ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
+ dma_fence_put(fence);
+ if (!ret) {
+ ret = -ETIME;
+ goto fail;
+ }
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_migration_save_produce(gt, vfid, data);
+ if (ret)
+ goto fail;
+
+ return 0;
+
+fail:
+ xe_sriov_migration_data_free(data);
+ return ret;
+}
+
+#define VF_VRAM_STATE_CHUNK_MAX_SIZE SZ_512M
+static int pf_save_vf_vram_mig_data(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
+ loff_t *offset = &migration->vram_save_offset;
+ struct xe_bo *vram;
+ size_t vram_size, chunk_size;
+ int ret;
+
+ vram = xe_gt_sriov_pf_config_get_lmem_obj(gt, vfid);
+ if (!vram)
+ return -ENXIO;
+
+ vram_size = xe_bo_size(vram);
+ chunk_size = min(vram_size - *offset, VF_VRAM_STATE_CHUNK_MAX_SIZE);
+
+ ret = pf_save_vram_chunk(gt, vfid, vram, *offset, chunk_size);
+ if (ret)
+ goto fail;
+
+ *offset += chunk_size;
+
+ xe_bo_put(vram);
+
+ if (*offset < vram_size)
+ return -EAGAIN;
+
+ return 0;
+
+fail:
+ xe_bo_put(vram);
+ xe_gt_sriov_err(gt, "Failed to save VF%u VRAM data (%pe)\n", vfid, ERR_PTR(ret));
+ return ret;
+}
+
+static int pf_restore_vf_vram_mig_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ u64 end = data->hdr.offset + data->hdr.size;
+ struct dma_fence *fence;
+ struct xe_bo *vram;
+ size_t size;
+ int ret = 0;
+
+ vram = xe_gt_sriov_pf_config_get_lmem_obj(gt, vfid);
+ if (!vram)
+ return -ENXIO;
+
+ size = xe_bo_size(vram);
+
+ if (end > size || end < data->hdr.size) {
+ ret = -EINVAL;
+ goto err;
+ }
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ fence = __pf_save_restore_vram(gt, vfid, vram, data->hdr.offset,
+ data->bo, 0, data->hdr.size, false);
+ ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
+ dma_fence_put(fence);
+ if (!ret) {
+ ret = -ETIME;
+ goto err;
+ }
+
+ return 0;
+err:
+ xe_bo_put(vram);
+ xe_gt_sriov_err(gt, "Failed to restore VF%u VRAM data (%pe)\n", vfid, ERR_PTR(ret));
+ return ret;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_vram_save() - Save VF VRAM migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_vram_save(struct xe_gt *gt, unsigned int vfid)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_save_vf_vram_mig_data(gt, vfid);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_vram_restore() - Restore VF VRAM migration data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_vram_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_restore_vf_vram_mig_data(gt, vfid, data);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_save_init() - Initialize per-GT migration related data.
+ * @gt: the &xe_gt
+ * @vfid: the VF identifier (can't be 0)
+ */
+void xe_gt_sriov_pf_migration_save_init(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_pick_gt_migration(gt, vfid)->vram_save_offset = 0;
+}
+
/**
* xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT.
* @gt: the &xe_gt
@@ -522,6 +737,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
size += sizeof(struct xe_sriov_pf_migration_hdr);
total += size;
+ size = xe_gt_sriov_pf_migration_vram_size(gt, vfid);
+ if (size < 0)
+ return size;
+ else if (size > 0)
+ size += sizeof(struct xe_sriov_pf_migration_hdr);
+ total += size;
+
return total;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 24a233c4cd0bb..ca518eda5429f 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -27,6 +27,12 @@ ssize_t xe_gt_sriov_pf_migration_mmio_size(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_migration_data *data);
+ssize_t xe_gt_sriov_pf_migration_vram_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_vram_save(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_vram_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_migration_data *data);
+
+void xe_gt_sriov_pf_migration_save_init(struct xe_gt *gt, unsigned int vfid);
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
index 75d8b94cbbefb..39a940c9b0a4b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
@@ -16,6 +16,9 @@
struct xe_gt_sriov_migration_data {
/** @ring: queue containing VF save / restore migration data */
struct ptr_ring ring;
+
+ /** @vram_save_offset: offset within VRAM, used for chunked VRAM save */
+ loff_t vram_save_offset;
};
#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
index c2768848daba1..aac8ecb861545 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
@@ -5,6 +5,7 @@
#include "xe_device.h"
#include "xe_gt_sriov_pf_control.h"
+#include "xe_gt_sriov_pf_migration.h"
#include "xe_sriov_migration_data.h"
#include "xe_sriov_pf_control.h"
#include "xe_sriov_printk.h"
@@ -171,6 +172,8 @@ int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid)
return ret;
for_each_gt(gt, xe, id) {
+ xe_gt_sriov_pf_migration_save_init(gt, vfid);
+
ret = xe_gt_sriov_pf_control_trigger_save_vf(gt, vfid);
if (ret)
return ret;
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 23/26] drm/xe/pf: Add wait helper for VF FLR
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (21 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 24/26] drm/xe/pf: Enable SR-IOV VF migration for PTL and BMG Michał Winiarski
` (2 subsequent siblings)
25 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
VF FLR requires additional processing done by PF driver.
The processing is done after FLR is already finished from PCIe
perspective.
In order to avoid a scenario where migration state transitions while
PF processing is still in progress, additional synchronization
point is needed.
Add a helper that will be used as part of VF driver struct
pci_error_handlers .reset_done() callback.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 24 ++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_control.h | 1 +
2 files changed, 25 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
index aac8ecb861545..bed488476706d 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
@@ -123,6 +123,30 @@ int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid)
return result;
}
+/**
+ * xe_sriov_pf_control_wait_flr() - Wait for a VF reset (FLR) to complete.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_wait_flr(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ unsigned int id;
+ int result = 0;
+ int err;
+
+ for_each_gt(gt, xe, id) {
+ err = xe_gt_sriov_pf_control_wait_flr(gt, vfid);
+ result = result ? -EUCLEAN : err;
+ }
+
+ return result;
+}
+
/**
* xe_sriov_pf_control_sync_flr() - Synchronize a VF FLR between all GTs.
* @xe: the &xe_device
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
index 30318c1fba34e..ef9f219b21096 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
@@ -12,6 +12,7 @@ int xe_sriov_pf_control_pause_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_resume_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_stop_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_wait_flr(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_finish_save_vf(struct xe_device *xe, unsigned int vfid);
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 24/26] drm/xe/pf: Enable SR-IOV VF migration for PTL and BMG
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (22 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 23/26] drm/xe/pf: Add wait helper for VF FLR Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-23 20:15 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
All of the necessary building blocks are now in place for PTL and BMG to
support SR-IOV VF migration.
Enable the feature without the need to pass feature enabling debug flags
for those platforms.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_device.h | 5 +++++
drivers/gpu/drm/xe/xe_device_types.h | 2 ++
drivers/gpu/drm/xe/xe_pci.c | 8 ++++++--
drivers/gpu/drm/xe/xe_pci_types.h | 1 +
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 4 +++-
5 files changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
index 32cc6323b7f64..0c4404c78227c 100644
--- a/drivers/gpu/drm/xe/xe_device.h
+++ b/drivers/gpu/drm/xe/xe_device.h
@@ -152,6 +152,11 @@ static inline bool xe_device_has_sriov(struct xe_device *xe)
return xe->info.has_sriov;
}
+static inline bool xe_device_has_sriov_vf_migration(struct xe_device *xe)
+{
+ return xe->info.has_sriov_vf_migration;
+}
+
static inline bool xe_device_has_msix(struct xe_device *xe)
{
return xe->irq.msix.nvec > 0;
diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
index 02c04ad7296e4..8973e17b9a359 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -311,6 +311,8 @@ struct xe_device {
u8 has_range_tlb_inval:1;
/** @info.has_sriov: Supports SR-IOV */
u8 has_sriov:1;
+ /** @info.has_sriov_vf_migration: Supports SR-IOV VF migration */
+ u8 has_sriov_vf_migration:1;
/** @info.has_usm: Device has unified shared memory support */
u8 has_usm:1;
/** @info.has_64bit_timestamp: Device supports 64-bit timestamps */
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index c3136141a9536..d4f9ee9d020b2 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -362,6 +362,7 @@ static const struct xe_device_desc bmg_desc = {
.has_heci_cscfi = 1,
.has_late_bind = true,
.has_sriov = true,
+ .has_sriov_vf_migration = true,
.max_gt_per_tile = 2,
.needs_scratch = true,
.subplatforms = (const struct xe_subplatform_desc[]) {
@@ -378,6 +379,7 @@ static const struct xe_device_desc ptl_desc = {
.has_display = true,
.has_flat_ccs = 1,
.has_sriov = true,
+ .has_sriov_vf_migration = true,
.max_gt_per_tile = 2,
.needs_scratch = true,
.needs_shared_vf_gt_wq = true,
@@ -657,6 +659,7 @@ static int xe_info_init_early(struct xe_device *xe,
xe->info.has_pxp = desc->has_pxp;
xe->info.has_sriov = xe_configfs_primary_gt_allowed(to_pci_dev(xe->drm.dev)) &&
desc->has_sriov;
+ xe->info.has_sriov_vf_migration = desc->has_sriov_vf_migration;
xe->info.skip_guc_pc = desc->skip_guc_pc;
xe->info.skip_mtcfg = desc->skip_mtcfg;
xe->info.skip_pcode = desc->skip_pcode;
@@ -1020,9 +1023,10 @@ static int xe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
xe_step_name(xe->info.step.media),
xe_step_name(xe->info.step.basedie));
- drm_dbg(&xe->drm, "SR-IOV support: %s (mode: %s)\n",
+ drm_dbg(&xe->drm, "SR-IOV support: %s (mode: %s) (VF migration: %s)\n",
str_yes_no(xe_device_has_sriov(xe)),
- xe_sriov_mode_to_string(xe_device_sriov_mode(xe)));
+ xe_sriov_mode_to_string(xe_device_sriov_mode(xe)),
+ str_yes_no(xe_device_has_sriov_vf_migration(xe)));
err = xe_pm_init_early(xe);
if (err)
diff --git a/drivers/gpu/drm/xe/xe_pci_types.h b/drivers/gpu/drm/xe/xe_pci_types.h
index a4451bdc79fb3..40f158b3ac890 100644
--- a/drivers/gpu/drm/xe/xe_pci_types.h
+++ b/drivers/gpu/drm/xe/xe_pci_types.h
@@ -48,6 +48,7 @@ struct xe_device_desc {
u8 has_mbx_power_limits:1;
u8 has_pxp:1;
u8 has_sriov:1;
+ u8 has_sriov_vf_migration:1;
u8 needs_scratch:1;
u8 skip_guc_pc:1;
u8 skip_mtcfg:1;
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index 88babec9c893e..a6cf3b57edba1 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -50,7 +50,9 @@ bool xe_sriov_pf_migration_supported(struct xe_device *xe)
static bool pf_check_migration_support(struct xe_device *xe)
{
- /* XXX: for now this is for feature enabling only */
+ if (xe_device_has_sriov_vf_migration(xe))
+ return true;
+
return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 25/26] drm/xe/pf: Export helpers for VFIO
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (23 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 24/26] drm/xe/pf: Enable SR-IOV VF migration for PTL and BMG Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-28 3:28 ` Tian, Kevin
2025-10-21 22:41 ` [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
25 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
Vendor-specific VFIO driver for Xe will implement VF migration.
Export everything that's needed for migration ops.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/Makefile | 2 +
drivers/gpu/drm/xe/xe_sriov_vfio.c | 296 +++++++++++++++++++++++++++++
include/drm/intel/xe_sriov_vfio.h | 28 +++
3 files changed, 326 insertions(+)
create mode 100644 drivers/gpu/drm/xe/xe_sriov_vfio.c
create mode 100644 include/drm/intel/xe_sriov_vfio.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 3d72db9e528e4..de3778f51ce7e 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -182,6 +182,8 @@ xe-$(CONFIG_PCI_IOV) += \
xe_sriov_pf_service.o \
xe_tile_sriov_pf_debugfs.o
+xe-$(CONFIG_XE_VFIO_PCI) += xe_sriov_vfio.o
+
# include helpers for tests even when XE is built-in
ifdef CONFIG_DRM_XE_KUNIT_TEST
xe-y += tests/xe_kunit_helpers.o
diff --git a/drivers/gpu/drm/xe/xe_sriov_vfio.c b/drivers/gpu/drm/xe/xe_sriov_vfio.c
new file mode 100644
index 0000000000000..4f2a7c2b5d61c
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_vfio.c
@@ -0,0 +1,296 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <drm/intel/xe_sriov_vfio.h>
+
+#include "xe_pm.h"
+#include "xe_sriov.h"
+#include "xe_sriov_migration_data.h"
+#include "xe_sriov_pf_control.h"
+#include "xe_sriov_pf_helpers.h"
+#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_types.h"
+
+/**
+ * xe_sriov_vfio_migration_supported() - Check if migration is supported.
+ * @pdev: the PF &pci_dev device
+ *
+ * Return: true if migration is supported, false otherwise.
+ */
+bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ return xe_sriov_pf_migration_supported(xe);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_migration_supported, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_wait_flr_done() - Wait for VF FLR completion.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function will wait until VF FLR is processed by PF on all tiles (or
+ * until timeout occurs).
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_control_wait_flr(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_wait_flr_done, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_stop() - Stop VF.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function will pause VF on all tiles/GTs.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_control_pause_vf(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_run() - Run VF.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * This function will resume VF on all tiles.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_control_resume_vf(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_run, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_stop_copy_enter() - Initiate a VF device migration data save.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_control_trigger_save_vf(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_enter, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_stop_copy_exit() - Finish a VF device migration data save.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_control_finish_save_vf(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_exit, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_resume_enter() - Initiate a VF device migration data restore.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_control_trigger_restore_vf(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_enter, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_resume_exit() - Finish a VF device migration data restore.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_control_finish_restore_vf(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_exit, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_error() - Move VF device to error state.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * Reset is needed to move it out of error state.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_control_stop_vf(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_error, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_data_read() - Read migration data from the VF device.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ * @buf: start address of userspace buffer
+ * @len: requested read size from userspace
+ *
+ * Return: number of bytes that has been successfully read,
+ * 0 if no more migration data is available, -errno on failure.
+ */
+ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
+ char __user *buf, size_t len)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ return xe_sriov_migration_data_read(xe, vfid, buf, len);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_read, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_data_write() - Write migration data to the VF device.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ * @buf: start address of userspace buffer
+ * @len: requested write size from userspace
+ *
+ * Return: number of bytes that has been successfully written, -errno on failure.
+ */
+ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
+ const char __user *buf, size_t len)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ return xe_sriov_migration_data_write(xe, vfid, buf, len);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_write, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_stop_copy_size() - Get a size estimate of VF device migration data.
+ * @pdev: the PF &pci_dev device
+ * @vfid: the VF identifier (can't be 0)
+ *
+ * Return: migration data size in bytes or a negative error code on failure.
+ */
+ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ if (vfid == PFID || vfid > xe_sriov_pf_get_totalvfs(xe))
+ return -EINVAL;
+
+ xe_assert(xe, !xe_pm_runtime_suspended(xe));
+
+ return xe_sriov_pf_migration_size(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_size, "xe-vfio-pci");
diff --git a/include/drm/intel/xe_sriov_vfio.h b/include/drm/intel/xe_sriov_vfio.h
new file mode 100644
index 0000000000000..cf4ef7a1cfbbe
--- /dev/null
+++ b/include/drm/intel/xe_sriov_vfio.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_SRIOV_VFIO_H_
+#define _XE_SRIOV_VFIO_H_
+
+#include <linux/types.h>
+
+struct pci_dev;
+
+bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
+int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid);
+ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
+ char __user *buf, size_t len);
+ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
+ const char __user *buf, size_t len);
+ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid);
+
+#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (24 preceding siblings ...)
2025-10-21 22:41 ` [PATCH v2 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
@ 2025-10-21 22:41 ` Michał Winiarski
2025-10-22 7:12 ` Christoph Hellwig
` (2 more replies)
25 siblings, 3 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-21 22:41 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna, Michał Winiarski
In addition to generic VFIO PCI functionality, the driver implements
VFIO migration uAPI, allowing userspace to enable migration for Intel
Graphics SR-IOV Virtual Functions.
The driver binds to VF device, and uses API exposed by Xe driver bound
to PF device to control VF device state and transfer the migration data.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
MAINTAINERS | 7 +
drivers/vfio/pci/Kconfig | 2 +
drivers/vfio/pci/Makefile | 2 +
drivers/vfio/pci/xe/Kconfig | 12 +
drivers/vfio/pci/xe/Makefile | 3 +
drivers/vfio/pci/xe/main.c | 470 +++++++++++++++++++++++++++++++++++
6 files changed, 496 insertions(+)
create mode 100644 drivers/vfio/pci/xe/Kconfig
create mode 100644 drivers/vfio/pci/xe/Makefile
create mode 100644 drivers/vfio/pci/xe/main.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 096fcca26dc76..255fcb01c98e1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -26976,6 +26976,13 @@ L: virtualization@lists.linux.dev
S: Maintained
F: drivers/vfio/pci/virtio
+VFIO XE PCI DRIVER
+M: Michał Winiarski <michal.winiarski@intel.com>
+L: kvm@vger.kernel.org
+L: intel-xe@lists.freedesktop.org
+S: Supported
+F: drivers/vfio/pci/xe
+
VGA_SWITCHEROO
R: Lukas Wunner <lukas@wunner.de>
S: Maintained
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 2b0172f546652..c100f0ab87f2d 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -67,4 +67,6 @@ source "drivers/vfio/pci/nvgrace-gpu/Kconfig"
source "drivers/vfio/pci/qat/Kconfig"
+source "drivers/vfio/pci/xe/Kconfig"
+
endmenu
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index cf00c0a7e55c8..f5d46aa9347b9 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -19,3 +19,5 @@ obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio/
obj-$(CONFIG_NVGRACE_GPU_VFIO_PCI) += nvgrace-gpu/
obj-$(CONFIG_QAT_VFIO_PCI) += qat/
+
+obj-$(CONFIG_XE_VFIO_PCI) += xe/
diff --git a/drivers/vfio/pci/xe/Kconfig b/drivers/vfio/pci/xe/Kconfig
new file mode 100644
index 0000000000000..787be88268685
--- /dev/null
+++ b/drivers/vfio/pci/xe/Kconfig
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config XE_VFIO_PCI
+ tristate "VFIO support for Intel Graphics"
+ depends on DRM_XE
+ select VFIO_PCI_CORE
+ help
+ This option enables vendor-specific VFIO driver for Intel Graphics.
+ In addition to generic VFIO PCI functionality, it implements VFIO
+ migration uAPI allowing userspace to enable migration for
+ Intel Graphics SR-IOV Virtual Functions supported by the Xe driver.
+
+ If you don't know what to do here, say N.
diff --git a/drivers/vfio/pci/xe/Makefile b/drivers/vfio/pci/xe/Makefile
new file mode 100644
index 0000000000000..13aa0fd192cd4
--- /dev/null
+++ b/drivers/vfio/pci/xe/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_XE_VFIO_PCI) += xe-vfio-pci.o
+xe-vfio-pci-y := main.o
diff --git a/drivers/vfio/pci/xe/main.c b/drivers/vfio/pci/xe/main.c
new file mode 100644
index 0000000000000..bea992cdee6b0
--- /dev/null
+++ b/drivers/vfio/pci/xe/main.c
@@ -0,0 +1,470 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <linux/anon_inodes.h>
+#include <linux/delay.h>
+#include <linux/file.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/sizes.h>
+#include <linux/types.h>
+#include <linux/vfio.h>
+#include <linux/vfio_pci_core.h>
+
+#include <drm/intel/xe_sriov_vfio.h>
+
+/**
+ * struct xe_vfio_pci_migration_file - file used for reading / writing migration data
+ */
+struct xe_vfio_pci_migration_file {
+ /** @filp: pointer to underlying &struct file */
+ struct file *filp;
+ /** @lock: serializes accesses to migration data */
+ struct mutex lock;
+ /** @xe_vdev: backpointer to &struct xe_vfio_pci_core_device */
+ struct xe_vfio_pci_core_device *xe_vdev;
+};
+
+/**
+ * struct xe_vfio_pci_core_device - xe-specific vfio_pci_core_device
+ *
+ * Top level structure of xe_vfio_pci.
+ */
+struct xe_vfio_pci_core_device {
+ /** @core_device: vendor-agnostic VFIO device */
+ struct vfio_pci_core_device core_device;
+
+ /** @mig_state: current device migration state */
+ enum vfio_device_mig_state mig_state;
+
+ /** @vfid: VF number used by PF, xe uses 1-based indexing for vfid */
+ unsigned int vfid;
+
+ /** @pf: pointer to driver_private of physical function */
+ struct pci_dev *pf;
+
+ /** @fd: &struct xe_vfio_pci_migration_file for userspace to read/write migration data */
+ struct xe_vfio_pci_migration_file *fd;
+};
+
+#define xe_vdev_to_dev(xe_vdev) (&(xe_vdev)->core_device.pdev->dev)
+#define xe_vdev_to_pdev(xe_vdev) ((xe_vdev)->core_device.pdev)
+
+static void xe_vfio_pci_disable_file(struct xe_vfio_pci_migration_file *migf)
+{
+ struct xe_vfio_pci_core_device *xe_vdev = migf->xe_vdev;
+
+ mutex_lock(&migf->lock);
+ xe_vdev->fd = NULL;
+ mutex_unlock(&migf->lock);
+}
+
+static void xe_vfio_pci_reset(struct xe_vfio_pci_core_device *xe_vdev)
+{
+ if (xe_vdev->fd)
+ xe_vfio_pci_disable_file(xe_vdev->fd);
+
+ xe_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING;
+}
+
+static void xe_vfio_pci_reset_done(struct pci_dev *pdev)
+{
+ struct xe_vfio_pci_core_device *xe_vdev = pci_get_drvdata(pdev);
+ int ret;
+
+ ret = xe_sriov_vfio_wait_flr_done(xe_vdev->pf, xe_vdev->vfid);
+ if (ret)
+ dev_err(&pdev->dev, "Failed to wait for FLR: %d\n", ret);
+
+ xe_vfio_pci_reset(xe_vdev);
+}
+
+static const struct pci_error_handlers xe_vfio_pci_err_handlers = {
+ .reset_done = xe_vfio_pci_reset_done,
+};
+
+static int xe_vfio_pci_open_device(struct vfio_device *core_vdev)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+ struct vfio_pci_core_device *vdev = &xe_vdev->core_device;
+ int ret;
+
+ ret = vfio_pci_core_enable(vdev);
+ if (ret)
+ return ret;
+
+ vfio_pci_core_finish_enable(vdev);
+
+ return 0;
+}
+
+static int xe_vfio_pci_release_file(struct inode *inode, struct file *filp)
+{
+ struct xe_vfio_pci_migration_file *migf = filp->private_data;
+
+ xe_vfio_pci_disable_file(migf);
+ mutex_destroy(&migf->lock);
+ kfree(migf);
+
+ return 0;
+}
+
+static ssize_t xe_vfio_pci_save_read(struct file *filp, char __user *buf, size_t len, loff_t *pos)
+{
+ struct xe_vfio_pci_migration_file *migf = filp->private_data;
+ ssize_t ret;
+
+ if (pos)
+ return -ESPIPE;
+
+ mutex_lock(&migf->lock);
+ ret = xe_sriov_vfio_data_read(migf->xe_vdev->pf, migf->xe_vdev->vfid, buf, len);
+ mutex_unlock(&migf->lock);
+
+ return ret;
+}
+
+static const struct file_operations xe_vfio_pci_save_fops = {
+ .owner = THIS_MODULE,
+ .read = xe_vfio_pci_save_read,
+ .release = xe_vfio_pci_release_file,
+ .llseek = noop_llseek,
+};
+
+static ssize_t xe_vfio_pci_resume_write(struct file *filp, const char __user *buf,
+ size_t len, loff_t *pos)
+{
+ struct xe_vfio_pci_migration_file *migf = filp->private_data;
+ ssize_t ret;
+
+ if (pos)
+ return -ESPIPE;
+
+ mutex_lock(&migf->lock);
+ ret = xe_sriov_vfio_data_write(migf->xe_vdev->pf, migf->xe_vdev->vfid, buf, len);
+ mutex_unlock(&migf->lock);
+
+ return ret;
+}
+
+static const struct file_operations xe_vfio_pci_resume_fops = {
+ .owner = THIS_MODULE,
+ .write = xe_vfio_pci_resume_write,
+ .release = xe_vfio_pci_release_file,
+ .llseek = noop_llseek,
+};
+
+static const char *vfio_dev_state_str(u32 state)
+{
+ switch (state) {
+ case VFIO_DEVICE_STATE_RUNNING: return "running";
+ case VFIO_DEVICE_STATE_RUNNING_P2P: return "running_p2p";
+ case VFIO_DEVICE_STATE_STOP_COPY: return "stopcopy";
+ case VFIO_DEVICE_STATE_STOP: return "stop";
+ case VFIO_DEVICE_STATE_RESUMING: return "resuming";
+ case VFIO_DEVICE_STATE_ERROR: return "error";
+ default: return "";
+ }
+}
+
+enum xe_vfio_pci_file_type {
+ XE_VFIO_FILE_SAVE = 0,
+ XE_VFIO_FILE_RESUME,
+};
+
+static struct xe_vfio_pci_migration_file *
+xe_vfio_pci_alloc_file(struct xe_vfio_pci_core_device *xe_vdev,
+ enum xe_vfio_pci_file_type type)
+{
+ struct xe_vfio_pci_migration_file *migf;
+ const struct file_operations *fops;
+ int flags;
+
+ migf = kzalloc(sizeof(*migf), GFP_KERNEL);
+ if (!migf)
+ return ERR_PTR(-ENOMEM);
+
+ fops = type == XE_VFIO_FILE_SAVE ? &xe_vfio_pci_save_fops : &xe_vfio_pci_resume_fops;
+ flags = type == XE_VFIO_FILE_SAVE ? O_RDONLY : O_WRONLY;
+ migf->filp = anon_inode_getfile("xe_vfio_mig", fops, migf, flags);
+ if (IS_ERR(migf->filp)) {
+ kfree(migf);
+ return ERR_CAST(migf->filp);
+ }
+
+ mutex_init(&migf->lock);
+ migf->xe_vdev = xe_vdev;
+ xe_vdev->fd = migf;
+
+ stream_open(migf->filp->f_inode, migf->filp);
+
+ return migf;
+}
+
+static struct file *
+xe_vfio_set_state(struct xe_vfio_pci_core_device *xe_vdev, u32 new)
+{
+ u32 cur = xe_vdev->mig_state;
+ int ret;
+
+ dev_dbg(xe_vdev_to_dev(xe_vdev),
+ "state: %s->%s\n", vfio_dev_state_str(cur), vfio_dev_state_str(new));
+
+ /*
+ * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
+ * selectively block p2p DMA transfers.
+ * The device is not processing new workload requests when the VF is stopped, and both
+ * memory and MMIO communication channels are transferred to destination (where processing
+ * will be resumed).
+ */
+ if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
+ (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
+ ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
+ if (ret)
+ goto err;
+
+ return NULL;
+ }
+
+ if ((cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_STOP) ||
+ (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING_P2P))
+ return NULL;
+
+ if ((cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING) ||
+ (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_RUNNING)) {
+ ret = xe_sriov_vfio_run(xe_vdev->pf, xe_vdev->vfid);
+ if (ret)
+ goto err;
+
+ return NULL;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_STOP_COPY) {
+ struct xe_vfio_pci_migration_file *migf;
+
+ migf = xe_vfio_pci_alloc_file(xe_vdev, XE_VFIO_FILE_SAVE);
+ if (IS_ERR(migf)) {
+ ret = PTR_ERR(migf);
+ goto err;
+ }
+
+ ret = xe_sriov_vfio_stop_copy_enter(xe_vdev->pf, xe_vdev->vfid);
+ if (ret) {
+ fput(migf->filp);
+ goto err;
+ }
+
+ return migf->filp;
+ }
+
+ if ((cur == VFIO_DEVICE_STATE_STOP_COPY && new == VFIO_DEVICE_STATE_STOP)) {
+ if (xe_vdev->fd)
+ xe_vfio_pci_disable_file(xe_vdev->fd);
+
+ xe_sriov_vfio_stop_copy_exit(xe_vdev->pf, xe_vdev->vfid);
+
+ return NULL;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RESUMING) {
+ struct xe_vfio_pci_migration_file *migf;
+
+ migf = xe_vfio_pci_alloc_file(xe_vdev, XE_VFIO_FILE_RESUME);
+ if (IS_ERR(migf)) {
+ ret = PTR_ERR(migf);
+ goto err;
+ }
+
+ ret = xe_sriov_vfio_resume_enter(xe_vdev->pf, xe_vdev->vfid);
+ if (ret) {
+ fput(migf->filp);
+ goto err;
+ }
+
+ return migf->filp;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_RESUMING && new == VFIO_DEVICE_STATE_STOP) {
+ if (xe_vdev->fd)
+ xe_vfio_pci_disable_file(xe_vdev->fd);
+
+ xe_sriov_vfio_resume_exit(xe_vdev->pf, xe_vdev->vfid);
+
+ return NULL;
+ }
+
+ if (new == VFIO_DEVICE_STATE_ERROR)
+ xe_sriov_vfio_error(xe_vdev->pf, xe_vdev->vfid);
+
+ WARN(true, "Unknown state transition %d->%d", cur, new);
+ return ERR_PTR(-EINVAL);
+
+err:
+ dev_dbg(xe_vdev_to_dev(xe_vdev),
+ "Failed to transition state: %s->%s err=%d\n",
+ vfio_dev_state_str(cur), vfio_dev_state_str(new), ret);
+ return ERR_PTR(ret);
+}
+
+static struct file *
+xe_vfio_pci_set_device_state(struct vfio_device *core_vdev,
+ enum vfio_device_mig_state new_state)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+ enum vfio_device_mig_state next_state;
+ struct file *f = NULL;
+ int ret;
+
+ while (new_state != xe_vdev->mig_state) {
+ ret = vfio_mig_get_next_state(core_vdev, xe_vdev->mig_state,
+ new_state, &next_state);
+ if (ret) {
+ f = ERR_PTR(ret);
+ break;
+ }
+ f = xe_vfio_set_state(xe_vdev, next_state);
+ if (IS_ERR(f))
+ break;
+
+ xe_vdev->mig_state = next_state;
+
+ /* Multiple state transitions with non-NULL file in the middle */
+ if (f && new_state != xe_vdev->mig_state) {
+ fput(f);
+ f = ERR_PTR(-EINVAL);
+ break;
+ }
+ }
+
+ return f;
+}
+
+static int xe_vfio_pci_get_device_state(struct vfio_device *core_vdev,
+ enum vfio_device_mig_state *curr_state)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+
+ *curr_state = xe_vdev->mig_state;
+
+ return 0;
+}
+
+static int xe_vfio_pci_get_data_size(struct vfio_device *vdev,
+ unsigned long *stop_copy_length)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+
+ *stop_copy_length = xe_sriov_vfio_stop_copy_size(xe_vdev->pf, xe_vdev->vfid);
+
+ return 0;
+}
+
+static const struct vfio_migration_ops xe_vfio_pci_migration_ops = {
+ .migration_set_state = xe_vfio_pci_set_device_state,
+ .migration_get_state = xe_vfio_pci_get_device_state,
+ .migration_get_data_size = xe_vfio_pci_get_data_size,
+};
+
+static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+ struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
+
+ if (!xe_sriov_vfio_migration_supported(pdev->physfn))
+ return;
+
+ /* vfid starts from 1 for xe */
+ xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
+ xe_vdev->pf = pdev->physfn;
+
+ core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P;
+ core_vdev->mig_ops = &xe_vfio_pci_migration_ops;
+}
+
+static int xe_vfio_pci_init_dev(struct vfio_device *core_vdev)
+{
+ struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
+
+ if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe") == 0)
+ xe_vfio_pci_migration_init(core_vdev);
+
+ return vfio_pci_core_init_dev(core_vdev);
+}
+
+static const struct vfio_device_ops xe_vfio_pci_ops = {
+ .name = "xe-vfio-pci",
+ .init = xe_vfio_pci_init_dev,
+ .release = vfio_pci_core_release_dev,
+ .open_device = xe_vfio_pci_open_device,
+ .close_device = vfio_pci_core_close_device,
+ .ioctl = vfio_pci_core_ioctl,
+ .device_feature = vfio_pci_core_ioctl_feature,
+ .read = vfio_pci_core_read,
+ .write = vfio_pci_core_write,
+ .mmap = vfio_pci_core_mmap,
+ .request = vfio_pci_core_request,
+ .match = vfio_pci_core_match,
+ .match_token_uuid = vfio_pci_core_match_token_uuid,
+ .bind_iommufd = vfio_iommufd_physical_bind,
+ .unbind_iommufd = vfio_iommufd_physical_unbind,
+ .attach_ioas = vfio_iommufd_physical_attach_ioas,
+ .detach_ioas = vfio_iommufd_physical_detach_ioas,
+};
+
+static int xe_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct xe_vfio_pci_core_device *xe_vdev;
+ int ret;
+
+ xe_vdev = vfio_alloc_device(xe_vfio_pci_core_device, core_device.vdev, &pdev->dev,
+ &xe_vfio_pci_ops);
+ if (IS_ERR(xe_vdev))
+ return PTR_ERR(xe_vdev);
+
+ dev_set_drvdata(&pdev->dev, &xe_vdev->core_device);
+
+ ret = vfio_pci_core_register_device(&xe_vdev->core_device);
+ if (ret) {
+ vfio_put_device(&xe_vdev->core_device.vdev);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void xe_vfio_pci_remove(struct pci_dev *pdev)
+{
+ struct xe_vfio_pci_core_device *xe_vdev = pci_get_drvdata(pdev);
+
+ vfio_pci_core_unregister_device(&xe_vdev->core_device);
+ vfio_put_device(&xe_vdev->core_device.vdev);
+}
+
+static const struct pci_device_id xe_vfio_pci_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_ANY_ID),
+ .class = PCI_BASE_CLASS_DISPLAY << 16, .class_mask = 0xff << 16,
+ .override_only = PCI_ID_F_VFIO_DRIVER_OVERRIDE },
+ {}
+};
+MODULE_DEVICE_TABLE(pci, xe_vfio_pci_table);
+
+static struct pci_driver xe_vfio_pci_driver = {
+ .name = "xe-vfio-pci",
+ .id_table = xe_vfio_pci_table,
+ .probe = xe_vfio_pci_probe,
+ .remove = xe_vfio_pci_remove,
+ .err_handler = &xe_vfio_pci_err_handlers,
+ .driver_managed_dma = true,
+};
+module_pci_driver(xe_vfio_pci_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("VFIO PCI driver with migration support for Intel Graphics");
--
2.50.1
^ permalink raw reply related [flat|nested] 72+ messages in thread
* Re: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-21 22:41 ` [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
@ 2025-10-22 7:12 ` Christoph Hellwig
2025-10-22 8:52 ` Michał Winiarski
2025-10-27 7:24 ` Tian, Kevin
2025-10-27 7:26 ` Tian, Kevin
2 siblings, 1 reply; 72+ messages in thread
From: Christoph Hellwig @ 2025-10-22 7:12 UTC (permalink / raw)
To: Michał Winiarski
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko, dri-devel,
Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin, David Airlie,
Simona Vetter, Lukasz Laguna
There is absolutely nothing vendor-specific here, it is a device variant
driver. In fact in Linux basically nothing is ever vendor specific,
because vendor is not a concept that does matter in any practical sense
except for tiny details like the vendor ID as one of the IDs to match
on in device probing.
I have no idea why people keep trying to inject this term again and
again.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-22 7:12 ` Christoph Hellwig
@ 2025-10-22 8:52 ` Michał Winiarski
2025-10-22 8:54 ` Christoph Hellwig
0 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-22 8:52 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko, dri-devel,
Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin, David Airlie,
Simona Vetter, Lukasz Laguna
On Wed, Oct 22, 2025 at 12:12:01AM -0700, Christoph Hellwig wrote:
> There is absolutely nothing vendor-specific here, it is a device variant
> driver. In fact in Linux basically nothing is ever vendor specific,
> because vendor is not a concept that does matter in any practical sense
> except for tiny details like the vendor ID as one of the IDs to match
> on in device probing.
>
> I have no idea why people keep trying to inject this term again and
> again.
Hi,
The reasoning was that in this case we're matching vendor ID + class
combination to match all Intel GPUs, and not just selected device ID,
but I get your point.
Let me replace it "device specific" to follow the VFIO documentation.
Thanks,
-Michał
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-22 8:52 ` Michał Winiarski
@ 2025-10-22 8:54 ` Christoph Hellwig
2025-10-22 9:12 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Christoph Hellwig @ 2025-10-22 8:54 UTC (permalink / raw)
To: Michał Winiarski
Cc: Christoph Hellwig, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost, Michal Wajdeczko, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Wed, Oct 22, 2025 at 10:52:34AM +0200, Michał Winiarski wrote:
> On Wed, Oct 22, 2025 at 12:12:01AM -0700, Christoph Hellwig wrote:
> > There is absolutely nothing vendor-specific here, it is a device variant
> > driver. In fact in Linux basically nothing is ever vendor specific,
> > because vendor is not a concept that does matter in any practical sense
> > except for tiny details like the vendor ID as one of the IDs to match
> > on in device probing.
> >
> > I have no idea why people keep trying to inject this term again and
> > again.
>
> Hi,
>
> The reasoning was that in this case we're matching vendor ID + class
> combination to match all Intel GPUs, and not just selected device ID,
> but I get your point.
Which sounds like a really bad idea. Is this going to work on i810
devices? Or the odd parts povervr based parts?
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-22 8:54 ` Christoph Hellwig
@ 2025-10-22 9:12 ` Michał Winiarski
2025-10-22 11:33 ` Jason Gunthorpe
0 siblings, 1 reply; 72+ messages in thread
From: Michał Winiarski @ 2025-10-22 9:12 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, Michal Wajdeczko, dri-devel,
Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin, David Airlie,
Simona Vetter, Lukasz Laguna
On Wed, Oct 22, 2025 at 01:54:42AM -0700, Christoph Hellwig wrote:
> On Wed, Oct 22, 2025 at 10:52:34AM +0200, Michał Winiarski wrote:
> > On Wed, Oct 22, 2025 at 12:12:01AM -0700, Christoph Hellwig wrote:
> > > There is absolutely nothing vendor-specific here, it is a device variant
> > > driver. In fact in Linux basically nothing is ever vendor specific,
> > > because vendor is not a concept that does matter in any practical sense
> > > except for tiny details like the vendor ID as one of the IDs to match
> > > on in device probing.
> > >
> > > I have no idea why people keep trying to inject this term again and
> > > again.
> >
> > Hi,
> >
> > The reasoning was that in this case we're matching vendor ID + class
> > combination to match all Intel GPUs, and not just selected device ID,
> > but I get your point.
>
> Which sounds like a really bad idea. Is this going to work on i810
> devices? Or the odd parts povervr based parts?
It's using .override_only = PCI_ID_F_VFIO_DRIVER_OVERRIDE, so it only
matters if the user was already planning to override the regular driver
with VFIO one (using driver_override sysfs).
So if it worked on i810 or other odd parts using regular vfio-pci, it
would work with xe-vfio-pci, as both are using the same underlying
functions provided by vfio-pci-core.
-Michał
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-22 9:12 ` Michał Winiarski
@ 2025-10-22 11:33 ` Jason Gunthorpe
2025-10-22 13:27 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Jason Gunthorpe @ 2025-10-22 11:33 UTC (permalink / raw)
To: Michał Winiarski
Cc: Christoph Hellwig, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Yishai Hadas, Kevin Tian,
intel-xe, linux-kernel, kvm, Matthew Brost, Michal Wajdeczko,
dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Wed, Oct 22, 2025 at 11:12:05AM +0200, Michał Winiarski wrote:
> On Wed, Oct 22, 2025 at 01:54:42AM -0700, Christoph Hellwig wrote:
> > On Wed, Oct 22, 2025 at 10:52:34AM +0200, Michał Winiarski wrote:
> > > On Wed, Oct 22, 2025 at 12:12:01AM -0700, Christoph Hellwig wrote:
> > > > There is absolutely nothing vendor-specific here, it is a device variant
> > > > driver. In fact in Linux basically nothing is ever vendor specific,
> > > > because vendor is not a concept that does matter in any practical sense
> > > > except for tiny details like the vendor ID as one of the IDs to match
> > > > on in device probing.
> > > >
> > > > I have no idea why people keep trying to inject this term again and
> > > > again.
> > >
> > > Hi,
> > >
> > > The reasoning was that in this case we're matching vendor ID + class
> > > combination to match all Intel GPUs, and not just selected device ID,
> > > but I get your point.
> >
> > Which sounds like a really bad idea. Is this going to work on i810
> > devices? Or the odd parts povervr based parts?
>
> It's using .override_only = PCI_ID_F_VFIO_DRIVER_OVERRIDE, so it only
> matters if the user was already planning to override the regular driver
> with VFIO one (using driver_override sysfs).
> So if it worked on i810 or other odd parts using regular vfio-pci, it
> would work with xe-vfio-pci, as both are using the same underlying
> functions provided by vfio-pci-core.
I also would rather see you list the actual working PCI IDs :|
Claiming all class devices for a vendor_id is something only DRM
does..
Jason
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-22 11:33 ` Jason Gunthorpe
@ 2025-10-22 13:27 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-22 13:27 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Christoph Hellwig, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Yishai Hadas, Kevin Tian,
intel-xe, linux-kernel, kvm, Matthew Brost, Michal Wajdeczko,
dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Wed, Oct 22, 2025 at 08:33:55AM -0300, Jason Gunthorpe wrote:
> On Wed, Oct 22, 2025 at 11:12:05AM +0200, Michał Winiarski wrote:
> > On Wed, Oct 22, 2025 at 01:54:42AM -0700, Christoph Hellwig wrote:
> > > On Wed, Oct 22, 2025 at 10:52:34AM +0200, Michał Winiarski wrote:
> > > > On Wed, Oct 22, 2025 at 12:12:01AM -0700, Christoph Hellwig wrote:
> > > > > There is absolutely nothing vendor-specific here, it is a device variant
> > > > > driver. In fact in Linux basically nothing is ever vendor specific,
> > > > > because vendor is not a concept that does matter in any practical sense
> > > > > except for tiny details like the vendor ID as one of the IDs to match
> > > > > on in device probing.
> > > > >
> > > > > I have no idea why people keep trying to inject this term again and
> > > > > again.
> > > >
> > > > Hi,
> > > >
> > > > The reasoning was that in this case we're matching vendor ID + class
> > > > combination to match all Intel GPUs, and not just selected device ID,
> > > > but I get your point.
> > >
> > > Which sounds like a really bad idea. Is this going to work on i810
> > > devices? Or the odd parts povervr based parts?
> >
> > It's using .override_only = PCI_ID_F_VFIO_DRIVER_OVERRIDE, so it only
> > matters if the user was already planning to override the regular driver
> > with VFIO one (using driver_override sysfs).
> > So if it worked on i810 or other odd parts using regular vfio-pci, it
> > would work with xe-vfio-pci, as both are using the same underlying
> > functions provided by vfio-pci-core.
>
> I also would rather see you list the actual working PCI IDs :|
>
> Claiming all class devices for a vendor_id is something only DRM
> does..
We already have all of the device IDs in include/drm/intel/pciids.h
So it's just a matter of adding a helper that sets an override and
including it and using a subset of ID.
I'll do that instead of matching on class.
Thanks,
-Michał
>
> Jason
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 04/26] drm/xe/pf: Add data structures and handlers for migration rings
2025-10-21 22:41 ` [PATCH v2 04/26] drm/xe/pf: Add data structures and handlers for migration rings Michał Winiarski
@ 2025-10-22 22:06 ` Michal Wajdeczko
2025-10-27 12:33 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-22 22:06 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> Migration data is queued in a per-GT ptr_ring to decouple the worker
> responsible for handling the data transfer from the .read() and .write()
> syscalls.
> Add the data structures and handlers that will be used in future
> commits.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 259 +++++++++++++++++-
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 6 +-
> .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 12 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 183 +++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 14 +
> .../drm/xe/xe_gt_sriov_pf_migration_types.h | 11 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 3 +
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 143 ++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 7 +
> .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 58 ++++
> drivers/gpu/drm/xe/xe_sriov_pf_types.h | 3 +
> 11 files changed, 684 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index b770916e88e53..cad73fdaee93c 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -19,6 +19,7 @@
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> #include "xe_sriov_pf_control.h"
> +#include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_tile.h"
>
> @@ -185,9 +186,15 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> CASE2STR(PAUSE_FAILED);
> CASE2STR(PAUSED);
> CASE2STR(SAVE_WIP);
> + CASE2STR(SAVE_PROCESS_DATA);
> + CASE2STR(SAVE_WAIT_DATA);
> + CASE2STR(SAVE_DATA_DONE);
> CASE2STR(SAVE_FAILED);
> CASE2STR(SAVED);
> CASE2STR(RESTORE_WIP);
> + CASE2STR(RESTORE_PROCESS_DATA);
> + CASE2STR(RESTORE_WAIT_DATA);
> + CASE2STR(RESTORE_DATA_DONE);
> CASE2STR(RESTORE_FAILED);
> CASE2STR(RESTORED);
> CASE2STR(RESUME_WIP);
> @@ -804,9 +811,50 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> return -ECANCELED;
> }
>
> +/**
> + * DOC: The VF SAVE state machine
> + *
> + * SAVE extends the PAUSED state.
> + *
> + * The VF SAVE state machine looks like::
> + *
> + * ....PAUSED....................................................
> + * : :
> + * : (O)<---------o :
> + * : | \ :
> + * : save (SAVED) (SAVE_FAILED) :
> + * : | ^ ^ :
> + * : | | | :
> + * : ....V...............o...........o......SAVE_WIP......... :
> + * : : | | | : :
> + * : : | empty | : :
> + * : : | | | : :
> + * : : | | | : :
> + * : : | DATA_DONE | : :
> + * : : | ^ | : :
> + * : : | | error : :
> + * : : | no_data / : :
> + * : : | / / : :
> + * : : | / / : :
> + * : : | / / : :
> + * : : o---------->PROCESS_DATA<----consume : :
> + * : : \ \ : :
> + * : : \ \ : :
> + * : : \ \ : :
> + * : : ring_full----->WAIT_DATA : :
> + * : : : :
> + * : :......................................................: :
> + * :............................................................:
this will not render correctly (missing extra indent, RESTORE_WIP below is fine)
> + *
> + * For the full state machine view, see `The VF state machine`_.
> + */
> static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> {
> - pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> + }
> }
>
> static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
> @@ -821,12 +869,39 @@ static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
> pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> }
>
> +static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> + pf_exit_vf_wip(gt, vfid);
> +}
> +
> +static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
> +{
> + return 0;
> +}
> +
> static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
> {
> - if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> + int ret;
> +
> + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA))
> return false;
>
> - pf_enter_vf_saved(gt, vfid);
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
this seems to be done too early
> + if (xe_gt_sriov_pf_migration_ring_full(gt, vfid)) {
you should enter(WAIT_DATA) here
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
and don't re-enter(PROCESS_DATA) as we shouldn't be in both sub-states at the same time
transition from WAIT to PROCESS shall be done in
pf_exit_vf_wait(gt, vf)
{
if (exit(WAIT))
enter(PROCESS_DATA)
queue
}
called from xe_gt_sriov_pf_control_process_save_data()
> +
> + return true;
> + }
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
> +
> + ret = pf_handle_vf_save_data(gt, vfid);
> + if (ret == -EAGAIN)
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> + else if (ret)
> + pf_enter_vf_save_failed(gt, vfid);
> + else
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
>
> return true;
> }
> @@ -834,6 +909,7 @@ static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
> static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> {
> if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> pf_enter_vf_wip(gt, vfid);
> pf_queue_vf(gt, vfid);
> return true;
> @@ -842,6 +918,36 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> return false;
> }
>
> +/**
> + * xe_gt_sriov_pf_control_check_save_data_done() - Check if all save migration data was produced.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +bool xe_gt_sriov_pf_control_check_save_data_done(struct xe_gt *gt, unsigned int vfid)
> +{
> + return pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_process_save_data() - Queue VF save migration data processing.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + */
> +void xe_gt_sriov_pf_control_process_save_data(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
> + return;
> +
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA))
> + pf_queue_vf(gt, vfid);
this should wrapped into:
exit_vf_wait_data()
where actual transition to PROCESS will happen
> +}
> +
> /**
> * xe_gt_sriov_pf_control_trigger_save_vf() - Start an SR-IOV VF migration data save sequence.
> * @gt: the &xe_gt
> @@ -887,19 +993,62 @@ int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid)
> */
> int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
> {
> - if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED)) {
> - pf_enter_vf_mismatch(gt, vfid);
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE)) {
> + xe_gt_sriov_err(gt, "VF%u save is still in progress!\n", vfid);
> return -EIO;
> }
>
> pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> + pf_enter_vf_saved(gt, vfid);
>
> return 0;
> }
>
> +/**
> + * DOC: The VF RESTORE state machine
> + *
> + * RESTORE extends the PAUSED state.
> + *
> + * The VF RESTORE state machine looks like::
> + *
> + * ....PAUSED....................................................
> + * : :
> + * : (O)<---------o :
> + * : | \ :
> + * : restore (RESTORED) (RESTORE_FAILED) :
> + * : | ^ ^ :
> + * : | | | :
> + * : ....V...............o...........o......RESTORE_WIP...... :
> + * : : | | | : :
> + * : : | empty | : :
> + * : : | | | : :
> + * : : | | | : :
> + * : : | DATA_DONE | : :
> + * : : | ^ | : :
> + * : : | | error : :
> + * : : | trailer / : :
> + * : : | / / : :
> + * : : | / / : :
> + * : : | / / : :
> + * : : o---------->PROCESS_DATA<----produce : :
> + * : : \ \ : :
> + * : : \ \ : :
> + * : : \ \ : :
> + * : : ring_empty---->WAIT_DATA : :
> + * : : : :
> + * : :......................................................: :
> + * :............................................................:
> + *
> + * For the full state machine view, see `The VF state machine`_.
> + */
> static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> {
> - pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP);
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE);
> + }
> }
>
> static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
> @@ -914,12 +1063,50 @@ static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
> pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> }
>
> +static void pf_enter_vf_restore_failed(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> + pf_exit_vf_wip(gt, vfid);
> +}
> +
> +static int
no need to split the line
> +pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_sriov_migration_data *data = xe_gt_sriov_pf_migration_restore_consume(gt, vfid);
> +
> + xe_gt_assert(gt, data);
> +
> + xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> +
> + return 0;
> +}
> +
> static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
> {
> - if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> + int ret;
> +
> + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA))
> return false;
>
> - pf_enter_vf_restored(gt, vfid);
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
maybe you shouldn't enter(WAIT_DATA) here
> + if (xe_gt_sriov_pf_migration_ring_empty(gt, vfid)) {
but here
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE)) {
hmm, there should be no direct transition from WAIT_DATA to DONE
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
> + pf_enter_vf_restored(gt, vfid);
> +
> + return true;
> + }
or just here
> +
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
and transition back to PROCESS only on exit(WAIT) called below
> + return true;
> + }
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
> +
> + ret = pf_handle_vf_restore_data(gt, vfid);
> + if (ret)
> + pf_enter_vf_restore_failed(gt, vfid);
> + else
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
>
> return true;
> }
> @@ -927,6 +1114,7 @@ static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
> static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> {
> if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
> pf_enter_vf_wip(gt, vfid);
> pf_queue_vf(gt, vfid);
> return true;
> @@ -935,6 +1123,41 @@ static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> return false;
> }
>
> +/**
> + * xe_gt_sriov_pf_control_restore_data_done() - Indicate the end of VF migration data stream.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_restore_data_done(struct xe_gt *gt, unsigned int vfid)
> +{
shouldn't we have additional state checks here?
expect(RESTORE_WIP)
expect(RESTORE_PROCESS_DATA) ?
this one below just looks for one-time entry, but can we really enter anytime ?
> + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE)) {
> + pf_enter_vf_state_machine_bug(gt, vfid);
> + return -EIO;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_process_restore_data() - Queue VF restore migration data processing.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + */
> +void xe_gt_sriov_pf_control_process_restore_data(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> + pf_enter_vf_state_machine_bug(gt, vfid);
> +
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA))
> + pf_queue_vf(gt, vfid);
IMO the transition to PROCESS shall be also done as part of exit(WAIT_DATA)
> +}
> +
> /**
> * xe_gt_sriov_pf_control_trigger restore_vf() - Start an SR-IOV VF migration data restore sequence.
> * @gt: the &xe_gt
> @@ -1000,11 +1223,9 @@ int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid
> {
> int ret;
>
> - if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> - ret = pf_wait_vf_restore_done(gt, vfid);
> - if (ret)
> - return ret;
> - }
> + ret = pf_wait_vf_restore_done(gt, vfid);
> + if (ret)
> + return ret;
>
> if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED)) {
> pf_enter_vf_mismatch(gt, vfid);
> @@ -1703,9 +1924,21 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
> if (pf_exit_vf_pause_save_guc(gt, vfid))
> return true;
>
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA)) {
> + xe_gt_sriov_dbg_verbose(gt, "VF%u in %s\n", vfid,
> + control_bit_to_string(XE_GT_SRIOV_STATE_SAVE_WAIT_DATA));
> + return false;
> + }
> +
> if (pf_handle_vf_save(gt, vfid))
> return true;
>
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA)) {
> + xe_gt_sriov_dbg_verbose(gt, "VF%u in %s\n", vfid,
> + control_bit_to_string(XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA));
> + return false;
> + }
> +
> if (pf_handle_vf_restore(gt, vfid))
> return true;
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> index abc233f6302ed..6b1ab339e3b73 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> @@ -14,12 +14,14 @@ struct xe_gt;
> int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
> void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
>
> -bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
> -
> int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
> +bool xe_gt_sriov_pf_control_check_save_data_done(struct xe_gt *gt, unsigned int vfid);
> +void xe_gt_sriov_pf_control_process_save_data(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_restore_data_done(struct xe_gt *gt, unsigned int vfid);
> +void xe_gt_sriov_pf_control_process_restore_data(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_stop_vf(struct xe_gt *gt, unsigned int vfid);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> index e113dc98b33ce..6e19a8ea88f0b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> @@ -32,9 +32,15 @@
> * @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
> * @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
> * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
> + * @XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA: indicates that VF migration data is being produced.
> + * @XE_GT_SRIOV_STATE_SAVE_WAIT_DATA: indicates that PF awaits for space in migration data ring.
> + * @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
> * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
> * @XE_GT_SRIOV_STATE_RESTORE_WIP: indicates that VF restore operation is in progress.
> + * @XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA: indicates that VF migration data is being consumed.
> + * @XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA: indicates that PF awaits for data in migration data ring.
> + * @XE_GT_SRIOV_STATE_RESTORE_DATA_DONE: indicates that all migration data was produced by the user.
> * @XE_GT_SRIOV_STATE_RESTORE_FAILED: indicates that VF restore operation has failed.
> * @XE_GT_SRIOV_STATE_RESTORED: indicates that VF data is restored.
> * @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress.
> @@ -70,10 +76,16 @@ enum xe_gt_sriov_control_bits {
> XE_GT_SRIOV_STATE_PAUSED,
>
> XE_GT_SRIOV_STATE_SAVE_WIP,
> + XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
> + XE_GT_SRIOV_STATE_SAVE_WAIT_DATA,
> + XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
> XE_GT_SRIOV_STATE_SAVE_FAILED,
> XE_GT_SRIOV_STATE_SAVED,
>
> XE_GT_SRIOV_STATE_RESTORE_WIP,
> + XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA,
> + XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA,
> + XE_GT_SRIOV_STATE_RESTORE_DATA_DONE,
> XE_GT_SRIOV_STATE_RESTORE_FAILED,
> XE_GT_SRIOV_STATE_RESTORED,
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index ca28f45aaf481..b6ffd982d6007 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -7,6 +7,7 @@
>
> #include "abi/guc_actions_sriov_abi.h"
> #include "xe_bo.h"
> +#include "xe_gt_sriov_pf_control.h"
> #include "xe_gt_sriov_pf_helpers.h"
> #include "xe_gt_sriov_pf_migration.h"
> #include "xe_gt_sriov_printk.h"
> @@ -15,6 +16,17 @@
> #include "xe_sriov.h"
> #include "xe_sriov_pf_migration.h"
>
> +#define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> +
> +static struct xe_gt_sriov_migration_data *pf_pick_gt_migration(struct xe_gt *gt, unsigned int vfid)
> +{
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid != PFID);
> + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> +
> + return >->sriov.pf.vfs[vfid].migration;
> +}
> +
> /* Return: number of dwords saved/restored/required or a negative error code on failure */
> static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
> u64 addr, u32 ndwords)
> @@ -382,6 +394,162 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> }
> #endif /* CONFIG_DEBUG_FS */
>
> +/**
> + * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * Return: true if the ring is empty, otherwise false.
> + */
> +bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid)
> +{
> + return ptr_ring_empty(&pf_pick_gt_migration(gt, vfid)->ring);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_ring_full() - Check if a migration ring is full.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * Return: true if the ring is full, otherwise false.
> + */
> +bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid)
> +{
> + return ptr_ring_full(&pf_pick_gt_migration(gt, vfid)->ring);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_save_produce() - Add VF save data packet to migration ring.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + * @data: &xe_sriov_migration_data packet
> + *
> + * Called by the save migration data producer (PF SR-IOV Control worker) when
> + * processing migration data.
> + * Wakes up the save migration data consumer (userspace), that is potentially
> + * waiting for data when the ring is empty.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_migration_data *data)
> +{
> + int ret;
> +
> + ret = ptr_ring_produce(&pf_pick_gt_migration(gt, vfid)->ring, data);
> + if (ret)
> + return ret;
> +
> + wake_up_all(xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid));
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_restore_consume() - Get VF restore data packet from migration ring.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * Called by the restore migration data consumer (PF SR-IOV Control worker) when
> + * processing migration data.
> + * Wakes up the restore migration data producer (userspace), that is
> + * potentially waiting to add more data when the ring is full.
> + *
> + * Return: Pointer to &struct xe_sriov_migration_data on success,
> + * NULL if ring is empty.
> + */
> +struct xe_sriov_migration_data *
> +xe_gt_sriov_pf_migration_restore_consume(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> + struct xe_sriov_migration_data *data;
> +
> + data = ptr_ring_consume(&migration->ring);
> + if (data)
> + wake_up_all(wq);
> +
> + return data;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_restore_produce() - Add VF restore data packet to migration ring.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + * @data: &xe_sriov_migration_data packet
> + *
> + * Called by the restore migration data producer (userspace) when processing
> + * migration data.
> + * If the ring is full, waits until there is space.
> + * Queues the restore migration data consumer (PF SR-IOV Control worker), that
> + * is potentially waiting for data when the ring is empty.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_migration_restore_produce(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_migration_data *data)
> +{
> + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> + int ret;
> +
> + xe_gt_assert(gt, data->tile == gt->tile->id);
> + xe_gt_assert(gt, data->gt == gt->info.id);
> +
> + while (1) {
or for (;;)
> + ret = ptr_ring_produce(&migration->ring, data);
> + if (!ret)
> + break;
> +
> + ret = wait_event_interruptible(*wq, !ptr_ring_full(&migration->ring));
> + if (ret)
> + return ret;
> + }
> +
> + xe_gt_sriov_pf_control_process_restore_data(gt, vfid);
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_save_consume() - Get VF save data packet from migration ring.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * Called by the save migration data consumer (userspace) when
> + * processing migration data.
> + * Queues the save migration data producer (PF SR-IOV Control worker), that is
> + * potentially waiting to add more data when the ring is full.
> + *
> + * Return: Pointer to &struct xe_sriov_migration_data on success,
> + * NULL if ring is empty and there's no more data available,
> + * ERR_PTR(-EAGAIN) if the ring is empty, but data is still produced.
> + */
> +struct xe_sriov_migration_data *
> +xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> + struct xe_sriov_migration_data *data;
> +
> + data = ptr_ring_consume(&migration->ring);
> + if (data) {
> + xe_gt_sriov_pf_control_process_save_data(gt, vfid);
> + return data;
> + }
> +
> + if (xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
> + return NULL;
> +
> + return ERR_PTR(-EAGAIN);
> +}
> +
> +static void action_ring_cleanup(struct drm_device *dev, void *arg)
> +{
> + struct ptr_ring *r = arg;
> +
> + ptr_ring_cleanup(r, NULL);
> +}
> +
> /**
> * xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
> * @gt: the &xe_gt
> @@ -393,6 +561,7 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> {
> struct xe_device *xe = gt_to_xe(gt);
> + unsigned int n, totalvfs;
> int err;
>
> xe_gt_assert(gt, IS_SRIOV_PF(xe));
> @@ -404,5 +573,19 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> if (err)
> return err;
>
> + totalvfs = xe_sriov_pf_get_totalvfs(xe);
> + for (n = 1; n <= totalvfs; n++) {
> + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, n);
> +
> + err = ptr_ring_init(&migration->ring,
> + XE_GT_SRIOV_PF_MIGRATION_RING_SIZE, GFP_KERNEL);
> + if (err)
> + return err;
> +
> + err = drmm_add_action_or_reset(&xe->drm, action_ring_cleanup, &migration->ring);
should we wait until drmm cleanup or devm cleanup ?
> + if (err)
> + return err;
> + }
> +
> return 0;
> }
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index 09faeae00ddbb..9e67f18ded205 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -9,11 +9,25 @@
> #include <linux/types.h>
>
> struct xe_gt;
> +struct xe_sriov_migration_data;
>
> int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
>
> +bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> +bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid);
> +
> +int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_migration_data *data);
> +struct xe_sriov_migration_data *
> +xe_gt_sriov_pf_migration_restore_consume(struct xe_gt *gt, unsigned int vfid);
> +
> +int xe_gt_sriov_pf_migration_restore_produce(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_migration_data *data);
> +struct xe_sriov_migration_data *
> +xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid);
> +
> #ifdef CONFIG_DEBUG_FS
> ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
> char __user *buf, size_t count, loff_t *pos);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> index 9d672feac5f04..84be6fac16c8b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> @@ -7,6 +7,7 @@
> #define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
>
> #include <linux/mutex.h>
> +#include <linux/ptr_ring.h>
> #include <linux/types.h>
>
> /**
> @@ -24,6 +25,16 @@ struct xe_gt_sriov_state_snapshot {
> } guc;
> };
>
> +/**
> + * struct xe_gt_sriov_migration_data - GT-level per-VF migration data.
> + *
> + * Used by the PF driver to maintain per-VF migration data.
> + */
> +struct xe_gt_sriov_migration_data {
> + /** @ring: queue containing VF save / restore migration data */
> + struct ptr_ring ring;
> +};
> +
> /**
> * struct xe_gt_sriov_pf_migration - GT-level data.
> *
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> index a64a6835ad656..812e74d3f8f80 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> @@ -33,6 +33,9 @@ struct xe_gt_sriov_metadata {
>
> /** @snapshot: snapshot of the VF state data */
> struct xe_gt_sriov_state_snapshot snapshot;
> +
> + /** @migration: per-VF migration data. */
> + struct xe_gt_sriov_migration_data migration;
> };
>
> /**
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index 8c523c392f98b..eaf581317bdef 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -3,8 +3,36 @@
> * Copyright © 2025 Intel Corporation
> */
>
> +#include <drm/drm_managed.h>
> +
> +#include "xe_device.h"
> +#include "xe_gt_sriov_pf_control.h"
> +#include "xe_gt_sriov_pf_migration.h"
> +#include "xe_pm.h"
> #include "xe_sriov.h"
> +#include "xe_sriov_pf_helpers.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_printk.h"
> +
> +static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> +
> + return &xe->sriov.pf.vfs[vfid].migration;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_waitqueue - Get waitqueue for migration.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * Return: pointer to the migration waitqueue.
> + */
> +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
> +{
> + return &pf_pick_migration(xe, vfid)->wq;
> +}
>
> /**
> * xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
> @@ -33,9 +61,124 @@ static bool pf_check_migration_support(struct xe_device *xe)
> */
> int xe_sriov_pf_migration_init(struct xe_device *xe)
> {
> + unsigned int n, totalvfs;
> +
> xe_assert(xe, IS_SRIOV_PF(xe));
>
> xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
> + if (!xe_sriov_pf_migration_supported(xe))
> + return 0;
> +
> + totalvfs = xe_sriov_pf_get_totalvfs(xe);
> + for (n = 1; n <= totalvfs; n++) {
> + struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
> +
> + init_waitqueue_head(&migration->wq);
> + }
>
> return 0;
> }
> +
> +static bool pf_migration_data_ready(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + u8 gt_id;
> +
> + for_each_gt(gt, xe, gt_id) {
> + if (!xe_gt_sriov_pf_migration_ring_empty(gt, vfid) ||
> + xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static struct xe_sriov_migration_data *
> +pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_migration_data *data;
> + struct xe_gt *gt;
> + u8 gt_id;
> + bool more_data = false;
> +
> + for_each_gt(gt, xe, gt_id) {
> + data = xe_gt_sriov_pf_migration_save_consume(gt, vfid);
> + if (data && PTR_ERR(data) != EAGAIN)
> + return data;
> + if (PTR_ERR(data) == -EAGAIN)
> + more_data = true;
> + }
> +
> + if (!more_data)
> + return NULL;
> +
> + return ERR_PTR(-EAGAIN);
> +}
> +
> +/**
> + * xe_sriov_pf_migration_save_consume() - Consume a VF migration data packet from the device.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * Called by the save migration data consumer (userspace) when
> + * processing migration data.
> + * If there is no migration data to process, wait until more data is available.
> + *
> + * Return: Pointer to &xe_sriov_migration_data on success,
> + * NULL if ring is empty and no more migration data is expected,
> + * ERR_PTR value in case of error.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +struct xe_sriov_migration_data *
> +xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, vfid);
> + struct xe_sriov_migration_data *data;
> + int ret;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + while (1) {
> + data = pf_migration_consume(xe, vfid);
> + if (PTR_ERR(data) != -EAGAIN)
> + goto out;
just
break; ?
> +
> + ret = wait_event_interruptible(migration->wq,
> + pf_migration_data_ready(xe, vfid));
> + if (ret)
> + return ERR_PTR(ret);
> + }
> +
> +out:
> + return data;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_restore_produce() - Produce a VF migration data packet to the device.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + * @data: Pointer to &xe_sriov_migration_data
> + *
> + * Called by the restore migration data producer (userspace) when processing
> + * migration data.
> + * If the underlying data structure is full, wait until there is space.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_migration_data *data)
> +{
> + struct xe_gt *gt;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + gt = xe_device_get_gt(xe, data->gt);
> + if (!gt || data->tile != gt->tile->id) {
> + xe_sriov_err_ratelimited(xe, "VF%d Invalid GT - tile:%u, GT:%u\n",
> + vfid, data->tile, data->gt);
> + return -EINVAL;
> + }
> +
> + return xe_gt_sriov_pf_migration_restore_produce(gt, vfid, data);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> index d2b4a24165438..df81a540c246a 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> @@ -7,10 +7,17 @@
> #define _XE_SRIOV_PF_MIGRATION_H_
>
> #include <linux/types.h>
> +#include <linux/wait.h>
>
> struct xe_device;
> +struct xe_sriov_migration_data;
>
> int xe_sriov_pf_migration_init(struct xe_device *xe);
> bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> +int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_migration_data *data);
> +struct xe_sriov_migration_data *
> +xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid);
> +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid);
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> index e69de29bb2d1d..2a45ee4e3ece8 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> @@ -0,0 +1,58 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_SRIOV_PF_MIGRATION_TYPES_H_
> +#define _XE_SRIOV_PF_MIGRATION_TYPES_H_
> +
> +#include <linux/types.h>
> +#include <linux/wait.h>
> +
> +/**
> + * struct xe_sriov_migration_data - Xe SR-IOV VF migration data packet
> + */
> +struct xe_sriov_migration_data {
> + /** @xe: Xe device */
> + struct xe_device *xe;
> + /** @vaddr: CPU pointer to payload data */
> + void *vaddr;
> + /** @remaining: payload data remaining */
> + size_t remaining;
> + /** @hdr_remaining: header data remaining */
> + size_t hdr_remaining;
> + union {
> + /** @bo: Buffer object with migration data */
> + struct xe_bo *bo;
> + /** @buff: Buffer with migration data */
> + void *buff;
> + };
> + __struct_group(xe_sriov_pf_migration_hdr, hdr, __packed,
> + /** @hdr.version: migration data protocol version */
> + u8 version;
> + /** @hdr.type: migration data type */
> + u8 type;
> + /** @hdr.tile: migration data tile id */
> + u8 tile;
> + /** @hdr.gt: migration data gt id */
> + u8 gt;
> + /** @hdr.flags: migration data flags */
> + u32 flags;
> + /** @hdr.offset: offset into the resource;
> + * used when multiple packets of given type are used for migration
> + */
> + u64 offset;
> + /** @hdr.size: migration data size */
> + u64 size;
> + );
> +};
> +
> +/**
> + * struct xe_sriov_pf_migration - Per VF device-level migration related data
> + */
> +struct xe_sriov_pf_migration {
> + /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
> + wait_queue_head_t wq;
> +};
> +
> +#endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> index 24d22afeececa..c92baaa1694ca 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> @@ -9,6 +9,7 @@
> #include <linux/mutex.h>
> #include <linux/types.h>
>
> +#include "xe_sriov_pf_migration_types.h"
> #include "xe_sriov_pf_provision_types.h"
> #include "xe_sriov_pf_service_types.h"
>
> @@ -18,6 +19,8 @@
> struct xe_sriov_metadata {
> /** @version: negotiated VF/PF ABI version */
> struct xe_sriov_pf_service_version version;
> + /** @migration: migration data */
> + struct xe_sriov_pf_migration migration;
> };
>
> /**
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 05/26] drm/xe/pf: Add helpers for migration data allocation / free
2025-10-21 22:41 ` [PATCH v2 05/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
@ 2025-10-22 22:18 ` Michal Wajdeczko
2025-10-27 12:47 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-22 22:18 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> Now that it's possible to free the packets - connect the restore
> handling logic with the ring.
> The helpers will also be used in upcoming changes that will start producing
> migration data packets.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 1 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 7 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 29 +++-
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 1 +
> drivers/gpu/drm/xe/xe_sriov_migration_data.c | 127 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_migration_data.h | 31 +++++
> 6 files changed, 195 insertions(+), 1 deletion(-)
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_migration_data.c
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_migration_data.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 89e5b26c27975..3d72db9e528e4 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -173,6 +173,7 @@ xe-$(CONFIG_PCI_IOV) += \
> xe_lmtt_2l.o \
> xe_lmtt_ml.o \
> xe_pci_sriov.o \
> + xe_sriov_migration_data.o \
> xe_sriov_pf.o \
> xe_sriov_pf_control.o \
> xe_sriov_pf_debugfs.o \
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index cad73fdaee93c..dd9bc9c99f78c 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -18,6 +18,7 @@
> #include "xe_gt_sriov_printk.h"
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> +#include "xe_sriov_migration_data.h"
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_service.h"
> @@ -851,6 +852,8 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> {
> if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> + xe_gt_sriov_pf_migration_ring_free(gt, vfid);
> +
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> @@ -1045,6 +1048,8 @@ int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
> static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> {
> if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + xe_gt_sriov_pf_migration_ring_free(gt, vfid);
> +
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE);
> @@ -1078,6 +1083,8 @@ pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
>
> xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
>
> + xe_sriov_migration_data_free(data);
> +
> return 0;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index b6ffd982d6007..8ba72165759b3 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -14,6 +14,7 @@
> #include "xe_guc.h"
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> +#include "xe_sriov_migration_data.h"
> #include "xe_sriov_pf_migration.h"
>
> #define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> @@ -418,6 +419,25 @@ bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid)
> return ptr_ring_full(&pf_pick_gt_migration(gt, vfid)->ring);
> }
>
> +/**
> + * xe_gt_sriov_pf_migration_ring_free() - Consume and free all data in migration ring
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + */
> +void xe_gt_sriov_pf_migration_ring_free(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> + struct xe_sriov_migration_data *data;
> +
> + if (ptr_ring_empty(&migration->ring))
> + return;
> +
> + xe_gt_sriov_notice(gt, "VF%u unprocessed migration data left in the ring!\n", vfid);
> +
> + while ((data = ptr_ring_consume(&migration->ring)))
> + xe_sriov_migration_data_free(data);
> +}
> +
> /**
> * xe_gt_sriov_pf_migration_save_produce() - Add VF save data packet to migration ring.
> * @gt: the &xe_gt
> @@ -543,11 +563,18 @@ xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid)
> return ERR_PTR(-EAGAIN);
> }
>
> +static void pf_mig_data_destroy(void *ptr)
> +{
> + struct xe_sriov_migration_data *data = ptr;
> +
> + xe_sriov_migration_data_free(data);
> +}
> +
> static void action_ring_cleanup(struct drm_device *dev, void *arg)
> {
> struct ptr_ring *r = arg;
>
> - ptr_ring_cleanup(r, NULL);
> + ptr_ring_cleanup(r, pf_mig_data_destroy);
> }
>
> /**
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index 9e67f18ded205..1ed2248f0a17e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -17,6 +17,7 @@ int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vf
>
> bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid);
> +void xe_gt_sriov_pf_migration_ring_free(struct xe_gt *gt, unsigned int vfid);
>
> int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
> struct xe_sriov_migration_data *data);
> diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> new file mode 100644
> index 0000000000000..b04f9be3b7fed
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> @@ -0,0 +1,127 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include "xe_bo.h"
> +#include "xe_device.h"
> +#include "xe_sriov_migration_data.h"
> +
> +static bool data_needs_bo(struct xe_sriov_migration_data *data)
> +{
> + return data->type == XE_SRIOV_MIGRATION_DATA_TYPE_VRAM;
> +}
> +
> +/**
> + * xe_sriov_migration_data() - Allocate migration data packet
> + * @xe: the &xe_device
> + *
> + * Only allocates the "outer" structure, without initializing the migration
> + * data backing storage.
> + *
> + * Return: Pointer to &xe_sriov_migration_data on success,
> + * NULL in case of error.
> + */
> +struct xe_sriov_migration_data *
no line split
> +xe_sriov_migration_data_alloc(struct xe_device *xe)
> +{
> + struct xe_sriov_migration_data *data;
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> + if (!data)
> + return NULL;
> +
> + data->xe = xe;
> + data->hdr_remaining = sizeof(data->hdr);
> +
> + return data;
> +}
> +
> +/**
> + * xe_sriov_migration_data_free() - Free migration data packet.
> + * @data: the &xe_sriov_migration_data packet
> + */
> +void xe_sriov_migration_data_free(struct xe_sriov_migration_data *data)
> +{
> + if (data_needs_bo(data))
> + xe_bo_unpin_map_no_vm(data->bo);
> + else
> + kvfree(data->buff);
> +
> + kfree(data);
> +}
> +
> +static int mig_data_init(struct xe_sriov_migration_data *data)
> +{
> + struct xe_gt *gt = xe_device_get_gt(data->xe, data->gt);
> +
> + if (data->size == 0)
> + return 0;
> +
> + if (data_needs_bo(data)) {
struct xe_bo *bo;
then
bo = ...
so will not have that long line
> + struct xe_bo *bo = xe_bo_create_pin_map_novm(data->xe, gt->tile,
> + PAGE_ALIGN(data->size),
> + ttm_bo_type_kernel,
> + XE_BO_FLAG_SYSTEM | XE_BO_FLAG_PINNED,
> + false);
> + if (IS_ERR(bo))
> + return PTR_ERR(bo);
> +
> + data->bo = bo;
> + data->vaddr = bo->vmap.vaddr;
> + } else {
> + void *buff = kvzalloc(data->size, GFP_KERNEL);
> +
> + if (!buff)
> + return -ENOMEM;
> +
> + data->buff = buff;
> + data->vaddr = buff;
> + }
> +
> + return 0;
> +}
> +
> +#define XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION 1
> +/**
> + * xe_sriov_migration_data_init() - Initialize the migration data header and backing storage.
> + * @data: the &xe_sriov_migration_data packet
> + * @tile_id: tile identifier
> + * @gt_id: GT identifier
> + * @type: &xe_sriov_migration_data_type
> + * @offset: offset of data packet payload (within wider resource)
> + * @size: size of data packet payload
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
> + enum xe_sriov_migration_data_type type, loff_t offset, size_t size)
> +{
> + data->version = XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION;
> + data->type = type;
> + data->tile = tile_id;
> + data->gt = gt_id;
> + data->offset = offset;
> + data->size = size;
> + data->remaining = size;
> +
> + return mig_data_init(data);
> +}
> +
> +/**
> + * xe_sriov_migration_data_init() - Initialize the migration data backing storage based on header.
> + * @data: the &xe_sriov_migration_data packet
> + *
> + * Header data is expected to be filled prior to calling this function.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *data)
> +{
> + if (data->version != XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION)
> + return -EINVAL;
> +
> + data->remaining = data->size;
> +
> + return mig_data_init(data);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> new file mode 100644
> index 0000000000000..ef65dccddc035
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_SRIOV_MIGRATION_DATA_H_
> +#define _XE_SRIOV_MIGRATION_DATA_H_
> +
> +#include <linux/types.h>
> +
> +struct xe_device;
> +
> +enum xe_sriov_migration_data_type {
> + /* Skipping 0 to catch uninitialized data */
> + XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR = 1,
> + XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER,
> + XE_SRIOV_MIGRATION_DATA_TYPE_GGTT,
> + XE_SRIOV_MIGRATION_DATA_TYPE_MMIO,
> + XE_SRIOV_MIGRATION_DATA_TYPE_GUC,
> + XE_SRIOV_MIGRATION_DATA_TYPE_VRAM,
> +};
> +
> +struct xe_sriov_migration_data *
no need for line split here
> +xe_sriov_migration_data_alloc(struct xe_device *xe);
> +void xe_sriov_migration_data_free(struct xe_sriov_migration_data *snapshot);
> +
> +int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
> + enum xe_sriov_migration_data_type, loff_t offset, size_t size);
> +int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *snapshot);
> +
> +#endif
just few nits, otherwise LGTM
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs
2025-10-21 22:41 ` [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
@ 2025-10-22 22:31 ` Michal Wajdeczko
2025-10-27 12:02 ` Michał Winiarski
2025-10-28 3:06 ` Tian, Kevin
1 sibling, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-22 22:31 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> The states will be used by upcoming changes to produce (in case of save)
> or consume (in case of resume) the VF migration data.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 248 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 6 +
> .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 14 +
> drivers/gpu/drm/xe/xe_sriov_pf_control.c | 96 +++++++
> drivers/gpu/drm/xe/xe_sriov_pf_control.h | 4 +
> drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 38 +++
> 6 files changed, 406 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index 2e6bd3d1fe1da..b770916e88e53 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -184,6 +184,12 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> CASE2STR(PAUSE_SAVE_GUC);
> CASE2STR(PAUSE_FAILED);
> CASE2STR(PAUSED);
> + CASE2STR(SAVE_WIP);
> + CASE2STR(SAVE_FAILED);
> + CASE2STR(SAVED);
> + CASE2STR(RESTORE_WIP);
> + CASE2STR(RESTORE_FAILED);
> + CASE2STR(RESTORED);
> CASE2STR(RESUME_WIP);
> CASE2STR(RESUME_SEND_RESUME);
> CASE2STR(RESUME_FAILED);
> @@ -208,6 +214,8 @@ static unsigned long pf_get_default_timeout(enum xe_gt_sriov_control_bits bit)
> case XE_GT_SRIOV_STATE_FLR_WIP:
> case XE_GT_SRIOV_STATE_FLR_RESET_CONFIG:
> return 5 * HZ;
> + case XE_GT_SRIOV_STATE_RESTORE_WIP:
> + return 20 * HZ;
> default:
> return HZ;
> }
> @@ -329,6 +337,8 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> }
>
> #define pf_enter_vf_state_machine_bug(gt, vfid) ({ \
> @@ -359,6 +369,8 @@ static void pf_queue_vf(struct xe_gt *gt, unsigned int vfid)
>
> static void pf_exit_vf_flr_wip(struct xe_gt *gt, unsigned int vfid);
> static void pf_exit_vf_stop_wip(struct xe_gt *gt, unsigned int vfid);
> +static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid);
> +static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid);
> static void pf_exit_vf_pause_wip(struct xe_gt *gt, unsigned int vfid);
> static void pf_exit_vf_resume_wip(struct xe_gt *gt, unsigned int vfid);
>
> @@ -380,6 +392,8 @@ static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
>
> pf_exit_vf_flr_wip(gt, vfid);
> pf_exit_vf_stop_wip(gt, vfid);
> + pf_exit_vf_save_wip(gt, vfid);
> + pf_exit_vf_restore_wip(gt, vfid);
> pf_exit_vf_pause_wip(gt, vfid);
> pf_exit_vf_resume_wip(gt, vfid);
>
> @@ -399,6 +413,8 @@ static void pf_enter_vf_ready(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> pf_exit_vf_mismatch(gt, vfid);
> pf_exit_vf_wip(gt, vfid);
> }
> @@ -675,6 +691,8 @@ static void pf_enter_vf_resumed(struct xe_gt *gt, unsigned int vfid)
> {
> pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> pf_exit_vf_mismatch(gt, vfid);
> pf_exit_vf_wip(gt, vfid);
> }
> @@ -753,6 +771,16 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> return -EPERM;
> }
>
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> + xe_gt_sriov_dbg(gt, "VF%u save is in progress!\n", vfid);
> + return -EBUSY;
> + }
> +
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + xe_gt_sriov_dbg(gt, "VF%u restore is in progress!\n", vfid);
> + return -EBUSY;
> + }
> +
> if (!pf_enter_vf_resume_wip(gt, vfid)) {
> xe_gt_sriov_dbg(gt, "VF%u resume already in progress!\n", vfid);
> return -EALREADY;
> @@ -776,6 +804,218 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> return -ECANCELED;
> }
>
> +static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> +}
> +
> +static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED))
> + pf_enter_vf_state_machine_bug(gt, vfid);
> +
> + xe_gt_sriov_dbg(gt, "VF%u saved!\n", vfid);
nit: you can move expect(PAUSED) here
> +
> + pf_exit_vf_mismatch(gt, vfid);
> + pf_exit_vf_wip(gt, vfid);
> + pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> +}
> +
> +static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> + return false;
> +
> + pf_enter_vf_saved(gt, vfid);
> +
> + return true;
> +}
> +
> +static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> + pf_enter_vf_wip(gt, vfid);
> + pf_queue_vf(gt, vfid);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_trigger_save_vf() - Start an SR-IOV VF migration data save sequence.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
> + xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
> + xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + xe_gt_sriov_dbg(gt, "VF%u restore is in progress!\n", vfid);
> + return -EBUSY;
> + }
> +
> + if (!pf_enter_vf_save_wip(gt, vfid)) {
> + xe_gt_sriov_dbg(gt, "VF%u save already in progress!\n", vfid);
> + return -EALREADY;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_finish_save_vf() - Complete a VF migration data save sequence.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED)) {
> + pf_enter_vf_mismatch(gt, vfid);
> + return -EIO;
> + }
> +
> + pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> +
> + return 0;
> +}
> +
> +static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP);
> +}
> +
> +static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED))
> + pf_enter_vf_state_machine_bug(gt, vfid);
> +
> + xe_gt_sriov_dbg(gt, "VF%u restored!\n", vfid);
> +
> + pf_exit_vf_mismatch(gt, vfid);
> + pf_exit_vf_wip(gt, vfid);
> + pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> +}
> +
> +static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> + return false;
> +
> + pf_enter_vf_restored(gt, vfid);
> +
> + return true;
> +}
> +
> +static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + pf_enter_vf_wip(gt, vfid);
> + pf_queue_vf(gt, vfid);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_trigger restore_vf() - Start an SR-IOV VF migration data restore sequence.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
> + xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
> + xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> + xe_gt_sriov_dbg(gt, "VF%u save is in progress!\n", vfid);
> + return -EBUSY;
> + }
> +
> + if (!pf_enter_vf_restore_wip(gt, vfid)) {
> + xe_gt_sriov_dbg(gt, "VF%u restore already in progress!\n", vfid);
> + return -EALREADY;
> + }
> +
> + return 0;
> +}
> +
> +static int pf_wait_vf_restore_done(struct xe_gt *gt, unsigned int vfid)
> +{
> + unsigned long timeout = pf_get_default_timeout(XE_GT_SRIOV_STATE_RESTORE_WIP);
> + int err;
> +
> + err = pf_wait_vf_wip_done(gt, vfid, timeout);
> + if (err) {
> + xe_gt_sriov_notice(gt, "VF%u RESTORE didn't finish in %u ms (%pe)\n",
> + vfid, jiffies_to_msecs(timeout), ERR_PTR(err));
> + return err;
> + }
> +
> + if (!pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED))
> + return -EIO;
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_finish_restore_vf() - Complete a VF migration data restore sequence.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid)
> +{
> + int ret;
> +
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + ret = pf_wait_vf_restore_done(gt, vfid);
> + if (ret)
> + return ret;
> + }
> +
> + if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED)) {
> + pf_enter_vf_mismatch(gt, vfid);
> + return -EIO;
> + }
> +
> + pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> +
> + return 0;
> +}
> +
> /**
> * DOC: The VF STOP state machine
> *
> @@ -817,6 +1057,8 @@ static void pf_enter_vf_stopped(struct xe_gt *gt, unsigned int vfid)
>
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> pf_exit_vf_mismatch(gt, vfid);
> pf_exit_vf_wip(gt, vfid);
> }
> @@ -1461,6 +1703,12 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
> if (pf_exit_vf_pause_save_guc(gt, vfid))
> return true;
>
> + if (pf_handle_vf_save(gt, vfid))
> + return true;
> +
> + if (pf_handle_vf_restore(gt, vfid))
> + return true;
> +
> if (pf_exit_vf_resume_send_resume(gt, vfid))
> return true;
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> index 8a72ef3778d47..abc233f6302ed 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> @@ -14,8 +14,14 @@ struct xe_gt;
> int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
> void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
>
> +bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
> +
> int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_stop_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_trigger_flr(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_sync_flr(struct xe_gt *gt, unsigned int vfid, bool sync);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> index c80b7e77f1ad2..e113dc98b33ce 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> @@ -31,6 +31,12 @@
> * @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state.
> * @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
> * @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
> + * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
> + * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> + * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
> + * @XE_GT_SRIOV_STATE_RESTORE_WIP: indicates that VF restore operation is in progress.
> + * @XE_GT_SRIOV_STATE_RESTORE_FAILED: indicates that VF restore operation has failed.
> + * @XE_GT_SRIOV_STATE_RESTORED: indicates that VF data is restored.
> * @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress.
> * @XE_GT_SRIOV_STATE_RESUME_SEND_RESUME: indicates that the PF is about to send RESUME command.
> * @XE_GT_SRIOV_STATE_RESUME_FAILED: indicates that a VF resume operation has failed.
> @@ -63,6 +69,14 @@ enum xe_gt_sriov_control_bits {
> XE_GT_SRIOV_STATE_PAUSE_FAILED,
> XE_GT_SRIOV_STATE_PAUSED,
>
> + XE_GT_SRIOV_STATE_SAVE_WIP,
> + XE_GT_SRIOV_STATE_SAVE_FAILED,
> + XE_GT_SRIOV_STATE_SAVED,
> +
> + XE_GT_SRIOV_STATE_RESTORE_WIP,
> + XE_GT_SRIOV_STATE_RESTORE_FAILED,
> + XE_GT_SRIOV_STATE_RESTORED,
> +
> XE_GT_SRIOV_STATE_RESUME_WIP,
> XE_GT_SRIOV_STATE_RESUME_SEND_RESUME,
> XE_GT_SRIOV_STATE_RESUME_FAILED,
it is easier to understand those states after patch 04/26 with diagrams,
and while there are small and hard to avoid overlaps between 03/26 and 04/26
the patch itself LGTM, so
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> index 416d00a03fbb7..8d8a01faf5291 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> @@ -149,3 +149,99 @@ int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid)
>
> return 0;
> }
> +
> +/**
> + * xe_sriov_pf_control_trigger_save_vf - Start a VF migration data SAVE sequence on all GTs.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + unsigned int id;
> + int ret;
> +
> + for_each_gt(gt, xe, id) {
> + ret = xe_gt_sriov_pf_control_trigger_save_vf(gt, vfid);
> + if (ret)
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_sriov_pf_control_finish_save_vf - Complete a VF migration data SAVE sequence on all GTs.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_finish_save_vf(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + unsigned int id;
> + int ret;
> +
> + for_each_gt(gt, xe, id) {
> + ret = xe_gt_sriov_pf_control_finish_save_vf(gt, vfid);
> + if (ret)
> + break;
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * xe_sriov_pf_control_trigger_restore_vf - Start a VF migration data RESTORE sequence on all GTs.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_trigger_restore_vf(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + unsigned int id;
> + int ret;
> +
> + for_each_gt(gt, xe, id) {
> + ret = xe_gt_sriov_pf_control_trigger_restore_vf(gt, vfid);
> + if (ret)
> + return ret;
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * xe_sriov_pf_control_wait_restore_vf - Complete a VF migration data RESTORE sequence in all GTs.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_finish_restore_vf(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + unsigned int id;
> + int ret;
> +
> + for_each_gt(gt, xe, id) {
> + ret = xe_gt_sriov_pf_control_finish_restore_vf(gt, vfid);
> + if (ret)
> + break;
> + }
> +
> + return ret;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> index 2d52d0ac1b28f..30318c1fba34e 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> @@ -13,5 +13,9 @@ int xe_sriov_pf_control_resume_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_stop_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_finish_save_vf(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_trigger_restore_vf(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_finish_restore_vf(struct xe_device *xe, unsigned int vfid);
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> index a81aa05c55326..e0e6340c49106 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> @@ -136,11 +136,31 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
> * │ │ ├── reset
> * │ │ ├── resume
> * │ │ ├── stop
> + * │ │ ├── save
> + * │ │ ├── restore
> * │ │ :
> * │ ├── vf2
> * │ │ ├── ...
> */
>
> +static int from_file_read_to_vf_call(struct seq_file *s,
> + int (*call)(struct xe_device *, unsigned int))
> +{
> + struct dentry *dent = file_dentry(s->file)->d_parent;
> + struct xe_device *xe = extract_xe(dent);
> + unsigned int vfid = extract_vfid(dent);
> + int ret;
> +
> + xe_pm_runtime_get(xe);
> + ret = call(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + if (ret < 0)
> + return ret;
> +
> + return 0;
> +}
> +
> static ssize_t from_file_write_to_vf_call(struct file *file, const char __user *userbuf,
> size_t count, loff_t *ppos,
> int (*call)(struct xe_device *, unsigned int))
> @@ -179,10 +199,26 @@ static ssize_t OP##_write(struct file *file, const char __user *userbuf, \
> } \
> DEFINE_SHOW_STORE_ATTRIBUTE(OP)
>
> +#define DEFINE_VF_CONTROL_ATTRIBUTE_RW(OP) \
> +static int OP##_show(struct seq_file *s, void *unused) \
> +{ \
> + return from_file_read_to_vf_call(s, \
> + xe_sriov_pf_control_finish_##OP); \
> +} \
> +static ssize_t OP##_write(struct file *file, const char __user *userbuf, \
> + size_t count, loff_t *ppos) \
> +{ \
> + return from_file_write_to_vf_call(file, userbuf, count, ppos, \
> + xe_sriov_pf_control_trigger_##OP); \
> +} \
> +DEFINE_SHOW_STORE_ATTRIBUTE(OP)
> +
> DEFINE_VF_CONTROL_ATTRIBUTE(pause_vf);
> DEFINE_VF_CONTROL_ATTRIBUTE(resume_vf);
> DEFINE_VF_CONTROL_ATTRIBUTE(stop_vf);
> DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
> +DEFINE_VF_CONTROL_ATTRIBUTE_RW(save_vf);
> +DEFINE_VF_CONTROL_ATTRIBUTE_RW(restore_vf);
>
> static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> {
> @@ -190,6 +226,8 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> debugfs_create_file("resume", 0200, vfdent, xe, &resume_vf_fops);
> debugfs_create_file("stop", 0200, vfdent, xe, &stop_vf_fops);
> debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
> + debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> + debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> }
>
> static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 06/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet
2025-10-21 22:41 ` [PATCH v2 06/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
@ 2025-10-22 22:34 ` Michal Wajdeczko
2025-10-27 13:27 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-22 22:34 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> Add debugfs handlers for migration state and handle bitstream
> .read()/.write() to convert from bitstream to/from migration data
> packets.
> As descriptor/trailer are handled at this layer - add handling for both
> save and restore side.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_sriov_migration_data.c | 336 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_migration_data.h | 5 +
> drivers/gpu/drm/xe/xe_sriov_pf_control.c | 5 +
> drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 35 ++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 54 +++
> .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 9 +
> 6 files changed, 444 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> index b04f9be3b7fed..4cd6c6fc9ba18 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> @@ -6,6 +6,44 @@
> #include "xe_bo.h"
> #include "xe_device.h"
> #include "xe_sriov_migration_data.h"
> +#include "xe_sriov_pf_helpers.h"
> +#include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_printk.h"
> +
> +static struct mutex *pf_migration_mutex(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
other helpers have sep line here
> + return &xe->sriov.pf.vfs[vfid].migration.lock;
> +}
> +
> +static struct xe_sriov_migration_data **pf_pick_pending(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> +
> + return &xe->sriov.pf.vfs[vfid].migration.pending;
> +}
> +
> +static struct xe_sriov_migration_data **
> +pf_pick_descriptor(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> +
> + return &xe->sriov.pf.vfs[vfid].migration.descriptor;
> +}
> +
> +static struct xe_sriov_migration_data **pf_pick_trailer(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> +
> + return &xe->sriov.pf.vfs[vfid].migration.trailer;
> +}
>
> static bool data_needs_bo(struct xe_sriov_migration_data *data)
> {
> @@ -43,6 +81,9 @@ xe_sriov_migration_data_alloc(struct xe_device *xe)
> */
> void xe_sriov_migration_data_free(struct xe_sriov_migration_data *data)
> {
> + if (IS_ERR_OR_NULL(data))
> + return;
> +
> if (data_needs_bo(data))
> xe_bo_unpin_map_no_vm(data->bo);
> else
> @@ -125,3 +166,298 @@ int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *data)
>
> return mig_data_init(data);
> }
> +
> +static ssize_t vf_mig_data_hdr_read(struct xe_sriov_migration_data *data,
> + char __user *buf, size_t len)
> +{
> + loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
> +
> + if (!data->hdr_remaining)
> + return -EINVAL;
> +
> + if (len > data->hdr_remaining)
> + len = data->hdr_remaining;
> +
> + if (copy_to_user(buf, (void *)&data->hdr + offset, len))
> + return -EFAULT;
> +
> + data->hdr_remaining -= len;
> +
> + return len;
> +}
> +
> +static ssize_t vf_mig_data_read(struct xe_sriov_migration_data *data,
> + char __user *buf, size_t len)
> +{
> + if (len > data->remaining)
> + len = data->remaining;
> +
> + if (copy_to_user(buf, data->vaddr + (data->size - data->remaining), len))
> + return -EFAULT;
> +
> + data->remaining -= len;
> +
> + return len;
> +}
> +
> +static ssize_t __vf_mig_data_read_single(struct xe_sriov_migration_data **data,
> + unsigned int vfid, char __user *buf, size_t len)
> +{
> + ssize_t copied = 0;
> +
> + if ((*data)->hdr_remaining)
> + copied = vf_mig_data_hdr_read(*data, buf, len);
> + else
> + copied = vf_mig_data_read(*data, buf, len);
> +
> + if ((*data)->remaining == 0 && (*data)->hdr_remaining == 0) {
> + xe_sriov_migration_data_free(*data);
> + *data = NULL;
> + }
> +
> + return copied;
> +}
> +
> +static struct xe_sriov_migration_data **vf_mig_pick_data(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_migration_data **data;
> +
> + data = pf_pick_descriptor(xe, vfid);
> + if (*data)
> + return data;
> +
> + data = pf_pick_pending(xe, vfid);
> + if (!*data)
> + *data = xe_sriov_pf_migration_save_consume(xe, vfid);
> + if (*data)
> + return data;
> +
> + data = pf_pick_trailer(xe, vfid);
> + if (*data)
> + return data;
> +
> + return ERR_PTR(-ENODATA);
> +}
> +
> +static ssize_t vf_mig_data_read_single(struct xe_device *xe, unsigned int vfid,
> + char __user *buf, size_t len)
> +{
> + struct xe_sriov_migration_data **data = vf_mig_pick_data(xe, vfid);
> +
> + if (IS_ERR_OR_NULL(data))
vf_mig_pick_data() seems to never return NULL, so maybe just IS_ERR() ?
> + return PTR_ERR(data);
> +
> + return __vf_mig_data_read_single(data, vfid, buf, len);
> +}
> +
> +/**
> + * xe_sriov_migration_data_read() - Read migration data from the device.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + * @buf: start address of userspace buffer
> + * @len: requested read size from userspace
> + *
> + * Return: number of bytes that has been successfully read,
> + * 0 if no more migration data is available,
> + * -errno on failure.
> + */
> +ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
> + char __user *buf, size_t len)
> +{
> + ssize_t ret, consumed = 0;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
> + while (consumed < len) {
> + ret = vf_mig_data_read_single(xe, vfid, buf, len - consumed);
> + if (ret == -ENODATA)
> + break;
> + if (ret < 0)
> + return ret;
> +
> + consumed += ret;
> + buf += ret;
> + }
> + }
> +
> + return consumed;
> +}
> +
> +static ssize_t vf_mig_hdr_write(struct xe_sriov_migration_data *data,
> + const char __user *buf, size_t len)
> +{
> + loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
> + int ret;
> +
> + if (len > data->hdr_remaining)
> + len = data->hdr_remaining;
> +
> + if (copy_from_user((void *)&data->hdr + offset, buf, len))
> + return -EFAULT;
> +
> + data->hdr_remaining -= len;
> +
> + if (!data->hdr_remaining) {
> + ret = xe_sriov_migration_data_init_from_hdr(data);
> + if (ret)
> + return ret;
> + }
> +
> + return len;
> +}
> +
> +static ssize_t vf_mig_data_write(struct xe_sriov_migration_data *data,
> + const char __user *buf, size_t len)
> +{
> + if (len > data->remaining)
> + len = data->remaining;
> +
> + if (copy_from_user(data->vaddr + (data->size - data->remaining), buf, len))
> + return -EFAULT;
> +
> + data->remaining -= len;
> +
> + return len;
> +}
> +
> +static ssize_t vf_mig_data_write_single(struct xe_device *xe, unsigned int vfid,
> + const char __user *buf, size_t len)
> +{
> + struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
> + int ret;
> + ssize_t copied;
> +
> + if (IS_ERR_OR_NULL(*data)) {
> + *data = xe_sriov_migration_data_alloc(xe);
> + if (!*data)
> + return -ENOMEM;
> + }
> +
> + if ((*data)->hdr_remaining)
> + copied = vf_mig_hdr_write(*data, buf, len);
> + else
> + copied = vf_mig_data_write(*data, buf, len);
> +
> + if ((*data)->hdr_remaining == 0 && (*data)->remaining == 0) {
> + ret = xe_sriov_pf_migration_restore_produce(xe, vfid, *data);
> + if (ret) {
> + xe_sriov_migration_data_free(*data);
> + return ret;
> + }
> +
> + *data = NULL;
> + }
> +
> + return copied;
> +}
> +
> +/**
> + * xe_sriov_migration_data_write() - Write migration data to the device.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + * @buf: start address of userspace buffer
> + * @len: requested write size from userspace
> + *
> + * Return: number of bytes that has been successfully written,
> + * -errno on failure.
> + */
> +ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
> + const char __user *buf, size_t len)
> +{
> + ssize_t ret, produced = 0;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
> + while (produced < len) {
> + ret = vf_mig_data_write_single(xe, vfid, buf, len - produced);
> + if (ret < 0)
> + return ret;
> +
> + produced += ret;
> + buf += ret;
> + }
> + }
> +
> + return produced;
> +}
> +
> +#define MIGRATION_DESCRIPTOR_DWORDS 0
> +static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_migration_data **desc = pf_pick_descriptor(xe, vfid);
> + struct xe_sriov_migration_data *data;
> + int ret;
> +
> + data = xe_sriov_migration_data_alloc(xe);
> + if (!data)
> + return -ENOMEM;
> +
> + ret = xe_sriov_migration_data_init(data, 0, 0, XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR,
> + 0, MIGRATION_DESCRIPTOR_DWORDS * sizeof(u32));
> + if (ret) {
> + xe_sriov_migration_data_free(data);
> + return ret;
> + }
> +
> + *desc = data;
> +
> + return 0;
> +}
> +
> +static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
> +
> + *data = NULL;
> +}
> +
> +#define MIGRATION_TRAILER_SIZE 0
> +static int pf_trailer_init(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_migration_data **trailer = pf_pick_trailer(xe, vfid);
> + struct xe_sriov_migration_data *data;
> + int ret;
> +
> + data = xe_sriov_migration_data_alloc(xe);
> + if (!data)
> + return -ENOMEM;
> +
> + ret = xe_sriov_migration_data_init(data, 0, 0, XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER,
> + 0, MIGRATION_TRAILER_SIZE);
> + if (ret) {
> + xe_sriov_migration_data_free(data);
> + return ret;
> + }
> +
> + *trailer = data;
> +
> + return 0;
> +}
> +
> +/**
> + * xe_sriov_migration_data_save_init() - Initialize the pending save migration data.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * Return: 0 on success, -errno on failure.
> + */
> +int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid)
> +{
> + int ret;
> +
> + scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
> + ret = pf_descriptor_init(xe, vfid);
> + if (ret)
> + return ret;
> +
> + ret = pf_trailer_init(xe, vfid);
> + if (ret)
> + return ret;
> +
> + pf_pending_init(xe, vfid);
> + }
> +
> + return 0;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> index ef65dccddc035..5cde6e9439677 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> @@ -27,5 +27,10 @@ void xe_sriov_migration_data_free(struct xe_sriov_migration_data *snapshot);
> int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
> enum xe_sriov_migration_data_type, loff_t offset, size_t size);
> int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *snapshot);
> +ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
> + char __user *buf, size_t len);
> +ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
> + const char __user *buf, size_t len);
> +int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> index 8d8a01faf5291..c2768848daba1 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> @@ -5,6 +5,7 @@
>
> #include "xe_device.h"
> #include "xe_gt_sriov_pf_control.h"
> +#include "xe_sriov_migration_data.h"
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_printk.h"
>
> @@ -165,6 +166,10 @@ int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid)
> unsigned int id;
> int ret;
>
> + ret = xe_sriov_migration_data_save_init(xe, vfid);
> + if (ret)
> + return ret;
> +
> for_each_gt(gt, xe, id) {
> ret = xe_gt_sriov_pf_control_trigger_save_vf(gt, vfid);
> if (ret)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> index e0e6340c49106..a9a28aec22421 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> @@ -9,6 +9,7 @@
> #include "xe_device.h"
> #include "xe_device_types.h"
> #include "xe_pm.h"
> +#include "xe_sriov_migration_data.h"
> #include "xe_sriov_pf.h"
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_pf_debugfs.h"
> @@ -132,6 +133,7 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
> * /sys/kernel/debug/dri/BDF/
> * ├── sriov
> * │ ├── vf1
> + * │ │ ├── migration_data
> * │ │ ├── pause
> * │ │ ├── reset
> * │ │ ├── resume
> @@ -220,6 +222,38 @@ DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
> DEFINE_VF_CONTROL_ATTRIBUTE_RW(save_vf);
> DEFINE_VF_CONTROL_ATTRIBUTE_RW(restore_vf);
>
> +static ssize_t data_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> +{
> + struct dentry *dent = file_dentry(file)->d_parent;
> + struct xe_device *xe = extract_xe(dent);
> + unsigned int vfid = extract_vfid(dent);
> +
> + if (*pos)
> + return -ESPIPE;
> +
> + return xe_sriov_migration_data_write(xe, vfid, buf, count);
> +}
> +
> +static ssize_t data_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> +{
> + struct dentry *dent = file_dentry(file)->d_parent;
> + struct xe_device *xe = extract_xe(dent);
> + unsigned int vfid = extract_vfid(dent);
> +
> + if (*ppos)
> + return -ESPIPE;
> +
> + return xe_sriov_migration_data_read(xe, vfid, buf, count);
> +}
> +
> +static const struct file_operations data_vf_fops = {
> + .owner = THIS_MODULE,
> + .open = simple_open,
> + .write = data_write,
> + .read = data_read,
> + .llseek = default_llseek,
> +};
> +
> static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> {
> debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> @@ -228,6 +262,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
> debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> + debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> }
>
> static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index eaf581317bdef..029e14f1ffa74 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -10,6 +10,7 @@
> #include "xe_gt_sriov_pf_migration.h"
> #include "xe_pm.h"
> #include "xe_sriov.h"
> +#include "xe_sriov_migration_data.h"
> #include "xe_sriov_pf_helpers.h"
> #include "xe_sriov_pf_migration.h"
> #include "xe_sriov_printk.h"
> @@ -53,6 +54,15 @@ static bool pf_check_migration_support(struct xe_device *xe)
> return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> }
>
> +static void pf_migration_cleanup(struct drm_device *dev, void *arg)
> +{
> + struct xe_sriov_pf_migration *migration = arg;
> +
> + xe_sriov_migration_data_free(migration->pending);
> + xe_sriov_migration_data_free(migration->trailer);
> + xe_sriov_migration_data_free(migration->descriptor);
> +}
> +
> /**
> * xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
> * @xe: the &xe_device
> @@ -62,6 +72,7 @@ static bool pf_check_migration_support(struct xe_device *xe)
> int xe_sriov_pf_migration_init(struct xe_device *xe)
> {
> unsigned int n, totalvfs;
> + int err;
>
> xe_assert(xe, IS_SRIOV_PF(xe));
>
> @@ -73,7 +84,15 @@ int xe_sriov_pf_migration_init(struct xe_device *xe)
> for (n = 1; n <= totalvfs; n++) {
> struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
>
> + err = drmm_mutex_init(&xe->drm, &migration->lock);
> + if (err)
> + return err;
> +
> init_waitqueue_head(&migration->wq);
> +
> + err = drmm_add_action_or_reset(&xe->drm, pf_migration_cleanup, migration);
shouldn't we use devm instead here ?
> + if (err)
> + return err;
> }
>
> return 0;
> @@ -154,6 +173,36 @@ xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
> return data;
> }
>
> +static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_migration_data *data)
> +{
> + if (data->tile != 0 || data->gt != 0)
> + return -EINVAL;
> +
> + xe_sriov_migration_data_free(data);
> +
> + return 0;
> +}
> +
> +static int pf_handle_trailer(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_migration_data *data)
> +{
> + struct xe_gt *gt;
> + u8 gt_id;
> +
> + if (data->tile != 0 || data->gt != 0)
> + return -EINVAL;
> + if (data->offset != 0 || data->size != 0 || data->buff || data->bo)
> + return -EINVAL;
> +
> + xe_sriov_migration_data_free(data);
> +
> + for_each_gt(gt, xe, gt_id)
> + xe_gt_sriov_pf_control_restore_data_done(gt, vfid);
> +
> + return 0;
> +}
> +
> /**
> * xe_sriov_pf_migration_restore_produce() - Produce a VF migration data packet to the device.
> * @xe: the &xe_device
> @@ -173,6 +222,11 @@ int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfi
>
> xe_assert(xe, IS_SRIOV_PF(xe));
>
> + if (data->type == XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR)
> + return pf_handle_descriptor(xe, vfid, data);
> + else if (data->type == XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER)
no need for "else" here
> + return pf_handle_trailer(xe, vfid, data);
> +
> gt = xe_device_get_gt(xe, data->gt);
> if (!gt || data->tile != gt->tile->id) {
> xe_sriov_err_ratelimited(xe, "VF%d Invalid GT - tile:%u, GT:%u\n",
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> index 2a45ee4e3ece8..8468e5eeb6d66 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> @@ -7,6 +7,7 @@
> #define _XE_SRIOV_PF_MIGRATION_TYPES_H_
>
> #include <linux/types.h>
> +#include <linux/mutex_types.h>
> #include <linux/wait.h>
>
> /**
> @@ -53,6 +54,14 @@ struct xe_sriov_migration_data {
> struct xe_sriov_pf_migration {
> /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
> wait_queue_head_t wq;
> + /** @lock: Mutex protecting the migration data */
> + struct mutex lock;
> + /** @pending: currently processed data packet of VF resource */
> + struct xe_sriov_migration_data *pending;
> + /** @trailer: data packet used to indicate the end of stream */
> + struct xe_sriov_migration_data *trailer;
> + /** @descriptor: data packet containing the metadata describing the device */
> + struct xe_sriov_migration_data *descriptor;
> };
>
> #endif
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 07/26] drm/xe/pf: Add minimalistic migration descriptor
2025-10-21 22:41 ` [PATCH v2 07/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
@ 2025-10-22 22:49 ` Michal Wajdeczko
2025-10-27 14:52 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-22 22:49 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> The descriptor reuses the KLV format used by GuC and contains metadata
> that can be used to quickly fail migration when source is incompatible
> with destination.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_sriov_migration_data.c | 79 +++++++++++++++++++-
> drivers/gpu/drm/xe/xe_sriov_migration_data.h | 2 +
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 6 ++
> 3 files changed, 86 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> index 4cd6c6fc9ba18..b58508c0c30f1 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> @@ -5,6 +5,7 @@
>
> #include "xe_bo.h"
> #include "xe_device.h"
> +#include "xe_guc_klv_helpers.h"
> #include "xe_sriov_migration_data.h"
> #include "xe_sriov_pf_helpers.h"
> #include "xe_sriov_pf_migration.h"
> @@ -383,11 +384,18 @@ ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
> return produced;
> }
>
> -#define MIGRATION_DESCRIPTOR_DWORDS 0
> +#define MIGRATION_KLV_DEVICE_DEVID_KEY 0xf001u
> +#define MIGRATION_KLV_DEVICE_DEVID_LEN 1u
> +#define MIGRATION_KLV_DEVICE_REVID_KEY 0xf002u
> +#define MIGRATION_KLV_DEVICE_REVID_LEN 1u
> +
> +#define MIGRATION_DESCRIPTOR_DWORDS (GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_DEVID_LEN + \
> + GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_REVID_LEN)
> static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
> {
> struct xe_sriov_migration_data **desc = pf_pick_descriptor(xe, vfid);
> struct xe_sriov_migration_data *data;
> + u32 *klvs;
> int ret;
>
> data = xe_sriov_migration_data_alloc(xe);
> @@ -401,11 +409,80 @@ static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
> return ret;
> }
>
> + klvs = data->vaddr;
> + *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_DEVID_KEY,
> + MIGRATION_KLV_DEVICE_DEVID_LEN);
> + *klvs++ = xe->info.devid;
> + *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_REVID_KEY,
> + MIGRATION_KLV_DEVICE_REVID_LEN);
> + *klvs++ = xe->info.revid;
> +
maybe add assert that written KLVs match descriptor size?
> *desc = data;
>
> return 0;
> }
>
> +/**
> + * xe_sriov_migration_data_process_descriptor() - Process migration data descriptor.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + * @data: the &struct xe_sriov_pf_migration_data containing the descriptor
> + *
> + * The descriptor uses the same KLV format as GuC, and contains metadata used for
> + * checking migration data compatibility.
> + *
> + * Return: 0 on success, -errno on failure.
> + */
> +int xe_sriov_migration_data_process_descriptor(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_migration_data *data)
> +{
> + u32 num_dwords = data->size / sizeof(u32);
> + u32 *klvs = data->vaddr;
> +
> + xe_assert(xe, data->type == XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR);
> + if (data->size % sizeof(u32) != 0)
no need to compare against 0
if (data->size % sizeof(u32))
> + return -EINVAL;
for other errors we warn(), ok to be silent here?
> +
> + while (num_dwords >= GUC_KLV_LEN_MIN) {
> + u32 key = FIELD_GET(GUC_KLV_0_KEY, klvs[0]);
> + u32 len = FIELD_GET(GUC_KLV_0_LEN, klvs[0]);
> +
> + klvs += GUC_KLV_LEN_MIN;
> + num_dwords -= GUC_KLV_LEN_MIN;
> +
you should check len vs num_dwords here
> + switch (key) {
> + case MIGRATION_KLV_DEVICE_DEVID_KEY:
> + if (*klvs != xe->info.devid) {
> + xe_sriov_warn(xe,
> + "Aborting migration, devid mismatch %#04x!=%#04x\n",
likely %#06x, as you need to count also "0x"
> + *klvs, xe->info.devid);
> + return -ENODEV;
> + }
> + break;
> + case MIGRATION_KLV_DEVICE_REVID_KEY:
> + if (*klvs != xe->info.revid) {
> + xe_sriov_warn(xe,
> + "Aborting migration, revid mismatch %#04x!=%#04x\n",
> + *klvs, xe->info.revid);
> + return -ENODEV;
> + }
> + break;
> + default:
> + xe_sriov_dbg(xe,
> + "Unknown migration descriptor key %#06x - skipping\n", key);
also print len? and some initial hexdump to help with debug?
> + break;
> + }
> +
> + if (len > num_dwords)
> + return -EINVAL;
this check should be earlier
> +
> + klvs += len;
> + num_dwords -= len;
> + }
> +
> + return 0;
> +}
> +
> static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
> {
> struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
> diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> index 5cde6e9439677..e7f3b332124bc 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> @@ -31,6 +31,8 @@ ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
> char __user *buf, size_t len);
> ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
> const char __user *buf, size_t len);
> +int xe_sriov_migration_data_process_descriptor(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_migration_data *data);
> int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index 029e14f1ffa74..0b4b237780102 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -176,9 +176,15 @@ xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
> static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
> struct xe_sriov_migration_data *data)
> {
> + int ret;
> +
> if (data->tile != 0 || data->gt != 0)
> return -EINVAL;
>
> + ret = xe_sriov_migration_data_process_descriptor(xe, vfid, data);
> + if (ret)
> + return ret;
> +
> xe_sriov_migration_data_free(data);
>
> return 0;
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 08/26] drm/xe/pf: Expose VF migration data size over debugfs
2025-10-21 22:41 ` [PATCH v2 08/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
@ 2025-10-22 23:02 ` Michal Wajdeczko
0 siblings, 0 replies; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-22 23:02 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> The size is normally used to make a decision on when to stop the device
> (mainly when it's in a pre_copy state).
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 19 ++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 2 ++
> drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 29 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 30 +++++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 1 +
> 5 files changed, 81 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 8ba72165759b3..4e26feb9c267f 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -395,6 +395,25 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> }
> #endif /* CONFIG_DEBUG_FS */
>
> +/**
> + * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: total migration data size in bytes or a negative error code on failure.
> + */
> +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> +{
> + ssize_t total = 0;
> +
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +
> + /* Nothing to query yet - will be updated once per-GT migration data types are added */
> + return total;
> +}
> +
> /**
> * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty.
> * @gt: the &xe_gt
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index 1ed2248f0a17e..e2d41750f863c 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -15,6 +15,8 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
>
> +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
> +
> bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid);
> void xe_gt_sriov_pf_migration_ring_free(struct xe_gt *gt, unsigned int vfid);
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> index a9a28aec22421..bc2d0b0342f22 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> @@ -14,6 +14,7 @@
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_pf_debugfs.h"
> #include "xe_sriov_pf_helpers.h"
> +#include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_provision.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_sriov_printk.h"
> @@ -254,6 +255,33 @@ static const struct file_operations data_vf_fops = {
> .llseek = default_llseek,
> };
>
> +static ssize_t size_read(struct file *file, char __user *ubuf, size_t count, loff_t *ppos)
> +{
> + struct dentry *dent = file_dentry(file)->d_parent;
> + struct xe_device *xe = extract_xe(dent);
> + unsigned int vfid = extract_vfid(dent);
> + char buf[21];
> + ssize_t ret;
> + int len;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_migration_size(xe, vfid);
> + xe_pm_runtime_put(xe);
IIRC during simple "cat migration_size" we might be called twice
to avoid that we can calc size in .open instead, see config_blob
but not a blocker, so
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> + if (ret < 0)
> + return ret;
> +
> + len = scnprintf(buf, sizeof(buf), "%zd\n", ret);
> +
> + return simple_read_from_buffer(ubuf, count, ppos, buf, len);
> +}
> +
> +static const struct file_operations size_vf_fops = {
> + .owner = THIS_MODULE,
> + .open = simple_open,
> + .read = size_read,
> + .llseek = default_llseek,
> +};
> +
> static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> {
> debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> @@ -263,6 +291,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> + debugfs_create_file("migration_size", 0400, vfdent, xe, &size_vf_fops);
> }
>
> static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index 0b4b237780102..88babec9c893e 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -242,3 +242,33 @@ int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfi
>
> return xe_gt_sriov_pf_migration_restore_produce(gt, vfid, data);
> }
> +
> +/**
> + * xe_sriov_pf_migration_size() - Total size of migration data from all components within a device
> + * @xe: the &xe_device
> + * @vfid: the VF identifier (can't be 0)
> + *
> + * This function is for PF only.
> + *
> + * Return: total migration data size in bytes or a negative error code on failure.
> + */
> +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid)
> +{
> + size_t size = 0;
> + struct xe_gt *gt;
> + ssize_t ret;
> + u8 gt_id;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid);
> +
> + for_each_gt(gt, xe, gt_id) {
> + ret = xe_gt_sriov_pf_migration_size(gt, vfid);
> + if (ret < 0)
> + return ret;
> +
> + size += ret;
> + }
> +
> + return size;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> index df81a540c246a..16cb444c36aa6 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> @@ -18,6 +18,7 @@ int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfi
> struct xe_sriov_migration_data *data);
> struct xe_sriov_migration_data *
> xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid);
> +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid);
> wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid);
>
> #endif
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 09/26] drm/xe: Add sa/guc_buf_cache sync interface
2025-10-21 22:41 ` [PATCH v2 09/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
@ 2025-10-22 23:05 ` Michal Wajdeczko
0 siblings, 0 replies; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-22 23:05 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> In upcoming changes the cached buffers are going to be used to read data
> produced by the GuC. Add a counterpart to flush, which synchronizes the
> CPU-side of suballocation with the GPU data and propagate the interface
> to GuC Buffer Cache.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 10/26] drm/xe: Allow the caller to pass guc_buf_cache size
2025-10-21 22:41 ` [PATCH v2 10/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
@ 2025-10-22 23:13 ` Michal Wajdeczko
0 siblings, 0 replies; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-22 23:13 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> An upcoming change will use GuC buffer cache as a place where GuC
> migration data will be stored, and the memory requirement for that is
> larger than indirect data.
> Allow the caller to pass the size based on the intended usecase.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c | 2 +-
> drivers/gpu/drm/xe/xe_guc.c | 4 ++--
> drivers/gpu/drm/xe/xe_guc_buf.c | 6 +++---
> drivers/gpu/drm/xe/xe_guc_buf.h | 4 +++-
> 4 files changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> index d266882adc0e0..485e7a70e6bb7 100644
> --- a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> +++ b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> @@ -72,7 +72,7 @@ static int guc_buf_test_init(struct kunit *test)
> kunit_activate_static_stub(test, xe_managed_bo_create_pin_map,
> replacement_xe_managed_bo_create_pin_map);
>
> - KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf));
> + KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE));
>
> test->priv = &guc->buf;
> return 0;
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index ecc3e091b89e6..7c65528859ecb 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -812,7 +812,7 @@ static int vf_guc_init_post_hwconfig(struct xe_guc *guc)
> if (err)
> return err;
>
> - err = xe_guc_buf_cache_init(&guc->buf);
> + err = xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE);
> if (err)
> return err;
>
> @@ -860,7 +860,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
> if (ret)
> return ret;
>
> - ret = xe_guc_buf_cache_init(&guc->buf);
> + ret = xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE);
> if (ret)
> return ret;
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
> index 4d8a4712309f4..ed096a0331244 100644
> --- a/drivers/gpu/drm/xe/xe_guc_buf.c
> +++ b/drivers/gpu/drm/xe/xe_guc_buf.c
> @@ -28,16 +28,16 @@ static struct xe_gt *cache_to_gt(struct xe_guc_buf_cache *cache)
> * @cache: the &xe_guc_buf_cache to initialize
> *
> * The Buffer Cache allows to obtain a reusable buffer that can be used to pass
> - * indirect H2G data to GuC without a need to create a ad-hoc allocation.
> + * data to GuC or read data from GuC without a need to create a ad-hoc allocation.
> *
> * Return: 0 on success or a negative error code on failure.
> */
> -int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache)
> +int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size)
> {
> struct xe_gt *gt = cache_to_gt(cache);
> struct xe_sa_manager *sam;
>
> - sam = __xe_sa_bo_manager_init(gt_to_tile(gt), SZ_8K, 0, sizeof(u32));
> + sam = __xe_sa_bo_manager_init(gt_to_tile(gt), size, 0, sizeof(u32));
> if (IS_ERR(sam))
> return PTR_ERR(sam);
> cache->sam = sam;
> diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
> index c5e0f1fd24d74..5210703309e81 100644
> --- a/drivers/gpu/drm/xe/xe_guc_buf.h
> +++ b/drivers/gpu/drm/xe/xe_guc_buf.h
> @@ -11,7 +11,9 @@
>
> #include "xe_guc_buf_types.h"
>
> -int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache);
> +#define XE_GUC_BUF_CACHE_DEFAULT_SIZE SZ_8K
> +
> +int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size);
alternatively, to minimize code changes, we can have:
int xe_guc_buf_cache_init_size(struct xe_guc_buf_cache *cache, u32 size);
static inline int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache)
{
return xe_guc_buf_cache_init_size(cache, XE_GUC_BUF_CACHE_DEFAULT_SIZE);
}
but up to you,
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> u32 xe_guc_buf_cache_dwords(struct xe_guc_buf_cache *cache);
> struct xe_guc_buf xe_guc_buf_reserve(struct xe_guc_buf_cache *cache, u32 dwords);
> struct xe_guc_buf xe_guc_buf_from_data(struct xe_guc_buf_cache *cache,
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control
2025-10-21 22:41 ` [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control Michał Winiarski
@ 2025-10-23 11:44 ` kernel test robot
2025-10-23 19:54 ` Michal Wajdeczko
1 sibling, 0 replies; 72+ messages in thread
From: kernel test robot @ 2025-10-23 11:44 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost, Michal Wajdeczko
Cc: oe-kbuild-all, dri-devel, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna,
Michał Winiarski
Hi Michał,
kernel test robot noticed the following build errors:
[auto build test ERROR on drm-xe/drm-xe-next]
[also build test ERROR on next-20251023]
[cannot apply to awilliam-vfio/next awilliam-vfio/for-linus drm-i915/for-linux-next drm-i915/for-linux-next-fixes linus/master v6.18-rc2]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Micha-Winiarski/drm-xe-pf-Remove-GuC-version-check-for-migration-support/20251022-064617
base: https://gitlab.freedesktop.org/drm/xe/kernel.git drm-xe-next
patch link: https://lore.kernel.org/r/20251021224133.577765-23-michal.winiarski%40intel.com
patch subject: [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control
config: arm-randconfig-r072-20251023 (https://download.01.org/0day-ci/archive/20251023/202510231918.XlOqymLC-lkp@intel.com/config)
compiler: clang version 16.0.6 (https://github.com/llvm/llvm-project 7cbf1a2591520c2491aa35339f227775f4d3adf6)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251023/202510231918.XlOqymLC-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510231918.XlOqymLC-lkp@intel.com/
All errors (new ones prefixed by >>):
>> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c:212:2: error: duplicate case value: 'XE_GT_SRIOV_STATE_RESTORE_DATA_DONE' and 'XE_GT_SRIOV_STATE_MISMATCH' both equal '31'
CASE2STR(MISMATCH);
^
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c:170:7: note: expanded from macro 'CASE2STR'
case XE_GT_SRIOV_STATE_##_X: return #_X
^
<scratch space>:58:1: note: expanded from here
XE_GT_SRIOV_STATE_MISMATCH
^
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c:201:2: note: previous case defined here
CASE2STR(RESTORE_DATA_DONE);
^
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c:170:7: note: expanded from macro 'CASE2STR'
case XE_GT_SRIOV_STATE_##_X: return #_X
^
<scratch space>:36:1: note: expanded from here
XE_GT_SRIOV_STATE_RESTORE_DATA_DONE
^
1 error generated.
vim +212 drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
aed2c1d70aa008 Michal Wajdeczko 2024-03-26 99
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 100 /**
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 101 * DOC: The VF state machine
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 102 *
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 103 * The simplified VF state machine could be presented as::
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 104 *
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 105 * pause--------------------------o
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 106 * / |
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 107 * / v
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 108 * (READY)<------------------resume-----(PAUSED)
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 109 * ^ \ / /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 110 * | \ / /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 111 * | stop---->(STOPPED)<----stop /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 112 * | / /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 113 * | / /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 114 * o--------<-----flr /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 115 * \ /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 116 * o------<--------------------flr
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 117 *
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 118 * Where:
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 119 *
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 120 * * READY - represents a state in which VF is fully operable
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 121 * * PAUSED - represents a state in which VF activity is temporarily suspended
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 122 * * STOPPED - represents a state in which VF activity is definitely halted
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 123 * * pause - represents a request to temporarily suspend VF activity
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 124 * * resume - represents a request to resume VF activity
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 125 * * stop - represents a request to definitely halt VF activity
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 126 * * flr - represents a request to perform VF FLR to restore VF activity
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 127 *
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 128 * However, each state transition requires additional steps that involves
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 129 * communication with GuC that might fail or be interrupted by other requests::
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 130 *
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 131 * .................................WIP....
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 132 * : :
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 133 * pause--------------------->PAUSE_WIP----------------------------o
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 134 * / : / \ : |
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 135 * / : o----<---stop flr--o : |
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 136 * / : | \ / | : V
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 137 * (READY,RESUMED)<--------+------------RESUME_WIP<----+--<-----resume--(PAUSED)
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 138 * ^ \ \ : | | : / /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 139 * | \ \ : | | : / /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 140 * | \ \ : | | : / /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 141 * | \ \ : o----<----------------------+--<-------stop /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 142 * | \ \ : | | : /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 143 * | \ \ : V | : /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 144 * | \ stop----->STOP_WIP---------flr--->-----o : /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 145 * | \ : | | : /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 146 * | \ : | V : /
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 147 * | flr--------+----->----------------->FLR_WIP<-----flr
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 148 * | : | / ^ :
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 149 * | : | / | :
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 150 * o--------<-------:----+-----<----------------o | :
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 151 * : | | :
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 152 * :....|...........................|.....:
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 153 * | |
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 154 * V |
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 155 * (STOPPED)--------------------flr
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 156 *
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 157 * For details about each internal WIP state machine see:
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 158 *
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 159 * * `The VF PAUSE state machine`_
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 160 * * `The VF RESUME state machine`_
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 161 * * `The VF STOP state machine`_
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 162 * * `The VF FLR state machine`_
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 163 */
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 164
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 165 #ifdef CONFIG_DRM_XE_DEBUG_SRIOV
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 166 static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 167 {
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 168 switch (bit) {
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 169 #define CASE2STR(_X) \
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 170 case XE_GT_SRIOV_STATE_##_X: return #_X
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 171 CASE2STR(WIP);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 172 CASE2STR(FLR_WIP);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 173 CASE2STR(FLR_SEND_START);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 174 CASE2STR(FLR_WAIT_GUC);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 175 CASE2STR(FLR_GUC_DONE);
2a8fcf7cc950e6 Michal Wajdeczko 2025-10-01 176 CASE2STR(FLR_SYNC);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 177 CASE2STR(FLR_RESET_CONFIG);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 178 CASE2STR(FLR_RESET_DATA);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 179 CASE2STR(FLR_RESET_MMIO);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 180 CASE2STR(FLR_SEND_FINISH);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 181 CASE2STR(FLR_FAILED);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 182 CASE2STR(PAUSE_WIP);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 183 CASE2STR(PAUSE_SEND_PAUSE);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 184 CASE2STR(PAUSE_WAIT_GUC);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 185 CASE2STR(PAUSE_GUC_DONE);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 186 CASE2STR(PAUSE_FAILED);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 187 CASE2STR(PAUSED);
ed3b410584ff58 Michał Winiarski 2025-10-22 188 CASE2STR(SAVE_WIP);
008ba8d0525f68 Michał Winiarski 2025-10-22 189 CASE2STR(SAVE_PROCESS_DATA);
008ba8d0525f68 Michał Winiarski 2025-10-22 190 CASE2STR(SAVE_WAIT_DATA);
33cfbd2b4f240a Michał Winiarski 2025-10-22 191 CASE2STR(SAVE_DATA_GUC);
994e46306a1791 Michał Winiarski 2025-10-22 192 CASE2STR(SAVE_DATA_GGTT);
bdbad7e79b97c4 Michał Winiarski 2025-10-22 193 CASE2STR(SAVE_DATA_MMIO);
afa80586c0896a Michał Winiarski 2025-10-22 194 CASE2STR(SAVE_DATA_VRAM);
008ba8d0525f68 Michał Winiarski 2025-10-22 195 CASE2STR(SAVE_DATA_DONE);
ed3b410584ff58 Michał Winiarski 2025-10-22 196 CASE2STR(SAVE_FAILED);
ed3b410584ff58 Michał Winiarski 2025-10-22 197 CASE2STR(SAVED);
ed3b410584ff58 Michał Winiarski 2025-10-22 198 CASE2STR(RESTORE_WIP);
008ba8d0525f68 Michał Winiarski 2025-10-22 199 CASE2STR(RESTORE_PROCESS_DATA);
008ba8d0525f68 Michał Winiarski 2025-10-22 200 CASE2STR(RESTORE_WAIT_DATA);
008ba8d0525f68 Michał Winiarski 2025-10-22 201 CASE2STR(RESTORE_DATA_DONE);
ed3b410584ff58 Michał Winiarski 2025-10-22 202 CASE2STR(RESTORE_FAILED);
ed3b410584ff58 Michał Winiarski 2025-10-22 203 CASE2STR(RESTORED);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 204 CASE2STR(RESUME_WIP);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 205 CASE2STR(RESUME_SEND_RESUME);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 206 CASE2STR(RESUME_FAILED);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 207 CASE2STR(RESUMED);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 208 CASE2STR(STOP_WIP);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 209 CASE2STR(STOP_SEND_STOP);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 210 CASE2STR(STOP_FAILED);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 211 CASE2STR(STOPPED);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 @212 CASE2STR(MISMATCH);
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 213 #undef CASE2STR
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 214 default: return "?";
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 215 }
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 216 }
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 217 #endif
2bd87f0fc24ae2 Michal Wajdeczko 2024-08-28 218
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 11/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration
2025-10-21 22:41 ` [PATCH v2 11/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration Michał Winiarski
@ 2025-10-23 17:37 ` Michal Wajdeczko
2025-10-28 10:46 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-23 17:37 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> Contiguous PF GGTT VMAs can be scarce after creating VFs.
> Increase the GuC buffer cache size to 4M for PF so that we can fit GuC
> migration data (which currently maxes out at just under 4M) and use the
but the code below still uses 8M
> cache instead of allocating fresh BOs.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 46 ++++++-------------
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 3 ++
> drivers/gpu/drm/xe/xe_guc.c | 12 ++++-
> 3 files changed, 28 insertions(+), 33 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 4e26feb9c267f..04fad3126865c 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -11,7 +11,7 @@
> #include "xe_gt_sriov_pf_helpers.h"
> #include "xe_gt_sriov_pf_migration.h"
> #include "xe_gt_sriov_printk.h"
> -#include "xe_guc.h"
> +#include "xe_guc_buf.h"
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> #include "xe_sriov_migration_data.h"
> @@ -57,73 +57,55 @@ static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
>
> /* Return: number of state dwords saved or a negative error code on failure */
> static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
> - void *buff, size_t size)
> + void *dst, size_t size)
> {
> const int ndwords = size / sizeof(u32);
> - struct xe_tile *tile = gt_to_tile(gt);
> - struct xe_device *xe = tile_to_xe(tile);
> struct xe_guc *guc = >->uc.guc;
> - struct xe_bo *bo;
> + CLASS(xe_guc_buf, buf)(&guc->buf, ndwords);
> int ret;
>
> xe_gt_assert(gt, size % sizeof(u32) == 0);
> xe_gt_assert(gt, size == ndwords * sizeof(u32));
>
> - bo = xe_bo_create_pin_map_novm(xe, tile,
> - ALIGN(size, PAGE_SIZE),
> - ttm_bo_type_kernel,
> - XE_BO_FLAG_SYSTEM |
> - XE_BO_FLAG_GGTT |
> - XE_BO_FLAG_GGTT_INVALIDATE, false);
> - if (IS_ERR(bo))
> - return PTR_ERR(bo);
> + if (!xe_guc_buf_is_valid(buf))
> + return -ENOBUFS;
> +
> + memset(xe_guc_buf_cpu_ptr(buf), 0, size);
hmm, I didn't find in the GuC spec that this buffer must be zeroed, so why bother?
>
> ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_SAVE,
> - xe_bo_ggtt_addr(bo), ndwords);
> + xe_guc_buf_flush(buf), ndwords);
> if (!ret)
> ret = -ENODATA;
> else if (ret > ndwords)
> ret = -EPROTO;
> else if (ret > 0)
> - xe_map_memcpy_from(xe, buff, &bo->vmap, 0, ret * sizeof(u32));
> + memcpy(dst, xe_guc_buf_sync_read(buf), ret * sizeof(u32));
nit: given this usage, maybe one day we should add optimized variant that copies directly to dst?
xe_guc_buf_sync_into(buf, dst, size);
>
> - xe_bo_unpin_map_no_vm(bo);
> return ret;
> }
>
> /* Return: number of state dwords restored or a negative error code on failure */
> static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
> - const void *buff, size_t size)
> + const void *src, size_t size)
> {
> const int ndwords = size / sizeof(u32);
> - struct xe_tile *tile = gt_to_tile(gt);
> - struct xe_device *xe = tile_to_xe(tile);
> struct xe_guc *guc = >->uc.guc;
> - struct xe_bo *bo;
> + CLASS(xe_guc_buf_from_data, buf)(&guc->buf, src, size);
> int ret;
>
> xe_gt_assert(gt, size % sizeof(u32) == 0);
> xe_gt_assert(gt, size == ndwords * sizeof(u32));
>
> - bo = xe_bo_create_pin_map_novm(xe, tile,
> - ALIGN(size, PAGE_SIZE),
> - ttm_bo_type_kernel,
> - XE_BO_FLAG_SYSTEM |
> - XE_BO_FLAG_GGTT |
> - XE_BO_FLAG_GGTT_INVALIDATE, false);
> - if (IS_ERR(bo))
> - return PTR_ERR(bo);
> -
> - xe_map_memcpy_to(xe, &bo->vmap, 0, buff, size);
> + if (!xe_guc_buf_is_valid(buf))
> + return -ENOBUFS;
>
> ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_RESTORE,
> - xe_bo_ggtt_addr(bo), ndwords);
> + xe_guc_buf_flush(buf), ndwords);
> if (!ret)
> ret = -ENODATA;
> else if (ret > ndwords)
> ret = -EPROTO;
>
> - xe_bo_unpin_map_no_vm(bo);
> return ret;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index e2d41750f863c..4f2f2783339c3 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -11,6 +11,9 @@
> struct xe_gt;
> struct xe_sriov_migration_data;
>
> +/* TODO: get this information by querying GuC in the future */
> +#define XE_GT_SRIOV_PF_MIGRATION_GUC_DATA_MAX_SIZE SZ_8M
so it's 8M or 4M ?
maybe wrap that into function now
u32 xe_gt_sriov_pf_migration_guc_data_size(struct xe_gt *gt)
{
if (xe_sriov_pf_migration_supported(gt_to_xe))
return SZ_4M; /* TODO: ... */
return 0;
}
> +
> int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index 7c65528859ecb..cd6ab277a7876 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -24,6 +24,7 @@
> #include "xe_gt_printk.h"
> #include "xe_gt_sriov_vf.h"
> #include "xe_gt_throttle.h"
> +#include "xe_gt_sriov_pf_migration.h"
> #include "xe_guc_ads.h"
> #include "xe_guc_buf.h"
> #include "xe_guc_capture.h"
> @@ -40,6 +41,7 @@
> #include "xe_mmio.h"
> #include "xe_platform_types.h"
> #include "xe_sriov.h"
> +#include "xe_sriov_pf_migration.h"
> #include "xe_uc.h"
> #include "xe_uc_fw.h"
> #include "xe_wa.h"
> @@ -821,6 +823,14 @@ static int vf_guc_init_post_hwconfig(struct xe_guc *guc)
> return 0;
> }
>
> +static u32 guc_buf_cache_size(struct xe_guc *guc)
> +{
> + if (IS_SRIOV_PF(guc_to_xe(guc)) && xe_sriov_pf_migration_supported(guc_to_xe(guc)))
> + return XE_GT_SRIOV_PF_MIGRATION_GUC_DATA_MAX_SIZE;
then
u32 size = XE_GUC_BUF_CACHE_DEFAULT_SIZE;
if (IS_SRIOV_PF(guc_to_xe(guc)))
size += xe_gt_sriov_pf_migration_guc_data_size(guc_to_gt(guc));
return size;
> + else
> + return XE_GUC_BUF_CACHE_DEFAULT_SIZE;
> +}
> +
> /**
> * xe_guc_init_post_hwconfig - initialize GuC post hwconfig load
> * @guc: The GuC object
> @@ -860,7 +870,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
> if (ret)
> return ret;
>
> - ret = xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE);
> + ret = xe_guc_buf_cache_init(&guc->buf, guc_buf_cache_size(guc));
> if (ret)
> return ret;
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 21/26] drm/xe/migrate: Add function to copy of VRAM data in chunks
2025-10-21 22:41 ` [PATCH v2 21/26] drm/xe/migrate: Add function to copy of VRAM data in chunks Michał Winiarski
@ 2025-10-23 19:29 ` Michal Wajdeczko
2025-10-30 6:07 ` Laguna, Lukasz
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-23 19:29 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> From: Lukasz Laguna <lukasz.laguna@intel.com>
>
> Introduce a new function to copy data between VRAM and sysmem objects.
> The existing xe_migrate_copy() is tailored for eviction and restore
> operations, which involves additional logic and operates on entire
> objects.
> The xe_migrate_vram_copy_chunk() allows copying chunks of data to or
> from a dedicated buffer object, which is essential in case of VF
> migration.
>
> Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_migrate.c | 134 ++++++++++++++++++++++++++++++--
> drivers/gpu/drm/xe/xe_migrate.h | 8 ++
> 2 files changed, 136 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> index 3112c966c67d7..d30675707162b 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.c
> +++ b/drivers/gpu/drm/xe/xe_migrate.c
> @@ -514,7 +514,7 @@ int xe_migrate_init(struct xe_migrate *m)
>
> static u64 max_mem_transfer_per_pass(struct xe_device *xe)
> {
> - if (!IS_DGFX(xe) && xe_device_has_flat_ccs(xe))
> + if ((!IS_DGFX(xe) || IS_SRIOV_PF(xe)) && xe_device_has_flat_ccs(xe))
being a PF is permanent case, while your expected usage is only during of the handling of the VF migration.
maybe it would be better to introduce flag FORCE_CCS_LIMITED_TRANSFER and pass it to the migration calls when really needed ?
> return MAX_CCS_LIMITED_TRANSFER;
>
> return MAX_PREEMPTDISABLE_TRANSFER;
> @@ -1155,6 +1155,133 @@ struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate)
> return migrate->q;
> }
>
> +/**
> + * xe_migrate_vram_copy_chunk() - Copy a chunk of a VRAM buffer object.
> + * @vram_bo: The VRAM buffer object.
> + * @vram_offset: The VRAM offset.
> + * @sysmem_bo: The sysmem buffer object.
> + * @sysmem_offset: The sysmem offset.
> + * @size: The size of VRAM chunk to copy.
> + * @dir: The direction of the copy operation.
> + *
> + * Copies a portion of a buffer object between VRAM and system memory.
> + * On Xe2 platforms that support flat CCS, VRAM data is decompressed when
> + * copying to system memory.
> + *
> + * Return: Pointer to a dma_fence representing the last copy batch, or
> + * an error pointer on failure. If there is a failure, any copy operation
> + * started by the function call has been synced.
> + */
> +struct dma_fence *xe_migrate_vram_copy_chunk(struct xe_bo *vram_bo, u64 vram_offset,
> + struct xe_bo *sysmem_bo, u64 sysmem_offset,
> + u64 size, enum xe_migrate_copy_dir dir)
> +{
> + struct xe_device *xe = xe_bo_device(vram_bo);
> + struct xe_tile *tile = vram_bo->tile;
> + struct xe_gt *gt = tile->primary_gt;
> + struct xe_migrate *m = tile->migrate;
> + struct dma_fence *fence = NULL;
> + struct ttm_resource *vram = vram_bo->ttm.resource;
> + struct ttm_resource *sysmem = sysmem_bo->ttm.resource;
> + struct xe_res_cursor vram_it, sysmem_it;
> + u64 vram_L0_ofs, sysmem_L0_ofs;
> + u32 vram_L0_pt, sysmem_L0_pt;
> + u64 vram_L0, sysmem_L0;
> + bool to_sysmem = (dir == XE_MIGRATE_COPY_TO_SRAM);
> + bool use_comp_pat = to_sysmem &&
> + GRAPHICS_VER(xe) >= 20 && xe_device_has_flat_ccs(xe);
> + int pass = 0;
> + int err;
> +
> + xe_assert(xe, IS_ALIGNED(vram_offset | sysmem_offset | size, PAGE_SIZE));
> + xe_assert(xe, xe_bo_is_vram(vram_bo));
> + xe_assert(xe, !xe_bo_is_vram(sysmem_bo));
> + xe_assert(xe, !range_overflows(vram_offset, size, (u64)vram_bo->ttm.base.size));
> + xe_assert(xe, !range_overflows(sysmem_offset, size, (u64)sysmem_bo->ttm.base.size));
> +
> + xe_res_first(vram, vram_offset, size, &vram_it);
> + xe_res_first_sg(xe_bo_sg(sysmem_bo), sysmem_offset, size, &sysmem_it);
> +
> + while (size) {
> + u32 pte_flags = PTE_UPDATE_FLAG_IS_VRAM;
> + u32 batch_size = 2; /* arb_clear() + MI_BATCH_BUFFER_END */
> + struct xe_sched_job *job;
> + struct xe_bb *bb;
> + u32 update_idx;
> + bool usm = xe->info.has_usm;
> + u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE;
> +
> + sysmem_L0 = xe_migrate_res_sizes(m, &sysmem_it);
> + vram_L0 = min(xe_migrate_res_sizes(m, &vram_it), sysmem_L0);
> +
> + drm_dbg(&xe->drm, "Pass %u, size: %llu\n", pass++, vram_L0);
nit: there is xe_dbg()
> +
> + pte_flags |= use_comp_pat ? PTE_UPDATE_FLAG_IS_COMP_PTE : 0;
> + batch_size += pte_update_size(m, pte_flags, vram, &vram_it, &vram_L0,
> + &vram_L0_ofs, &vram_L0_pt, 0, 0, avail_pts);
> +
> + batch_size += pte_update_size(m, 0, sysmem, &sysmem_it, &vram_L0, &sysmem_L0_ofs,
> + &sysmem_L0_pt, 0, avail_pts, avail_pts);
> + batch_size += EMIT_COPY_DW;
> +
> + bb = xe_bb_new(gt, batch_size, usm);
> + if (IS_ERR(bb)) {
> + err = PTR_ERR(bb);
> + return ERR_PTR(err);
> + }
> +
> + if (xe_migrate_allow_identity(vram_L0, &vram_it))
> + xe_res_next(&vram_it, vram_L0);
> + else
> + emit_pte(m, bb, vram_L0_pt, true, use_comp_pat, &vram_it, vram_L0, vram);
> +
> + emit_pte(m, bb, sysmem_L0_pt, false, false, &sysmem_it, vram_L0, sysmem);
> +
> + bb->cs[bb->len++] = MI_BATCH_BUFFER_END;
> + update_idx = bb->len;
> +
> + if (to_sysmem)
> + emit_copy(gt, bb, vram_L0_ofs, sysmem_L0_ofs, vram_L0, XE_PAGE_SIZE);
> + else
> + emit_copy(gt, bb, sysmem_L0_ofs, vram_L0_ofs, vram_L0, XE_PAGE_SIZE);
> +
> + job = xe_bb_create_migration_job(m->q, bb, xe_migrate_batch_base(m, usm),
> + update_idx);
> + if (IS_ERR(job)) {
> + err = PTR_ERR(job);
> + goto err;
this goto inside 'while' loop is weird
> + }
> +
> + xe_sched_job_add_migrate_flush(job, MI_INVALIDATE_TLB);
> +
> + WARN_ON_ONCE(!dma_resv_test_signaled(vram_bo->ttm.base.resv,
> + DMA_RESV_USAGE_BOOKKEEP));
> + WARN_ON_ONCE(!dma_resv_test_signaled(sysmem_bo->ttm.base.resv,
> + DMA_RESV_USAGE_BOOKKEEP));
xe_WARN_ON_ONCE() ?
but why do not use asserts() if we are sure that this shouldn't happen ?
> +
> + mutex_lock(&m->job_mutex);
scoped_quard(mutex) ?
> + xe_sched_job_arm(job);
> + dma_fence_put(fence);
> + fence = dma_fence_get(&job->drm.s_fence->finished);
> + xe_sched_job_push(job);
> +
> + dma_fence_put(m->fence);
> + m->fence = dma_fence_get(fence);
> + mutex_unlock(&m->job_mutex);
> +
> + xe_bb_free(bb, fence);
> + size -= vram_L0;
> + continue;
> +
> +err:
> + xe_bb_free(bb, NULL);
> +
> + return ERR_PTR(err);
> + }
> +
> + return fence;
> +}
> +
> static void emit_clear_link_copy(struct xe_gt *gt, struct xe_bb *bb, u64 src_ofs,
> u32 size, u32 pitch)
> {
> @@ -1852,11 +1979,6 @@ static bool xe_migrate_vram_use_pde(struct drm_pagemap_addr *sram_addr,
> return true;
> }
>
> -enum xe_migrate_copy_dir {
> - XE_MIGRATE_COPY_TO_VRAM,
> - XE_MIGRATE_COPY_TO_SRAM,
> -};
> -
> #define XE_CACHELINE_BYTES 64ull
> #define XE_CACHELINE_MASK (XE_CACHELINE_BYTES - 1)
>
> diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h
> index 4fad324b62535..d7bcc6ad8464e 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.h
> +++ b/drivers/gpu/drm/xe/xe_migrate.h
> @@ -28,6 +28,11 @@ struct xe_vma;
>
> enum xe_sriov_vf_ccs_rw_ctxs;
>
> +enum xe_migrate_copy_dir {
> + XE_MIGRATE_COPY_TO_VRAM,
> + XE_MIGRATE_COPY_TO_SRAM,
> +};
nit: it's time for xe_migrate_types.h ;)
but not as part of this series
> +
> /**
> * struct xe_migrate_pt_update_ops - Callbacks for the
> * xe_migrate_update_pgtables() function.
> @@ -131,6 +136,9 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
>
> struct xe_lrc *xe_migrate_lrc(struct xe_migrate *migrate);
> struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate);
> +struct dma_fence *xe_migrate_vram_copy_chunk(struct xe_bo *vram_bo, u64 vram_offset,
> + struct xe_bo *sysmem_bo, u64 sysmem_offset,
> + u64 size, enum xe_migrate_copy_dir dir);
> int xe_migrate_access_memory(struct xe_migrate *m, struct xe_bo *bo,
> unsigned long offset, void *buf, int len,
> int write);
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control
2025-10-21 22:41 ` [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control Michał Winiarski
2025-10-23 11:44 ` kernel test robot
@ 2025-10-23 19:54 ` Michal Wajdeczko
2025-10-29 8:54 ` Michał Winiarski
1 sibling, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-23 19:54 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> Connect the helpers to allow save and restore of VRAM migration data in
> stop_copy / resume device state.
>
> Co-developed-by: Lukasz Laguna <lukasz.laguna@intel.com>
> Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 18 ++
> .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 222 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 6 +
> .../drm/xe/xe_gt_sriov_pf_migration_types.h | 3 +
> drivers/gpu/drm/xe/xe_sriov_pf_control.c | 3 +
> 6 files changed, 254 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index e7156ad3d1839..680f2de44144b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -191,6 +191,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> CASE2STR(SAVE_DATA_GUC);
> CASE2STR(SAVE_DATA_GGTT);
> CASE2STR(SAVE_DATA_MMIO);
> + CASE2STR(SAVE_DATA_VRAM);
> CASE2STR(SAVE_DATA_DONE);
> CASE2STR(SAVE_FAILED);
> CASE2STR(SAVED);
> @@ -832,6 +833,7 @@ static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
> }
> }
>
> @@ -885,6 +887,19 @@ static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
> ret = xe_gt_sriov_pf_migration_mmio_save(gt, vfid);
> if (ret)
> return ret;
> +
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
> + return -EAGAIN;
> + }
> +
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM)) {
> + if (xe_gt_sriov_pf_migration_vram_size(gt, vfid) > 0) {
> + ret = xe_gt_sriov_pf_migration_vram_save(gt, vfid);
> + if (ret == -EAGAIN)
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
> + if (ret)
> + return ret;
> + }
> }
>
> return 0;
> @@ -1100,6 +1115,9 @@ pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
> case XE_SRIOV_MIGRATION_DATA_TYPE_GUC:
> ret = xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
> break;
> + case XE_SRIOV_MIGRATION_DATA_TYPE_VRAM:
> + ret = xe_gt_sriov_pf_migration_vram_restore(gt, vfid, data);
> + break;
> default:
> xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> break;
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> index 9dfcebd5078ac..fba10136f7cc7 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> @@ -36,6 +36,7 @@
> * @XE_GT_SRIOV_STATE_SAVE_DATA_GUC: indicates PF needs to save VF GuC migration data.
> * @XE_GT_SRIOV_STATE_SAVE_DATA_GGTT: indicates PF needs to save VF GGTT migration data.
> * @XE_GT_SRIOV_STATE_SAVE_DATA_MMIO: indicates PF needs to save VF MMIO migration data.
> + * @XE_GT_SRIOV_STATE_SAVE_DATA_VRAM: indicates PF needs to save VF VRAM migration data.
> * @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
> * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
> @@ -82,6 +83,7 @@ enum xe_gt_sriov_control_bits {
> XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
> XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
> XE_GT_SRIOV_STATE_SAVE_DATA_MMIO,
> + XE_GT_SRIOV_STATE_SAVE_DATA_VRAM,
> XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
> XE_GT_SRIOV_STATE_SAVE_FAILED,
> XE_GT_SRIOV_STATE_SAVED,
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 41335b15ffdbe..2c6a86d98ee31 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -17,6 +17,7 @@
> #include "xe_gt_sriov_printk.h"
> #include "xe_guc_buf.h"
> #include "xe_guc_ct.h"
> +#include "xe_migrate.h"
> #include "xe_sriov.h"
> #include "xe_sriov_migration_data.h"
> #include "xe_sriov_pf_migration.h"
> @@ -485,6 +486,220 @@ int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
> return pf_restore_vf_mmio_mig_data(gt, vfid, data);
> }
>
> +/**
> + * xe_gt_sriov_pf_migration_vram_size() - Get the size of VF VRAM migration data.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: size in bytes or a negative error code on failure.
> + */
> +ssize_t xe_gt_sriov_pf_migration_vram_size(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (gt != xe_root_mmio_gt(gt_to_xe(gt)))
probably you mean
if (!xe_gt_is_main_type(gt))
> + return 0;
> +
> + return xe_gt_sriov_pf_config_get_lmem(gt, vfid);
> +}
> +
> +static struct dma_fence *__pf_save_restore_vram(struct xe_gt *gt, unsigned int vfid,
> + struct xe_bo *vram, u64 vram_offset,
> + struct xe_bo *sysmem, u64 sysmem_offset,
> + size_t size, bool save)
> +{
> + struct dma_fence *ret = NULL;
> + struct drm_exec exec;
> + int err;
> +
> + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
> + drm_exec_until_all_locked(&exec) {
> + err = drm_exec_lock_obj(&exec, &vram->ttm.base);
> + drm_exec_retry_on_contention(&exec);
> + if (err) {
> + ret = ERR_PTR(err);
> + goto err;
> + }
> +
> + err = drm_exec_lock_obj(&exec, &sysmem->ttm.base);
> + drm_exec_retry_on_contention(&exec);
> + if (err) {
> + ret = ERR_PTR(err);
> + goto err;
> + }
> + }
> +
> + ret = xe_migrate_vram_copy_chunk(vram, vram_offset, sysmem, sysmem_offset, size,
> + save ? XE_MIGRATE_COPY_TO_SRAM : XE_MIGRATE_COPY_TO_VRAM);
> +
> +err:
> + drm_exec_fini(&exec);
> +
> + return ret;
> +}
> +
> +static int pf_save_vram_chunk(struct xe_gt *gt, unsigned int vfid,
> + struct xe_bo *src_vram, u64 src_vram_offset,
> + size_t size)
> +{
> + struct xe_sriov_migration_data *data;
> + struct dma_fence *fence;
> + int ret;
> +
> + data = xe_sriov_migration_data_alloc(gt_to_xe(gt));
> + if (!data)
> + return -ENOMEM;
> +
> + ret = xe_sriov_migration_data_init(data, gt->tile->id, gt->info.id,
> + XE_SRIOV_MIGRATION_DATA_TYPE_VRAM,
> + src_vram_offset, size);
> + if (ret)
> + goto fail;
> +
> + fence = __pf_save_restore_vram(gt, vfid,
> + src_vram, src_vram_offset,
> + data->bo, 0, size, true);
> +
> + ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
> + dma_fence_put(fence);
> + if (!ret) {
> + ret = -ETIME;
> + goto fail;
> + }
> +
> + pf_dump_mig_data(gt, vfid, data);
> +
> + ret = xe_gt_sriov_pf_migration_save_produce(gt, vfid, data);
> + if (ret)
> + goto fail;
> +
> + return 0;
> +
> +fail:
> + xe_sriov_migration_data_free(data);
> + return ret;
> +}
> +
> +#define VF_VRAM_STATE_CHUNK_MAX_SIZE SZ_512M
> +static int pf_save_vf_vram_mig_data(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> + loff_t *offset = &migration->vram_save_offset;
> + struct xe_bo *vram;
> + size_t vram_size, chunk_size;
> + int ret;
> +
> + vram = xe_gt_sriov_pf_config_get_lmem_obj(gt, vfid);
> + if (!vram)
> + return -ENXIO;
no error message ?
> +
> + vram_size = xe_bo_size(vram);
> + chunk_size = min(vram_size - *offset, VF_VRAM_STATE_CHUNK_MAX_SIZE);
what if *offset > vram_size ?
> +
> + ret = pf_save_vram_chunk(gt, vfid, vram, *offset, chunk_size);
> + if (ret)
> + goto fail;
> +
> + *offset += chunk_size;
> +
> + xe_bo_put(vram);
> +
> + if (*offset < vram_size)
> + return -EAGAIN;
> +
> + return 0;
> +
> +fail:
> + xe_bo_put(vram);
> + xe_gt_sriov_err(gt, "Failed to save VF%u VRAM data (%pe)\n", vfid, ERR_PTR(ret));
> + return ret;
> +}
> +
> +static int pf_restore_vf_vram_mig_data(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_migration_data *data)
> +{
> + u64 end = data->hdr.offset + data->hdr.size;
> + struct dma_fence *fence;
> + struct xe_bo *vram;
> + size_t size;
> + int ret = 0;
> +
> + vram = xe_gt_sriov_pf_config_get_lmem_obj(gt, vfid);
> + if (!vram)
> + return -ENXIO;
no error message ? other errors are reported
> +
> + size = xe_bo_size(vram);
> +
> + if (end > size || end < data->hdr.size) {
> + ret = -EINVAL;
> + goto err;
> + }
> +
> + pf_dump_mig_data(gt, vfid, data);
> +
> + fence = __pf_save_restore_vram(gt, vfid, vram, data->hdr.offset,
> + data->bo, 0, data->hdr.size, false);
> + ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
define this timeout at least as macro (if not as helper function, as this might be platform/settings specific)
> + dma_fence_put(fence);
> + if (!ret) {
> + ret = -ETIME;
> + goto err;
> + }
> +
> + return 0;
> +err:
> + xe_bo_put(vram);
> + xe_gt_sriov_err(gt, "Failed to restore VF%u VRAM data (%pe)\n", vfid, ERR_PTR(ret));
> + return ret;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_vram_save() - Save VF VRAM migration data.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier (can't be 0)
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_migration_vram_save(struct xe_gt *gt, unsigned int vfid)
> +{
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid != PFID);
> + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> +
> + return pf_save_vf_vram_mig_data(gt, vfid);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_vram_restore() - Restore VF VRAM migration data.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier (can't be 0)
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_migration_vram_restore(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_migration_data *data)
> +{
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid != PFID);
> + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> +
> + return pf_restore_vf_vram_mig_data(gt, vfid, data);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_save_init() - Initialize per-GT migration related data.
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier (can't be 0)
> + */
> +void xe_gt_sriov_pf_migration_save_init(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_pick_gt_migration(gt, vfid)->vram_save_offset = 0;
> +}
> +
> /**
> * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT.
> * @gt: the &xe_gt
> @@ -522,6 +737,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> size += sizeof(struct xe_sriov_pf_migration_hdr);
> total += size;
>
> + size = xe_gt_sriov_pf_migration_vram_size(gt, vfid);
> + if (size < 0)
> + return size;
> + else if (size > 0)
"else" not needed
> + size += sizeof(struct xe_sriov_pf_migration_hdr);
> + total += size;
> +
> return total;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index 24a233c4cd0bb..ca518eda5429f 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -27,6 +27,12 @@ ssize_t xe_gt_sriov_pf_migration_mmio_size(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
> struct xe_sriov_migration_data *data);
> +ssize_t xe_gt_sriov_pf_migration_vram_size(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_migration_vram_save(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_migration_vram_restore(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_migration_data *data);
> +
> +void xe_gt_sriov_pf_migration_save_init(struct xe_gt *gt, unsigned int vfid);
>
> ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> index 75d8b94cbbefb..39a940c9b0a4b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> @@ -16,6 +16,9 @@
> struct xe_gt_sriov_migration_data {
> /** @ring: queue containing VF save / restore migration data */
> struct ptr_ring ring;
> +
> + /** @vram_save_offset: offset within VRAM, used for chunked VRAM save */
"last saved offset" ?
> + loff_t vram_save_offset;
> };
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> index c2768848daba1..aac8ecb861545 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> @@ -5,6 +5,7 @@
>
> #include "xe_device.h"
> #include "xe_gt_sriov_pf_control.h"
> +#include "xe_gt_sriov_pf_migration.h"
> #include "xe_sriov_migration_data.h"
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_printk.h"
> @@ -171,6 +172,8 @@ int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid)
> return ret;
>
> for_each_gt(gt, xe, id) {
> + xe_gt_sriov_pf_migration_save_init(gt, vfid);
> +
> ret = xe_gt_sriov_pf_control_trigger_save_vf(gt, vfid);
> if (ret)
> return ret;
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 24/26] drm/xe/pf: Enable SR-IOV VF migration for PTL and BMG
2025-10-21 22:41 ` [PATCH v2 24/26] drm/xe/pf: Enable SR-IOV VF migration for PTL and BMG Michał Winiarski
@ 2025-10-23 20:15 ` Michal Wajdeczko
0 siblings, 0 replies; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-23 20:15 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> All of the necessary building blocks are now in place for PTL and BMG to
> support SR-IOV VF migration.
> Enable the feature without the need to pass feature enabling debug flags
> for those platforms.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_device.h | 5 +++++
> drivers/gpu/drm/xe/xe_device_types.h | 2 ++
> drivers/gpu/drm/xe/xe_pci.c | 8 ++++++--
> drivers/gpu/drm/xe/xe_pci_types.h | 1 +
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 4 +++-
> 5 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_device.h b/drivers/gpu/drm/xe/xe_device.h
> index 32cc6323b7f64..0c4404c78227c 100644
> --- a/drivers/gpu/drm/xe/xe_device.h
> +++ b/drivers/gpu/drm/xe/xe_device.h
> @@ -152,6 +152,11 @@ static inline bool xe_device_has_sriov(struct xe_device *xe)
> return xe->info.has_sriov;
> }
>
> +static inline bool xe_device_has_sriov_vf_migration(struct xe_device *xe)
> +{
> + return xe->info.has_sriov_vf_migration;
> +}
> +
> static inline bool xe_device_has_msix(struct xe_device *xe)
> {
> return xe->irq.msix.nvec > 0;
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h
> index 02c04ad7296e4..8973e17b9a359 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -311,6 +311,8 @@ struct xe_device {
> u8 has_range_tlb_inval:1;
> /** @info.has_sriov: Supports SR-IOV */
> u8 has_sriov:1;
> + /** @info.has_sriov_vf_migration: Supports SR-IOV VF migration */
> + u8 has_sriov_vf_migration:1;
> /** @info.has_usm: Device has unified shared memory support */
> u8 has_usm:1;
> /** @info.has_64bit_timestamp: Device supports 64-bit timestamps */
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index c3136141a9536..d4f9ee9d020b2 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -362,6 +362,7 @@ static const struct xe_device_desc bmg_desc = {
> .has_heci_cscfi = 1,
> .has_late_bind = true,
> .has_sriov = true,
> + .has_sriov_vf_migration = true,
> .max_gt_per_tile = 2,
> .needs_scratch = true,
> .subplatforms = (const struct xe_subplatform_desc[]) {
> @@ -378,6 +379,7 @@ static const struct xe_device_desc ptl_desc = {
> .has_display = true,
> .has_flat_ccs = 1,
> .has_sriov = true,
> + .has_sriov_vf_migration = true,
> .max_gt_per_tile = 2,
> .needs_scratch = true,
> .needs_shared_vf_gt_wq = true,
> @@ -657,6 +659,7 @@ static int xe_info_init_early(struct xe_device *xe,
> xe->info.has_pxp = desc->has_pxp;
> xe->info.has_sriov = xe_configfs_primary_gt_allowed(to_pci_dev(xe->drm.dev)) &&
> desc->has_sriov;
> + xe->info.has_sriov_vf_migration = desc->has_sriov_vf_migration;
> xe->info.skip_guc_pc = desc->skip_guc_pc;
> xe->info.skip_mtcfg = desc->skip_mtcfg;
> xe->info.skip_pcode = desc->skip_pcode;
> @@ -1020,9 +1023,10 @@ static int xe_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> xe_step_name(xe->info.step.media),
> xe_step_name(xe->info.step.basedie));
>
> - drm_dbg(&xe->drm, "SR-IOV support: %s (mode: %s)\n",
> + drm_dbg(&xe->drm, "SR-IOV support: %s (mode: %s) (VF migration: %s)\n",
> str_yes_no(xe_device_has_sriov(xe)),
> - xe_sriov_mode_to_string(xe_device_sriov_mode(xe)));
> + xe_sriov_mode_to_string(xe_device_sriov_mode(xe)),
> + str_yes_no(xe_device_has_sriov_vf_migration(xe)));
>
> err = xe_pm_init_early(xe);
> if (err)
> diff --git a/drivers/gpu/drm/xe/xe_pci_types.h b/drivers/gpu/drm/xe/xe_pci_types.h
> index a4451bdc79fb3..40f158b3ac890 100644
> --- a/drivers/gpu/drm/xe/xe_pci_types.h
> +++ b/drivers/gpu/drm/xe/xe_pci_types.h
> @@ -48,6 +48,7 @@ struct xe_device_desc {
> u8 has_mbx_power_limits:1;
> u8 has_pxp:1;
> u8 has_sriov:1;
> + u8 has_sriov_vf_migration:1;
> u8 needs_scratch:1;
> u8 skip_guc_pc:1;
> u8 skip_mtcfg:1;
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index 88babec9c893e..a6cf3b57edba1 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -50,7 +50,9 @@ bool xe_sriov_pf_migration_supported(struct xe_device *xe)
>
> static bool pf_check_migration_support(struct xe_device *xe)
> {
> - /* XXX: for now this is for feature enabling only */
> + if (xe_device_has_sriov_vf_migration(xe))
> + return true;
but from the PF POV, are there any differences in migration between platforms which already have .has_sriov flag?
and on the VF side we decided just to rely on the xe_has_memirq() flag, maybe we can do the same her on PF side?
note that all pre-PTL platforms require .force_probe flag anyway,
and that's we also enabled unconditional .has_sriov flag for them
btw, IIRC we also should check for min GuC version on PTL for proper CCS migration,
IMO the PF shall reject VF migration on older GuC
> +
> return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> }
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 20/26] drm/xe/pf: Add helper to retrieve VF's LMEM object
2025-10-21 22:41 ` [PATCH v2 20/26] drm/xe/pf: Add helper to retrieve VF's LMEM object Michał Winiarski
@ 2025-10-23 20:25 ` Michal Wajdeczko
2025-10-28 23:40 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-23 20:25 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> From: Lukasz Laguna <lukasz.laguna@intel.com>
>
> Instead of accessing VF's lmem_obj directly, introduce a helper function
> to make the access more convenient.
>
> Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 31 ++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 1 +
> 2 files changed, 32 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> index c857879e28fe5..28d648c386487 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> @@ -1643,6 +1643,37 @@ int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid,
> "LMEM", n, err);
> }
>
> +static struct xe_bo *pf_get_vf_config_lmem_obj(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
> +
> + return config->lmem_obj;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_config_get_lmem_obj - Take a reference to the struct &xe_bo backing VF LMEM.
* xe_gt_sriov_pf_config_get_lmem_obj() - Take ...
> + * @gt: the &xe_gt
> + * @vfid: the VF identifier
since you assert vfid below, add "(can't be 0)"
> + *
> + * This function can only be called on PF.
> + * The caller is responsible for calling xe_bo_put() on the returned object.
> + *
> + * Return: pointer to struct &xe_bo backing VF LMEM (if any).
> + */
> +struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_bo *lmem_obj;
> +
> + xe_gt_assert(gt, vfid);
> +
> + mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
> + lmem_obj = pf_get_vf_config_lmem_obj(gt, vfid);
> + xe_bo_get(lmem_obj);
> + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
> +
> + return lmem_obj;
or just
{
guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
return xe_bo_get(pf_get_vf_config_lmem_obj(gt, vfid));
}
> +}
> +
> static u64 pf_query_free_lmem(struct xe_gt *gt)
> {
> struct xe_tile *tile = gt->tile;
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> index 6916b8f58ebf2..03c5dc0cd5fef 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> @@ -36,6 +36,7 @@ int xe_gt_sriov_pf_config_set_lmem(struct xe_gt *gt, unsigned int vfid, u64 size
> int xe_gt_sriov_pf_config_set_fair_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs);
> int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs,
> u64 size);
> +struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid);
>
> u32 xe_gt_sriov_pf_config_get_exec_quantum(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_config_set_exec_quantum(struct xe_gt *gt, unsigned int vfid, u32 exec_quantum);
probably we should block VF's reprovisioning during the SAVE/RESTORE,
but that could be done later as follow up
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 15/26] drm/xe/pf: Handle GuC migration data as part of PF control
2025-10-21 22:41 ` [PATCH v2 15/26] drm/xe/pf: Handle GuC migration data as part of PF control Michał Winiarski
@ 2025-10-23 20:39 ` Michal Wajdeczko
2025-10-28 13:04 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-23 20:39 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> Connect the helpers to allow save and restore of GuC migration data in
> stop_copy / resume device state.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 26 +++++++++++++++++--
> .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 ++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 9 ++++++-
> 3 files changed, 34 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index c159f35adcbe7..18f6e3028d4f0 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -188,6 +188,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> CASE2STR(SAVE_WIP);
> CASE2STR(SAVE_PROCESS_DATA);
> CASE2STR(SAVE_WAIT_DATA);
> + CASE2STR(SAVE_DATA_GUC);
> CASE2STR(SAVE_DATA_DONE);
> CASE2STR(SAVE_FAILED);
> CASE2STR(SAVED);
> @@ -343,6 +344,7 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOP_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
this should be in one of the previous patch
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> @@ -824,6 +826,7 @@ static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
>
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> }
> }
> @@ -848,6 +851,16 @@ static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
>
> static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
> {
> + int ret;
> +
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC)) {
> + xe_gt_assert(gt, xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0);
> +
> + ret = xe_gt_sriov_pf_migration_guc_save(gt, vfid);
> + if (ret)
> + return ret;
> + }
> +
> return 0;
> }
>
> @@ -881,6 +894,7 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> {
> if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> pf_enter_vf_wip(gt, vfid);
> pf_queue_vf(gt, vfid);
> return true;
> @@ -1046,14 +1060,22 @@ static int
> pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
> {
> struct xe_sriov_migration_data *data = xe_gt_sriov_pf_migration_restore_consume(gt, vfid);
> + int ret = 0;
>
> xe_gt_assert(gt, data);
>
> - xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> + switch (data->type) {
> + case XE_SRIOV_MIGRATION_DATA_TYPE_GUC:
> + ret = xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
> + break;
> + default:
> + xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> + break;
> + }
>
> xe_sriov_migration_data_free(data);
>
> - return 0;
> + return ret;
> }
>
> static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> index 35ceb2ff62110..8b951ee8a24fe 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> @@ -33,6 +33,7 @@
> * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
> * @XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA: indicates that VF migration data is being produced.
> * @XE_GT_SRIOV_STATE_SAVE_WAIT_DATA: indicates that PF awaits for space in migration data ring.
> + * @XE_GT_SRIOV_STATE_SAVE_DATA_GUC: indicates PF needs to save VF GuC migration data.
> * @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
> * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
> @@ -76,6 +77,7 @@ enum xe_gt_sriov_control_bits {
> XE_GT_SRIOV_STATE_SAVE_WIP,
> XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
> XE_GT_SRIOV_STATE_SAVE_WAIT_DATA,
> + XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
as DATA_GUC and introduced later DATA_GGTT/MMIO/VRAM are kind of sub-states of PROCESS_DATA,
better to keep them together
XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
XE_GT_SRIOV_STATE_SAVE_DATA_MMIO,
XE_GT_SRIOV_STATE_SAVE_DATA_VRAM,
XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
XE_GT_SRIOV_STATE_SAVE_WAIT_CONSUME,
and at some point you need to update state diagram to include those DATA states
> XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
> XE_GT_SRIOV_STATE_SAVE_FAILED,
> XE_GT_SRIOV_STATE_SAVED,
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 127162e8c66e8..594178fbe36d0 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -279,10 +279,17 @@ int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
> ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> {
> ssize_t total = 0;
> + ssize_t size;
>
> xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>
> - /* Nothing to query yet - will be updated once per-GT migration data types are added */
> + size = xe_gt_sriov_pf_migration_guc_size(gt, vfid);
> + if (size < 0)
> + return size;
> + else if (size > 0)
"else" not needed
> + size += sizeof(struct xe_sriov_pf_migration_hdr);
> + total += size;
> +
> return total;
> }
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling
2025-10-21 22:41 ` [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
@ 2025-10-23 21:50 ` Michal Wajdeczko
2025-10-28 17:03 ` Michał Winiarski
2025-10-28 3:22 ` Tian, Kevin
1 sibling, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-23 21:50 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> In an upcoming change, the VF GGTT migration data will be handled as
> part of VF control state machine. Add the necessary helpers to allow the
> migration data transfer to/from the HW GGTT resource.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_ggtt.c | 100 +++++++++++++++++++++
> drivers/gpu/drm/xe/xe_ggtt.h | 3 +
> drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 44 +++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 5 ++
> 5 files changed, 154 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> index 40680f0c49a17..99fe891c7939e 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.c
> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> @@ -151,6 +151,14 @@ static void xe_ggtt_set_pte_and_flush(struct xe_ggtt *ggtt, u64 addr, u64 pte)
> ggtt_update_access_counter(ggtt);
> }
>
> +static u64 xe_ggtt_get_pte(struct xe_ggtt *ggtt, u64 addr)
> +{
> + xe_tile_assert(ggtt->tile, !(addr & XE_PTE_MASK));
> + xe_tile_assert(ggtt->tile, addr < ggtt->size);
> +
> + return readq(&ggtt->gsm[addr >> XE_PTE_SHIFT]);
> +}
> +
> static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size)
> {
> u16 pat_index = tile_to_xe(ggtt->tile)->pat.idx[XE_CACHE_WB];
> @@ -233,16 +241,19 @@ void xe_ggtt_might_lock(struct xe_ggtt *ggtt)
> static const struct xe_ggtt_pt_ops xelp_pt_ops = {
> .pte_encode_flags = xelp_ggtt_pte_flags,
> .ggtt_set_pte = xe_ggtt_set_pte,
> + .ggtt_get_pte = xe_ggtt_get_pte,
> };
>
> static const struct xe_ggtt_pt_ops xelpg_pt_ops = {
> .pte_encode_flags = xelpg_ggtt_pte_flags,
> .ggtt_set_pte = xe_ggtt_set_pte,
> + .ggtt_get_pte = xe_ggtt_get_pte,
> };
>
> static const struct xe_ggtt_pt_ops xelpg_pt_wa_ops = {
> .pte_encode_flags = xelpg_ggtt_pte_flags,
> .ggtt_set_pte = xe_ggtt_set_pte_and_flush,
> + .ggtt_get_pte = xe_ggtt_get_pte,
> };
>
> static void __xe_ggtt_init_early(struct xe_ggtt *ggtt, u32 reserved)
> @@ -912,6 +923,22 @@ static void xe_ggtt_assign_locked(struct xe_ggtt *ggtt, const struct drm_mm_node
> xe_ggtt_invalidate(ggtt);
> }
>
> +/**
> + * xe_ggtt_pte_size() - Convert GGTT VMA size to page table entries size.
> + * @ggtt: the &xe_ggtt
> + * @size: GGTT VMA size in bytes
> + *
> + * Return: GGTT page table entries size in bytes.
> + */
> +size_t xe_ggtt_pte_size(struct xe_ggtt *ggtt, size_t size)
passing ggtt just for assert seems overkill
> +{
> + struct xe_device __maybe_unused *xe = tile_to_xe(ggtt->tile);
we try to avoid __maybe_unused
if you need xe/tile/gt just in the assert, then put to_xe/tile/gt inside it
> +
> + xe_assert(xe, size % XE_PAGE_SIZE == 0);
> +
> + return size / XE_PAGE_SIZE * sizeof(u64);
> +}
> +
> /**
> * xe_ggtt_assign - assign a GGTT region to the VF
> * @node: the &xe_ggtt_node to update
> @@ -927,6 +954,79 @@ void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid)
> xe_ggtt_assign_locked(node->ggtt, &node->base, vfid);
> mutex_unlock(&node->ggtt->lock);
> }
> +
> +/**
> + * xe_ggtt_node_save() - Save a &xe_ggtt_node to a buffer.
> + * @node: the &xe_ggtt_node to be saved
> + * @dst: destination buffer
> + * @size: destination buffer size in bytes
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size)
> +{
> + struct xe_ggtt *ggtt;
> + u64 start, end;
> + u64 *buf = dst;
> +
> + if (!node)
> + return -ENOENT;
> +
> + guard(mutex)(&node->ggtt->lock);
> +
> + ggtt = node->ggtt;
> + start = node->base.start;
> + end = start + node->base.size - 1;
> +
> + if (xe_ggtt_pte_size(ggtt, node->base.size) > size)
> + return -EINVAL;
> +
> + while (start < end) {
> + *buf++ = ggtt->pt_ops->ggtt_get_pte(ggtt, start) & ~GGTT_PTE_VFID;
up to this point function is generic, non-IOV, so maybe leave PTEs as-is and do not sanitize VFID ?
or, similar to node_load(), also pass vfid to enforce additional checks ?
pte = ggtt->pt_ops->ggtt_get_pte(ggtt, start);
if (vfid != u64_get_bits(pte, GGTT_PTE_VFID))
return -EPERM;
then optionally sanitize using:
*buf++ = u64_replace_bits(pte, 0, GGTT_PTE_VFID);
> + start += XE_PAGE_SIZE;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_ggtt_node_load() - Load a &xe_ggtt_node from a buffer.
> + * @node: the &xe_ggtt_node to be loaded
> + * @src: source buffer
> + * @size: source buffer size in bytes
> + * @vfid: VF identifier
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid)
> +{
> + u64 vfid_pte = xe_encode_vfid_pte(vfid);
> + const u64 *buf = src;
> + struct xe_ggtt *ggtt;
> + u64 start, end;
> +
> + if (!node)
> + return -ENOENT;
> +
> + guard(mutex)(&node->ggtt->lock);
> +
> + ggtt = node->ggtt;
> + start = node->base.start;
> + end = start + size - 1;
> +
> + if (xe_ggtt_pte_size(ggtt, node->base.size) != size)
> + return -EINVAL;
> +
> + while (start < end) {
> + ggtt->pt_ops->ggtt_set_pte(ggtt, start, (*buf & ~GGTT_PTE_VFID) | vfid_pte);
pte = u64_replace_bits(*buf++, vfid, GGTT_PTE_VFID));
ggtt_set_pte(ggtt, start, pte);
> + start += XE_PAGE_SIZE;
> + buf++;
> + }
> + xe_ggtt_invalidate(ggtt);
> +
> + return 0;
> +}
> +
> #endif
>
> /**
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
> index 75fc7a1efea76..5f55f80fe3adc 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.h
> +++ b/drivers/gpu/drm/xe/xe_ggtt.h
> @@ -42,7 +42,10 @@ int xe_ggtt_dump(struct xe_ggtt *ggtt, struct drm_printer *p);
> u64 xe_ggtt_print_holes(struct xe_ggtt *ggtt, u64 alignment, struct drm_printer *p);
>
> #ifdef CONFIG_PCI_IOV
> +size_t xe_ggtt_pte_size(struct xe_ggtt *ggtt, size_t size);
this could be generic (non PCI-IOV only) inline helper or macro here or in .c
size_t to_xe_ggtt_pt_size(size_t size);
and then more elegant solution would be to expose
size_t xe_ggtt_node_pt_size(const struct xe_ggtt_node *node);
and yes, that would require to additionally expose something from gt_sriov_pf_config
as migration code doesn't have access to this node,
but maybe xe_gt_sriov_pf_config_ggtt_save() can be updated to also support 'query' mode?
size_t xe_gt_sriov_pf_config_ggtt_save(gt, vfid, buf, size) -> bytes saved
size_t xe_gt_sriov_pf_config_ggtt_save(gt, vfid, NULL, 0) -> size to be saved
> void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid);
> +int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size);
> +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid);
> #endif
>
> #ifndef CONFIG_LOCKDEP
> diff --git a/drivers/gpu/drm/xe/xe_ggtt_types.h b/drivers/gpu/drm/xe/xe_ggtt_types.h
> index c5e999d58ff2a..dacd796f81844 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt_types.h
> +++ b/drivers/gpu/drm/xe/xe_ggtt_types.h
> @@ -78,6 +78,8 @@ struct xe_ggtt_pt_ops {
> u64 (*pte_encode_flags)(struct xe_bo *bo, u16 pat_index);
> /** @ggtt_set_pte: Directly write into GGTT's PTE */
> void (*ggtt_set_pte)(struct xe_ggtt *ggtt, u64 addr, u64 pte);
> + /** @ggtt_get_pte: Directly read from GGTT's PTE */
> + u64 (*ggtt_get_pte)(struct xe_ggtt *ggtt, u64 addr);
> };
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> index c0c0215c07036..c857879e28fe5 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> @@ -726,6 +726,50 @@ int xe_gt_sriov_pf_config_set_fair_ggtt(struct xe_gt *gt, unsigned int vfid,
> return xe_gt_sriov_pf_config_bulk_set_ggtt(gt, vfid, num_vfs, fair);
> }
>
> +/**
> + * xe_gt_sriov_pf_config_ggtt_save() - Save a VF provisioned GGTT data into a buffer.
> + * @gt: the &xe_gt
> + * @vfid: VF identifier (can't be 0)
> + * @buf: the GGTT data destination buffer
> + * @size: the size of the buffer
> + *
> + * This function can only be called on PF.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
> + void *buf, size_t size)
> +{
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid);
> +
> + guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
> +
> + return xe_ggtt_node_save(pf_pick_vf_config(gt, vfid)->ggtt_region, buf, size);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_config_ggtt_restore() - Restore a VF provisioned GGTT data from a buffer.
> + * @gt: the &xe_gt
> + * @vfid: VF identifier (can't be 0)
> + * @buf: the GGTT data source buffer
> + * @size: the size of the buffer
> + *
> + * This function can only be called on PF.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size)
> +{
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid);
> +
> + guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
> +
> + return xe_ggtt_node_load(pf_pick_vf_config(gt, vfid)->ggtt_region, buf, size, vfid);
> +}
> +
> static u32 pf_get_min_spare_ctxs(struct xe_gt *gt)
> {
> /* XXX: preliminary */
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> index 513e6512a575b..6916b8f58ebf2 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> @@ -61,6 +61,11 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
> int xe_gt_sriov_pf_config_restore(struct xe_gt *gt, unsigned int vfid,
> const void *buf, size_t size);
>
> +int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
> + void *buf, size_t size);
> +int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size);
> +
> bool xe_gt_sriov_pf_config_is_empty(struct xe_gt *gt, unsigned int vfid);
>
> int xe_gt_sriov_pf_config_init(struct xe_gt *gt);
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 18/26] drm/xe/pf: Add helpers for VF MMIO migration data handling
2025-10-21 22:41 ` [PATCH v2 18/26] drm/xe/pf: Add helpers for VF MMIO migration data handling Michał Winiarski
@ 2025-10-23 22:10 ` Michal Wajdeczko
2025-10-28 23:37 ` Michał Winiarski
0 siblings, 1 reply; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-23 22:10 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, intel-xe, linux-kernel, kvm,
Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> In an upcoming change, the VF MMIO migration data will be handled as
> part of VF control state machine. Add the necessary helpers to allow the
> migration data transfer to/from the VF MMIO registers.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 88 +++++++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf.h | 6 ++
wrong place for those helpers
just promote xe_reg_vf_to_pf()
or maybe it can be done like this:
void xe_mmio_init_vf(struct xe_mmio *vf, const struct xe_mmio *pf, vfid);
then
struct xe_mmio mmio_vf;
xe_mmio_init_vf(&mmio_vf, >->mmio, vfid);
val = xe_mmio_read32(&mmio_vf, REG);
xe_mmio_write32(&mmio_vf, val, REG);
let me try check this out
> 2 files changed, 94 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> index c4dda87b47cc8..31ee86166dfd0 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> @@ -194,6 +194,94 @@ static void pf_clear_vf_scratch_regs(struct xe_gt *gt, unsigned int vfid)
> }
> }
>
> +/**
> + * xe_gt_sriov_pf_mmio_vf_size - Get the size of VF MMIO register data.
> + * @gt: the &xe_gt
> + * @vfid: VF identifier
> + *
> + * Return: size in bytes.
> + */
> +size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (xe_gt_is_media_type(gt))
> + return MED_VF_SW_FLAG_COUNT * sizeof(u32);
> + else
> + return VF_SW_FLAG_COUNT * sizeof(u32);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_mmio_vf_save - Save VF MMIO register values to a buffer.
> + * @gt: the &xe_gt
> + * @vfid: VF identifier
> + * @buf: destination buffer
> + * @size: destination buffer size in bytes
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size)
> +{
> + u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
> + struct xe_reg scratch;
> + u32 *regs = buf;
> + int n, count;
> +
> + if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
> + return -EINVAL;
> +
> + if (xe_gt_is_media_type(gt)) {
> + count = MED_VF_SW_FLAG_COUNT;
> + for (n = 0; n < count; n++) {
> + scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
> + regs[n] = xe_mmio_read32(>->mmio, scratch);
> + }
> + } else {
> + count = VF_SW_FLAG_COUNT;
> + for (n = 0; n < count; n++) {
> + scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
> + regs[n] = xe_mmio_read32(>->mmio, scratch);
> + }
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_mmio_vf_restore - Restore VF MMIO register values from a buffer.
> + * @gt: the &xe_gt
> + * @vfid: VF identifier
> + * @buf: source buffer
> + * @size: source buffer size in bytes
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size)
> +{
> + u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
> + const u32 *regs = buf;
> + struct xe_reg scratch;
> + int n, count;
> +
> + if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
> + return -EINVAL;
> +
> + if (xe_gt_is_media_type(gt)) {
> + count = MED_VF_SW_FLAG_COUNT;
> + for (n = 0; n < count; n++) {
> + scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
> + xe_mmio_write32(>->mmio, scratch, regs[n]);
> + }
> + } else {
> + count = VF_SW_FLAG_COUNT;
> + for (n = 0; n < count; n++) {
> + scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
> + xe_mmio_write32(>->mmio, scratch, regs[n]);
> + }
> + }
> +
> + return 0;
> +}
> +
> /**
> * xe_gt_sriov_pf_sanitize_hw() - Reset hardware state related to a VF.
> * @gt: the &xe_gt
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> index e7fde3f9937af..7f4f1fda5f77a 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> @@ -6,6 +6,8 @@
> #ifndef _XE_GT_SRIOV_PF_H_
> #define _XE_GT_SRIOV_PF_H_
>
> +#include <linux/types.h>
> +
> struct xe_gt;
>
> #ifdef CONFIG_PCI_IOV
> @@ -16,6 +18,10 @@ void xe_gt_sriov_pf_init_hw(struct xe_gt *gt);
> void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid);
> void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt);
> void xe_gt_sriov_pf_restart(struct xe_gt *gt);
> +size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size);
> +int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size);
> #else
> static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
> {
^ permalink raw reply [flat|nested] 72+ messages in thread
* RE: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-21 22:41 ` [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
2025-10-22 7:12 ` Christoph Hellwig
@ 2025-10-27 7:24 ` Tian, Kevin
2025-10-29 20:46 ` Winiarski, Michal
2025-10-27 7:26 ` Tian, Kevin
2 siblings, 1 reply; 72+ messages in thread
From: Tian, Kevin @ 2025-10-27 7:24 UTC (permalink / raw)
To: Winiarski, Michal, Alex Williamson, De Marchi, Lucas,
Thomas Hellström, Vivi, Rodrigo, Jason Gunthorpe,
Yishai Hadas, intel-xe@lists.freedesktop.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Brost, Matthew,
Wajdeczko, Michal
Cc: dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Wednesday, October 22, 2025 6:42 AM
> +
> +/**
> + * struct xe_vfio_pci_migration_file - file used for reading / writing migration
> data
> + */
let's use the comment style in vfio, i.e. "/*" instead of "/**"
> +struct xe_vfio_pci_migration_file {
> + /** @filp: pointer to underlying &struct file */
> + struct file *filp;
> + /** @lock: serializes accesses to migration data */
> + struct mutex lock;
> + /** @xe_vdev: backpointer to &struct xe_vfio_pci_core_device */
> + struct xe_vfio_pci_core_device *xe_vdev;
above comments are obvious...
> +struct xe_vfio_pci_core_device {
> + /** @core_device: vendor-agnostic VFIO device */
> + struct vfio_pci_core_device core_device;
> +
> + /** @mig_state: current device migration state */
> + enum vfio_device_mig_state mig_state;
> +
> + /** @vfid: VF number used by PF, xe uses 1-based indexing for vfid
> */
> + unsigned int vfid;
is 1-based indexing a sw or hw requirement?
> +
> + /** @pf: pointer to driver_private of physical function */
> + struct pci_dev *pf;
> +
> + /** @fd: &struct xe_vfio_pci_migration_file for userspace to
> read/write migration data */
> + struct xe_vfio_pci_migration_file *fd;
s/fd/migf/, as 'fd' is integer in all other places
btw it's risky w/o a lock protecting the state transition. See the usage of
state_mutex in other migration drivers.
> +static void xe_vfio_pci_reset_done(struct pci_dev *pdev)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev = pci_get_drvdata(pdev);
> + int ret;
> +
> + ret = xe_sriov_vfio_wait_flr_done(xe_vdev->pf, xe_vdev->vfid);
> + if (ret)
> + dev_err(&pdev->dev, "Failed to wait for FLR: %d\n", ret);
why is there a device specific wait for flr done? suppose it's already
covered by pci core...
> +
> + xe_vfio_pci_reset(xe_vdev);
> +}
> +
> +static const struct pci_error_handlers xe_vfio_pci_err_handlers = {
> + .reset_done = xe_vfio_pci_reset_done,
> +};
missing ".error_detected "
> +static struct xe_vfio_pci_migration_file *
> +xe_vfio_pci_alloc_file(struct xe_vfio_pci_core_device *xe_vdev,
> + enum xe_vfio_pci_file_type type)
> +{
> + struct xe_vfio_pci_migration_file *migf;
> + const struct file_operations *fops;
> + int flags;
> +
> + migf = kzalloc(sizeof(*migf), GFP_KERNEL);
> + if (!migf)
> + return ERR_PTR(-ENOMEM);
> +
> + fops = type == XE_VFIO_FILE_SAVE ? &xe_vfio_pci_save_fops :
> &xe_vfio_pci_resume_fops;
> + flags = type == XE_VFIO_FILE_SAVE ? O_RDONLY : O_WRONLY;
> + migf->filp = anon_inode_getfile("xe_vfio_mig", fops, migf, flags);
> + if (IS_ERR(migf->filp)) {
> + kfree(migf);
> + return ERR_CAST(migf->filp);
> + }
> +
> + mutex_init(&migf->lock);
> + migf->xe_vdev = xe_vdev;
> + xe_vdev->fd = migf;
> +
> + stream_open(migf->filp->f_inode, migf->filp);
> +
> + return migf;
miss a get_file(). vfio core will do another fput() upon error.
see vfio_ioct_mig_return_fd()
> +}
> +
> +static struct file *
> +xe_vfio_set_state(struct xe_vfio_pci_core_device *xe_vdev, u32 new)
> +{
> + u32 cur = xe_vdev->mig_state;
> + int ret;
> +
> + dev_dbg(xe_vdev_to_dev(xe_vdev),
> + "state: %s->%s\n", vfio_dev_state_str(cur),
> vfio_dev_state_str(new));
> +
> + /*
> + * "STOP" handling is reused for "RUNNING_P2P", as the device
> doesn't have the capability to
> + * selectively block p2p DMA transfers.
> + * The device is not processing new workload requests when the VF is
> stopped, and both
> + * memory and MMIO communication channels are transferred to
> destination (where processing
> + * will be resumed).
> + */
> + if ((cur == VFIO_DEVICE_STATE_RUNNING && new ==
> VFIO_DEVICE_STATE_STOP) ||
this is not required when P2P is supported. vfio_mig_get_next_state() will
find the right arc from RUNNING to RUNNING_P2P to STOP.
> + (cur == VFIO_DEVICE_STATE_RUNNING && new ==
> VFIO_DEVICE_STATE_RUNNING_P2P)) {
> + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
> + if (ret)
> + goto err;
> +
> + return NULL;
> + }
better to align with other drivers, s/stop/suspend/ and s/run/resume/
> +
> + if ((cur == VFIO_DEVICE_STATE_RUNNING_P2P && new ==
> VFIO_DEVICE_STATE_STOP) ||
> + (cur == VFIO_DEVICE_STATE_STOP && new ==
> VFIO_DEVICE_STATE_RUNNING_P2P))
> + return NULL;
> +
> + if ((cur == VFIO_DEVICE_STATE_STOP && new ==
> VFIO_DEVICE_STATE_RUNNING) ||
> + (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new ==
> VFIO_DEVICE_STATE_RUNNING)) {
> + ret = xe_sriov_vfio_run(xe_vdev->pf, xe_vdev->vfid);
> + if (ret)
> + goto err;
> +
> + return NULL;
> + }
> +
> + if (cur == VFIO_DEVICE_STATE_STOP && new ==
> VFIO_DEVICE_STATE_STOP_COPY) {
> + struct xe_vfio_pci_migration_file *migf;
> +
> + migf = xe_vfio_pci_alloc_file(xe_vdev, XE_VFIO_FILE_SAVE);
> + if (IS_ERR(migf)) {
> + ret = PTR_ERR(migf);
> + goto err;
> + }
> +
> + ret = xe_sriov_vfio_stop_copy_enter(xe_vdev->pf, xe_vdev-
> >vfid);
> + if (ret) {
> + fput(migf->filp);
> + goto err;
> + }
> +
> + return migf->filp;
> + }
> +
> + if ((cur == VFIO_DEVICE_STATE_STOP_COPY && new ==
> VFIO_DEVICE_STATE_STOP)) {
> + if (xe_vdev->fd)
> + xe_vfio_pci_disable_file(xe_vdev->fd);
> +
> + xe_sriov_vfio_stop_copy_exit(xe_vdev->pf, xe_vdev->vfid);
> +
> + return NULL;
> + }
> +
> + if (cur == VFIO_DEVICE_STATE_STOP && new ==
> VFIO_DEVICE_STATE_RESUMING) {
> + struct xe_vfio_pci_migration_file *migf;
> +
> + migf = xe_vfio_pci_alloc_file(xe_vdev,
> XE_VFIO_FILE_RESUME);
> + if (IS_ERR(migf)) {
> + ret = PTR_ERR(migf);
> + goto err;
> + }
> +
> + ret = xe_sriov_vfio_resume_enter(xe_vdev->pf, xe_vdev-
> >vfid);
> + if (ret) {
> + fput(migf->filp);
> + goto err;
> + }
> +
> + return migf->filp;
> + }
> +
> + if (cur == VFIO_DEVICE_STATE_RESUMING && new ==
> VFIO_DEVICE_STATE_STOP) {
> + if (xe_vdev->fd)
> + xe_vfio_pci_disable_file(xe_vdev->fd);
> +
> + xe_sriov_vfio_resume_exit(xe_vdev->pf, xe_vdev->vfid);
> +
> + return NULL;
> + }
> +
> + if (new == VFIO_DEVICE_STATE_ERROR)
> + xe_sriov_vfio_error(xe_vdev->pf, xe_vdev->vfid);
the ERROR state is not passed to the variant driver. You'll get -EINVAL
from vfio_mig_get_next_state(). so this is dead code.
If the pf driver needs to be notified, you could check the ret value instead.
> +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev =
> + container_of(core_vdev, struct xe_vfio_pci_core_device,
> core_device.vdev);
> + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> +
> + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> + return;
> +
> + /* vfid starts from 1 for xe */
> + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
pci_iov_vf_id() returns error if it's not vf. should be checked.
> +static int xe_vfio_pci_init_dev(struct vfio_device *core_vdev)
> +{
> + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> +
> + if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe")
> == 0)
> + xe_vfio_pci_migration_init(core_vdev);
I didn't see the point of checking the driver name.
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Intel Corporation");
please use the author name, as other drivers do
^ permalink raw reply [flat|nested] 72+ messages in thread
* RE: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-21 22:41 ` [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
2025-10-22 7:12 ` Christoph Hellwig
2025-10-27 7:24 ` Tian, Kevin
@ 2025-10-27 7:26 ` Tian, Kevin
2 siblings, 0 replies; 72+ messages in thread
From: Tian, Kevin @ 2025-10-27 7:26 UTC (permalink / raw)
To: Winiarski, Michal, Alex Williamson, De Marchi, Lucas,
Thomas Hellström, Vivi, Rodrigo, Jason Gunthorpe,
Yishai Hadas, intel-xe@lists.freedesktop.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Brost, Matthew,
Wajdeczko, Michal
Cc: dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Wednesday, October 22, 2025 6:42 AM
> To: Alex Williamson <alex.williamson@redhat.com>;
You may need to run get_maintainer.pl again. Alex just gets a new mail
address.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs
2025-10-22 22:31 ` Michal Wajdeczko
@ 2025-10-27 12:02 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-27 12:02 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 12:31:47AM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > The states will be used by upcoming changes to produce (in case of save)
> > or consume (in case of resume) the VF migration data.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 248 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 6 +
> > .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 14 +
> > drivers/gpu/drm/xe/xe_sriov_pf_control.c | 96 +++++++
> > drivers/gpu/drm/xe/xe_sriov_pf_control.h | 4 +
> > drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 38 +++
> > 6 files changed, 406 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index 2e6bd3d1fe1da..b770916e88e53 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -184,6 +184,12 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> > CASE2STR(PAUSE_SAVE_GUC);
> > CASE2STR(PAUSE_FAILED);
> > CASE2STR(PAUSED);
> > + CASE2STR(SAVE_WIP);
> > + CASE2STR(SAVE_FAILED);
> > + CASE2STR(SAVED);
> > + CASE2STR(RESTORE_WIP);
> > + CASE2STR(RESTORE_FAILED);
> > + CASE2STR(RESTORED);
> > CASE2STR(RESUME_WIP);
> > CASE2STR(RESUME_SEND_RESUME);
> > CASE2STR(RESUME_FAILED);
> > @@ -208,6 +214,8 @@ static unsigned long pf_get_default_timeout(enum xe_gt_sriov_control_bits bit)
> > case XE_GT_SRIOV_STATE_FLR_WIP:
> > case XE_GT_SRIOV_STATE_FLR_RESET_CONFIG:
> > return 5 * HZ;
> > + case XE_GT_SRIOV_STATE_RESTORE_WIP:
> > + return 20 * HZ;
> > default:
> > return HZ;
> > }
> > @@ -329,6 +337,8 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> > }
> >
> > #define pf_enter_vf_state_machine_bug(gt, vfid) ({ \
> > @@ -359,6 +369,8 @@ static void pf_queue_vf(struct xe_gt *gt, unsigned int vfid)
> >
> > static void pf_exit_vf_flr_wip(struct xe_gt *gt, unsigned int vfid);
> > static void pf_exit_vf_stop_wip(struct xe_gt *gt, unsigned int vfid);
> > +static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid);
> > +static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid);
> > static void pf_exit_vf_pause_wip(struct xe_gt *gt, unsigned int vfid);
> > static void pf_exit_vf_resume_wip(struct xe_gt *gt, unsigned int vfid);
> >
> > @@ -380,6 +392,8 @@ static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
> >
> > pf_exit_vf_flr_wip(gt, vfid);
> > pf_exit_vf_stop_wip(gt, vfid);
> > + pf_exit_vf_save_wip(gt, vfid);
> > + pf_exit_vf_restore_wip(gt, vfid);
> > pf_exit_vf_pause_wip(gt, vfid);
> > pf_exit_vf_resume_wip(gt, vfid);
> >
> > @@ -399,6 +413,8 @@ static void pf_enter_vf_ready(struct xe_gt *gt, unsigned int vfid)
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> > pf_exit_vf_mismatch(gt, vfid);
> > pf_exit_vf_wip(gt, vfid);
> > }
> > @@ -675,6 +691,8 @@ static void pf_enter_vf_resumed(struct xe_gt *gt, unsigned int vfid)
> > {
> > pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> > pf_exit_vf_mismatch(gt, vfid);
> > pf_exit_vf_wip(gt, vfid);
> > }
> > @@ -753,6 +771,16 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> > return -EPERM;
> > }
> >
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> > + xe_gt_sriov_dbg(gt, "VF%u save is in progress!\n", vfid);
> > + return -EBUSY;
> > + }
> > +
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> > + xe_gt_sriov_dbg(gt, "VF%u restore is in progress!\n", vfid);
> > + return -EBUSY;
> > + }
> > +
> > if (!pf_enter_vf_resume_wip(gt, vfid)) {
> > xe_gt_sriov_dbg(gt, "VF%u resume already in progress!\n", vfid);
> > return -EALREADY;
> > @@ -776,6 +804,218 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> > return -ECANCELED;
> > }
> >
> > +static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> > +}
> > +
> > +static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED))
> > + pf_enter_vf_state_machine_bug(gt, vfid);
> > +
> > + xe_gt_sriov_dbg(gt, "VF%u saved!\n", vfid);
>
> nit: you can move expect(PAUSED) here
Ok.
>
> > +
> > + pf_exit_vf_mismatch(gt, vfid);
> > + pf_exit_vf_wip(gt, vfid);
> > + pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > +}
> > +
> > +static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> > + return false;
> > +
> > + pf_enter_vf_saved(gt, vfid);
> > +
> > + return true;
> > +}
> > +
> > +static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> > + pf_enter_vf_wip(gt, vfid);
> > + pf_queue_vf(gt, vfid);
> > + return true;
> > + }
> > +
> > + return false;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_control_trigger_save_vf() - Start an SR-IOV VF migration data save sequence.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
> > + xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
> > + return -EPERM;
> > + }
> > +
> > + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
> > + xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
> > + return -EPERM;
> > + }
> > +
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> > + xe_gt_sriov_dbg(gt, "VF%u restore is in progress!\n", vfid);
> > + return -EBUSY;
> > + }
> > +
> > + if (!pf_enter_vf_save_wip(gt, vfid)) {
> > + xe_gt_sriov_dbg(gt, "VF%u save already in progress!\n", vfid);
> > + return -EALREADY;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_control_finish_save_vf() - Complete a VF migration data save sequence.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED)) {
> > + pf_enter_vf_mismatch(gt, vfid);
> > + return -EIO;
> > + }
> > +
> > + pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > +
> > + return 0;
> > +}
> > +
> > +static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP);
> > +}
> > +
> > +static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED))
> > + pf_enter_vf_state_machine_bug(gt, vfid);
> > +
> > + xe_gt_sriov_dbg(gt, "VF%u restored!\n", vfid);
> > +
> > + pf_exit_vf_mismatch(gt, vfid);
> > + pf_exit_vf_wip(gt, vfid);
> > + pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > +}
> > +
> > +static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> > + return false;
> > +
> > + pf_enter_vf_restored(gt, vfid);
> > +
> > + return true;
> > +}
> > +
> > +static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> > + pf_enter_vf_wip(gt, vfid);
> > + pf_queue_vf(gt, vfid);
> > + return true;
> > + }
> > +
> > + return false;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_control_trigger restore_vf() - Start an SR-IOV VF migration data restore sequence.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
> > + xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
> > + return -EPERM;
> > + }
> > +
> > + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
> > + xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
> > + return -EPERM;
> > + }
> > +
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> > + xe_gt_sriov_dbg(gt, "VF%u save is in progress!\n", vfid);
> > + return -EBUSY;
> > + }
> > +
> > + if (!pf_enter_vf_restore_wip(gt, vfid)) {
> > + xe_gt_sriov_dbg(gt, "VF%u restore already in progress!\n", vfid);
> > + return -EALREADY;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static int pf_wait_vf_restore_done(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + unsigned long timeout = pf_get_default_timeout(XE_GT_SRIOV_STATE_RESTORE_WIP);
> > + int err;
> > +
> > + err = pf_wait_vf_wip_done(gt, vfid, timeout);
> > + if (err) {
> > + xe_gt_sriov_notice(gt, "VF%u RESTORE didn't finish in %u ms (%pe)\n",
> > + vfid, jiffies_to_msecs(timeout), ERR_PTR(err));
> > + return err;
> > + }
> > +
> > + if (!pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED))
> > + return -EIO;
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_control_finish_restore_vf() - Complete a VF migration data restore sequence.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + int ret;
> > +
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> > + ret = pf_wait_vf_restore_done(gt, vfid);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED)) {
> > + pf_enter_vf_mismatch(gt, vfid);
> > + return -EIO;
> > + }
> > +
> > + pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > +
> > + return 0;
> > +}
> > +
> > /**
> > * DOC: The VF STOP state machine
> > *
> > @@ -817,6 +1057,8 @@ static void pf_enter_vf_stopped(struct xe_gt *gt, unsigned int vfid)
> >
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> > pf_exit_vf_mismatch(gt, vfid);
> > pf_exit_vf_wip(gt, vfid);
> > }
> > @@ -1461,6 +1703,12 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
> > if (pf_exit_vf_pause_save_guc(gt, vfid))
> > return true;
> >
> > + if (pf_handle_vf_save(gt, vfid))
> > + return true;
> > +
> > + if (pf_handle_vf_restore(gt, vfid))
> > + return true;
> > +
> > if (pf_exit_vf_resume_send_resume(gt, vfid))
> > return true;
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > index 8a72ef3778d47..abc233f6302ed 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > @@ -14,8 +14,14 @@ struct xe_gt;
> > int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
> > void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
> >
> > +bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
> > +
> > int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_stop_vf(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_trigger_flr(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_sync_flr(struct xe_gt *gt, unsigned int vfid, bool sync);
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > index c80b7e77f1ad2..e113dc98b33ce 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > @@ -31,6 +31,12 @@
> > * @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state.
> > * @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
> > * @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
> > + * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
> > + * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> > + * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
> > + * @XE_GT_SRIOV_STATE_RESTORE_WIP: indicates that VF restore operation is in progress.
> > + * @XE_GT_SRIOV_STATE_RESTORE_FAILED: indicates that VF restore operation has failed.
> > + * @XE_GT_SRIOV_STATE_RESTORED: indicates that VF data is restored.
> > * @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress.
> > * @XE_GT_SRIOV_STATE_RESUME_SEND_RESUME: indicates that the PF is about to send RESUME command.
> > * @XE_GT_SRIOV_STATE_RESUME_FAILED: indicates that a VF resume operation has failed.
> > @@ -63,6 +69,14 @@ enum xe_gt_sriov_control_bits {
> > XE_GT_SRIOV_STATE_PAUSE_FAILED,
> > XE_GT_SRIOV_STATE_PAUSED,
> >
> > + XE_GT_SRIOV_STATE_SAVE_WIP,
> > + XE_GT_SRIOV_STATE_SAVE_FAILED,
> > + XE_GT_SRIOV_STATE_SAVED,
> > +
> > + XE_GT_SRIOV_STATE_RESTORE_WIP,
> > + XE_GT_SRIOV_STATE_RESTORE_FAILED,
> > + XE_GT_SRIOV_STATE_RESTORED,
> > +
> > XE_GT_SRIOV_STATE_RESUME_WIP,
> > XE_GT_SRIOV_STATE_RESUME_SEND_RESUME,
> > XE_GT_SRIOV_STATE_RESUME_FAILED,
>
> it is easier to understand those states after patch 04/26 with diagrams,
> and while there are small and hard to avoid overlaps between 03/26 and 04/26
> the patch itself LGTM, so
>
> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Thanks,
-Michał
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 04/26] drm/xe/pf: Add data structures and handlers for migration rings
2025-10-22 22:06 ` Michal Wajdeczko
@ 2025-10-27 12:33 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-27 12:33 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 12:06:05AM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > Migration data is queued in a per-GT ptr_ring to decouple the worker
> > responsible for handling the data transfer from the .read() and .write()
> > syscalls.
> > Add the data structures and handlers that will be used in future
> > commits.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 259 +++++++++++++++++-
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 6 +-
> > .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 12 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 183 +++++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 14 +
> > .../drm/xe/xe_gt_sriov_pf_migration_types.h | 11 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 3 +
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 143 ++++++++++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 7 +
> > .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 58 ++++
> > drivers/gpu/drm/xe/xe_sriov_pf_types.h | 3 +
> > 11 files changed, 684 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index b770916e88e53..cad73fdaee93c 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -19,6 +19,7 @@
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_control.h"
> > +#include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_tile.h"
> >
> > @@ -185,9 +186,15 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> > CASE2STR(PAUSE_FAILED);
> > CASE2STR(PAUSED);
> > CASE2STR(SAVE_WIP);
> > + CASE2STR(SAVE_PROCESS_DATA);
> > + CASE2STR(SAVE_WAIT_DATA);
> > + CASE2STR(SAVE_DATA_DONE);
> > CASE2STR(SAVE_FAILED);
> > CASE2STR(SAVED);
> > CASE2STR(RESTORE_WIP);
> > + CASE2STR(RESTORE_PROCESS_DATA);
> > + CASE2STR(RESTORE_WAIT_DATA);
> > + CASE2STR(RESTORE_DATA_DONE);
> > CASE2STR(RESTORE_FAILED);
> > CASE2STR(RESTORED);
> > CASE2STR(RESUME_WIP);
> > @@ -804,9 +811,50 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> > return -ECANCELED;
> > }
> >
> > +/**
> > + * DOC: The VF SAVE state machine
> > + *
> > + * SAVE extends the PAUSED state.
> > + *
> > + * The VF SAVE state machine looks like::
> > + *
> > + * ....PAUSED....................................................
> > + * : :
> > + * : (O)<---------o :
> > + * : | \ :
> > + * : save (SAVED) (SAVE_FAILED) :
> > + * : | ^ ^ :
> > + * : | | | :
> > + * : ....V...............o...........o......SAVE_WIP......... :
> > + * : : | | | : :
> > + * : : | empty | : :
> > + * : : | | | : :
> > + * : : | | | : :
> > + * : : | DATA_DONE | : :
> > + * : : | ^ | : :
> > + * : : | | error : :
> > + * : : | no_data / : :
> > + * : : | / / : :
> > + * : : | / / : :
> > + * : : | / / : :
> > + * : : o---------->PROCESS_DATA<----consume : :
> > + * : : \ \ : :
> > + * : : \ \ : :
> > + * : : \ \ : :
> > + * : : ring_full----->WAIT_DATA : :
> > + * : : : :
> > + * : :......................................................: :
> > + * :............................................................:
>
> this will not render correctly (missing extra indent, RESTORE_WIP below is fine)
Ok.
>
> > + *
> > + * For the full state machine view, see `The VF state machine`_.
> > + */
> > static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > - pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> > + }
> > }
> >
> > static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
> > @@ -821,12 +869,39 @@ static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
> > pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > }
> >
> > +static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> > + pf_exit_vf_wip(gt, vfid);
> > +}
> > +
> > +static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + return 0;
> > +}
> > +
> > static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
> > {
> > - if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> > + int ret;
> > +
> > + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA))
> > return false;
> >
> > - pf_enter_vf_saved(gt, vfid);
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
>
> this seems to be done too early
Yeah - I'll change this (and other save/restore related placed) to the
suggested pattern.
>
> > + if (xe_gt_sriov_pf_migration_ring_full(gt, vfid)) {
>
> you should enter(WAIT_DATA) here
>
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
>
> and don't re-enter(PROCESS_DATA) as we shouldn't be in both sub-states at the same time
>
> transition from WAIT to PROCESS shall be done in
>
> pf_exit_vf_wait(gt, vf)
> {
> if (exit(WAIT))
> enter(PROCESS_DATA)
> queue
> }
>
> called from xe_gt_sriov_pf_control_process_save_data()
>
> > +
> > + return true;
> > + }
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
> > +
> > + ret = pf_handle_vf_save_data(gt, vfid);
> > + if (ret == -EAGAIN)
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> > + else if (ret)
> > + pf_enter_vf_save_failed(gt, vfid);
> > + else
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> >
> > return true;
> > }
> > @@ -834,6 +909,7 @@ static bool pf_handle_vf_save(struct xe_gt *gt, unsigned int vfid)
> > static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> > pf_enter_vf_wip(gt, vfid);
> > pf_queue_vf(gt, vfid);
> > return true;
> > @@ -842,6 +918,36 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > return false;
> > }
> >
> > +/**
> > + * xe_gt_sriov_pf_control_check_save_data_done() - Check if all save migration data was produced.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +bool xe_gt_sriov_pf_control_check_save_data_done(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + return pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_control_process_save_data() - Queue VF save migration data processing.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + */
> > +void xe_gt_sriov_pf_control_process_save_data(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
> > + return;
> > +
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA))
> > + pf_queue_vf(gt, vfid);
>
> this should wrapped into:
>
> exit_vf_wait_data()
>
> where actual transition to PROCESS will happen
>
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_control_trigger_save_vf() - Start an SR-IOV VF migration data save sequence.
> > * @gt: the &xe_gt
> > @@ -887,19 +993,62 @@ int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid)
> > */
> > int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
> > {
> > - if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED)) {
> > - pf_enter_vf_mismatch(gt, vfid);
> > + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE)) {
> > + xe_gt_sriov_err(gt, "VF%u save is still in progress!\n", vfid);
> > return -EIO;
> > }
> >
> > pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> > + pf_enter_vf_saved(gt, vfid);
> >
> > return 0;
> > }
> >
> > +/**
> > + * DOC: The VF RESTORE state machine
> > + *
> > + * RESTORE extends the PAUSED state.
> > + *
> > + * The VF RESTORE state machine looks like::
> > + *
> > + * ....PAUSED....................................................
> > + * : :
> > + * : (O)<---------o :
> > + * : | \ :
> > + * : restore (RESTORED) (RESTORE_FAILED) :
> > + * : | ^ ^ :
> > + * : | | | :
> > + * : ....V...............o...........o......RESTORE_WIP...... :
> > + * : : | | | : :
> > + * : : | empty | : :
> > + * : : | | | : :
> > + * : : | | | : :
> > + * : : | DATA_DONE | : :
> > + * : : | ^ | : :
> > + * : : | | error : :
> > + * : : | trailer / : :
> > + * : : | / / : :
> > + * : : | / / : :
> > + * : : | / / : :
> > + * : : o---------->PROCESS_DATA<----produce : :
> > + * : : \ \ : :
> > + * : : \ \ : :
> > + * : : \ \ : :
> > + * : : ring_empty---->WAIT_DATA : :
> > + * : : : :
> > + * : :......................................................: :
> > + * :............................................................:
> > + *
> > + * For the full state machine view, see `The VF state machine`_.
> > + */
> > static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > - pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP);
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE);
> > + }
> > }
> >
> > static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
> > @@ -914,12 +1063,50 @@ static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
> > pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> > }
> >
> > +static void pf_enter_vf_restore_failed(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> > + pf_exit_vf_wip(gt, vfid);
> > +}
> > +
> > +static int
>
> no need to split the line
Ok.
>
> > +pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_sriov_migration_data *data = xe_gt_sriov_pf_migration_restore_consume(gt, vfid);
> > +
> > + xe_gt_assert(gt, data);
> > +
> > + xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> > +
> > + return 0;
> > +}
> > +
> > static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
> > {
> > - if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> > + int ret;
> > +
> > + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA))
> > return false;
> >
> > - pf_enter_vf_restored(gt, vfid);
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
>
> maybe you shouldn't enter(WAIT_DATA) here
>
> > + if (xe_gt_sriov_pf_migration_ring_empty(gt, vfid)) {
>
> but here
>
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE)) {
>
> hmm, there should be no direct transition from WAIT_DATA to DONE
>
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
> > + pf_enter_vf_restored(gt, vfid);
> > +
> > + return true;
> > + }
>
> or just here
>
> > +
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
>
> and transition back to PROCESS only on exit(WAIT) called below
>
> > + return true;
> > + }
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
> > +
> > + ret = pf_handle_vf_restore_data(gt, vfid);
> > + if (ret)
> > + pf_enter_vf_restore_failed(gt, vfid);
> > + else
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
> >
> > return true;
> > }
> > @@ -927,6 +1114,7 @@ static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
> > static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
> > pf_enter_vf_wip(gt, vfid);
> > pf_queue_vf(gt, vfid);
> > return true;
> > @@ -935,6 +1123,41 @@ static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > return false;
> > }
> >
> > +/**
> > + * xe_gt_sriov_pf_control_restore_data_done() - Indicate the end of VF migration data stream.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_control_restore_data_done(struct xe_gt *gt, unsigned int vfid)
> > +{
>
> shouldn't we have additional state checks here?
>
> expect(RESTORE_WIP)
> expect(RESTORE_PROCESS_DATA) ?
>
> this one below just looks for one-time entry, but can we really enter anytime ?
expect(RESTORE_WIP) makes sense - I'll add it.
>
> > + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE)) {
> > + pf_enter_vf_state_machine_bug(gt, vfid);
> > + return -EIO;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_control_process_restore_data() - Queue VF restore migration data processing.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + */
> > +void xe_gt_sriov_pf_control_process_restore_data(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> > + pf_enter_vf_state_machine_bug(gt, vfid);
> > +
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA))
> > + pf_queue_vf(gt, vfid);
>
> IMO the transition to PROCESS shall be also done as part of exit(WAIT_DATA)
>
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_control_trigger restore_vf() - Start an SR-IOV VF migration data restore sequence.
> > * @gt: the &xe_gt
> > @@ -1000,11 +1223,9 @@ int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid
> > {
> > int ret;
> >
> > - if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> > - ret = pf_wait_vf_restore_done(gt, vfid);
> > - if (ret)
> > - return ret;
> > - }
> > + ret = pf_wait_vf_restore_done(gt, vfid);
> > + if (ret)
> > + return ret;
> >
> > if (!pf_expect_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED)) {
> > pf_enter_vf_mismatch(gt, vfid);
> > @@ -1703,9 +1924,21 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
> > if (pf_exit_vf_pause_save_guc(gt, vfid))
> > return true;
> >
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA)) {
> > + xe_gt_sriov_dbg_verbose(gt, "VF%u in %s\n", vfid,
> > + control_bit_to_string(XE_GT_SRIOV_STATE_SAVE_WAIT_DATA));
> > + return false;
> > + }
> > +
> > if (pf_handle_vf_save(gt, vfid))
> > return true;
> >
> > + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA)) {
> > + xe_gt_sriov_dbg_verbose(gt, "VF%u in %s\n", vfid,
> > + control_bit_to_string(XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA));
> > + return false;
> > + }
> > +
> > if (pf_handle_vf_restore(gt, vfid))
> > return true;
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > index abc233f6302ed..6b1ab339e3b73 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > @@ -14,12 +14,14 @@ struct xe_gt;
> > int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
> > void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
> >
> > -bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
> > -
> > int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
> > +bool xe_gt_sriov_pf_control_check_save_data_done(struct xe_gt *gt, unsigned int vfid);
> > +void xe_gt_sriov_pf_control_process_save_data(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_control_restore_data_done(struct xe_gt *gt, unsigned int vfid);
> > +void xe_gt_sriov_pf_control_process_restore_data(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_trigger_restore_vf(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_finish_restore_vf(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_stop_vf(struct xe_gt *gt, unsigned int vfid);
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > index e113dc98b33ce..6e19a8ea88f0b 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > @@ -32,9 +32,15 @@
> > * @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
> > * @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
> > * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
> > + * @XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA: indicates that VF migration data is being produced.
> > + * @XE_GT_SRIOV_STATE_SAVE_WAIT_DATA: indicates that PF awaits for space in migration data ring.
> > + * @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
> > * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> > * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
> > * @XE_GT_SRIOV_STATE_RESTORE_WIP: indicates that VF restore operation is in progress.
> > + * @XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA: indicates that VF migration data is being consumed.
> > + * @XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA: indicates that PF awaits for data in migration data ring.
> > + * @XE_GT_SRIOV_STATE_RESTORE_DATA_DONE: indicates that all migration data was produced by the user.
> > * @XE_GT_SRIOV_STATE_RESTORE_FAILED: indicates that VF restore operation has failed.
> > * @XE_GT_SRIOV_STATE_RESTORED: indicates that VF data is restored.
> > * @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress.
> > @@ -70,10 +76,16 @@ enum xe_gt_sriov_control_bits {
> > XE_GT_SRIOV_STATE_PAUSED,
> >
> > XE_GT_SRIOV_STATE_SAVE_WIP,
> > + XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
> > + XE_GT_SRIOV_STATE_SAVE_WAIT_DATA,
> > + XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
> > XE_GT_SRIOV_STATE_SAVE_FAILED,
> > XE_GT_SRIOV_STATE_SAVED,
> >
> > XE_GT_SRIOV_STATE_RESTORE_WIP,
> > + XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA,
> > + XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA,
> > + XE_GT_SRIOV_STATE_RESTORE_DATA_DONE,
> > XE_GT_SRIOV_STATE_RESTORE_FAILED,
> > XE_GT_SRIOV_STATE_RESTORED,
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index ca28f45aaf481..b6ffd982d6007 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -7,6 +7,7 @@
> >
> > #include "abi/guc_actions_sriov_abi.h"
> > #include "xe_bo.h"
> > +#include "xe_gt_sriov_pf_control.h"
> > #include "xe_gt_sriov_pf_helpers.h"
> > #include "xe_gt_sriov_pf_migration.h"
> > #include "xe_gt_sriov_printk.h"
> > @@ -15,6 +16,17 @@
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_migration.h"
> >
> > +#define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> > +
> > +static struct xe_gt_sriov_migration_data *pf_pick_gt_migration(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid != PFID);
> > + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> > +
> > + return >->sriov.pf.vfs[vfid].migration;
> > +}
> > +
> > /* Return: number of dwords saved/restored/required or a negative error code on failure */
> > static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
> > u64 addr, u32 ndwords)
> > @@ -382,6 +394,162 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> > }
> > #endif /* CONFIG_DEBUG_FS */
> >
> > +/**
> > + * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * Return: true if the ring is empty, otherwise false.
> > + */
> > +bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + return ptr_ring_empty(&pf_pick_gt_migration(gt, vfid)->ring);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_ring_full() - Check if a migration ring is full.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * Return: true if the ring is full, otherwise false.
> > + */
> > +bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + return ptr_ring_full(&pf_pick_gt_migration(gt, vfid)->ring);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_save_produce() - Add VF save data packet to migration ring.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + * @data: &xe_sriov_migration_data packet
> > + *
> > + * Called by the save migration data producer (PF SR-IOV Control worker) when
> > + * processing migration data.
> > + * Wakes up the save migration data consumer (userspace), that is potentially
> > + * waiting for data when the ring is empty.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_migration_data *data)
> > +{
> > + int ret;
> > +
> > + ret = ptr_ring_produce(&pf_pick_gt_migration(gt, vfid)->ring, data);
> > + if (ret)
> > + return ret;
> > +
> > + wake_up_all(xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid));
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_restore_consume() - Get VF restore data packet from migration ring.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * Called by the restore migration data consumer (PF SR-IOV Control worker) when
> > + * processing migration data.
> > + * Wakes up the restore migration data producer (userspace), that is
> > + * potentially waiting to add more data when the ring is full.
> > + *
> > + * Return: Pointer to &struct xe_sriov_migration_data on success,
> > + * NULL if ring is empty.
> > + */
> > +struct xe_sriov_migration_data *
> > +xe_gt_sriov_pf_migration_restore_consume(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> > + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> > + struct xe_sriov_migration_data *data;
> > +
> > + data = ptr_ring_consume(&migration->ring);
> > + if (data)
> > + wake_up_all(wq);
> > +
> > + return data;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_restore_produce() - Add VF restore data packet to migration ring.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + * @data: &xe_sriov_migration_data packet
> > + *
> > + * Called by the restore migration data producer (userspace) when processing
> > + * migration data.
> > + * If the ring is full, waits until there is space.
> > + * Queues the restore migration data consumer (PF SR-IOV Control worker), that
> > + * is potentially waiting for data when the ring is empty.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_migration_restore_produce(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_migration_data *data)
> > +{
> > + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> > + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> > + int ret;
> > +
> > + xe_gt_assert(gt, data->tile == gt->tile->id);
> > + xe_gt_assert(gt, data->gt == gt->info.id);
> > +
> > + while (1) {
>
> or for (;;)
Ok.
>
> > + ret = ptr_ring_produce(&migration->ring, data);
> > + if (!ret)
> > + break;
> > +
> > + ret = wait_event_interruptible(*wq, !ptr_ring_full(&migration->ring));
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + xe_gt_sriov_pf_control_process_restore_data(gt, vfid);
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_save_consume() - Get VF save data packet from migration ring.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * Called by the save migration data consumer (userspace) when
> > + * processing migration data.
> > + * Queues the save migration data producer (PF SR-IOV Control worker), that is
> > + * potentially waiting to add more data when the ring is full.
> > + *
> > + * Return: Pointer to &struct xe_sriov_migration_data on success,
> > + * NULL if ring is empty and there's no more data available,
> > + * ERR_PTR(-EAGAIN) if the ring is empty, but data is still produced.
> > + */
> > +struct xe_sriov_migration_data *
> > +xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> > + struct xe_sriov_migration_data *data;
> > +
> > + data = ptr_ring_consume(&migration->ring);
> > + if (data) {
> > + xe_gt_sriov_pf_control_process_save_data(gt, vfid);
> > + return data;
> > + }
> > +
> > + if (xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
> > + return NULL;
> > +
> > + return ERR_PTR(-EAGAIN);
> > +}
> > +
> > +static void action_ring_cleanup(struct drm_device *dev, void *arg)
> > +{
> > + struct ptr_ring *r = arg;
> > +
> > + ptr_ring_cleanup(r, NULL);
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
> > * @gt: the &xe_gt
> > @@ -393,6 +561,7 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> > int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> > {
> > struct xe_device *xe = gt_to_xe(gt);
> > + unsigned int n, totalvfs;
> > int err;
> >
> > xe_gt_assert(gt, IS_SRIOV_PF(xe));
> > @@ -404,5 +573,19 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> > if (err)
> > return err;
> >
> > + totalvfs = xe_sriov_pf_get_totalvfs(xe);
> > + for (n = 1; n <= totalvfs; n++) {
> > + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, n);
> > +
> > + err = ptr_ring_init(&migration->ring,
> > + XE_GT_SRIOV_PF_MIGRATION_RING_SIZE, GFP_KERNEL);
> > + if (err)
> > + return err;
> > +
> > + err = drmm_add_action_or_reset(&xe->drm, action_ring_cleanup, &migration->ring);
>
> should we wait until drmm cleanup or devm cleanup ?
Worker is drmm, so I did follow that, but yeah, I guess it should match
pdev lifetime rather than DRM dev lifetime.
>
>
> > + if (err)
> > + return err;
> > + }
> > +
> > return 0;
> > }
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > index 09faeae00ddbb..9e67f18ded205 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > @@ -9,11 +9,25 @@
> > #include <linux/types.h>
> >
> > struct xe_gt;
> > +struct xe_sriov_migration_data;
> >
> > int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> > int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
> >
> > +bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> > +bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid);
> > +
> > +int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_migration_data *data);
> > +struct xe_sriov_migration_data *
> > +xe_gt_sriov_pf_migration_restore_consume(struct xe_gt *gt, unsigned int vfid);
> > +
> > +int xe_gt_sriov_pf_migration_restore_produce(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_migration_data *data);
> > +struct xe_sriov_migration_data *
> > +xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid);
> > +
> > #ifdef CONFIG_DEBUG_FS
> > ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
> > char __user *buf, size_t count, loff_t *pos);
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > index 9d672feac5f04..84be6fac16c8b 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > @@ -7,6 +7,7 @@
> > #define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
> >
> > #include <linux/mutex.h>
> > +#include <linux/ptr_ring.h>
> > #include <linux/types.h>
> >
> > /**
> > @@ -24,6 +25,16 @@ struct xe_gt_sriov_state_snapshot {
> > } guc;
> > };
> >
> > +/**
> > + * struct xe_gt_sriov_migration_data - GT-level per-VF migration data.
> > + *
> > + * Used by the PF driver to maintain per-VF migration data.
> > + */
> > +struct xe_gt_sriov_migration_data {
> > + /** @ring: queue containing VF save / restore migration data */
> > + struct ptr_ring ring;
> > +};
> > +
> > /**
> > * struct xe_gt_sriov_pf_migration - GT-level data.
> > *
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > index a64a6835ad656..812e74d3f8f80 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > @@ -33,6 +33,9 @@ struct xe_gt_sriov_metadata {
> >
> > /** @snapshot: snapshot of the VF state data */
> > struct xe_gt_sriov_state_snapshot snapshot;
> > +
> > + /** @migration: per-VF migration data. */
> > + struct xe_gt_sriov_migration_data migration;
> > };
> >
> > /**
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index 8c523c392f98b..eaf581317bdef 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -3,8 +3,36 @@
> > * Copyright © 2025 Intel Corporation
> > */
> >
> > +#include <drm/drm_managed.h>
> > +
> > +#include "xe_device.h"
> > +#include "xe_gt_sriov_pf_control.h"
> > +#include "xe_gt_sriov_pf_migration.h"
> > +#include "xe_pm.h"
> > #include "xe_sriov.h"
> > +#include "xe_sriov_pf_helpers.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_printk.h"
> > +
> > +static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > +
> > + return &xe->sriov.pf.vfs[vfid].migration;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_waitqueue - Get waitqueue for migration.
> > + * @xe: the &xe_device
> > + * @vfid: the VF identifier
> > + *
> > + * Return: pointer to the migration waitqueue.
> > + */
> > +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
> > +{
> > + return &pf_pick_migration(xe, vfid)->wq;
> > +}
> >
> > /**
> > * xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
> > @@ -33,9 +61,124 @@ static bool pf_check_migration_support(struct xe_device *xe)
> > */
> > int xe_sriov_pf_migration_init(struct xe_device *xe)
> > {
> > + unsigned int n, totalvfs;
> > +
> > xe_assert(xe, IS_SRIOV_PF(xe));
> >
> > xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
> > + if (!xe_sriov_pf_migration_supported(xe))
> > + return 0;
> > +
> > + totalvfs = xe_sriov_pf_get_totalvfs(xe);
> > + for (n = 1; n <= totalvfs; n++) {
> > + struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
> > +
> > + init_waitqueue_head(&migration->wq);
> > + }
> >
> > return 0;
> > }
> > +
> > +static bool pf_migration_data_ready(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_gt *gt;
> > + u8 gt_id;
> > +
> > + for_each_gt(gt, xe, gt_id) {
> > + if (!xe_gt_sriov_pf_migration_ring_empty(gt, vfid) ||
> > + xe_gt_sriov_pf_control_check_save_data_done(gt, vfid))
> > + return true;
> > + }
> > +
> > + return false;
> > +}
> > +
> > +static struct xe_sriov_migration_data *
> > +pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_migration_data *data;
> > + struct xe_gt *gt;
> > + u8 gt_id;
> > + bool more_data = false;
> > +
> > + for_each_gt(gt, xe, gt_id) {
> > + data = xe_gt_sriov_pf_migration_save_consume(gt, vfid);
> > + if (data && PTR_ERR(data) != EAGAIN)
> > + return data;
> > + if (PTR_ERR(data) == -EAGAIN)
> > + more_data = true;
> > + }
> > +
> > + if (!more_data)
> > + return NULL;
> > +
> > + return ERR_PTR(-EAGAIN);
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_save_consume() - Consume a VF migration data packet from the device.
> > + * @xe: the &xe_device
> > + * @vfid: the VF identifier
> > + *
> > + * Called by the save migration data consumer (userspace) when
> > + * processing migration data.
> > + * If there is no migration data to process, wait until more data is available.
> > + *
> > + * Return: Pointer to &xe_sriov_migration_data on success,
> > + * NULL if ring is empty and no more migration data is expected,
> > + * ERR_PTR value in case of error.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +struct xe_sriov_migration_data *
> > +xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, vfid);
> > + struct xe_sriov_migration_data *data;
> > + int ret;
> > +
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + while (1) {
> > + data = pf_migration_consume(xe, vfid);
> > + if (PTR_ERR(data) != -EAGAIN)
> > + goto out;
>
> just
> break; ?
Ok.
Thanks,
-Michał
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 05/26] drm/xe/pf: Add helpers for migration data allocation / free
2025-10-22 22:18 ` Michal Wajdeczko
@ 2025-10-27 12:47 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-27 12:47 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 12:18:09AM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > Now that it's possible to free the packets - connect the restore
> > handling logic with the ring.
> > The helpers will also be used in upcoming changes that will start producing
> > migration data packets.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile | 1 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 7 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 29 +++-
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 1 +
> > drivers/gpu/drm/xe/xe_sriov_migration_data.c | 127 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_sriov_migration_data.h | 31 +++++
> > 6 files changed, 195 insertions(+), 1 deletion(-)
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_migration_data.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index 89e5b26c27975..3d72db9e528e4 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -173,6 +173,7 @@ xe-$(CONFIG_PCI_IOV) += \
> > xe_lmtt_2l.o \
> > xe_lmtt_ml.o \
> > xe_pci_sriov.o \
> > + xe_sriov_migration_data.o \
> > xe_sriov_pf.o \
> > xe_sriov_pf_control.o \
> > xe_sriov_pf_debugfs.o \
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index cad73fdaee93c..dd9bc9c99f78c 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -18,6 +18,7 @@
> > #include "xe_gt_sriov_printk.h"
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > +#include "xe_sriov_migration_data.h"
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_pf_service.h"
> > @@ -851,6 +852,8 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> > static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> > + xe_gt_sriov_pf_migration_ring_free(gt, vfid);
> > +
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> > @@ -1045,6 +1048,8 @@ int xe_gt_sriov_pf_control_finish_save_vf(struct xe_gt *gt, unsigned int vfid)
> > static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> > + xe_gt_sriov_pf_migration_ring_free(gt, vfid);
> > +
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_PROCESS_DATA);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_DATA_DONE);
> > @@ -1078,6 +1083,8 @@ pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
> >
> > xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> >
> > + xe_sriov_migration_data_free(data);
> > +
> > return 0;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index b6ffd982d6007..8ba72165759b3 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -14,6 +14,7 @@
> > #include "xe_guc.h"
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > +#include "xe_sriov_migration_data.h"
> > #include "xe_sriov_pf_migration.h"
> >
> > #define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> > @@ -418,6 +419,25 @@ bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid)
> > return ptr_ring_full(&pf_pick_gt_migration(gt, vfid)->ring);
> > }
> >
> > +/**
> > + * xe_gt_sriov_pf_migration_ring_free() - Consume and free all data in migration ring
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + */
> > +void xe_gt_sriov_pf_migration_ring_free(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> > + struct xe_sriov_migration_data *data;
> > +
> > + if (ptr_ring_empty(&migration->ring))
> > + return;
> > +
> > + xe_gt_sriov_notice(gt, "VF%u unprocessed migration data left in the ring!\n", vfid);
> > +
> > + while ((data = ptr_ring_consume(&migration->ring)))
> > + xe_sriov_migration_data_free(data);
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_migration_save_produce() - Add VF save data packet to migration ring.
> > * @gt: the &xe_gt
> > @@ -543,11 +563,18 @@ xe_gt_sriov_pf_migration_save_consume(struct xe_gt *gt, unsigned int vfid)
> > return ERR_PTR(-EAGAIN);
> > }
> >
> > +static void pf_mig_data_destroy(void *ptr)
> > +{
> > + struct xe_sriov_migration_data *data = ptr;
> > +
> > + xe_sriov_migration_data_free(data);
> > +}
> > +
> > static void action_ring_cleanup(struct drm_device *dev, void *arg)
> > {
> > struct ptr_ring *r = arg;
> >
> > - ptr_ring_cleanup(r, NULL);
> > + ptr_ring_cleanup(r, pf_mig_data_destroy);
> > }
> >
> > /**
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > index 9e67f18ded205..1ed2248f0a17e 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > @@ -17,6 +17,7 @@ int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vf
> >
> > bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> > bool xe_gt_sriov_pf_migration_ring_full(struct xe_gt *gt, unsigned int vfid);
> > +void xe_gt_sriov_pf_migration_ring_free(struct xe_gt *gt, unsigned int vfid);
> >
> > int xe_gt_sriov_pf_migration_save_produce(struct xe_gt *gt, unsigned int vfid,
> > struct xe_sriov_migration_data *data);
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > new file mode 100644
> > index 0000000000000..b04f9be3b7fed
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > @@ -0,0 +1,127 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#include "xe_bo.h"
> > +#include "xe_device.h"
> > +#include "xe_sriov_migration_data.h"
> > +
> > +static bool data_needs_bo(struct xe_sriov_migration_data *data)
> > +{
> > + return data->type == XE_SRIOV_MIGRATION_DATA_TYPE_VRAM;
> > +}
> > +
> > +/**
> > + * xe_sriov_migration_data() - Allocate migration data packet
> > + * @xe: the &xe_device
> > + *
> > + * Only allocates the "outer" structure, without initializing the migration
> > + * data backing storage.
> > + *
> > + * Return: Pointer to &xe_sriov_migration_data on success,
> > + * NULL in case of error.
> > + */
> > +struct xe_sriov_migration_data *
>
> no line split
Ok.
>
> > +xe_sriov_migration_data_alloc(struct xe_device *xe)
> > +{
> > + struct xe_sriov_migration_data *data;
> > +
> > + data = kzalloc(sizeof(*data), GFP_KERNEL);
> > + if (!data)
> > + return NULL;
> > +
> > + data->xe = xe;
> > + data->hdr_remaining = sizeof(data->hdr);
> > +
> > + return data;
> > +}
> > +
> > +/**
> > + * xe_sriov_migration_data_free() - Free migration data packet.
> > + * @data: the &xe_sriov_migration_data packet
> > + */
> > +void xe_sriov_migration_data_free(struct xe_sriov_migration_data *data)
> > +{
> > + if (data_needs_bo(data))
> > + xe_bo_unpin_map_no_vm(data->bo);
> > + else
> > + kvfree(data->buff);
> > +
> > + kfree(data);
> > +}
> > +
> > +static int mig_data_init(struct xe_sriov_migration_data *data)
> > +{
> > + struct xe_gt *gt = xe_device_get_gt(data->xe, data->gt);
> > +
> > + if (data->size == 0)
> > + return 0;
> > +
> > + if (data_needs_bo(data)) {
>
> struct xe_bo *bo;
> then
> bo = ...
>
> so will not have that long line
Ok.
>
> > + struct xe_bo *bo = xe_bo_create_pin_map_novm(data->xe, gt->tile,
> > + PAGE_ALIGN(data->size),
> > + ttm_bo_type_kernel,
> > + XE_BO_FLAG_SYSTEM | XE_BO_FLAG_PINNED,
> > + false);
> > + if (IS_ERR(bo))
> > + return PTR_ERR(bo);
> > +
> > + data->bo = bo;
> > + data->vaddr = bo->vmap.vaddr;
> > + } else {
> > + void *buff = kvzalloc(data->size, GFP_KERNEL);
> > +
> > + if (!buff)
> > + return -ENOMEM;
> > +
> > + data->buff = buff;
> > + data->vaddr = buff;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +#define XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION 1
> > +/**
> > + * xe_sriov_migration_data_init() - Initialize the migration data header and backing storage.
> > + * @data: the &xe_sriov_migration_data packet
> > + * @tile_id: tile identifier
> > + * @gt_id: GT identifier
> > + * @type: &xe_sriov_migration_data_type
> > + * @offset: offset of data packet payload (within wider resource)
> > + * @size: size of data packet payload
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
> > + enum xe_sriov_migration_data_type type, loff_t offset, size_t size)
> > +{
> > + data->version = XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION;
> > + data->type = type;
> > + data->tile = tile_id;
> > + data->gt = gt_id;
> > + data->offset = offset;
> > + data->size = size;
> > + data->remaining = size;
> > +
> > + return mig_data_init(data);
> > +}
> > +
> > +/**
> > + * xe_sriov_migration_data_init() - Initialize the migration data backing storage based on header.
> > + * @data: the &xe_sriov_migration_data packet
> > + *
> > + * Header data is expected to be filled prior to calling this function.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *data)
> > +{
> > + if (data->version != XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION)
> > + return -EINVAL;
> > +
> > + data->remaining = data->size;
> > +
> > + return mig_data_init(data);
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> > new file mode 100644
> > index 0000000000000..ef65dccddc035
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> > @@ -0,0 +1,31 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_SRIOV_MIGRATION_DATA_H_
> > +#define _XE_SRIOV_MIGRATION_DATA_H_
> > +
> > +#include <linux/types.h>
> > +
> > +struct xe_device;
> > +
> > +enum xe_sriov_migration_data_type {
> > + /* Skipping 0 to catch uninitialized data */
> > + XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR = 1,
> > + XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER,
> > + XE_SRIOV_MIGRATION_DATA_TYPE_GGTT,
> > + XE_SRIOV_MIGRATION_DATA_TYPE_MMIO,
> > + XE_SRIOV_MIGRATION_DATA_TYPE_GUC,
> > + XE_SRIOV_MIGRATION_DATA_TYPE_VRAM,
> > +};
> > +
> > +struct xe_sriov_migration_data *
>
> no need for line split here
Ok.
> > +xe_sriov_migration_data_alloc(struct xe_device *xe);
> > +void xe_sriov_migration_data_free(struct xe_sriov_migration_data *snapshot);
> > +
> > +int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
> > + enum xe_sriov_migration_data_type, loff_t offset, size_t size);
> > +int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *snapshot);
> > +
> > +#endif
>
> just few nits, otherwise LGTM
>
> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
Thanks,
-Michał
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 06/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet
2025-10-22 22:34 ` Michal Wajdeczko
@ 2025-10-27 13:27 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-27 13:27 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 12:34:50AM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > Add debugfs handlers for migration state and handle bitstream
> > .read()/.write() to convert from bitstream to/from migration data
> > packets.
> > As descriptor/trailer are handled at this layer - add handling for both
> > save and restore side.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_sriov_migration_data.c | 336 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_sriov_migration_data.h | 5 +
> > drivers/gpu/drm/xe/xe_sriov_pf_control.c | 5 +
> > drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 35 ++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 54 +++
> > .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 9 +
> > 6 files changed, 444 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > index b04f9be3b7fed..4cd6c6fc9ba18 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > @@ -6,6 +6,44 @@
> > #include "xe_bo.h"
> > #include "xe_device.h"
> > #include "xe_sriov_migration_data.h"
> > +#include "xe_sriov_pf_helpers.h"
> > +#include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_printk.h"
> > +
> > +static struct mutex *pf_migration_mutex(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
>
> other helpers have sep line here
Ok.
>
> > + return &xe->sriov.pf.vfs[vfid].migration.lock;
> > +}
> > +
> > +static struct xe_sriov_migration_data **pf_pick_pending(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> > +
> > + return &xe->sriov.pf.vfs[vfid].migration.pending;
> > +}
> > +
> > +static struct xe_sriov_migration_data **
> > +pf_pick_descriptor(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> > +
> > + return &xe->sriov.pf.vfs[vfid].migration.descriptor;
> > +}
> > +
> > +static struct xe_sriov_migration_data **pf_pick_trailer(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> > +
> > + return &xe->sriov.pf.vfs[vfid].migration.trailer;
> > +}
> >
> > static bool data_needs_bo(struct xe_sriov_migration_data *data)
> > {
> > @@ -43,6 +81,9 @@ xe_sriov_migration_data_alloc(struct xe_device *xe)
> > */
> > void xe_sriov_migration_data_free(struct xe_sriov_migration_data *data)
> > {
> > + if (IS_ERR_OR_NULL(data))
> > + return;
> > +
> > if (data_needs_bo(data))
> > xe_bo_unpin_map_no_vm(data->bo);
> > else
> > @@ -125,3 +166,298 @@ int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *data)
> >
> > return mig_data_init(data);
> > }
> > +
> > +static ssize_t vf_mig_data_hdr_read(struct xe_sriov_migration_data *data,
> > + char __user *buf, size_t len)
> > +{
> > + loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
> > +
> > + if (!data->hdr_remaining)
> > + return -EINVAL;
> > +
> > + if (len > data->hdr_remaining)
> > + len = data->hdr_remaining;
> > +
> > + if (copy_to_user(buf, (void *)&data->hdr + offset, len))
> > + return -EFAULT;
> > +
> > + data->hdr_remaining -= len;
> > +
> > + return len;
> > +}
> > +
> > +static ssize_t vf_mig_data_read(struct xe_sriov_migration_data *data,
> > + char __user *buf, size_t len)
> > +{
> > + if (len > data->remaining)
> > + len = data->remaining;
> > +
> > + if (copy_to_user(buf, data->vaddr + (data->size - data->remaining), len))
> > + return -EFAULT;
> > +
> > + data->remaining -= len;
> > +
> > + return len;
> > +}
> > +
> > +static ssize_t __vf_mig_data_read_single(struct xe_sriov_migration_data **data,
> > + unsigned int vfid, char __user *buf, size_t len)
> > +{
> > + ssize_t copied = 0;
> > +
> > + if ((*data)->hdr_remaining)
> > + copied = vf_mig_data_hdr_read(*data, buf, len);
> > + else
> > + copied = vf_mig_data_read(*data, buf, len);
> > +
> > + if ((*data)->remaining == 0 && (*data)->hdr_remaining == 0) {
> > + xe_sriov_migration_data_free(*data);
> > + *data = NULL;
> > + }
> > +
> > + return copied;
> > +}
> > +
> > +static struct xe_sriov_migration_data **vf_mig_pick_data(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_migration_data **data;
> > +
> > + data = pf_pick_descriptor(xe, vfid);
> > + if (*data)
> > + return data;
> > +
> > + data = pf_pick_pending(xe, vfid);
> > + if (!*data)
> > + *data = xe_sriov_pf_migration_save_consume(xe, vfid);
> > + if (*data)
> > + return data;
> > +
> > + data = pf_pick_trailer(xe, vfid);
> > + if (*data)
> > + return data;
> > +
> > + return ERR_PTR(-ENODATA);
> > +}
> > +
> > +static ssize_t vf_mig_data_read_single(struct xe_device *xe, unsigned int vfid,
> > + char __user *buf, size_t len)
> > +{
> > + struct xe_sriov_migration_data **data = vf_mig_pick_data(xe, vfid);
> > +
> > + if (IS_ERR_OR_NULL(data))
>
> vf_mig_pick_data() seems to never return NULL, so maybe just IS_ERR() ?
Ok.
>
> > + return PTR_ERR(data);
> > +
> > + return __vf_mig_data_read_single(data, vfid, buf, len);
> > +}
> > +
> > +/**
> > + * xe_sriov_migration_data_read() - Read migration data from the device.
> > + * @xe: the &xe_device
> > + * @vfid: the VF identifier
> > + * @buf: start address of userspace buffer
> > + * @len: requested read size from userspace
> > + *
> > + * Return: number of bytes that has been successfully read,
> > + * 0 if no more migration data is available,
> > + * -errno on failure.
> > + */
> > +ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
> > + char __user *buf, size_t len)
> > +{
> > + ssize_t ret, consumed = 0;
> > +
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
> > + while (consumed < len) {
> > + ret = vf_mig_data_read_single(xe, vfid, buf, len - consumed);
> > + if (ret == -ENODATA)
> > + break;
> > + if (ret < 0)
> > + return ret;
> > +
> > + consumed += ret;
> > + buf += ret;
> > + }
> > + }
> > +
> > + return consumed;
> > +}
> > +
> > +static ssize_t vf_mig_hdr_write(struct xe_sriov_migration_data *data,
> > + const char __user *buf, size_t len)
> > +{
> > + loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
> > + int ret;
> > +
> > + if (len > data->hdr_remaining)
> > + len = data->hdr_remaining;
> > +
> > + if (copy_from_user((void *)&data->hdr + offset, buf, len))
> > + return -EFAULT;
> > +
> > + data->hdr_remaining -= len;
> > +
> > + if (!data->hdr_remaining) {
> > + ret = xe_sriov_migration_data_init_from_hdr(data);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + return len;
> > +}
> > +
> > +static ssize_t vf_mig_data_write(struct xe_sriov_migration_data *data,
> > + const char __user *buf, size_t len)
> > +{
> > + if (len > data->remaining)
> > + len = data->remaining;
> > +
> > + if (copy_from_user(data->vaddr + (data->size - data->remaining), buf, len))
> > + return -EFAULT;
> > +
> > + data->remaining -= len;
> > +
> > + return len;
> > +}
> > +
> > +static ssize_t vf_mig_data_write_single(struct xe_device *xe, unsigned int vfid,
> > + const char __user *buf, size_t len)
> > +{
> > + struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
> > + int ret;
> > + ssize_t copied;
> > +
> > + if (IS_ERR_OR_NULL(*data)) {
> > + *data = xe_sriov_migration_data_alloc(xe);
> > + if (!*data)
> > + return -ENOMEM;
> > + }
> > +
> > + if ((*data)->hdr_remaining)
> > + copied = vf_mig_hdr_write(*data, buf, len);
> > + else
> > + copied = vf_mig_data_write(*data, buf, len);
> > +
> > + if ((*data)->hdr_remaining == 0 && (*data)->remaining == 0) {
> > + ret = xe_sriov_pf_migration_restore_produce(xe, vfid, *data);
> > + if (ret) {
> > + xe_sriov_migration_data_free(*data);
> > + return ret;
> > + }
> > +
> > + *data = NULL;
> > + }
> > +
> > + return copied;
> > +}
> > +
> > +/**
> > + * xe_sriov_migration_data_write() - Write migration data to the device.
> > + * @xe: the &xe_device
> > + * @vfid: the VF identifier
> > + * @buf: start address of userspace buffer
> > + * @len: requested write size from userspace
> > + *
> > + * Return: number of bytes that has been successfully written,
> > + * -errno on failure.
> > + */
> > +ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
> > + const char __user *buf, size_t len)
> > +{
> > + ssize_t ret, produced = 0;
> > +
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
> > + while (produced < len) {
> > + ret = vf_mig_data_write_single(xe, vfid, buf, len - produced);
> > + if (ret < 0)
> > + return ret;
> > +
> > + produced += ret;
> > + buf += ret;
> > + }
> > + }
> > +
> > + return produced;
> > +}
> > +
> > +#define MIGRATION_DESCRIPTOR_DWORDS 0
> > +static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_migration_data **desc = pf_pick_descriptor(xe, vfid);
> > + struct xe_sriov_migration_data *data;
> > + int ret;
> > +
> > + data = xe_sriov_migration_data_alloc(xe);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + ret = xe_sriov_migration_data_init(data, 0, 0, XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR,
> > + 0, MIGRATION_DESCRIPTOR_DWORDS * sizeof(u32));
> > + if (ret) {
> > + xe_sriov_migration_data_free(data);
> > + return ret;
> > + }
> > +
> > + *desc = data;
> > +
> > + return 0;
> > +}
> > +
> > +static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
> > +
> > + *data = NULL;
> > +}
> > +
> > +#define MIGRATION_TRAILER_SIZE 0
> > +static int pf_trailer_init(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_migration_data **trailer = pf_pick_trailer(xe, vfid);
> > + struct xe_sriov_migration_data *data;
> > + int ret;
> > +
> > + data = xe_sriov_migration_data_alloc(xe);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + ret = xe_sriov_migration_data_init(data, 0, 0, XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER,
> > + 0, MIGRATION_TRAILER_SIZE);
> > + if (ret) {
> > + xe_sriov_migration_data_free(data);
> > + return ret;
> > + }
> > +
> > + *trailer = data;
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_sriov_migration_data_save_init() - Initialize the pending save migration data.
> > + * @xe: the &xe_device
> > + * @vfid: the VF identifier
> > + *
> > + * Return: 0 on success, -errno on failure.
> > + */
> > +int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid)
> > +{
> > + int ret;
> > +
> > + scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) {
> > + ret = pf_descriptor_init(xe, vfid);
> > + if (ret)
> > + return ret;
> > +
> > + ret = pf_trailer_init(xe, vfid);
> > + if (ret)
> > + return ret;
> > +
> > + pf_pending_init(xe, vfid);
> > + }
> > +
> > + return 0;
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> > index ef65dccddc035..5cde6e9439677 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> > @@ -27,5 +27,10 @@ void xe_sriov_migration_data_free(struct xe_sriov_migration_data *snapshot);
> > int xe_sriov_migration_data_init(struct xe_sriov_migration_data *data, u8 tile_id, u8 gt_id,
> > enum xe_sriov_migration_data_type, loff_t offset, size_t size);
> > int xe_sriov_migration_data_init_from_hdr(struct xe_sriov_migration_data *snapshot);
> > +ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
> > + char __user *buf, size_t len);
> > +ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
> > + const char __user *buf, size_t len);
> > +int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > index 8d8a01faf5291..c2768848daba1 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > @@ -5,6 +5,7 @@
> >
> > #include "xe_device.h"
> > #include "xe_gt_sriov_pf_control.h"
> > +#include "xe_sriov_migration_data.h"
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_printk.h"
> >
> > @@ -165,6 +166,10 @@ int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid)
> > unsigned int id;
> > int ret;
> >
> > + ret = xe_sriov_migration_data_save_init(xe, vfid);
> > + if (ret)
> > + return ret;
> > +
> > for_each_gt(gt, xe, id) {
> > ret = xe_gt_sriov_pf_control_trigger_save_vf(gt, vfid);
> > if (ret)
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > index e0e6340c49106..a9a28aec22421 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > @@ -9,6 +9,7 @@
> > #include "xe_device.h"
> > #include "xe_device_types.h"
> > #include "xe_pm.h"
> > +#include "xe_sriov_migration_data.h"
> > #include "xe_sriov_pf.h"
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_pf_debugfs.h"
> > @@ -132,6 +133,7 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
> > * /sys/kernel/debug/dri/BDF/
> > * ├── sriov
> > * │ ├── vf1
> > + * │ │ ├── migration_data
> > * │ │ ├── pause
> > * │ │ ├── reset
> > * │ │ ├── resume
> > @@ -220,6 +222,38 @@ DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
> > DEFINE_VF_CONTROL_ATTRIBUTE_RW(save_vf);
> > DEFINE_VF_CONTROL_ATTRIBUTE_RW(restore_vf);
> >
> > +static ssize_t data_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > +{
> > + struct dentry *dent = file_dentry(file)->d_parent;
> > + struct xe_device *xe = extract_xe(dent);
> > + unsigned int vfid = extract_vfid(dent);
> > +
> > + if (*pos)
> > + return -ESPIPE;
> > +
> > + return xe_sriov_migration_data_write(xe, vfid, buf, count);
> > +}
> > +
> > +static ssize_t data_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > +{
> > + struct dentry *dent = file_dentry(file)->d_parent;
> > + struct xe_device *xe = extract_xe(dent);
> > + unsigned int vfid = extract_vfid(dent);
> > +
> > + if (*ppos)
> > + return -ESPIPE;
> > +
> > + return xe_sriov_migration_data_read(xe, vfid, buf, count);
> > +}
> > +
> > +static const struct file_operations data_vf_fops = {
> > + .owner = THIS_MODULE,
> > + .open = simple_open,
> > + .write = data_write,
> > + .read = data_read,
> > + .llseek = default_llseek,
> > +};
> > +
> > static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> > {
> > debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> > @@ -228,6 +262,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> > debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
> > debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> > debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> > + debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> > }
> >
> > static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index eaf581317bdef..029e14f1ffa74 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -10,6 +10,7 @@
> > #include "xe_gt_sriov_pf_migration.h"
> > #include "xe_pm.h"
> > #include "xe_sriov.h"
> > +#include "xe_sriov_migration_data.h"
> > #include "xe_sriov_pf_helpers.h"
> > #include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_printk.h"
> > @@ -53,6 +54,15 @@ static bool pf_check_migration_support(struct xe_device *xe)
> > return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> > }
> >
> > +static void pf_migration_cleanup(struct drm_device *dev, void *arg)
> > +{
> > + struct xe_sriov_pf_migration *migration = arg;
> > +
> > + xe_sriov_migration_data_free(migration->pending);
> > + xe_sriov_migration_data_free(migration->trailer);
> > + xe_sriov_migration_data_free(migration->descriptor);
> > +}
> > +
> > /**
> > * xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
> > * @xe: the &xe_device
> > @@ -62,6 +72,7 @@ static bool pf_check_migration_support(struct xe_device *xe)
> > int xe_sriov_pf_migration_init(struct xe_device *xe)
> > {
> > unsigned int n, totalvfs;
> > + int err;
> >
> > xe_assert(xe, IS_SRIOV_PF(xe));
> >
> > @@ -73,7 +84,15 @@ int xe_sriov_pf_migration_init(struct xe_device *xe)
> > for (n = 1; n <= totalvfs; n++) {
> > struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
> >
> > + err = drmm_mutex_init(&xe->drm, &migration->lock);
> > + if (err)
> > + return err;
> > +
> > init_waitqueue_head(&migration->wq);
> > +
> > + err = drmm_add_action_or_reset(&xe->drm, pf_migration_cleanup, migration);
>
> shouldn't we use devm instead here ?
I'll switch it to devm.
>
> > + if (err)
> > + return err;
> > }
> >
> > return 0;
> > @@ -154,6 +173,36 @@ xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
> > return data;
> > }
> >
> > +static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_migration_data *data)
> > +{
> > + if (data->tile != 0 || data->gt != 0)
> > + return -EINVAL;
> > +
> > + xe_sriov_migration_data_free(data);
> > +
> > + return 0;
> > +}
> > +
> > +static int pf_handle_trailer(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_migration_data *data)
> > +{
> > + struct xe_gt *gt;
> > + u8 gt_id;
> > +
> > + if (data->tile != 0 || data->gt != 0)
> > + return -EINVAL;
> > + if (data->offset != 0 || data->size != 0 || data->buff || data->bo)
> > + return -EINVAL;
> > +
> > + xe_sriov_migration_data_free(data);
> > +
> > + for_each_gt(gt, xe, gt_id)
> > + xe_gt_sriov_pf_control_restore_data_done(gt, vfid);
> > +
> > + return 0;
> > +}
> > +
> > /**
> > * xe_sriov_pf_migration_restore_produce() - Produce a VF migration data packet to the device.
> > * @xe: the &xe_device
> > @@ -173,6 +222,11 @@ int xe_sriov_pf_migration_restore_produce(struct xe_device *xe, unsigned int vfi
> >
> > xe_assert(xe, IS_SRIOV_PF(xe));
> >
> > + if (data->type == XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR)
> > + return pf_handle_descriptor(xe, vfid, data);
> > + else if (data->type == XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER)
>
> no need for "else" here
Ok.
Thanks,
-Michał
>
> > + return pf_handle_trailer(xe, vfid, data);
> > +
> > gt = xe_device_get_gt(xe, data->gt);
> > if (!gt || data->tile != gt->tile->id) {
> > xe_sriov_err_ratelimited(xe, "VF%d Invalid GT - tile:%u, GT:%u\n",
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > index 2a45ee4e3ece8..8468e5eeb6d66 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > @@ -7,6 +7,7 @@
> > #define _XE_SRIOV_PF_MIGRATION_TYPES_H_
> >
> > #include <linux/types.h>
> > +#include <linux/mutex_types.h>
> > #include <linux/wait.h>
> >
> > /**
> > @@ -53,6 +54,14 @@ struct xe_sriov_migration_data {
> > struct xe_sriov_pf_migration {
> > /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
> > wait_queue_head_t wq;
> > + /** @lock: Mutex protecting the migration data */
> > + struct mutex lock;
> > + /** @pending: currently processed data packet of VF resource */
> > + struct xe_sriov_migration_data *pending;
> > + /** @trailer: data packet used to indicate the end of stream */
> > + struct xe_sriov_migration_data *trailer;
> > + /** @descriptor: data packet containing the metadata describing the device */
> > + struct xe_sriov_migration_data *descriptor;
> > };
> >
> > #endif
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 07/26] drm/xe/pf: Add minimalistic migration descriptor
2025-10-22 22:49 ` Michal Wajdeczko
@ 2025-10-27 14:52 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-27 14:52 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 12:49:52AM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > The descriptor reuses the KLV format used by GuC and contains metadata
> > that can be used to quickly fail migration when source is incompatible
> > with destination.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_sriov_migration_data.c | 79 +++++++++++++++++++-
> > drivers/gpu/drm/xe/xe_sriov_migration_data.h | 2 +
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 6 ++
> > 3 files changed, 86 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > index 4cd6c6fc9ba18..b58508c0c30f1 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.c
> > @@ -5,6 +5,7 @@
> >
> > #include "xe_bo.h"
> > #include "xe_device.h"
> > +#include "xe_guc_klv_helpers.h"
> > #include "xe_sriov_migration_data.h"
> > #include "xe_sriov_pf_helpers.h"
> > #include "xe_sriov_pf_migration.h"
> > @@ -383,11 +384,18 @@ ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
> > return produced;
> > }
> >
> > -#define MIGRATION_DESCRIPTOR_DWORDS 0
> > +#define MIGRATION_KLV_DEVICE_DEVID_KEY 0xf001u
> > +#define MIGRATION_KLV_DEVICE_DEVID_LEN 1u
> > +#define MIGRATION_KLV_DEVICE_REVID_KEY 0xf002u
> > +#define MIGRATION_KLV_DEVICE_REVID_LEN 1u
> > +
> > +#define MIGRATION_DESCRIPTOR_DWORDS (GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_DEVID_LEN + \
> > + GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_REVID_LEN)
> > static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
> > {
> > struct xe_sriov_migration_data **desc = pf_pick_descriptor(xe, vfid);
> > struct xe_sriov_migration_data *data;
> > + u32 *klvs;
> > int ret;
> >
> > data = xe_sriov_migration_data_alloc(xe);
> > @@ -401,11 +409,80 @@ static size_t pf_descriptor_init(struct xe_device *xe, unsigned int vfid)
> > return ret;
> > }
> >
> > + klvs = data->vaddr;
> > + *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_DEVID_KEY,
> > + MIGRATION_KLV_DEVICE_DEVID_LEN);
> > + *klvs++ = xe->info.devid;
> > + *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_REVID_KEY,
> > + MIGRATION_KLV_DEVICE_REVID_LEN);
> > + *klvs++ = xe->info.revid;
> > +
>
> maybe add assert that written KLVs match descriptor size?
I'll track len written and verify with an assert.
>
> > *desc = data;
> >
> > return 0;
> > }
> >
> > +/**
> > + * xe_sriov_migration_data_process_descriptor() - Process migration data descriptor.
> > + * @xe: the &xe_device
> > + * @vfid: the VF identifier
> > + * @data: the &struct xe_sriov_pf_migration_data containing the descriptor
> > + *
> > + * The descriptor uses the same KLV format as GuC, and contains metadata used for
> > + * checking migration data compatibility.
> > + *
> > + * Return: 0 on success, -errno on failure.
> > + */
> > +int xe_sriov_migration_data_process_descriptor(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_migration_data *data)
> > +{
> > + u32 num_dwords = data->size / sizeof(u32);
> > + u32 *klvs = data->vaddr;
> > +
> > + xe_assert(xe, data->type == XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR);
> > + if (data->size % sizeof(u32) != 0)
>
> no need to compare against 0
Ok.
>
> if (data->size % sizeof(u32))
>
> > + return -EINVAL;
>
> for other errors we warn(), ok to be silent here?
Let's add:
xe_sriov_warn(xe, "Aborting migration, descriptor not in KLV format (size=%llu)\n",
data->size);
>
> > +
> > + while (num_dwords >= GUC_KLV_LEN_MIN) {
> > + u32 key = FIELD_GET(GUC_KLV_0_KEY, klvs[0]);
> > + u32 len = FIELD_GET(GUC_KLV_0_LEN, klvs[0]);
> > +
> > + klvs += GUC_KLV_LEN_MIN;
> > + num_dwords -= GUC_KLV_LEN_MIN;
> > +
>
> you should check len vs num_dwords here
Ok.
>
> > + switch (key) {
> > + case MIGRATION_KLV_DEVICE_DEVID_KEY:
> > + if (*klvs != xe->info.devid) {
> > + xe_sriov_warn(xe,
> > + "Aborting migration, devid mismatch %#04x!=%#04x\n",
>
> likely %#06x, as you need to count also "0x"
Ok.
>
> > + *klvs, xe->info.devid);
> > + return -ENODEV;
> > + }
> > + break;
> > + case MIGRATION_KLV_DEVICE_REVID_KEY:
> > + if (*klvs != xe->info.revid) {
> > + xe_sriov_warn(xe,
> > + "Aborting migration, revid mismatch %#04x!=%#04x\n",
> > + *klvs, xe->info.revid);
> > + return -ENODEV;
> > + }
> > + break;
> > + default:
> > + xe_sriov_dbg(xe,
> > + "Unknown migration descriptor key %#06x - skipping\n", key);
>
> also print len? and some initial hexdump to help with debug?
I'll replace it with:
xe_sriov_dbg(xe,
"Skipping unknown migration descriptor key %#06x (len=%#06x)\n",
key, len);
print_hex_dump_bytes("desc: ", DUMP_PREFIX_OFFSET, klvs,
min(SZ_64, len * sizeof(u32)));
>
> > + break;
> > + }
> > +
> > + if (len > num_dwords)
> > + return -EINVAL;
>
> this check should be earlier
Ok.
Thanks,
-Michał
>
> > +
> > + klvs += len;
> > + num_dwords -= len;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
> > {
> > struct xe_sriov_migration_data **data = pf_pick_pending(xe, vfid);
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> > index 5cde6e9439677..e7f3b332124bc 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_migration_data.h
> > @@ -31,6 +31,8 @@ ssize_t xe_sriov_migration_data_read(struct xe_device *xe, unsigned int vfid,
> > char __user *buf, size_t len);
> > ssize_t xe_sriov_migration_data_write(struct xe_device *xe, unsigned int vfid,
> > const char __user *buf, size_t len);
> > +int xe_sriov_migration_data_process_descriptor(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_migration_data *data);
> > int xe_sriov_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index 029e14f1ffa74..0b4b237780102 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -176,9 +176,15 @@ xe_sriov_pf_migration_save_consume(struct xe_device *xe, unsigned int vfid)
> > static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
> > struct xe_sriov_migration_data *data)
> > {
> > + int ret;
> > +
> > if (data->tile != 0 || data->gt != 0)
> > return -EINVAL;
> >
> > + ret = xe_sriov_migration_data_process_descriptor(xe, vfid, data);
> > + if (ret)
> > + return ret;
> > +
> > xe_sriov_migration_data_free(data);
> >
> > return 0;
>
>
>
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* RE: [PATCH v2 01/26] drm/xe/pf: Remove GuC version check for migration support
2025-10-21 22:41 ` [PATCH v2 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
@ 2025-10-28 2:33 ` Tian, Kevin
2025-10-28 8:06 ` Winiarski, Michal
0 siblings, 1 reply; 72+ messages in thread
From: Tian, Kevin @ 2025-10-28 2:33 UTC (permalink / raw)
To: Winiarski, Michal, Alex Williamson, De Marchi, Lucas,
Thomas Hellström, Vivi, Rodrigo, Jason Gunthorpe,
Yishai Hadas, intel-xe@lists.freedesktop.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Brost, Matthew,
Wajdeczko, Michal
Cc: dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Wednesday, October 22, 2025 6:41 AM
>
> static bool pf_check_migration_support(struct xe_gt *gt)
> {
> - /* GuC 70.25 with save/restore v2 is required */
> - xe_gt_assert(gt, GUC_FIRMWARE_VER(>->uc.guc) >=
> MAKE_GUC_VER(70, 25, 0));
> -
> /* XXX: for now this is for feature enabling only */
> return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
why putting it under a debug option? Now you are sending formal
series for merge, assuming good quality.
^ permalink raw reply [flat|nested] 72+ messages in thread
* RE: [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs
2025-10-21 22:41 ` [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
2025-10-22 22:31 ` Michal Wajdeczko
@ 2025-10-28 3:06 ` Tian, Kevin
2025-10-28 8:02 ` Michal Wajdeczko
1 sibling, 1 reply; 72+ messages in thread
From: Tian, Kevin @ 2025-10-28 3:06 UTC (permalink / raw)
To: Winiarski, Michal, Alex Williamson, De Marchi, Lucas,
Thomas Hellström, Vivi, Rodrigo, Jason Gunthorpe,
Yishai Hadas, intel-xe@lists.freedesktop.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Brost, Matthew,
Wajdeczko, Michal
Cc: dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Wednesday, October 22, 2025 6:41 AM
>
> +int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int
> vfid)
the prefix is too long. xe_gt_sriov_trigger_save_vf() or
xe_gt_trigger_save_vf() is sufficient.
^ permalink raw reply [flat|nested] 72+ messages in thread
* RE: [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling
2025-10-21 22:41 ` [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
2025-10-23 21:50 ` Michal Wajdeczko
@ 2025-10-28 3:22 ` Tian, Kevin
2025-10-28 7:38 ` Michal Wajdeczko
1 sibling, 1 reply; 72+ messages in thread
From: Tian, Kevin @ 2025-10-28 3:22 UTC (permalink / raw)
To: Winiarski, Michal, Alex Williamson, De Marchi, Lucas,
Thomas Hellström, Vivi, Rodrigo, Jason Gunthorpe,
Yishai Hadas, intel-xe@lists.freedesktop.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Brost, Matthew,
Wajdeczko, Michal
Cc: dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Wednesday, October 22, 2025 6:41 AM
>
> +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t
> size, u16 vfid)
> +{
> + u64 vfid_pte = xe_encode_vfid_pte(vfid);
> + const u64 *buf = src;
> + struct xe_ggtt *ggtt;
> + u64 start, end;
> +
> + if (!node)
> + return -ENOENT;
> +
> + guard(mutex)(&node->ggtt->lock);
> +
> + ggtt = node->ggtt;
> + start = node->base.start;
> + end = start + size - 1;
> +
> + if (xe_ggtt_pte_size(ggtt, node->base.size) != size)
> + return -EINVAL;
> +
> + while (start < end) {
> + ggtt->pt_ops->ggtt_set_pte(ggtt, start, (*buf &
> ~GGTT_PTE_VFID) | vfid_pte);
> + start += XE_PAGE_SIZE;
> + buf++;
> + }
static u64 xe_encode_vfid_pte(u16 vfid)
{
return FIELD_PREP(GGTT_PTE_VFID, vfid) | XE_PAGE_PRESENT;
}
so above loop blindly set every GGTT entry to valid. Isn't the right
thing to carry the present bit from the src buffer?
^ permalink raw reply [flat|nested] 72+ messages in thread
* RE: [PATCH v2 25/26] drm/xe/pf: Export helpers for VFIO
2025-10-21 22:41 ` [PATCH v2 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
@ 2025-10-28 3:28 ` Tian, Kevin
0 siblings, 0 replies; 72+ messages in thread
From: Tian, Kevin @ 2025-10-28 3:28 UTC (permalink / raw)
To: Winiarski, Michal, Alex Williamson, De Marchi, Lucas,
Thomas Hellström, Vivi, Rodrigo, Jason Gunthorpe,
Yishai Hadas, intel-xe@lists.freedesktop.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Brost, Matthew,
Wajdeczko, Michal
Cc: dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Wednesday, October 22, 2025 6:42 AM
> +
> +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid);
> +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> + char __user *buf, size_t len);
> +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> + const char __user *buf, size_t len);
> +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int
> vfid);
> +
none of those helpers ties to any vfio specific logic. so there is no need
to have 'vfio' explicitly in those function names.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling
2025-10-28 3:22 ` Tian, Kevin
@ 2025-10-28 7:38 ` Michal Wajdeczko
0 siblings, 0 replies; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-28 7:38 UTC (permalink / raw)
To: Tian, Kevin, Winiarski, Michal, Alex Williamson, De Marchi, Lucas,
Thomas Hellström, Vivi, Rodrigo, Jason Gunthorpe,
Yishai Hadas, intel-xe@lists.freedesktop.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Brost, Matthew
Cc: dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
On 10/28/2025 4:22 AM, Tian, Kevin wrote:
>> From: Winiarski, Michal <michal.winiarski@intel.com>
>> Sent: Wednesday, October 22, 2025 6:41 AM
>>
>> +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t
>> size, u16 vfid)
>> +{
>> + u64 vfid_pte = xe_encode_vfid_pte(vfid);
>> + const u64 *buf = src;
>> + struct xe_ggtt *ggtt;
>> + u64 start, end;
>> +
>> + if (!node)
>> + return -ENOENT;
>> +
>> + guard(mutex)(&node->ggtt->lock);
>> +
>> + ggtt = node->ggtt;
>> + start = node->base.start;
>> + end = start + size - 1;
>> +
>> + if (xe_ggtt_pte_size(ggtt, node->base.size) != size)
>> + return -EINVAL;
>> +
>> + while (start < end) {
>> + ggtt->pt_ops->ggtt_set_pte(ggtt, start, (*buf &
>> ~GGTT_PTE_VFID) | vfid_pte);
>> + start += XE_PAGE_SIZE;
>> + buf++;
>> + }
>
> static u64 xe_encode_vfid_pte(u16 vfid)
> {
> return FIELD_PREP(GGTT_PTE_VFID, vfid) | XE_PAGE_PRESENT;
> }
>
> so above loop blindly set every GGTT entry to valid. Isn't the right
> thing to carry the present bit from the src buffer?
VFs can't modify VALID/PRESENT(0) bit so it must be always set by PF
Bspec: 52395
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs
2025-10-28 3:06 ` Tian, Kevin
@ 2025-10-28 8:02 ` Michal Wajdeczko
0 siblings, 0 replies; 72+ messages in thread
From: Michal Wajdeczko @ 2025-10-28 8:02 UTC (permalink / raw)
To: Tian, Kevin, Winiarski, Michal, Alex Williamson, De Marchi, Lucas,
Thomas Hellström, Vivi, Rodrigo, Jason Gunthorpe,
Yishai Hadas, intel-xe@lists.freedesktop.org,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Brost, Matthew
Cc: dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
On 10/28/2025 4:06 AM, Tian, Kevin wrote:
>> From: Winiarski, Michal <michal.winiarski@intel.com>
>> Sent: Wednesday, October 22, 2025 6:41 AM
>>
>> +int xe_gt_sriov_pf_control_trigger_save_vf(struct xe_gt *gt, unsigned int
>> vfid)
>
> the prefix is too long. xe_gt_sriov_trigger_save_vf() or
> xe_gt_trigger_save_vf() is sufficient.
on the Xe driver we name functions based on the sub-component name
xe_sriov_vfio.c
= xe|sriov|vfio
= Xe driver | SR-IOV feature | VFIO interface
xe_sriov_pf_control.c
= xe|sriov|pf|control
= Xe driver | SR-IOV feature | PF specific | control
xe_gt_sriov_pf_control.c
= xe|gt|sriov|pf|control
= Xe driver | GT-related | SR-IOV feature | PF specific | control
and only functions from the xe|sriov|vfio component will be exported
for use by the xe vfio driver (hence the vfio tag in their names) and
other functions will be internal to the Xe driver
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 01/26] drm/xe/pf: Remove GuC version check for migration support
2025-10-28 2:33 ` Tian, Kevin
@ 2025-10-28 8:06 ` Winiarski, Michal
0 siblings, 0 replies; 72+ messages in thread
From: Winiarski, Michal @ 2025-10-28 8:06 UTC (permalink / raw)
To: Tian, Kevin
Cc: Alex Williamson, De Marchi, Lucas, Thomas Hellström,
Vivi, Rodrigo, Jason Gunthorpe, Yishai Hadas,
intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Brost, Matthew, Wajdeczko, Michal,
dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
On Tue, Oct 28, 2025 at 03:33:22AM +0100, Tian, Kevin wrote:
> > From: Winiarski, Michal <michal.winiarski@intel.com>
> > Sent: Wednesday, October 22, 2025 6:41 AM
> >
> > static bool pf_check_migration_support(struct xe_gt *gt)
> > {
> > - /* GuC 70.25 with save/restore v2 is required */
> > - xe_gt_assert(gt, GUC_FIRMWARE_VER(>->uc.guc) >=
> > MAKE_GUC_VER(70, 25, 0));
> > -
> > /* XXX: for now this is for feature enabling only */
> > return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
>
> why putting it under a debug option? Now you are sending formal
> series for merge, assuming good quality.
The need for debug option is removed for specific platforms in Patch
24/26, but I will drop it completely in v3.
-Michał
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 11/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration
2025-10-23 17:37 ` Michal Wajdeczko
@ 2025-10-28 10:46 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-28 10:46 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 07:37:48PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > Contiguous PF GGTT VMAs can be scarce after creating VFs.
> > Increase the GuC buffer cache size to 4M for PF so that we can fit GuC
> > migration data (which currently maxes out at just under 4M) and use the
>
> but the code below still uses 8M
Yeah - turns out we need more than 4M (I did my math on one of the
structs, but there's actually additional data present), so let's just
stick to 8M for now.
>
> > cache instead of allocating fresh BOs.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 46 ++++++-------------
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 3 ++
> > drivers/gpu/drm/xe/xe_guc.c | 12 ++++-
> > 3 files changed, 28 insertions(+), 33 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index 4e26feb9c267f..04fad3126865c 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -11,7 +11,7 @@
> > #include "xe_gt_sriov_pf_helpers.h"
> > #include "xe_gt_sriov_pf_migration.h"
> > #include "xe_gt_sriov_printk.h"
> > -#include "xe_guc.h"
> > +#include "xe_guc_buf.h"
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > #include "xe_sriov_migration_data.h"
> > @@ -57,73 +57,55 @@ static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
> >
> > /* Return: number of state dwords saved or a negative error code on failure */
> > static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
> > - void *buff, size_t size)
> > + void *dst, size_t size)
> > {
> > const int ndwords = size / sizeof(u32);
> > - struct xe_tile *tile = gt_to_tile(gt);
> > - struct xe_device *xe = tile_to_xe(tile);
> > struct xe_guc *guc = >->uc.guc;
> > - struct xe_bo *bo;
> > + CLASS(xe_guc_buf, buf)(&guc->buf, ndwords);
> > int ret;
> >
> > xe_gt_assert(gt, size % sizeof(u32) == 0);
> > xe_gt_assert(gt, size == ndwords * sizeof(u32));
> >
> > - bo = xe_bo_create_pin_map_novm(xe, tile,
> > - ALIGN(size, PAGE_SIZE),
> > - ttm_bo_type_kernel,
> > - XE_BO_FLAG_SYSTEM |
> > - XE_BO_FLAG_GGTT |
> > - XE_BO_FLAG_GGTT_INVALIDATE, false);
> > - if (IS_ERR(bo))
> > - return PTR_ERR(bo);
> > + if (!xe_guc_buf_is_valid(buf))
> > + return -ENOBUFS;
> > +
> > + memset(xe_guc_buf_cpu_ptr(buf), 0, size);
>
> hmm, I didn't find in the GuC spec that this buffer must be zeroed, so why bother?
That was found during testing, GuC actually expects the buffer to be
zeroed.
I'll ping folks to update the spec.
> >
> > ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_SAVE,
> > - xe_bo_ggtt_addr(bo), ndwords);
> > + xe_guc_buf_flush(buf), ndwords);
> > if (!ret)
> > ret = -ENODATA;
> > else if (ret > ndwords)
> > ret = -EPROTO;
> > else if (ret > 0)
> > - xe_map_memcpy_from(xe, buff, &bo->vmap, 0, ret * sizeof(u32));
> > + memcpy(dst, xe_guc_buf_sync_read(buf), ret * sizeof(u32));
>
> nit: given this usage, maybe one day we should add optimized variant that copies directly to dst?
>
> xe_guc_buf_sync_into(buf, dst, size);
>
> >
> > - xe_bo_unpin_map_no_vm(bo);
> > return ret;
> > }
> >
> > /* Return: number of state dwords restored or a negative error code on failure */
> > static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
> > - const void *buff, size_t size)
> > + const void *src, size_t size)
> > {
> > const int ndwords = size / sizeof(u32);
> > - struct xe_tile *tile = gt_to_tile(gt);
> > - struct xe_device *xe = tile_to_xe(tile);
> > struct xe_guc *guc = >->uc.guc;
> > - struct xe_bo *bo;
> > + CLASS(xe_guc_buf_from_data, buf)(&guc->buf, src, size);
> > int ret;
> >
> > xe_gt_assert(gt, size % sizeof(u32) == 0);
> > xe_gt_assert(gt, size == ndwords * sizeof(u32));
> >
> > - bo = xe_bo_create_pin_map_novm(xe, tile,
> > - ALIGN(size, PAGE_SIZE),
> > - ttm_bo_type_kernel,
> > - XE_BO_FLAG_SYSTEM |
> > - XE_BO_FLAG_GGTT |
> > - XE_BO_FLAG_GGTT_INVALIDATE, false);
> > - if (IS_ERR(bo))
> > - return PTR_ERR(bo);
> > -
> > - xe_map_memcpy_to(xe, &bo->vmap, 0, buff, size);
> > + if (!xe_guc_buf_is_valid(buf))
> > + return -ENOBUFS;
> >
> > ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_RESTORE,
> > - xe_bo_ggtt_addr(bo), ndwords);
> > + xe_guc_buf_flush(buf), ndwords);
> > if (!ret)
> > ret = -ENODATA;
> > else if (ret > ndwords)
> > ret = -EPROTO;
> >
> > - xe_bo_unpin_map_no_vm(bo);
> > return ret;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > index e2d41750f863c..4f2f2783339c3 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > @@ -11,6 +11,9 @@
> > struct xe_gt;
> > struct xe_sriov_migration_data;
> >
> > +/* TODO: get this information by querying GuC in the future */
> > +#define XE_GT_SRIOV_PF_MIGRATION_GUC_DATA_MAX_SIZE SZ_8M
>
> so it's 8M or 4M ?
>
> maybe wrap that into function now
>
> u32 xe_gt_sriov_pf_migration_guc_data_size(struct xe_gt *gt)
> {
> if (xe_sriov_pf_migration_supported(gt_to_xe))
> return SZ_4M; /* TODO: ... */
> return 0;
> }
XE_GT_SRIOV_PF_MIGRATION_GUC_DATA_MAX_SIZE disappears from this header
as a result of previous changes, so the size calculation can be kept
static.
>
> > +
> > int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> > int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
> > diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> > index 7c65528859ecb..cd6ab277a7876 100644
> > --- a/drivers/gpu/drm/xe/xe_guc.c
> > +++ b/drivers/gpu/drm/xe/xe_guc.c
> > @@ -24,6 +24,7 @@
> > #include "xe_gt_printk.h"
> > #include "xe_gt_sriov_vf.h"
> > #include "xe_gt_throttle.h"
> > +#include "xe_gt_sriov_pf_migration.h"
> > #include "xe_guc_ads.h"
> > #include "xe_guc_buf.h"
> > #include "xe_guc_capture.h"
> > @@ -40,6 +41,7 @@
> > #include "xe_mmio.h"
> > #include "xe_platform_types.h"
> > #include "xe_sriov.h"
> > +#include "xe_sriov_pf_migration.h"
> > #include "xe_uc.h"
> > #include "xe_uc_fw.h"
> > #include "xe_wa.h"
> > @@ -821,6 +823,14 @@ static int vf_guc_init_post_hwconfig(struct xe_guc *guc)
> > return 0;
> > }
> >
> > +static u32 guc_buf_cache_size(struct xe_guc *guc)
> > +{
> > + if (IS_SRIOV_PF(guc_to_xe(guc)) && xe_sriov_pf_migration_supported(guc_to_xe(guc)))
> > + return XE_GT_SRIOV_PF_MIGRATION_GUC_DATA_MAX_SIZE;
>
> then
> u32 size = XE_GUC_BUF_CACHE_DEFAULT_SIZE;
>
> if (IS_SRIOV_PF(guc_to_xe(guc)))
> size += xe_gt_sriov_pf_migration_guc_data_size(guc_to_gt(guc));
>
> return size;
As the cache gets reused, we don't need to add anything to the default
(we should just replace the size with the new requirement for the
largest object size).
Thanks,
-Michał
>
> > + else
> > + return XE_GUC_BUF_CACHE_DEFAULT_SIZE;
> > +}
> > +
> > /**
> > * xe_guc_init_post_hwconfig - initialize GuC post hwconfig load
> > * @guc: The GuC object
> > @@ -860,7 +870,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
> > if (ret)
> > return ret;
> >
> > - ret = xe_guc_buf_cache_init(&guc->buf, XE_GUC_BUF_CACHE_DEFAULT_SIZE);
> > + ret = xe_guc_buf_cache_init(&guc->buf, guc_buf_cache_size(guc));
> > if (ret)
> > return ret;
> >
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 15/26] drm/xe/pf: Handle GuC migration data as part of PF control
2025-10-23 20:39 ` Michal Wajdeczko
@ 2025-10-28 13:04 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-28 13:04 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 10:39:12PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > Connect the helpers to allow save and restore of GuC migration data in
> > stop_copy / resume device state.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 26 +++++++++++++++++--
> > .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 ++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 9 ++++++-
> > 3 files changed, 34 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index c159f35adcbe7..18f6e3028d4f0 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -188,6 +188,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> > CASE2STR(SAVE_WIP);
> > CASE2STR(SAVE_PROCESS_DATA);
> > CASE2STR(SAVE_WAIT_DATA);
> > + CASE2STR(SAVE_DATA_GUC);
> > CASE2STR(SAVE_DATA_DONE);
> > CASE2STR(SAVE_FAILED);
> > CASE2STR(SAVED);
> > @@ -343,6 +344,7 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOP_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
>
> this should be in one of the previous patch
It is - note that we're exiting this state twice :)
It's a leftover from previous revisions (at some point we were
introducing FAILED state here). I'll remove it.
>
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> > @@ -824,6 +826,7 @@ static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> >
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WAIT_DATA);
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> > }
> > }
> > @@ -848,6 +851,16 @@ static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
> >
> > static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
> > {
> > + int ret;
> > +
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC)) {
> > + xe_gt_assert(gt, xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0);
> > +
> > + ret = xe_gt_sriov_pf_migration_guc_save(gt, vfid);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > return 0;
> > }
> >
> > @@ -881,6 +894,7 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> > pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA);
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> > pf_enter_vf_wip(gt, vfid);
> > pf_queue_vf(gt, vfid);
> > return true;
> > @@ -1046,14 +1060,22 @@ static int
> > pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
> > {
> > struct xe_sriov_migration_data *data = xe_gt_sriov_pf_migration_restore_consume(gt, vfid);
> > + int ret = 0;
> >
> > xe_gt_assert(gt, data);
> >
> > - xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> > + switch (data->type) {
> > + case XE_SRIOV_MIGRATION_DATA_TYPE_GUC:
> > + ret = xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
> > + break;
> > + default:
> > + xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> > + break;
> > + }
> >
> > xe_sriov_migration_data_free(data);
> >
> > - return 0;
> > + return ret;
> > }
> >
> > static bool pf_handle_vf_restore(struct xe_gt *gt, unsigned int vfid)
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > index 35ceb2ff62110..8b951ee8a24fe 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > @@ -33,6 +33,7 @@
> > * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
> > * @XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA: indicates that VF migration data is being produced.
> > * @XE_GT_SRIOV_STATE_SAVE_WAIT_DATA: indicates that PF awaits for space in migration data ring.
> > + * @XE_GT_SRIOV_STATE_SAVE_DATA_GUC: indicates PF needs to save VF GuC migration data.
> > * @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
> > * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> > * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
> > @@ -76,6 +77,7 @@ enum xe_gt_sriov_control_bits {
> > XE_GT_SRIOV_STATE_SAVE_WIP,
> > XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
> > XE_GT_SRIOV_STATE_SAVE_WAIT_DATA,
> > + XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
>
> as DATA_GUC and introduced later DATA_GGTT/MMIO/VRAM are kind of sub-states of PROCESS_DATA,
> better to keep them together
>
> XE_GT_SRIOV_STATE_SAVE_PROCESS_DATA,
> XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
> XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
> XE_GT_SRIOV_STATE_SAVE_DATA_MMIO,
> XE_GT_SRIOV_STATE_SAVE_DATA_VRAM,
> XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
> XE_GT_SRIOV_STATE_SAVE_WAIT_CONSUME,
>
> and at some point you need to update state diagram to include those DATA states
I'll extract this out from control state machine, as it's conceptually
similar to save_vram_offset introduced later in the series.
>
> > XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
> > XE_GT_SRIOV_STATE_SAVE_FAILED,
> > XE_GT_SRIOV_STATE_SAVED,
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index 127162e8c66e8..594178fbe36d0 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -279,10 +279,17 @@ int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
> > ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> > {
> > ssize_t total = 0;
> > + ssize_t size;
> >
> > xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> >
> > - /* Nothing to query yet - will be updated once per-GT migration data types are added */
> > + size = xe_gt_sriov_pf_migration_guc_size(gt, vfid);
> > + if (size < 0)
> > + return size;
> > + else if (size > 0)
>
> "else" not needed
Ok.
Thanks,
-Michał
> > + size += sizeof(struct xe_sriov_pf_migration_hdr);
> > + total += size;
> > +
> > return total;
> > }
> >
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling
2025-10-23 21:50 ` Michal Wajdeczko
@ 2025-10-28 17:03 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-28 17:03 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 11:50:28PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > In an upcoming change, the VF GGTT migration data will be handled as
> > part of VF control state machine. Add the necessary helpers to allow the
> > migration data transfer to/from the HW GGTT resource.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_ggtt.c | 100 +++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_ggtt.h | 3 +
> > drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 44 +++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 5 ++
> > 5 files changed, 154 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> > index 40680f0c49a17..99fe891c7939e 100644
> > --- a/drivers/gpu/drm/xe/xe_ggtt.c
> > +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> > @@ -151,6 +151,14 @@ static void xe_ggtt_set_pte_and_flush(struct xe_ggtt *ggtt, u64 addr, u64 pte)
> > ggtt_update_access_counter(ggtt);
> > }
> >
> > +static u64 xe_ggtt_get_pte(struct xe_ggtt *ggtt, u64 addr)
> > +{
> > + xe_tile_assert(ggtt->tile, !(addr & XE_PTE_MASK));
> > + xe_tile_assert(ggtt->tile, addr < ggtt->size);
> > +
> > + return readq(&ggtt->gsm[addr >> XE_PTE_SHIFT]);
> > +}
> > +
> > static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size)
> > {
> > u16 pat_index = tile_to_xe(ggtt->tile)->pat.idx[XE_CACHE_WB];
> > @@ -233,16 +241,19 @@ void xe_ggtt_might_lock(struct xe_ggtt *ggtt)
> > static const struct xe_ggtt_pt_ops xelp_pt_ops = {
> > .pte_encode_flags = xelp_ggtt_pte_flags,
> > .ggtt_set_pte = xe_ggtt_set_pte,
> > + .ggtt_get_pte = xe_ggtt_get_pte,
> > };
> >
> > static const struct xe_ggtt_pt_ops xelpg_pt_ops = {
> > .pte_encode_flags = xelpg_ggtt_pte_flags,
> > .ggtt_set_pte = xe_ggtt_set_pte,
> > + .ggtt_get_pte = xe_ggtt_get_pte,
> > };
> >
> > static const struct xe_ggtt_pt_ops xelpg_pt_wa_ops = {
> > .pte_encode_flags = xelpg_ggtt_pte_flags,
> > .ggtt_set_pte = xe_ggtt_set_pte_and_flush,
> > + .ggtt_get_pte = xe_ggtt_get_pte,
> > };
> >
> > static void __xe_ggtt_init_early(struct xe_ggtt *ggtt, u32 reserved)
> > @@ -912,6 +923,22 @@ static void xe_ggtt_assign_locked(struct xe_ggtt *ggtt, const struct drm_mm_node
> > xe_ggtt_invalidate(ggtt);
> > }
> >
> > +/**
> > + * xe_ggtt_pte_size() - Convert GGTT VMA size to page table entries size.
> > + * @ggtt: the &xe_ggtt
> > + * @size: GGTT VMA size in bytes
> > + *
> > + * Return: GGTT page table entries size in bytes.
> > + */
> > +size_t xe_ggtt_pte_size(struct xe_ggtt *ggtt, size_t size)
>
> passing ggtt just for assert seems overkill
>
> > +{
> > + struct xe_device __maybe_unused *xe = tile_to_xe(ggtt->tile);
>
> we try to avoid __maybe_unused
>
> if you need xe/tile/gt just in the assert, then put to_xe/tile/gt inside it
It will go away after restructuring it to pass the node instead.
>
> > +
> > + xe_assert(xe, size % XE_PAGE_SIZE == 0);
> > +
> > + return size / XE_PAGE_SIZE * sizeof(u64);
> > +}
> > +
> > /**
> > * xe_ggtt_assign - assign a GGTT region to the VF
> > * @node: the &xe_ggtt_node to update
> > @@ -927,6 +954,79 @@ void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid)
> > xe_ggtt_assign_locked(node->ggtt, &node->base, vfid);
> > mutex_unlock(&node->ggtt->lock);
> > }
> > +
> > +/**
> > + * xe_ggtt_node_save() - Save a &xe_ggtt_node to a buffer.
> > + * @node: the &xe_ggtt_node to be saved
> > + * @dst: destination buffer
> > + * @size: destination buffer size in bytes
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size)
> > +{
> > + struct xe_ggtt *ggtt;
> > + u64 start, end;
> > + u64 *buf = dst;
> > +
> > + if (!node)
> > + return -ENOENT;
> > +
> > + guard(mutex)(&node->ggtt->lock);
> > +
> > + ggtt = node->ggtt;
> > + start = node->base.start;
> > + end = start + node->base.size - 1;
> > +
> > + if (xe_ggtt_pte_size(ggtt, node->base.size) > size)
> > + return -EINVAL;
> > +
> > + while (start < end) {
> > + *buf++ = ggtt->pt_ops->ggtt_get_pte(ggtt, start) & ~GGTT_PTE_VFID;
>
> up to this point function is generic, non-IOV, so maybe leave PTEs as-is and do not sanitize VFID ?
>
> or, similar to node_load(), also pass vfid to enforce additional checks ?
>
> pte = ggtt->pt_ops->ggtt_get_pte(ggtt, start);
> if (vfid != u64_get_bits(pte, GGTT_PTE_VFID))
> return -EPERM;
>
> then optionally sanitize using:
>
> *buf++ = u64_replace_bits(pte, 0, GGTT_PTE_VFID);
>
I'll go with check & sanitize.
>
>
> > + start += XE_PAGE_SIZE;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_ggtt_node_load() - Load a &xe_ggtt_node from a buffer.
> > + * @node: the &xe_ggtt_node to be loaded
> > + * @src: source buffer
> > + * @size: source buffer size in bytes
> > + * @vfid: VF identifier
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid)
> > +{
> > + u64 vfid_pte = xe_encode_vfid_pte(vfid);
> > + const u64 *buf = src;
> > + struct xe_ggtt *ggtt;
> > + u64 start, end;
> > +
> > + if (!node)
> > + return -ENOENT;
> > +
> > + guard(mutex)(&node->ggtt->lock);
> > +
> > + ggtt = node->ggtt;
> > + start = node->base.start;
> > + end = start + size - 1;
> > +
> > + if (xe_ggtt_pte_size(ggtt, node->base.size) != size)
> > + return -EINVAL;
> > +
> > + while (start < end) {
> > + ggtt->pt_ops->ggtt_set_pte(ggtt, start, (*buf & ~GGTT_PTE_VFID) | vfid_pte);
>
> pte = u64_replace_bits(*buf++, vfid, GGTT_PTE_VFID));
> ggtt_set_pte(ggtt, start, pte);
>
Ok.
> > + start += XE_PAGE_SIZE;
> > + buf++;
> > + }
> > + xe_ggtt_invalidate(ggtt);
> > +
> > + return 0;
> > +}
> > +
> > #endif
> >
> > /**
> > diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
> > index 75fc7a1efea76..5f55f80fe3adc 100644
> > --- a/drivers/gpu/drm/xe/xe_ggtt.h
> > +++ b/drivers/gpu/drm/xe/xe_ggtt.h
> > @@ -42,7 +42,10 @@ int xe_ggtt_dump(struct xe_ggtt *ggtt, struct drm_printer *p);
> > u64 xe_ggtt_print_holes(struct xe_ggtt *ggtt, u64 alignment, struct drm_printer *p);
> >
> > #ifdef CONFIG_PCI_IOV
> > +size_t xe_ggtt_pte_size(struct xe_ggtt *ggtt, size_t size);
>
> this could be generic (non PCI-IOV only) inline helper or macro here or in .c
>
> size_t to_xe_ggtt_pt_size(size_t size);
>
> and then more elegant solution would be to expose
>
> size_t xe_ggtt_node_pt_size(const struct xe_ggtt_node *node);
>
> and yes, that would require to additionally expose something from gt_sriov_pf_config
> as migration code doesn't have access to this node,
>
> but maybe xe_gt_sriov_pf_config_ggtt_save() can be updated to also support 'query' mode?
>
> size_t xe_gt_sriov_pf_config_ggtt_save(gt, vfid, buf, size) -> bytes saved
> size_t xe_gt_sriov_pf_config_ggtt_save(gt, vfid, NULL, 0) -> size to be saved
Ok.
I'll go with passing NULL and 0 to query.
Thanks,
-Michał
>
>
> > void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid);
> > +int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size);
> > +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid);
> > #endif
> >
> > #ifndef CONFIG_LOCKDEP
> > diff --git a/drivers/gpu/drm/xe/xe_ggtt_types.h b/drivers/gpu/drm/xe/xe_ggtt_types.h
> > index c5e999d58ff2a..dacd796f81844 100644
> > --- a/drivers/gpu/drm/xe/xe_ggtt_types.h
> > +++ b/drivers/gpu/drm/xe/xe_ggtt_types.h
> > @@ -78,6 +78,8 @@ struct xe_ggtt_pt_ops {
> > u64 (*pte_encode_flags)(struct xe_bo *bo, u16 pat_index);
> > /** @ggtt_set_pte: Directly write into GGTT's PTE */
> > void (*ggtt_set_pte)(struct xe_ggtt *ggtt, u64 addr, u64 pte);
> > + /** @ggtt_get_pte: Directly read from GGTT's PTE */
> > + u64 (*ggtt_get_pte)(struct xe_ggtt *ggtt, u64 addr);
> > };
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > index c0c0215c07036..c857879e28fe5 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > @@ -726,6 +726,50 @@ int xe_gt_sriov_pf_config_set_fair_ggtt(struct xe_gt *gt, unsigned int vfid,
> > return xe_gt_sriov_pf_config_bulk_set_ggtt(gt, vfid, num_vfs, fair);
> > }
> >
> > +/**
> > + * xe_gt_sriov_pf_config_ggtt_save() - Save a VF provisioned GGTT data into a buffer.
> > + * @gt: the &xe_gt
> > + * @vfid: VF identifier (can't be 0)
> > + * @buf: the GGTT data destination buffer
> > + * @size: the size of the buffer
> > + *
> > + * This function can only be called on PF.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
> > + void *buf, size_t size)
> > +{
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid);
> > +
> > + guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
> > +
> > + return xe_ggtt_node_save(pf_pick_vf_config(gt, vfid)->ggtt_region, buf, size);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_config_ggtt_restore() - Restore a VF provisioned GGTT data from a buffer.
> > + * @gt: the &xe_gt
> > + * @vfid: VF identifier (can't be 0)
> > + * @buf: the GGTT data source buffer
> > + * @size: the size of the buffer
> > + *
> > + * This function can only be called on PF.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> > + const void *buf, size_t size)
> > +{
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid);
> > +
> > + guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
> > +
> > + return xe_ggtt_node_load(pf_pick_vf_config(gt, vfid)->ggtt_region, buf, size, vfid);
> > +}
> > +
> > static u32 pf_get_min_spare_ctxs(struct xe_gt *gt)
> > {
> > /* XXX: preliminary */
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > index 513e6512a575b..6916b8f58ebf2 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > @@ -61,6 +61,11 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
> > int xe_gt_sriov_pf_config_restore(struct xe_gt *gt, unsigned int vfid,
> > const void *buf, size_t size);
> >
> > +int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
> > + void *buf, size_t size);
> > +int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> > + const void *buf, size_t size);
> > +
> > bool xe_gt_sriov_pf_config_is_empty(struct xe_gt *gt, unsigned int vfid);
> >
> > int xe_gt_sriov_pf_config_init(struct xe_gt *gt);
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 18/26] drm/xe/pf: Add helpers for VF MMIO migration data handling
2025-10-23 22:10 ` Michal Wajdeczko
@ 2025-10-28 23:37 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-28 23:37 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Fri, Oct 24, 2025 at 12:10:31AM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > In an upcoming change, the VF MMIO migration data will be handled as
> > part of VF control state machine. Add the necessary helpers to allow the
> > migration data transfer to/from the VF MMIO registers.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 88 +++++++++++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf.h | 6 ++
>
> wrong place for those helpers
>
> just promote xe_reg_vf_to_pf()
>
> or maybe it can be done like this:
>
> void xe_mmio_init_vf(struct xe_mmio *vf, const struct xe_mmio *pf, vfid);
>
> then
>
> struct xe_mmio mmio_vf;
>
> xe_mmio_init_vf(&mmio_vf, >->mmio, vfid);
> val = xe_mmio_read32(&mmio_vf, REG);
> xe_mmio_write32(&mmio_vf, val, REG);
>
> let me try check this out
With:
4504e78068924 ("drm/xe/pf: Access VF's register using dedicated MMIO view")
The helpers are somewhat no longer necessary.
I'll move the logic to xe_gt_sriov_pf_migration.c and drop this patch
completely, moving everything into:
drm/xe/pf: Handle MMIO migration data as part of PF control
Thanks,
-Michał
>
>
> > 2 files changed, 94 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> > index c4dda87b47cc8..31ee86166dfd0 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> > @@ -194,6 +194,94 @@ static void pf_clear_vf_scratch_regs(struct xe_gt *gt, unsigned int vfid)
> > }
> > }
> >
> > +/**
> > + * xe_gt_sriov_pf_mmio_vf_size - Get the size of VF MMIO register data.
> > + * @gt: the &xe_gt
> > + * @vfid: VF identifier
> > + *
> > + * Return: size in bytes.
> > + */
> > +size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (xe_gt_is_media_type(gt))
> > + return MED_VF_SW_FLAG_COUNT * sizeof(u32);
> > + else
> > + return VF_SW_FLAG_COUNT * sizeof(u32);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_mmio_vf_save - Save VF MMIO register values to a buffer.
> > + * @gt: the &xe_gt
> > + * @vfid: VF identifier
> > + * @buf: destination buffer
> > + * @size: destination buffer size in bytes
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size)
> > +{
> > + u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
> > + struct xe_reg scratch;
> > + u32 *regs = buf;
> > + int n, count;
> > +
> > + if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
> > + return -EINVAL;
> > +
> > + if (xe_gt_is_media_type(gt)) {
> > + count = MED_VF_SW_FLAG_COUNT;
> > + for (n = 0; n < count; n++) {
> > + scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
> > + regs[n] = xe_mmio_read32(>->mmio, scratch);
> > + }
> > + } else {
> > + count = VF_SW_FLAG_COUNT;
> > + for (n = 0; n < count; n++) {
> > + scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
> > + regs[n] = xe_mmio_read32(>->mmio, scratch);
> > + }
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_mmio_vf_restore - Restore VF MMIO register values from a buffer.
> > + * @gt: the &xe_gt
> > + * @vfid: VF identifier
> > + * @buf: source buffer
> > + * @size: source buffer size in bytes
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
> > + const void *buf, size_t size)
> > +{
> > + u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
> > + const u32 *regs = buf;
> > + struct xe_reg scratch;
> > + int n, count;
> > +
> > + if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
> > + return -EINVAL;
> > +
> > + if (xe_gt_is_media_type(gt)) {
> > + count = MED_VF_SW_FLAG_COUNT;
> > + for (n = 0; n < count; n++) {
> > + scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
> > + xe_mmio_write32(>->mmio, scratch, regs[n]);
> > + }
> > + } else {
> > + count = VF_SW_FLAG_COUNT;
> > + for (n = 0; n < count; n++) {
> > + scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
> > + xe_mmio_write32(>->mmio, scratch, regs[n]);
> > + }
> > + }
> > +
> > + return 0;
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_sanitize_hw() - Reset hardware state related to a VF.
> > * @gt: the &xe_gt
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> > index e7fde3f9937af..7f4f1fda5f77a 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> > @@ -6,6 +6,8 @@
> > #ifndef _XE_GT_SRIOV_PF_H_
> > #define _XE_GT_SRIOV_PF_H_
> >
> > +#include <linux/types.h>
> > +
> > struct xe_gt;
> >
> > #ifdef CONFIG_PCI_IOV
> > @@ -16,6 +18,10 @@ void xe_gt_sriov_pf_init_hw(struct xe_gt *gt);
> > void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid);
> > void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt);
> > void xe_gt_sriov_pf_restart(struct xe_gt *gt);
> > +size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size);
> > +int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
> > + const void *buf, size_t size);
> > #else
> > static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
> > {
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 20/26] drm/xe/pf: Add helper to retrieve VF's LMEM object
2025-10-23 20:25 ` Michal Wajdeczko
@ 2025-10-28 23:40 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-28 23:40 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 10:25:08PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > From: Lukasz Laguna <lukasz.laguna@intel.com>
> >
> > Instead of accessing VF's lmem_obj directly, introduce a helper function
> > to make the access more convenient.
> >
> > Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 31 ++++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 1 +
> > 2 files changed, 32 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > index c857879e28fe5..28d648c386487 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > @@ -1643,6 +1643,37 @@ int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid,
> > "LMEM", n, err);
> > }
> >
> > +static struct xe_bo *pf_get_vf_config_lmem_obj(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
> > +
> > + return config->lmem_obj;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_config_get_lmem_obj - Take a reference to the struct &xe_bo backing VF LMEM.
>
> * xe_gt_sriov_pf_config_get_lmem_obj() - Take ...
Ok.
>
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
>
> since you assert vfid below, add "(can't be 0)"
Ok.
>
> > + *
> > + * This function can only be called on PF.
> > + * The caller is responsible for calling xe_bo_put() on the returned object.
> > + *
> > + * Return: pointer to struct &xe_bo backing VF LMEM (if any).
> > + */
> > +struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_bo *lmem_obj;
> > +
> > + xe_gt_assert(gt, vfid);
> > +
> > + mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
> > + lmem_obj = pf_get_vf_config_lmem_obj(gt, vfid);
> > + xe_bo_get(lmem_obj);
> > + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
> > +
> > + return lmem_obj;
>
> or just
>
> {
> guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
>
> return xe_bo_get(pf_get_vf_config_lmem_obj(gt, vfid));
> }
Ok.
>
> > +}
> > +
> > static u64 pf_query_free_lmem(struct xe_gt *gt)
> > {
> > struct xe_tile *tile = gt->tile;
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > index 6916b8f58ebf2..03c5dc0cd5fef 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > @@ -36,6 +36,7 @@ int xe_gt_sriov_pf_config_set_lmem(struct xe_gt *gt, unsigned int vfid, u64 size
> > int xe_gt_sriov_pf_config_set_fair_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs);
> > int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs,
> > u64 size);
> > +struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid);
> >
> > u32 xe_gt_sriov_pf_config_get_exec_quantum(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_config_set_exec_quantum(struct xe_gt *gt, unsigned int vfid, u32 exec_quantum);
>
> probably we should block VF's reprovisioning during the SAVE/RESTORE,
> but that could be done later as follow up
Yeah - I ended up leaving it out of this series.
But the general work-in-progress idea was to block provisioning access
from userspace if we're in any WIP state.
>
> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
Thanks,
-Michał
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control
2025-10-23 19:54 ` Michal Wajdeczko
@ 2025-10-29 8:54 ` Michał Winiarski
0 siblings, 0 replies; 72+ messages in thread
From: Michał Winiarski @ 2025-10-29 8:54 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe,
linux-kernel, kvm, Matthew Brost, dri-devel, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
On Thu, Oct 23, 2025 at 09:54:02PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
> > Connect the helpers to allow save and restore of VRAM migration data in
> > stop_copy / resume device state.
> >
> > Co-developed-by: Lukasz Laguna <lukasz.laguna@intel.com>
> > Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 18 ++
> > .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 222 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 6 +
> > .../drm/xe/xe_gt_sriov_pf_migration_types.h | 3 +
> > drivers/gpu/drm/xe/xe_sriov_pf_control.c | 3 +
> > 6 files changed, 254 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index e7156ad3d1839..680f2de44144b 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -191,6 +191,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> > CASE2STR(SAVE_DATA_GUC);
> > CASE2STR(SAVE_DATA_GGTT);
> > CASE2STR(SAVE_DATA_MMIO);
> > + CASE2STR(SAVE_DATA_VRAM);
> > CASE2STR(SAVE_DATA_DONE);
> > CASE2STR(SAVE_FAILED);
> > CASE2STR(SAVED);
> > @@ -832,6 +833,7 @@ static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_DONE);
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
> > }
> > }
> >
> > @@ -885,6 +887,19 @@ static int pf_handle_vf_save_data(struct xe_gt *gt, unsigned int vfid)
> > ret = xe_gt_sriov_pf_migration_mmio_save(gt, vfid);
> > if (ret)
> > return ret;
> > +
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
> > + return -EAGAIN;
> > + }
> > +
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM)) {
> > + if (xe_gt_sriov_pf_migration_vram_size(gt, vfid) > 0) {
> > + ret = xe_gt_sriov_pf_migration_vram_save(gt, vfid);
> > + if (ret == -EAGAIN)
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
> > + if (ret)
> > + return ret;
> > + }
> > }
> >
> > return 0;
> > @@ -1100,6 +1115,9 @@ pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid)
> > case XE_SRIOV_MIGRATION_DATA_TYPE_GUC:
> > ret = xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
> > break;
> > + case XE_SRIOV_MIGRATION_DATA_TYPE_VRAM:
> > + ret = xe_gt_sriov_pf_migration_vram_restore(gt, vfid, data);
> > + break;
> > default:
> > xe_gt_sriov_notice(gt, "Skipping VF%u unknown data type: %d\n", vfid, data->type);
> > break;
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > index 9dfcebd5078ac..fba10136f7cc7 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > @@ -36,6 +36,7 @@
> > * @XE_GT_SRIOV_STATE_SAVE_DATA_GUC: indicates PF needs to save VF GuC migration data.
> > * @XE_GT_SRIOV_STATE_SAVE_DATA_GGTT: indicates PF needs to save VF GGTT migration data.
> > * @XE_GT_SRIOV_STATE_SAVE_DATA_MMIO: indicates PF needs to save VF MMIO migration data.
> > + * @XE_GT_SRIOV_STATE_SAVE_DATA_VRAM: indicates PF needs to save VF VRAM migration data.
> > * @XE_GT_SRIOV_STATE_SAVE_DATA_DONE: indicates that all migration data was produced by Xe.
> > * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> > * @XE_GT_SRIOV_STATE_SAVED: indicates that VF data is saved.
> > @@ -82,6 +83,7 @@ enum xe_gt_sriov_control_bits {
> > XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
> > XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
> > XE_GT_SRIOV_STATE_SAVE_DATA_MMIO,
> > + XE_GT_SRIOV_STATE_SAVE_DATA_VRAM,
> > XE_GT_SRIOV_STATE_SAVE_DATA_DONE,
> > XE_GT_SRIOV_STATE_SAVE_FAILED,
> > XE_GT_SRIOV_STATE_SAVED,
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index 41335b15ffdbe..2c6a86d98ee31 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -17,6 +17,7 @@
> > #include "xe_gt_sriov_printk.h"
> > #include "xe_guc_buf.h"
> > #include "xe_guc_ct.h"
> > +#include "xe_migrate.h"
> > #include "xe_sriov.h"
> > #include "xe_sriov_migration_data.h"
> > #include "xe_sriov_pf_migration.h"
> > @@ -485,6 +486,220 @@ int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
> > return pf_restore_vf_mmio_mig_data(gt, vfid, data);
> > }
> >
> > +/**
> > + * xe_gt_sriov_pf_migration_vram_size() - Get the size of VF VRAM migration data.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: size in bytes or a negative error code on failure.
> > + */
> > +ssize_t xe_gt_sriov_pf_migration_vram_size(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (gt != xe_root_mmio_gt(gt_to_xe(gt)))
>
> probably you mean
>
> if (!xe_gt_is_main_type(gt))
Yes - I'll use it.
>
> > + return 0;
> > +
> > + return xe_gt_sriov_pf_config_get_lmem(gt, vfid);
> > +}
> > +
> > +static struct dma_fence *__pf_save_restore_vram(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_bo *vram, u64 vram_offset,
> > + struct xe_bo *sysmem, u64 sysmem_offset,
> > + size_t size, bool save)
> > +{
> > + struct dma_fence *ret = NULL;
> > + struct drm_exec exec;
> > + int err;
> > +
> > + drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
> > + drm_exec_until_all_locked(&exec) {
> > + err = drm_exec_lock_obj(&exec, &vram->ttm.base);
> > + drm_exec_retry_on_contention(&exec);
> > + if (err) {
> > + ret = ERR_PTR(err);
> > + goto err;
> > + }
> > +
> > + err = drm_exec_lock_obj(&exec, &sysmem->ttm.base);
> > + drm_exec_retry_on_contention(&exec);
> > + if (err) {
> > + ret = ERR_PTR(err);
> > + goto err;
> > + }
> > + }
> > +
> > + ret = xe_migrate_vram_copy_chunk(vram, vram_offset, sysmem, sysmem_offset, size,
> > + save ? XE_MIGRATE_COPY_TO_SRAM : XE_MIGRATE_COPY_TO_VRAM);
> > +
> > +err:
> > + drm_exec_fini(&exec);
> > +
> > + return ret;
> > +}
> > +
> > +static int pf_save_vram_chunk(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_bo *src_vram, u64 src_vram_offset,
> > + size_t size)
> > +{
> > + struct xe_sriov_migration_data *data;
> > + struct dma_fence *fence;
> > + int ret;
> > +
> > + data = xe_sriov_migration_data_alloc(gt_to_xe(gt));
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + ret = xe_sriov_migration_data_init(data, gt->tile->id, gt->info.id,
> > + XE_SRIOV_MIGRATION_DATA_TYPE_VRAM,
> > + src_vram_offset, size);
> > + if (ret)
> > + goto fail;
> > +
> > + fence = __pf_save_restore_vram(gt, vfid,
> > + src_vram, src_vram_offset,
> > + data->bo, 0, size, true);
> > +
> > + ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
> > + dma_fence_put(fence);
> > + if (!ret) {
> > + ret = -ETIME;
> > + goto fail;
> > + }
> > +
> > + pf_dump_mig_data(gt, vfid, data);
> > +
> > + ret = xe_gt_sriov_pf_migration_save_produce(gt, vfid, data);
> > + if (ret)
> > + goto fail;
> > +
> > + return 0;
> > +
> > +fail:
> > + xe_sriov_migration_data_free(data);
> > + return ret;
> > +}
> > +
> > +#define VF_VRAM_STATE_CHUNK_MAX_SIZE SZ_512M
> > +static int pf_save_vf_vram_mig_data(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_gt_sriov_migration_data *migration = pf_pick_gt_migration(gt, vfid);
> > + loff_t *offset = &migration->vram_save_offset;
> > + struct xe_bo *vram;
> > + size_t vram_size, chunk_size;
> > + int ret;
> > +
> > + vram = xe_gt_sriov_pf_config_get_lmem_obj(gt, vfid);
> > + if (!vram)
> > + return -ENXIO;
>
> no error message ?
Well, this should be converted to assert after we block re-provisioning,
so I don't think it's needed (or know what to print out :) ).
>
> > +
> > + vram_size = xe_bo_size(vram);
> > + chunk_size = min(vram_size - *offset, VF_VRAM_STATE_CHUNK_MAX_SIZE);
>
> what if *offset > vram_size ?
We control the offset - I'll add an assert.
>
> > +
> > + ret = pf_save_vram_chunk(gt, vfid, vram, *offset, chunk_size);
> > + if (ret)
> > + goto fail;
> > +
> > + *offset += chunk_size;
> > +
> > + xe_bo_put(vram);
> > +
> > + if (*offset < vram_size)
> > + return -EAGAIN;
> > +
> > + return 0;
> > +
> > +fail:
> > + xe_bo_put(vram);
> > + xe_gt_sriov_err(gt, "Failed to save VF%u VRAM data (%pe)\n", vfid, ERR_PTR(ret));
> > + return ret;
> > +}
> > +
> > +static int pf_restore_vf_vram_mig_data(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_migration_data *data)
> > +{
> > + u64 end = data->hdr.offset + data->hdr.size;
> > + struct dma_fence *fence;
> > + struct xe_bo *vram;
> > + size_t size;
> > + int ret = 0;
> > +
> > + vram = xe_gt_sriov_pf_config_get_lmem_obj(gt, vfid);
> > + if (!vram)
> > + return -ENXIO;
>
> no error message ? other errors are reported
Same as above.
>
> > +
> > + size = xe_bo_size(vram);
> > +
> > + if (end > size || end < data->hdr.size) {
> > + ret = -EINVAL;
> > + goto err;
> > + }
> > +
> > + pf_dump_mig_data(gt, vfid, data);
> > +
> > + fence = __pf_save_restore_vram(gt, vfid, vram, data->hdr.offset,
> > + data->bo, 0, data->hdr.size, false);
> > + ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
>
> define this timeout at least as macro (if not as helper function, as this might be platform/settings specific)
I'll use a macro for now.
>
> > + dma_fence_put(fence);
> > + if (!ret) {
> > + ret = -ETIME;
> > + goto err;
> > + }
> > +
> > + return 0;
> > +err:
> > + xe_bo_put(vram);
> > + xe_gt_sriov_err(gt, "Failed to restore VF%u VRAM data (%pe)\n", vfid, ERR_PTR(ret));
> > + return ret;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_vram_save() - Save VF VRAM migration data.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier (can't be 0)
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_migration_vram_save(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid != PFID);
> > + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> > +
> > + return pf_save_vf_vram_mig_data(gt, vfid);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_vram_restore() - Restore VF VRAM migration data.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier (can't be 0)
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_migration_vram_restore(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_migration_data *data)
> > +{
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid != PFID);
> > + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> > +
> > + return pf_restore_vf_vram_mig_data(gt, vfid, data);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_save_init() - Initialize per-GT migration related data.
> > + * @gt: the &xe_gt
> > + * @vfid: the VF identifier (can't be 0)
> > + */
> > +void xe_gt_sriov_pf_migration_save_init(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + pf_pick_gt_migration(gt, vfid)->vram_save_offset = 0;
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT.
> > * @gt: the &xe_gt
> > @@ -522,6 +737,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> > size += sizeof(struct xe_sriov_pf_migration_hdr);
> > total += size;
> >
> > + size = xe_gt_sriov_pf_migration_vram_size(gt, vfid);
> > + if (size < 0)
> > + return size;
> > + else if (size > 0)
>
> "else" not needed
Ok.
>
> > + size += sizeof(struct xe_sriov_pf_migration_hdr);
> > + total += size;
> > +
> > return total;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > index 24a233c4cd0bb..ca518eda5429f 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > @@ -27,6 +27,12 @@ ssize_t xe_gt_sriov_pf_migration_mmio_size(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
> > struct xe_sriov_migration_data *data);
> > +ssize_t xe_gt_sriov_pf_migration_vram_size(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_migration_vram_save(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_migration_vram_restore(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_migration_data *data);
> > +
> > +void xe_gt_sriov_pf_migration_save_init(struct xe_gt *gt, unsigned int vfid);
> >
> > ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > index 75d8b94cbbefb..39a940c9b0a4b 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > @@ -16,6 +16,9 @@
> > struct xe_gt_sriov_migration_data {
> > /** @ring: queue containing VF save / restore migration data */
> > struct ptr_ring ring;
> > +
> > + /** @vram_save_offset: offset within VRAM, used for chunked VRAM save */
>
> "last saved offset" ?
Ok.
Thanks,
-Michał
> > + loff_t vram_save_offset;
> > };
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > index c2768848daba1..aac8ecb861545 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > @@ -5,6 +5,7 @@
> >
> > #include "xe_device.h"
> > #include "xe_gt_sriov_pf_control.h"
> > +#include "xe_gt_sriov_pf_migration.h"
> > #include "xe_sriov_migration_data.h"
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_printk.h"
> > @@ -171,6 +172,8 @@ int xe_sriov_pf_control_trigger_save_vf(struct xe_device *xe, unsigned int vfid)
> > return ret;
> >
> > for_each_gt(gt, xe, id) {
> > + xe_gt_sriov_pf_migration_save_init(gt, vfid);
> > +
> > ret = xe_gt_sriov_pf_control_trigger_save_vf(gt, vfid);
> > if (ret)
> > return ret;
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-27 7:24 ` Tian, Kevin
@ 2025-10-29 20:46 ` Winiarski, Michal
0 siblings, 0 replies; 72+ messages in thread
From: Winiarski, Michal @ 2025-10-29 20:46 UTC (permalink / raw)
To: Tian, Kevin
Cc: Alex Williamson, De Marchi, Lucas, Thomas Hellström,
Vivi, Rodrigo, Jason Gunthorpe, Yishai Hadas,
intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, Brost, Matthew, Wajdeczko, Michal,
dri-devel@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
On Mon, Oct 27, 2025 at 08:24:37AM +0100, Tian, Kevin wrote:
> > From: Winiarski, Michal <michal.winiarski@intel.com>
> > Sent: Wednesday, October 22, 2025 6:42 AM
> > +
> > +/**
> > + * struct xe_vfio_pci_migration_file - file used for reading / writing migration
> > data
> > + */
>
> let's use the comment style in vfio, i.e. "/*" instead of "/**"
It's a kernel-doc format (it's also used in vfio in some places).
I'll drop it though - because of the comments below.
>
> > +struct xe_vfio_pci_migration_file {
> > + /** @filp: pointer to underlying &struct file */
> > + struct file *filp;
> > + /** @lock: serializes accesses to migration data */
> > + struct mutex lock;
> > + /** @xe_vdev: backpointer to &struct xe_vfio_pci_core_device */
> > + struct xe_vfio_pci_core_device *xe_vdev;
>
> above comments are obvious...
Ok - will keep it simple and drop the obvious ones.
>
> > +struct xe_vfio_pci_core_device {
> > + /** @core_device: vendor-agnostic VFIO device */
> > + struct vfio_pci_core_device core_device;
> > +
> > + /** @mig_state: current device migration state */
> > + enum vfio_device_mig_state mig_state;
> > +
> > + /** @vfid: VF number used by PF, xe uses 1-based indexing for vfid
> > */
> > + unsigned int vfid;
>
> is 1-based indexing a sw or hw requirement?
HW/FW components are using 1-based indexing.
I'll update the comment to state that.
>
> > +
> > + /** @pf: pointer to driver_private of physical function */
> > + struct pci_dev *pf;
> > +
> > + /** @fd: &struct xe_vfio_pci_migration_file for userspace to
> > read/write migration data */
> > + struct xe_vfio_pci_migration_file *fd;
>
> s/fd/migf/, as 'fd' is integer in all other places
Ok.
>
> btw it's risky w/o a lock protecting the state transition. See the usage of
> state_mutex in other migration drivers.
It's a gap - I'll introduce a state_mutex.
>
> > +static void xe_vfio_pci_reset_done(struct pci_dev *pdev)
> > +{
> > + struct xe_vfio_pci_core_device *xe_vdev = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + ret = xe_sriov_vfio_wait_flr_done(xe_vdev->pf, xe_vdev->vfid);
> > + if (ret)
> > + dev_err(&pdev->dev, "Failed to wait for FLR: %d\n", ret);
>
> why is there a device specific wait for flr done? suppose it's already
> covered by pci core...
No, unfortunately some of the state is cleared by PF driver, after
PCI-level VF FLR is already done.
More info on that is available in patch 23:
"VF FLR requires additional processing done by PF driver.
The processing is done after FLR is already finished from PCIe
perspective.
In order to avoid a scenario where migration state transitions while
PF processing is still in progress, additional synchronization
point is needed.
Add a helper that will be used as part of VF driver struct
pci_error_handlers .reset_done() callback."
I'll add a comment, so that it's available here as well.
>
> > +
> > + xe_vfio_pci_reset(xe_vdev);
> > +}
> > +
> > +static const struct pci_error_handlers xe_vfio_pci_err_handlers = {
> > + .reset_done = xe_vfio_pci_reset_done,
> > +};
>
> missing ".error_detected "
Ok. I'll use the generic one - vfio_pci_core_aer_err_detected().
>
> > +static struct xe_vfio_pci_migration_file *
> > +xe_vfio_pci_alloc_file(struct xe_vfio_pci_core_device *xe_vdev,
> > + enum xe_vfio_pci_file_type type)
> > +{
> > + struct xe_vfio_pci_migration_file *migf;
> > + const struct file_operations *fops;
> > + int flags;
> > +
> > + migf = kzalloc(sizeof(*migf), GFP_KERNEL);
> > + if (!migf)
> > + return ERR_PTR(-ENOMEM);
> > +
> > + fops = type == XE_VFIO_FILE_SAVE ? &xe_vfio_pci_save_fops :
> > &xe_vfio_pci_resume_fops;
> > + flags = type == XE_VFIO_FILE_SAVE ? O_RDONLY : O_WRONLY;
> > + migf->filp = anon_inode_getfile("xe_vfio_mig", fops, migf, flags);
> > + if (IS_ERR(migf->filp)) {
> > + kfree(migf);
> > + return ERR_CAST(migf->filp);
> > + }
> > +
> > + mutex_init(&migf->lock);
> > + migf->xe_vdev = xe_vdev;
> > + xe_vdev->fd = migf;
> > +
> > + stream_open(migf->filp->f_inode, migf->filp);
> > +
> > + return migf;
>
> miss a get_file(). vfio core will do another fput() upon error.
>
> see vfio_ioct_mig_return_fd()
Ok. I'll take a ref on both STOP_COPY and RESUMING transition.
>
> > +}
> > +
> > +static struct file *
> > +xe_vfio_set_state(struct xe_vfio_pci_core_device *xe_vdev, u32 new)
> > +{
> > + u32 cur = xe_vdev->mig_state;
> > + int ret;
> > +
> > + dev_dbg(xe_vdev_to_dev(xe_vdev),
> > + "state: %s->%s\n", vfio_dev_state_str(cur),
> > vfio_dev_state_str(new));
> > +
> > + /*
> > + * "STOP" handling is reused for "RUNNING_P2P", as the device
> > doesn't have the capability to
> > + * selectively block p2p DMA transfers.
> > + * The device is not processing new workload requests when the VF is
> > stopped, and both
> > + * memory and MMIO communication channels are transferred to
> > destination (where processing
> > + * will be resumed).
> > + */
> > + if ((cur == VFIO_DEVICE_STATE_RUNNING && new ==
> > VFIO_DEVICE_STATE_STOP) ||
>
> this is not required when P2P is supported. vfio_mig_get_next_state() will
> find the right arc from RUNNING to RUNNING_P2P to STOP.
I'll remove both states (RUNNING -> STOP, STOP -> RUNNING).
>
> > + (cur == VFIO_DEVICE_STATE_RUNNING && new ==
> > VFIO_DEVICE_STATE_RUNNING_P2P)) {
> > + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
> > + if (ret)
> > + goto err;
> > +
> > + return NULL;
> > + }
>
> better to align with other drivers, s/stop/suspend/ and s/run/resume/
This will collide with resume_enter / resume_exit for actual migration
data loading.
I'll use suspend_device / resume_device, and resume_data_enter /
resume_data_exit.
>
> > +
> > + if ((cur == VFIO_DEVICE_STATE_RUNNING_P2P && new ==
> > VFIO_DEVICE_STATE_STOP) ||
> > + (cur == VFIO_DEVICE_STATE_STOP && new ==
> > VFIO_DEVICE_STATE_RUNNING_P2P))
> > + return NULL;
> > +
> > + if ((cur == VFIO_DEVICE_STATE_STOP && new ==
> > VFIO_DEVICE_STATE_RUNNING) ||
> > + (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new ==
> > VFIO_DEVICE_STATE_RUNNING)) {
> > + ret = xe_sriov_vfio_run(xe_vdev->pf, xe_vdev->vfid);
> > + if (ret)
> > + goto err;
> > +
> > + return NULL;
> > + }
> > +
> > + if (cur == VFIO_DEVICE_STATE_STOP && new ==
> > VFIO_DEVICE_STATE_STOP_COPY) {
> > + struct xe_vfio_pci_migration_file *migf;
> > +
> > + migf = xe_vfio_pci_alloc_file(xe_vdev, XE_VFIO_FILE_SAVE);
> > + if (IS_ERR(migf)) {
> > + ret = PTR_ERR(migf);
> > + goto err;
> > + }
> > +
> > + ret = xe_sriov_vfio_stop_copy_enter(xe_vdev->pf, xe_vdev-
> > >vfid);
> > + if (ret) {
> > + fput(migf->filp);
> > + goto err;
> > + }
> > +
> > + return migf->filp;
> > + }
> > +
> > + if ((cur == VFIO_DEVICE_STATE_STOP_COPY && new ==
> > VFIO_DEVICE_STATE_STOP)) {
> > + if (xe_vdev->fd)
> > + xe_vfio_pci_disable_file(xe_vdev->fd);
> > +
> > + xe_sriov_vfio_stop_copy_exit(xe_vdev->pf, xe_vdev->vfid);
> > +
> > + return NULL;
> > + }
> > +
> > + if (cur == VFIO_DEVICE_STATE_STOP && new ==
> > VFIO_DEVICE_STATE_RESUMING) {
> > + struct xe_vfio_pci_migration_file *migf;
> > +
> > + migf = xe_vfio_pci_alloc_file(xe_vdev,
> > XE_VFIO_FILE_RESUME);
> > + if (IS_ERR(migf)) {
> > + ret = PTR_ERR(migf);
> > + goto err;
> > + }
> > +
> > + ret = xe_sriov_vfio_resume_enter(xe_vdev->pf, xe_vdev-
> > >vfid);
> > + if (ret) {
> > + fput(migf->filp);
> > + goto err;
> > + }
> > +
> > + return migf->filp;
> > + }
> > +
> > + if (cur == VFIO_DEVICE_STATE_RESUMING && new ==
> > VFIO_DEVICE_STATE_STOP) {
> > + if (xe_vdev->fd)
> > + xe_vfio_pci_disable_file(xe_vdev->fd);
> > +
> > + xe_sriov_vfio_resume_exit(xe_vdev->pf, xe_vdev->vfid);
> > +
> > + return NULL;
> > + }
> > +
> > + if (new == VFIO_DEVICE_STATE_ERROR)
> > + xe_sriov_vfio_error(xe_vdev->pf, xe_vdev->vfid);
>
> the ERROR state is not passed to the variant driver. You'll get -EINVAL
> from vfio_mig_get_next_state(). so this is dead code.
>
> If the pf driver needs to be notified, you could check the ret value instead.
Ok. I'll do that instead.
>
> > +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> > +{
> > + struct xe_vfio_pci_core_device *xe_vdev =
> > + container_of(core_vdev, struct xe_vfio_pci_core_device,
> > core_device.vdev);
> > + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> > +
> > + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> > + return;
> > +
> > + /* vfid starts from 1 for xe */
> > + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
>
> pci_iov_vf_id() returns error if it's not vf. should be checked.
Ok.
>
> > +static int xe_vfio_pci_init_dev(struct vfio_device *core_vdev)
> > +{
> > + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> > +
> > + if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe")
> > == 0)
> > + xe_vfio_pci_migration_init(core_vdev);
>
> I didn't see the point of checking the driver name.
It will go away after transitioning to xe_pci_get_pf_xe_device().
>
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_AUTHOR("Intel Corporation");
>
> please use the author name, as other drivers do
Ok.
Thanks,
-Michał
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [PATCH v2 21/26] drm/xe/migrate: Add function to copy of VRAM data in chunks
2025-10-23 19:29 ` Michal Wajdeczko
@ 2025-10-30 6:07 ` Laguna, Lukasz
0 siblings, 0 replies; 72+ messages in thread
From: Laguna, Lukasz @ 2025-10-30 6:07 UTC (permalink / raw)
To: Michal Wajdeczko, Michał Winiarski, Alex Williamson,
Lucas De Marchi, Thomas Hellström, Rodrigo Vivi,
Jason Gunthorpe, Yishai Hadas, Kevin Tian, intel-xe, linux-kernel,
kvm, Matthew Brost
Cc: dri-devel, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter
On 10/23/2025 21:29, Michal Wajdeczko wrote:
>
> On 10/22/2025 12:41 AM, Michał Winiarski wrote:
>> From: Lukasz Laguna <lukasz.laguna@intel.com>
>>
>> Introduce a new function to copy data between VRAM and sysmem objects.
>> The existing xe_migrate_copy() is tailored for eviction and restore
>> operations, which involves additional logic and operates on entire
>> objects.
>> The xe_migrate_vram_copy_chunk() allows copying chunks of data to or
>> from a dedicated buffer object, which is essential in case of VF
>> migration.
>>
>> Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
>> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_migrate.c | 134 ++++++++++++++++++++++++++++++--
>> drivers/gpu/drm/xe/xe_migrate.h | 8 ++
>> 2 files changed, 136 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
>> index 3112c966c67d7..d30675707162b 100644
>> --- a/drivers/gpu/drm/xe/xe_migrate.c
>> +++ b/drivers/gpu/drm/xe/xe_migrate.c
>> @@ -514,7 +514,7 @@ int xe_migrate_init(struct xe_migrate *m)
>>
>> static u64 max_mem_transfer_per_pass(struct xe_device *xe)
>> {
>> - if (!IS_DGFX(xe) && xe_device_has_flat_ccs(xe))
>> + if ((!IS_DGFX(xe) || IS_SRIOV_PF(xe)) && xe_device_has_flat_ccs(xe))
> being a PF is permanent case, while your expected usage is only during of the handling of the VF migration.
>
> maybe it would be better to introduce flag FORCE_CCS_LIMITED_TRANSFER and pass it to the migration calls when really needed ?
I don't think this change is necessary anymore since we removed support
for raw CCS copy. I'll revert these updates. I tested the copy with 8M
blocks on BMG, and it worked fine.
>
>> return MAX_CCS_LIMITED_TRANSFER;
>>
>> return MAX_PREEMPTDISABLE_TRANSFER;
>> @@ -1155,6 +1155,133 @@ struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate)
>> return migrate->q;
>> }
>>
>> +/**
>> + * xe_migrate_vram_copy_chunk() - Copy a chunk of a VRAM buffer object.
>> + * @vram_bo: The VRAM buffer object.
>> + * @vram_offset: The VRAM offset.
>> + * @sysmem_bo: The sysmem buffer object.
>> + * @sysmem_offset: The sysmem offset.
>> + * @size: The size of VRAM chunk to copy.
>> + * @dir: The direction of the copy operation.
>> + *
>> + * Copies a portion of a buffer object between VRAM and system memory.
>> + * On Xe2 platforms that support flat CCS, VRAM data is decompressed when
>> + * copying to system memory.
>> + *
>> + * Return: Pointer to a dma_fence representing the last copy batch, or
>> + * an error pointer on failure. If there is a failure, any copy operation
>> + * started by the function call has been synced.
>> + */
>> +struct dma_fence *xe_migrate_vram_copy_chunk(struct xe_bo *vram_bo, u64 vram_offset,
>> + struct xe_bo *sysmem_bo, u64 sysmem_offset,
>> + u64 size, enum xe_migrate_copy_dir dir)
>> +{
>> + struct xe_device *xe = xe_bo_device(vram_bo);
>> + struct xe_tile *tile = vram_bo->tile;
>> + struct xe_gt *gt = tile->primary_gt;
>> + struct xe_migrate *m = tile->migrate;
>> + struct dma_fence *fence = NULL;
>> + struct ttm_resource *vram = vram_bo->ttm.resource;
>> + struct ttm_resource *sysmem = sysmem_bo->ttm.resource;
>> + struct xe_res_cursor vram_it, sysmem_it;
>> + u64 vram_L0_ofs, sysmem_L0_ofs;
>> + u32 vram_L0_pt, sysmem_L0_pt;
>> + u64 vram_L0, sysmem_L0;
>> + bool to_sysmem = (dir == XE_MIGRATE_COPY_TO_SRAM);
>> + bool use_comp_pat = to_sysmem &&
>> + GRAPHICS_VER(xe) >= 20 && xe_device_has_flat_ccs(xe);
>> + int pass = 0;
>> + int err;
>> +
>> + xe_assert(xe, IS_ALIGNED(vram_offset | sysmem_offset | size, PAGE_SIZE));
>> + xe_assert(xe, xe_bo_is_vram(vram_bo));
>> + xe_assert(xe, !xe_bo_is_vram(sysmem_bo));
>> + xe_assert(xe, !range_overflows(vram_offset, size, (u64)vram_bo->ttm.base.size));
>> + xe_assert(xe, !range_overflows(sysmem_offset, size, (u64)sysmem_bo->ttm.base.size));
>> +
>> + xe_res_first(vram, vram_offset, size, &vram_it);
>> + xe_res_first_sg(xe_bo_sg(sysmem_bo), sysmem_offset, size, &sysmem_it);
>> +
>> + while (size) {
>> + u32 pte_flags = PTE_UPDATE_FLAG_IS_VRAM;
>> + u32 batch_size = 2; /* arb_clear() + MI_BATCH_BUFFER_END */
>> + struct xe_sched_job *job;
>> + struct xe_bb *bb;
>> + u32 update_idx;
>> + bool usm = xe->info.has_usm;
>> + u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE;
>> +
>> + sysmem_L0 = xe_migrate_res_sizes(m, &sysmem_it);
>> + vram_L0 = min(xe_migrate_res_sizes(m, &vram_it), sysmem_L0);
>> +
>> + drm_dbg(&xe->drm, "Pass %u, size: %llu\n", pass++, vram_L0);
> nit: there is xe_dbg()
Ok
>
>> +
>> + pte_flags |= use_comp_pat ? PTE_UPDATE_FLAG_IS_COMP_PTE : 0;
>> + batch_size += pte_update_size(m, pte_flags, vram, &vram_it, &vram_L0,
>> + &vram_L0_ofs, &vram_L0_pt, 0, 0, avail_pts);
>> +
>> + batch_size += pte_update_size(m, 0, sysmem, &sysmem_it, &vram_L0, &sysmem_L0_ofs,
>> + &sysmem_L0_pt, 0, avail_pts, avail_pts);
>> + batch_size += EMIT_COPY_DW;
>> +
>> + bb = xe_bb_new(gt, batch_size, usm);
>> + if (IS_ERR(bb)) {
>> + err = PTR_ERR(bb);
>> + return ERR_PTR(err);
>> + }
>> +
>> + if (xe_migrate_allow_identity(vram_L0, &vram_it))
>> + xe_res_next(&vram_it, vram_L0);
>> + else
>> + emit_pte(m, bb, vram_L0_pt, true, use_comp_pat, &vram_it, vram_L0, vram);
>> +
>> + emit_pte(m, bb, sysmem_L0_pt, false, false, &sysmem_it, vram_L0, sysmem);
>> +
>> + bb->cs[bb->len++] = MI_BATCH_BUFFER_END;
>> + update_idx = bb->len;
>> +
>> + if (to_sysmem)
>> + emit_copy(gt, bb, vram_L0_ofs, sysmem_L0_ofs, vram_L0, XE_PAGE_SIZE);
>> + else
>> + emit_copy(gt, bb, sysmem_L0_ofs, vram_L0_ofs, vram_L0, XE_PAGE_SIZE);
>> +
>> + job = xe_bb_create_migration_job(m->q, bb, xe_migrate_batch_base(m, usm),
>> + update_idx);
>> + if (IS_ERR(job)) {
>> + err = PTR_ERR(job);
>> + goto err;
> this goto inside 'while' loop is weird
Good point
>
>> + }
>> +
>> + xe_sched_job_add_migrate_flush(job, MI_INVALIDATE_TLB);
>> +
>> + WARN_ON_ONCE(!dma_resv_test_signaled(vram_bo->ttm.base.resv,
>> + DMA_RESV_USAGE_BOOKKEEP));
>> + WARN_ON_ONCE(!dma_resv_test_signaled(sysmem_bo->ttm.base.resv,
>> + DMA_RESV_USAGE_BOOKKEEP));
> xe_WARN_ON_ONCE() ?
>
> but why do not use asserts() if we are sure that this shouldn't happen ?
Ok, I'll use asserts
>
>> +
>> + mutex_lock(&m->job_mutex);
> scoped_quard(mutex) ?
Ok
>
>> + xe_sched_job_arm(job);
>> + dma_fence_put(fence);
>> + fence = dma_fence_get(&job->drm.s_fence->finished);
>> + xe_sched_job_push(job);
>> +
>> + dma_fence_put(m->fence);
>> + m->fence = dma_fence_get(fence);
>> + mutex_unlock(&m->job_mutex);
>> +
>> + xe_bb_free(bb, fence);
>> + size -= vram_L0;
>> + continue;
>> +
>> +err:
>> + xe_bb_free(bb, NULL);
>> +
>> + return ERR_PTR(err);
>> + }
>> +
>> + return fence;
>> +}
>> +
>> static void emit_clear_link_copy(struct xe_gt *gt, struct xe_bb *bb, u64 src_ofs,
>> u32 size, u32 pitch)
>> {
>> @@ -1852,11 +1979,6 @@ static bool xe_migrate_vram_use_pde(struct drm_pagemap_addr *sram_addr,
>> return true;
>> }
>>
>> -enum xe_migrate_copy_dir {
>> - XE_MIGRATE_COPY_TO_VRAM,
>> - XE_MIGRATE_COPY_TO_SRAM,
>> -};
>> -
>> #define XE_CACHELINE_BYTES 64ull
>> #define XE_CACHELINE_MASK (XE_CACHELINE_BYTES - 1)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h
>> index 4fad324b62535..d7bcc6ad8464e 100644
>> --- a/drivers/gpu/drm/xe/xe_migrate.h
>> +++ b/drivers/gpu/drm/xe/xe_migrate.h
>> @@ -28,6 +28,11 @@ struct xe_vma;
>>
>> enum xe_sriov_vf_ccs_rw_ctxs;
>>
>> +enum xe_migrate_copy_dir {
>> + XE_MIGRATE_COPY_TO_VRAM,
>> + XE_MIGRATE_COPY_TO_SRAM,
>> +};
> nit: it's time for xe_migrate_types.h ;)
>
> but not as part of this series
>
>> +
>> /**
>> * struct xe_migrate_pt_update_ops - Callbacks for the
>> * xe_migrate_update_pgtables() function.
>> @@ -131,6 +136,9 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
>>
>> struct xe_lrc *xe_migrate_lrc(struct xe_migrate *migrate);
>> struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate);
>> +struct dma_fence *xe_migrate_vram_copy_chunk(struct xe_bo *vram_bo, u64 vram_offset,
>> + struct xe_bo *sysmem_bo, u64 sysmem_offset,
>> + u64 size, enum xe_migrate_copy_dir dir);
>> int xe_migrate_access_memory(struct xe_migrate *m, struct xe_bo *bo,
>> unsigned long offset, void *buf, int len,
>> int write);
^ permalink raw reply [flat|nested] 72+ messages in thread
end of thread, other threads:[~2025-10-30 6:08 UTC | newest]
Thread overview: 72+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-21 22:41 [PATCH v2 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
2025-10-28 2:33 ` Tian, Kevin
2025-10-28 8:06 ` Winiarski, Michal
2025-10-21 22:41 ` [PATCH v2 02/26] drm/xe: Move migration support to device-level struct Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
2025-10-22 22:31 ` Michal Wajdeczko
2025-10-27 12:02 ` Michał Winiarski
2025-10-28 3:06 ` Tian, Kevin
2025-10-28 8:02 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 04/26] drm/xe/pf: Add data structures and handlers for migration rings Michał Winiarski
2025-10-22 22:06 ` Michal Wajdeczko
2025-10-27 12:33 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 05/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
2025-10-22 22:18 ` Michal Wajdeczko
2025-10-27 12:47 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 06/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
2025-10-22 22:34 ` Michal Wajdeczko
2025-10-27 13:27 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 07/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
2025-10-22 22:49 ` Michal Wajdeczko
2025-10-27 14:52 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 08/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
2025-10-22 23:02 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 09/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
2025-10-22 23:05 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 10/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
2025-10-22 23:13 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 11/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration Michał Winiarski
2025-10-23 17:37 ` Michal Wajdeczko
2025-10-28 10:46 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 12/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 13/26] drm/xe/pf: Don't save GuC VF migration data on pause Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 14/26] drm/xe/pf: Switch VF migration GuC save/restore to struct migration data Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 15/26] drm/xe/pf: Handle GuC migration data as part of PF control Michał Winiarski
2025-10-23 20:39 ` Michal Wajdeczko
2025-10-28 13:04 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 16/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
2025-10-23 21:50 ` Michal Wajdeczko
2025-10-28 17:03 ` Michał Winiarski
2025-10-28 3:22 ` Tian, Kevin
2025-10-28 7:38 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 17/26] drm/xe/pf: Handle GGTT migration data as part of PF control Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 18/26] drm/xe/pf: Add helpers for VF MMIO migration data handling Michał Winiarski
2025-10-23 22:10 ` Michal Wajdeczko
2025-10-28 23:37 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 19/26] drm/xe/pf: Handle MMIO migration data as part of PF control Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 20/26] drm/xe/pf: Add helper to retrieve VF's LMEM object Michał Winiarski
2025-10-23 20:25 ` Michal Wajdeczko
2025-10-28 23:40 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 21/26] drm/xe/migrate: Add function to copy of VRAM data in chunks Michał Winiarski
2025-10-23 19:29 ` Michal Wajdeczko
2025-10-30 6:07 ` Laguna, Lukasz
2025-10-21 22:41 ` [PATCH v2 22/26] drm/xe/pf: Handle VRAM migration data as part of PF control Michał Winiarski
2025-10-23 11:44 ` kernel test robot
2025-10-23 19:54 ` Michal Wajdeczko
2025-10-29 8:54 ` Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 23/26] drm/xe/pf: Add wait helper for VF FLR Michał Winiarski
2025-10-21 22:41 ` [PATCH v2 24/26] drm/xe/pf: Enable SR-IOV VF migration for PTL and BMG Michał Winiarski
2025-10-23 20:15 ` Michal Wajdeczko
2025-10-21 22:41 ` [PATCH v2 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
2025-10-28 3:28 ` Tian, Kevin
2025-10-21 22:41 ` [PATCH v2 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
2025-10-22 7:12 ` Christoph Hellwig
2025-10-22 8:52 ` Michał Winiarski
2025-10-22 8:54 ` Christoph Hellwig
2025-10-22 9:12 ` Michał Winiarski
2025-10-22 11:33 ` Jason Gunthorpe
2025-10-22 13:27 ` Michał Winiarski
2025-10-27 7:24 ` Tian, Kevin
2025-10-29 20:46 ` Winiarski, Michal
2025-10-27 7:26 ` Tian, Kevin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox