* [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration
@ 2025-10-11 19:38 Michał Winiarski
2025-10-11 19:38 ` [PATCH 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
` (25 more replies)
0 siblings, 26 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Xe is a DRM driver supporting Intel GPUs and for SR-IOV capable
devices, it enables the creation of SR-IOV VFs.
This series adds xe-vfio-pci driver variant that interacts with Xe
driver to control VF device state and read/write migration data,
allowing it to extend regular vfio-pci functionality with VFIO migration
capability.
The driver doesn't expose PRE_COPY support, as currently supported
hardware lacks the capability to track dirty pages.
While Xe driver already had the capability to manage VF device state,
management of migration data was something that needed to be implemented
and constitutes the majority of the series.
The migration data is processed asynchronously by the Xe driver, and is
organized into multiple migration data packet types representing the
hardware interfaces of the device (GGTT / MMIO / GuC FW / VRAM).
Since the VRAM can potentially be larger than available system memory,
it is copied in multiple chunks. The metadata needed for migration
compatibility decisions is added as part of descriptor packet (currently
limited to PCI device ID / revision).
Xe driver abstracts away the internals of packet processing and takes
care of tracking the position within individual packets.
The API exported to VFIO is similar to API exported by VFIO to
userspace, a simple .read()/.write().
Note that some of the VF resources are not virtualized (e.g. GGTT - the
GFX device global virtual address space). This means that the VF driver
needs to be aware that migration has occured in order to properly
relocate (patching or reemiting data that contains references to GGTT
addresses) before resuming operation.
The code to handle that is already present in upstream Linux and in
production VF drivers for other OSes.
Lukasz Laguna (2):
drm/xe/pf: Add helper to retrieve VF's LMEM object
drm/xe/migrate: Add function for raw copy of VRAM and CCS
Michał Winiarski (24):
drm/xe/pf: Remove GuC version check for migration support
drm/xe: Move migration support to device-level struct
drm/xe/pf: Add save/restore control state stubs and connect to debugfs
drm/xe/pf: Extract migration mutex out of its struct
drm/xe/pf: Add data structures and handlers for migration rings
drm/xe/pf: Add helpers for migration data allocation / free
drm/xe/pf: Add support for encap/decap of bitstream to/from packet
drm/xe/pf: Add minimalistic migration descriptor
drm/xe/pf: Expose VF migration data size over debugfs
drm/xe: Add sa/guc_buf_cache sync interface
drm/xe: Allow the caller to pass guc_buf_cache size
drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF
migration
drm/xe/pf: Remove GuC migration data save/restore from GT debugfs
drm/xe/pf: Don't save GuC VF migration data on pause
drm/xe/pf: Switch VF migration GuC save/restore to struct migration
data
drm/xe/pf: Handle GuC migration data as part of PF control
drm/xe/pf: Add helpers for VF GGTT migration data handling
drm/xe/pf: Handle GGTT migration data as part of PF control
drm/xe/pf: Add helpers for VF MMIO migration data handling
drm/xe/pf: Handle MMIO migration data as part of PF control
drm/xe/pf: Handle VRAM migration data as part of PF control
drm/xe/pf: Add wait helper for VF FLR
drm/xe/pf: Export helpers for VFIO
vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
MAINTAINERS | 7 +
drivers/gpu/drm/xe/Makefile | 4 +
drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c | 2 +-
drivers/gpu/drm/xe/xe_ggtt.c | 92 ++
drivers/gpu/drm/xe/xe_ggtt.h | 2 +
drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +
drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 88 ++
drivers/gpu/drm/xe/xe_gt_sriov_pf.h | 19 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 94 ++
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 6 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 436 ++++++++-
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 7 +
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 23 +-
drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 47 -
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 901 ++++++++++++++----
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 34 +-
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 27 +-
drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 6 +-
drivers/gpu/drm/xe/xe_guc.c | 4 +-
drivers/gpu/drm/xe/xe_guc_buf.c | 15 +-
drivers/gpu/drm/xe/xe_guc_buf.h | 3 +-
drivers/gpu/drm/xe/xe_migrate.c | 214 ++++-
drivers/gpu/drm/xe/xe_migrate.h | 4 +
drivers/gpu/drm/xe/xe_sa.c | 21 +
drivers/gpu/drm/xe/xe_sa.h | 1 +
drivers/gpu/drm/xe/xe_sriov_pf.c | 6 +
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 125 +++
drivers/gpu/drm/xe/xe_sriov_pf_control.h | 5 +
drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 117 +++
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 281 ++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 48 +
.../gpu/drm/xe/xe_sriov_pf_migration_data.c | 566 +++++++++++
.../gpu/drm/xe/xe_sriov_pf_migration_data.h | 39 +
.../gpu/drm/xe/xe_sriov_pf_migration_types.h | 46 +
drivers/gpu/drm/xe/xe_sriov_pf_types.h | 8 +
drivers/gpu/drm/xe/xe_sriov_vfio.c | 252 +++++
drivers/vfio/pci/Kconfig | 2 +
drivers/vfio/pci/Makefile | 2 +
drivers/vfio/pci/xe/Kconfig | 12 +
drivers/vfio/pci/xe/Makefile | 3 +
drivers/vfio/pci/xe/main.c | 470 +++++++++
include/drm/intel/xe_sriov_vfio.h | 28 +
42 files changed, 3747 insertions(+), 322 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.c
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.h
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
create mode 100644 drivers/gpu/drm/xe/xe_sriov_vfio.c
create mode 100644 drivers/vfio/pci/xe/Kconfig
create mode 100644 drivers/vfio/pci/xe/Makefile
create mode 100644 drivers/vfio/pci/xe/main.c
create mode 100644 include/drm/intel/xe_sriov_vfio.h
--
2.50.1
^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH 01/26] drm/xe/pf: Remove GuC version check for migration support
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 18:31 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 02/26] drm/xe: Move migration support to device-level struct Michał Winiarski
` (24 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Since commit 4eb0aab6e4434 ("drm/xe/guc: Bump minimum required GuC
version to v70.29.2"), the minimum GuC version required by the driver
is v70.29.2, which should already include everything that we need for
migration.
Remove the version check.
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 44cc612b0a752..a5bf327ef8889 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -384,9 +384,6 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
static bool pf_check_migration_support(struct xe_gt *gt)
{
- /* GuC 70.25 with save/restore v2 is required */
- xe_gt_assert(gt, GUC_FIRMWARE_VER(>->uc.guc) >= MAKE_GUC_VER(70, 25, 0));
-
/* XXX: for now this is for feature enabling only */
return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
}
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 02/26] drm/xe: Move migration support to device-level struct
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
2025-10-11 19:38 ` [PATCH 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 18:58 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
` (23 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Upcoming changes will allow users to control VF state and obtain its
migration data with a device-level granularity (not tile/gt).
Change the data structures to reflect that and move the GT-level
migration init to happen after device-level init.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/Makefile | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 12 +-----
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 3 --
drivers/gpu/drm/xe/xe_sriov_pf.c | 5 +++
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 43 +++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 27 ++++++++++++
.../gpu/drm/xe/xe_sriov_pf_migration_types.h | 0
drivers/gpu/drm/xe/xe_sriov_pf_types.h | 5 +++
8 files changed, 83 insertions(+), 13 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.c
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.h
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 84321fad32658..71f685a315dca 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -176,6 +176,7 @@ xe-$(CONFIG_PCI_IOV) += \
xe_sriov_pf.o \
xe_sriov_pf_control.o \
xe_sriov_pf_debugfs.o \
+ xe_sriov_pf_migration.o \
xe_sriov_pf_service.o \
xe_tile_sriov_pf_debugfs.o
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index a5bf327ef8889..ca28f45aaf481 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -13,6 +13,7 @@
#include "xe_guc.h"
#include "xe_guc_ct.h"
#include "xe_sriov.h"
+#include "xe_sriov_pf_migration.h"
/* Return: number of dwords saved/restored/required or a negative error code on failure */
static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
@@ -115,8 +116,7 @@ static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
static bool pf_migration_supported(struct xe_gt *gt)
{
- xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- return gt->sriov.pf.migration.supported;
+ return xe_sriov_pf_migration_supported(gt_to_xe(gt));
}
static struct mutex *pf_migration_mutex(struct xe_gt *gt)
@@ -382,12 +382,6 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
}
#endif /* CONFIG_DEBUG_FS */
-static bool pf_check_migration_support(struct xe_gt *gt)
-{
- /* XXX: for now this is for feature enabling only */
- return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
-}
-
/**
* xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
* @gt: the &xe_gt
@@ -403,8 +397,6 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
xe_gt_assert(gt, IS_SRIOV_PF(xe));
- gt->sriov.pf.migration.supported = pf_check_migration_support(gt);
-
if (!pf_migration_supported(gt))
return 0;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
index 1f3110b6d44fa..9d672feac5f04 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
@@ -30,9 +30,6 @@ struct xe_gt_sriov_state_snapshot {
* Used by the PF driver to maintain non-VF specific per-GT data.
*/
struct xe_gt_sriov_pf_migration {
- /** @supported: indicates whether the feature is supported */
- bool supported;
-
/** @snapshot_lock: protects all VFs snapshots */
struct mutex snapshot_lock;
};
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
index bc1ab9ee31d92..95743c7af8050 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
@@ -15,6 +15,7 @@
#include "xe_sriov.h"
#include "xe_sriov_pf.h"
#include "xe_sriov_pf_helpers.h"
+#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_service.h"
#include "xe_sriov_printk.h"
@@ -101,6 +102,10 @@ int xe_sriov_pf_init_early(struct xe_device *xe)
if (err)
return err;
+ err = xe_sriov_pf_migration_init(xe);
+ if (err)
+ return err;
+
xe_sriov_pf_service_init(xe);
return 0;
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
new file mode 100644
index 0000000000000..cf6a210d5597a
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include "xe_sriov.h"
+#include "xe_sriov_pf_migration.h"
+
+/**
+ * xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
+ * @xe: the &struct xe_device
+ *
+ * Return: true if migration is supported, false otherwise
+ */
+bool xe_sriov_pf_migration_supported(struct xe_device *xe)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ return xe->sriov.pf.migration.supported;
+}
+
+static bool pf_check_migration_support(struct xe_device *xe)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ /* XXX: for now this is for feature enabling only */
+ return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
+}
+
+/**
+ * xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
+ * @xe: the &struct xe_device
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_migration_init(struct xe_device *xe)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
+
+ return 0;
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
new file mode 100644
index 0000000000000..d3058b6682192
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_SRIOV_PF_MIGRATION_H_
+#define _XE_SRIOV_PF_MIGRATION_H_
+
+#include <linux/types.h>
+
+struct xe_device;
+
+#ifdef CONFIG_PCI_IOV
+int xe_sriov_pf_migration_init(struct xe_device *xe);
+bool xe_sriov_pf_migration_supported(struct xe_device *xe);
+#else
+static inline int xe_sriov_pf_migration_init(struct xe_device *xe)
+{
+ return 0;
+}
+static inline bool xe_sriov_pf_migration_supported(struct xe_device *xe)
+{
+ return false;
+}
+#endif
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
new file mode 100644
index 0000000000000..e69de29bb2d1d
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
index 956a88f9f213d..2d2fcc0a2f258 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
@@ -32,6 +32,11 @@ struct xe_device_pf {
/** @driver_max_vfs: Maximum number of VFs supported by the driver. */
u16 driver_max_vfs;
+ struct {
+ /** @migration.supported: indicates whether VF migration feature is supported */
+ bool supported;
+ } migration;
+
/** @master_lock: protects all VFs configurations across GTs */
struct mutex master_lock;
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
2025-10-11 19:38 ` [PATCH 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
2025-10-11 19:38 ` [PATCH 02/26] drm/xe: Move migration support to device-level struct Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 20:09 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 04/26] drm/xe/pf: Extract migration mutex out of its struct Michał Winiarski
` (22 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
The states will be used by upcoming changes to produce (in case of save)
or consume (in case of resume) the VF migration data.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 270 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 6 +
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 17 ++
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 96 +++++++
drivers/gpu/drm/xe/xe_sriov_pf_control.h | 4 +
drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 38 +++
6 files changed, 431 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 2e6bd3d1fe1da..44df984278548 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -184,6 +184,13 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(PAUSE_SAVE_GUC);
CASE2STR(PAUSE_FAILED);
CASE2STR(PAUSED);
+ CASE2STR(MIGRATION_DATA_WIP);
+ CASE2STR(SAVE_WIP);
+ CASE2STR(SAVE_FAILED);
+ CASE2STR(SAVED);
+ CASE2STR(RESTORE_WIP);
+ CASE2STR(RESTORE_FAILED);
+ CASE2STR(RESTORED);
CASE2STR(RESUME_WIP);
CASE2STR(RESUME_SEND_RESUME);
CASE2STR(RESUME_FAILED);
@@ -207,6 +214,8 @@ static unsigned long pf_get_default_timeout(enum xe_gt_sriov_control_bits bit)
return HZ / 2;
case XE_GT_SRIOV_STATE_FLR_WIP:
case XE_GT_SRIOV_STATE_FLR_RESET_CONFIG:
+ case XE_GT_SRIOV_STATE_SAVE_WIP:
+ case XE_GT_SRIOV_STATE_RESTORE_WIP:
return 5 * HZ;
default:
return HZ;
@@ -359,6 +368,10 @@ static void pf_queue_vf(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_flr_wip(struct xe_gt *gt, unsigned int vfid);
static void pf_exit_vf_stop_wip(struct xe_gt *gt, unsigned int vfid);
+static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid);
+static void pf_exit_vf_saved(struct xe_gt *gt, unsigned int vfid);
+static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid);
+static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid);
static void pf_exit_vf_pause_wip(struct xe_gt *gt, unsigned int vfid);
static void pf_exit_vf_resume_wip(struct xe_gt *gt, unsigned int vfid);
@@ -380,6 +393,8 @@ static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_flr_wip(gt, vfid);
pf_exit_vf_stop_wip(gt, vfid);
+ pf_exit_vf_save_wip(gt, vfid);
+ pf_exit_vf_restore_wip(gt, vfid);
pf_exit_vf_pause_wip(gt, vfid);
pf_exit_vf_resume_wip(gt, vfid);
@@ -399,6 +414,8 @@ static void pf_enter_vf_ready(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
+ pf_exit_vf_saved(gt, vfid);
+ pf_exit_vf_restored(gt, vfid);
pf_exit_vf_mismatch(gt, vfid);
pf_exit_vf_wip(gt, vfid);
}
@@ -675,6 +692,8 @@ static void pf_enter_vf_resumed(struct xe_gt *gt, unsigned int vfid)
{
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+ pf_exit_vf_saved(gt, vfid);
+ pf_exit_vf_restored(gt, vfid);
pf_exit_vf_mismatch(gt, vfid);
pf_exit_vf_wip(gt, vfid);
}
@@ -776,6 +795,249 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
return -ECANCELED;
}
+/**
+ * xe_gt_sriov_pf_control_check_vf_data_wip - check if new SR-IOV VF migration data is expected
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: true when new migration data is expected to be produced, false otherwise
+ */
+bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ return pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
+}
+
+static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
+}
+
+static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED))
+ pf_enter_vf_state_machine_bug(gt, vfid);
+
+ xe_gt_sriov_info(gt, "VF%u saved!\n", vfid);
+
+ pf_exit_vf_mismatch(gt, vfid);
+ pf_exit_vf_wip(gt, vfid);
+}
+
+static void pf_exit_vf_saved(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
+}
+
+static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
+ return false;
+
+ pf_exit_vf_save_wip(gt, vfid);
+ pf_enter_vf_saved(gt, vfid);
+
+ return true;
+}
+
+static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
+ pf_exit_vf_restored(gt, vfid);
+ pf_enter_vf_wip(gt, vfid);
+ pf_queue_vf(gt, vfid);
+ return true;
+ }
+
+ return false;
+}
+
+/**
+ * xe_gt_sriov_pf_control_save_vf - Save SR-IOV VF migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_save_vf(struct xe_gt *gt, unsigned int vfid)
+{
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
+ xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
+ return -EPERM;
+ }
+
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
+ xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
+ return -EPERM;
+ }
+
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ xe_gt_sriov_dbg(gt, "VF%u restore is in progress!\n", vfid);
+ return -EPERM;
+ }
+
+ if (!pf_enter_vf_save_wip(gt, vfid)) {
+ xe_gt_sriov_dbg(gt, "VF%u save already in progress!\n", vfid);
+ return -EALREADY;
+ }
+
+ return 0;
+}
+
+static int pf_wait_vf_save_done(struct xe_gt *gt, unsigned int vfid)
+{
+ unsigned long timeout = pf_get_default_timeout(XE_GT_SRIOV_STATE_SAVE_WIP);
+ int err;
+
+ err = pf_wait_vf_wip_done(gt, vfid, timeout);
+ if (err) {
+ xe_gt_sriov_notice(gt, "VF%u SAVE didn't finish in %u ms (%pe)\n",
+ vfid, jiffies_to_msecs(timeout), ERR_PTR(err));
+ return err;
+ }
+
+ if (!pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED))
+ return -EIO;
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_control_wait_save_done() - Wait for a VF Save to complete
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_wait_save_done(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
+ return 0;
+
+ return pf_wait_vf_save_done(gt, vfid);
+}
+
+static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP);
+}
+
+static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED))
+ pf_enter_vf_state_machine_bug(gt, vfid);
+
+ xe_gt_sriov_info(gt, "VF%u restored!\n", vfid);
+
+ pf_exit_vf_mismatch(gt, vfid);
+ pf_exit_vf_wip(gt, vfid);
+}
+
+static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
+}
+
+static bool pf_handle_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
+ return false;
+
+ pf_exit_vf_restore_wip(gt, vfid);
+ pf_enter_vf_restored(gt, vfid);
+
+ return true;
+}
+
+static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
+{
+ if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
+ pf_exit_vf_saved(gt, vfid);
+ pf_enter_vf_wip(gt, vfid);
+ pf_enter_vf_restored(gt, vfid);
+ return true;
+ }
+
+ return false;
+}
+
+/**
+ * xe_gt_sriov_pf_control_restore_vf - Restore SR-IOV VF migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_restore_vf(struct xe_gt *gt, unsigned int vfid)
+{
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
+ xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
+ return -EPERM;
+ }
+
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
+ xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
+ return -EPERM;
+ }
+
+ if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
+ xe_gt_sriov_dbg(gt, "VF%u save is in progress!\n", vfid);
+ return -EPERM;
+ }
+
+ if (!pf_enter_vf_restore_wip(gt, vfid)) {
+ xe_gt_sriov_dbg(gt, "VF%u restore already in progress!\n", vfid);
+ return -EALREADY;
+ }
+
+ return 0;
+}
+
+static int pf_wait_vf_restore_done(struct xe_gt *gt, unsigned int vfid)
+{
+ unsigned long timeout = pf_get_default_timeout(XE_GT_SRIOV_STATE_RESTORE_WIP);
+ int err;
+
+ err = pf_wait_vf_wip_done(gt, vfid, timeout);
+ if (err) {
+ xe_gt_sriov_notice(gt, "VF%u RESTORE didn't finish in %u ms (%pe)\n",
+ vfid, jiffies_to_msecs(timeout), ERR_PTR(err));
+ return err;
+ }
+
+ if (!pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED))
+ return -EIO;
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_control_wait_restore_done() - Wait for a VF Restore to complete
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_control_wait_restore_done(struct xe_gt *gt, unsigned int vfid)
+{
+ if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
+ return 0;
+
+ return pf_wait_vf_restore_done(gt, vfid);
+}
+
/**
* DOC: The VF STOP state machine
*
@@ -817,6 +1079,8 @@ static void pf_enter_vf_stopped(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
+ pf_exit_vf_saved(gt, vfid);
+ pf_exit_vf_restored(gt, vfid);
pf_exit_vf_mismatch(gt, vfid);
pf_exit_vf_wip(gt, vfid);
}
@@ -1461,6 +1725,12 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
if (pf_exit_vf_pause_save_guc(gt, vfid))
return true;
+ if (pf_handle_vf_save_wip(gt, vfid))
+ return true;
+
+ if (pf_handle_vf_restore_wip(gt, vfid))
+ return true;
+
if (pf_exit_vf_resume_send_resume(gt, vfid))
return true;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
index 8a72ef3778d47..2e121e8132dcf 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
@@ -14,8 +14,14 @@ struct xe_gt;
int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
+bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
+
int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_save_vf(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_wait_save_done(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_restore_vf(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_control_wait_restore_done(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_stop_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_trigger_flr(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_sync_flr(struct xe_gt *gt, unsigned int vfid, bool sync);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index c80b7e77f1ad2..02b517533ee8a 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -31,6 +31,13 @@
* @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state.
* @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
* @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
+ * @XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP: indicates that the new data is expected in migration ring.
+ * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
+ * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
+ * @XE_GT_SRIOV_STATE_SAVED: indicates that VF is saved.
+ * @XE_GT_SRIOV_STATE_RESTORE_WIP: indicates that VF restore operation is in progress.
+ * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF restore operation has failed.
+ * @XE_GT_SRIOV_STATE_SAVED: indicates that VF is restored.
* @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress.
* @XE_GT_SRIOV_STATE_RESUME_SEND_RESUME: indicates that the PF is about to send RESUME command.
* @XE_GT_SRIOV_STATE_RESUME_FAILED: indicates that a VF resume operation has failed.
@@ -63,6 +70,16 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_PAUSE_FAILED,
XE_GT_SRIOV_STATE_PAUSED,
+ XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP,
+
+ XE_GT_SRIOV_STATE_SAVE_WIP,
+ XE_GT_SRIOV_STATE_SAVE_FAILED,
+ XE_GT_SRIOV_STATE_SAVED,
+
+ XE_GT_SRIOV_STATE_RESTORE_WIP,
+ XE_GT_SRIOV_STATE_RESTORE_FAILED,
+ XE_GT_SRIOV_STATE_RESTORED,
+
XE_GT_SRIOV_STATE_RESUME_WIP,
XE_GT_SRIOV_STATE_RESUME_SEND_RESUME,
XE_GT_SRIOV_STATE_RESUME_FAILED,
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
index 416d00a03fbb7..e64c7b56172c6 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
@@ -149,3 +149,99 @@ int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid)
return 0;
}
+
+/**
+ * xe_sriov_pf_control_save_vf - Save VF migration data on all GTs.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ unsigned int id;
+ int ret;
+
+ for_each_gt(gt, xe, id) {
+ ret = xe_gt_sriov_pf_control_save_vf(gt, vfid);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+/**
+ * xe_sriov_pf_control_wait_save_vf - Wait until VF migration data was saved on all GTs
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_wait_save_vf(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ u8 id;
+ int ret;
+
+ for_each_gt(gt, xe, id) {
+ ret = xe_gt_sriov_pf_control_wait_save_done(gt, vfid);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
+/**
+ * xe_sriov_pf_control_restore_vf - Restore VF migration data on all GTs.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_restore_vf(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ u8 id;
+ int ret;
+
+ for_each_gt(gt, xe, id) {
+ ret = xe_gt_sriov_pf_control_restore_vf(gt, vfid);
+ if (ret)
+ return ret;
+ }
+
+ return ret;
+}
+
+/**
+ * xe_sriov_pf_control_wait_save_vf - Wait until VF migration data was restored on all GTs
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_wait_restore_vf(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ u8 id;
+ int ret;
+
+ for_each_gt(gt, xe, id) {
+ ret = xe_gt_sriov_pf_control_wait_restore_done(gt, vfid);
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
index 2d52d0ac1b28f..512fd21d87c1e 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
@@ -13,5 +13,9 @@ int xe_sriov_pf_control_resume_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_stop_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_wait_save_vf(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_restore_vf(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_wait_restore_vf(struct xe_device *xe, unsigned int vfid);
#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
index 97636ed86fb8b..74eeabef91c57 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
@@ -75,11 +75,31 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
* │ │ ├── reset
* │ │ ├── resume
* │ │ ├── stop
+ * │ │ ├── save
+ * │ │ ├── restore
* │ │ :
* │ ├── vf2
* │ │ ├── ...
*/
+static int from_file_read_to_vf_call(struct seq_file *s,
+ int (*call)(struct xe_device *, unsigned int))
+{
+ struct dentry *dent = file_dentry(s->file)->d_parent;
+ struct xe_device *xe = extract_xe(dent);
+ unsigned int vfid = extract_vfid(dent);
+ int ret;
+
+ xe_pm_runtime_get(xe);
+ ret = call(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ if (ret < 0)
+ return ret;
+
+ return s->count;
+}
+
static ssize_t from_file_write_to_vf_call(struct file *file, const char __user *userbuf,
size_t count, loff_t *ppos,
int (*call)(struct xe_device *, unsigned int))
@@ -118,10 +138,26 @@ static ssize_t OP##_write(struct file *file, const char __user *userbuf, \
} \
DEFINE_SHOW_STORE_ATTRIBUTE(OP)
+#define DEFINE_VF_RW_CONTROL_ATTRIBUTE(OP) \
+static int OP##_show(struct seq_file *s, void *unused) \
+{ \
+ return from_file_read_to_vf_call(s, \
+ xe_sriov_pf_control_wait_##OP); \
+} \
+static ssize_t OP##_write(struct file *file, const char __user *userbuf, \
+ size_t count, loff_t *ppos) \
+{ \
+ return from_file_write_to_vf_call(file, userbuf, count, ppos, \
+ xe_sriov_pf_control_##OP); \
+} \
+DEFINE_SHOW_STORE_ATTRIBUTE(OP)
+
DEFINE_VF_CONTROL_ATTRIBUTE(pause_vf);
DEFINE_VF_CONTROL_ATTRIBUTE(resume_vf);
DEFINE_VF_CONTROL_ATTRIBUTE(stop_vf);
DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
+DEFINE_VF_RW_CONTROL_ATTRIBUTE(save_vf);
+DEFINE_VF_RW_CONTROL_ATTRIBUTE(restore_vf);
static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
{
@@ -129,6 +165,8 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
debugfs_create_file("resume", 0200, vfdent, xe, &resume_vf_fops);
debugfs_create_file("stop", 0200, vfdent, xe, &stop_vf_fops);
debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
+ debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
+ debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
}
static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 04/26] drm/xe/pf: Extract migration mutex out of its struct
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (2 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 19:08 ` Matthew Brost
2025-10-11 19:38 ` [PATCH 05/26] drm/xe/pf: Add data structures and handlers for migration rings Michał Winiarski
` (21 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
As part of upcoming changes, the struct xe_gt_sriov_pf_migration will be
used as a per-VF data structure.
The mutex (which is currently the only member of this structure) will
have slightly different semantics.
Extract the mutex to free up the struct name and simplify the future
changes.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 4 ++--
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h | 2 --
drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 2 +-
3 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index ca28f45aaf481..f8604b172963e 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -122,7 +122,7 @@ static bool pf_migration_supported(struct xe_gt *gt)
static struct mutex *pf_migration_mutex(struct xe_gt *gt)
{
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- return >->sriov.pf.migration.snapshot_lock;
+ return >->sriov.pf.snapshot_lock;
}
static struct xe_gt_sriov_state_snapshot *pf_pick_vf_snapshot(struct xe_gt *gt,
@@ -400,7 +400,7 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
if (!pf_migration_supported(gt))
return 0;
- err = drmm_mutex_init(&xe->drm, >->sriov.pf.migration.snapshot_lock);
+ err = drmm_mutex_init(&xe->drm, >->sriov.pf.snapshot_lock);
if (err)
return err;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
index 9d672feac5f04..fdc5a31dd8989 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
@@ -30,8 +30,6 @@ struct xe_gt_sriov_state_snapshot {
* Used by the PF driver to maintain non-VF specific per-GT data.
*/
struct xe_gt_sriov_pf_migration {
- /** @snapshot_lock: protects all VFs snapshots */
- struct mutex snapshot_lock;
};
#endif
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
index a64a6835ad656..9a856da379d39 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
@@ -58,7 +58,7 @@ struct xe_gt_sriov_pf {
struct xe_gt_sriov_pf_service service;
struct xe_gt_sriov_pf_control control;
struct xe_gt_sriov_pf_policy policy;
- struct xe_gt_sriov_pf_migration migration;
+ struct mutex snapshot_lock;
struct xe_gt_sriov_spare_config spare;
struct xe_gt_sriov_metadata *vfs;
};
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 05/26] drm/xe/pf: Add data structures and handlers for migration rings
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (3 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 04/26] drm/xe/pf: Extract migration mutex out of its struct Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 21:06 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
` (20 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Migration data is queued in a per-GT ptr_ring to decouple the worker
responsible for handling the data transfer from the .read()/.write()
syscalls.
Add the data structures and handlers that will be used in future
commits.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 4 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 163 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 9 +
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 5 +-
drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 3 +
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 147 ++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 20 +++
.../gpu/drm/xe/xe_sriov_pf_migration_types.h | 37 ++++
drivers/gpu/drm/xe/xe_sriov_pf_types.h | 3 +
9 files changed, 390 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 44df984278548..16a88e7599f6d 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -19,6 +19,7 @@
#include "xe_guc_ct.h"
#include "xe_sriov.h"
#include "xe_sriov_pf_control.h"
+#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_service.h"
#include "xe_tile.h"
@@ -388,6 +389,8 @@ static bool pf_enter_vf_wip(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
{
+ struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
+
if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_WIP)) {
struct xe_gt_sriov_control_state *cs = pf_pick_vf_control(gt, vfid);
@@ -399,6 +402,7 @@ static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_resume_wip(gt, vfid);
complete_all(&cs->done);
+ wake_up_all(wq);
}
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index f8604b172963e..af5952f42fff1 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -7,6 +7,7 @@
#include "abi/guc_actions_sriov_abi.h"
#include "xe_bo.h"
+#include "xe_gt_sriov_pf_control.h"
#include "xe_gt_sriov_pf_helpers.h"
#include "xe_gt_sriov_pf_migration.h"
#include "xe_gt_sriov_printk.h"
@@ -15,6 +16,17 @@
#include "xe_sriov.h"
#include "xe_sriov_pf_migration.h"
+#define XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT (HZ * 20)
+#define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
+
+static struct xe_gt_sriov_pf_migration *pf_pick_gt_migration(struct xe_gt *gt, unsigned int vfid)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return >->sriov.pf.vfs[vfid].migration;
+}
+
/* Return: number of dwords saved/restored/required or a negative error code on failure */
static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
u64 addr, u32 ndwords)
@@ -382,6 +394,142 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
}
#endif /* CONFIG_DEBUG_FS */
+/**
+ * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * Return: true if the ring is empty, otherwise false.
+ */
+bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid)
+{
+ return ptr_ring_empty(&pf_pick_gt_migration(gt, vfid)->ring);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_produce() - Add migration data packet to migration ring
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ * @data: &struct xe_sriov_pf_migration_data packet
+ *
+ * If the ring is full, wait until there is space in the ring.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
+ struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
+ unsigned long timeout = XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT;
+ int ret;
+
+ xe_gt_assert(gt, data->tile == gt->tile->id);
+ xe_gt_assert(gt, data->gt == gt->info.id);
+
+ while (1) {
+ ret = ptr_ring_produce(&migration->ring, data);
+ if (ret == 0) {
+ wake_up_all(wq);
+ break;
+ }
+
+ if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
+ return -EINVAL;
+
+ ret = wait_event_interruptible_timeout(*wq,
+ !ptr_ring_full(&migration->ring),
+ timeout);
+ if (ret == 0)
+ return -ETIMEDOUT;
+
+ timeout = ret;
+ }
+
+ return ret;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_consume() - Get migration data packet from migration ring
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * If the ring is empty, wait until there are new migration data packets to process.
+ *
+ * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
+ * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
+ * ERR_PTR value in case of error.
+ */
+struct xe_sriov_pf_migration_data *
+xe_gt_sriov_pf_migration_ring_consume(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
+ struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
+ unsigned long timeout = XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT;
+ struct xe_sriov_pf_migration_data *data;
+ int ret;
+
+ while (1) {
+ data = ptr_ring_consume(&migration->ring);
+ if (data) {
+ wake_up_all(wq);
+ break;
+ }
+
+ if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
+ return ERR_PTR(-ENODATA);
+
+ ret = wait_event_interruptible_timeout(*wq,
+ !ptr_ring_empty(&migration->ring) ||
+ !xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid),
+ timeout);
+ if (ret == 0)
+ return ERR_PTR(-ETIMEDOUT);
+
+ timeout = ret;
+ }
+
+ return data;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_consume_nowait() - Get migration data packet from migration ring
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * Similar to xe_gt_sriov_pf_migration_consume(), but doesn't wait until more data is available.
+ *
+ * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
+ * ERR_PTR(-EAGAIN) if ring is empty but migration data is expected,
+ * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
+ * ERR_PTR value in case of error.
+ */
+struct xe_sriov_pf_migration_data *
+xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
+ struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
+ struct xe_sriov_pf_migration_data *data;
+
+ data = ptr_ring_consume(&migration->ring);
+ if (data) {
+ wake_up_all(wq);
+ return data;
+ }
+
+ if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
+ return ERR_PTR(-ENODATA);
+
+ return ERR_PTR(-EAGAIN);
+}
+
+static void pf_gt_migration_cleanup(struct drm_device *dev, void *arg)
+{
+ struct xe_gt_sriov_pf_migration *migration = arg;
+
+ ptr_ring_cleanup(&migration->ring, NULL);
+}
+
/**
* xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
* @gt: the &xe_gt
@@ -393,6 +541,7 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
{
struct xe_device *xe = gt_to_xe(gt);
+ unsigned int n, totalvfs;
int err;
xe_gt_assert(gt, IS_SRIOV_PF(xe));
@@ -404,5 +553,19 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
if (err)
return err;
+ totalvfs = xe_sriov_pf_get_totalvfs(xe);
+ for (n = 0; n <= totalvfs; n++) {
+ struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, n);
+
+ err = ptr_ring_init(&migration->ring,
+ XE_GT_SRIOV_PF_MIGRATION_RING_SIZE, GFP_KERNEL);
+ if (err)
+ return err;
+
+ err = drmm_add_action_or_reset(&xe->drm, pf_gt_migration_cleanup, migration);
+ if (err)
+ return err;
+ }
+
return 0;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 09faeae00ddbb..1e4dc46413823 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -9,11 +9,20 @@
#include <linux/types.h>
struct xe_gt;
+struct xe_sriov_pf_migration_data;
int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
+bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data);
+struct xe_sriov_pf_migration_data *
+xe_gt_sriov_pf_migration_ring_consume(struct xe_gt *gt, unsigned int vfid);
+struct xe_sriov_pf_migration_data *
+xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid);
+
#ifdef CONFIG_DEBUG_FS
ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
char __user *buf, size_t count, loff_t *pos);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
index fdc5a31dd8989..8434689372082 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
@@ -7,6 +7,7 @@
#define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
#include <linux/mutex.h>
+#include <linux/ptr_ring.h>
#include <linux/types.h>
/**
@@ -27,9 +28,11 @@ struct xe_gt_sriov_state_snapshot {
/**
* struct xe_gt_sriov_pf_migration - GT-level data.
*
- * Used by the PF driver to maintain non-VF specific per-GT data.
+ * Used by the PF driver to maintain per-VF migration data.
*/
struct xe_gt_sriov_pf_migration {
+ /** @ring: queue containing VF save / restore migration data */
+ struct ptr_ring ring;
};
#endif
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
index 9a856da379d39..fbb08f8030f7f 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
@@ -33,6 +33,9 @@ struct xe_gt_sriov_metadata {
/** @snapshot: snapshot of the VF state data */
struct xe_gt_sriov_state_snapshot snapshot;
+
+ /** @migration: */
+ struct xe_gt_sriov_pf_migration migration;
};
/**
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index cf6a210d5597a..347682f29a03c 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -4,7 +4,35 @@
*/
#include "xe_sriov.h"
+#include <drm/drm_managed.h>
+
+#include "xe_device.h"
+#include "xe_gt_sriov_pf_control.h"
+#include "xe_gt_sriov_pf_migration.h"
+#include "xe_pm.h"
+#include "xe_sriov_pf_helpers.h"
#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_printk.h"
+
+static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+
+ return &xe->sriov.pf.vfs[vfid].migration;
+}
+
+/**
+ * xe_sriov_pf_migration_waitqueue - Get waitqueue for migration
+ * @xe: the &struct xe_device
+ * @vfid: the VF identifier
+ *
+ * Return: pointer to the migration waitqueue.
+ */
+wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
+{
+ return &pf_pick_migration(xe, vfid)->wq;
+}
/**
* xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
@@ -35,9 +63,128 @@ static bool pf_check_migration_support(struct xe_device *xe)
*/
int xe_sriov_pf_migration_init(struct xe_device *xe)
{
+ unsigned int n, totalvfs;
+
xe_assert(xe, IS_SRIOV_PF(xe));
xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
+ if (!xe_sriov_pf_migration_supported(xe))
+ return 0;
+
+ totalvfs = xe_sriov_pf_get_totalvfs(xe);
+ for (n = 1; n <= totalvfs; n++) {
+ struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
+
+ init_waitqueue_head(&migration->wq);
+ }
return 0;
}
+
+static bool pf_migration_empty(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ u8 gt_id;
+
+ for_each_gt(gt, xe, gt_id) {
+ if (!xe_gt_sriov_pf_migration_ring_empty(gt, vfid))
+ return false;
+ }
+
+ return true;
+}
+
+static struct xe_sriov_pf_migration_data *
+pf_migration_consume(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration_data *data;
+ struct xe_gt *gt;
+ u8 gt_id;
+ bool no_data = true;
+
+ for_each_gt(gt, xe, gt_id) {
+ data = xe_gt_sriov_pf_migration_ring_consume_nowait(gt, vfid);
+
+ if (!IS_ERR(data))
+ return data;
+ else if (PTR_ERR(data) == -EAGAIN)
+ no_data = false;
+ }
+
+ if (no_data)
+ return ERR_PTR(-ENODATA);
+
+ return ERR_PTR(-EAGAIN);
+}
+
+/**
+ * xe_sriov_pf_migration_consume() - Consume a SR-IOV VF migration data packet from the device
+ * @xe: the &struct xe_device
+ * @vfid: the VF identifier
+ *
+ * If there is no migration data to process, wait until more data is available.
+ *
+ * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
+ * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
+ * ERR_PTR value in case of error.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+struct xe_sriov_pf_migration_data *
+xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, vfid);
+ unsigned long timeout = HZ * 5;
+ struct xe_sriov_pf_migration_data *data;
+ int ret;
+
+ if (!IS_SRIOV_PF(xe))
+ return ERR_PTR(-ENODEV);
+
+ while (1) {
+ data = pf_migration_consume(xe, vfid);
+ if (!IS_ERR(data) || PTR_ERR(data) != -EAGAIN)
+ goto out;
+
+ ret = wait_event_interruptible_timeout(migration->wq,
+ !pf_migration_empty(xe, vfid),
+ timeout);
+ if (ret == 0) {
+ xe_sriov_warn(xe, "VF%d Timed out waiting for migration data\n", vfid);
+ return ERR_PTR(-ETIMEDOUT);
+ }
+
+ timeout = ret;
+ }
+
+out:
+ return data;
+}
+
+/**
+ * xe_sriov_pf_migration_produce() - Produce a SR-IOV VF migration data packet for device to process
+ * @xe: the &struct xe_device
+ * @vfid: the VF identifier
+ * @data: VF migration data
+ *
+ * If the underlying data structure is full, wait until there is space.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ struct xe_gt *gt;
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ gt = xe_device_get_gt(xe, data->gt);
+ if (!gt || data->tile != gt->tile->id) {
+ xe_sriov_err_ratelimited(xe, "VF%d Unknown GT - tile_id:%d, gt_id:%d\n",
+ vfid, data->tile, data->gt);
+ return -EINVAL;
+ }
+
+ return xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
index d3058b6682192..f2020ba19c2da 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
@@ -7,12 +7,18 @@
#define _XE_SRIOV_PF_MIGRATION_H_
#include <linux/types.h>
+#include <linux/wait.h>
struct xe_device;
#ifdef CONFIG_PCI_IOV
int xe_sriov_pf_migration_init(struct xe_device *xe);
bool xe_sriov_pf_migration_supported(struct xe_device *xe);
+struct xe_sriov_pf_migration_data *
+xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data);
+wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid);
#else
static inline int xe_sriov_pf_migration_init(struct xe_device *xe)
{
@@ -22,6 +28,20 @@ static inline bool xe_sriov_pf_migration_supported(struct xe_device *xe)
{
return false;
}
+static inline struct xe_sriov_pf_migration_data *
+xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
+{
+ return ERR_PTR(-ENODEV);
+}
+static inline int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ return -ENODEV;
+}
+wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
+{
+ return ERR_PTR(-ENODEV);
+}
#endif
#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
index e69de29bb2d1d..80fdea32b884a 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_SRIOV_PF_MIGRATION_TYPES_H_
+#define _XE_SRIOV_PF_MIGRATION_TYPES_H_
+
+#include <linux/types.h>
+#include <linux/wait.h>
+
+struct xe_sriov_pf_migration_data {
+ struct xe_device *xe;
+ void *vaddr;
+ size_t remaining;
+ size_t hdr_remaining;
+ union {
+ struct xe_bo *bo;
+ void *buff;
+ };
+ __struct_group(xe_sriov_pf_migration_hdr, hdr, __packed,
+ u8 version;
+ u8 type;
+ u8 tile;
+ u8 gt;
+ u32 flags;
+ u64 offset;
+ u64 size;
+ );
+};
+
+struct xe_sriov_pf_migration {
+ /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
+ wait_queue_head_t wq;
+};
+
+#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
index 2d2fcc0a2f258..b3ae21a5a0490 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
@@ -9,6 +9,7 @@
#include <linux/mutex.h>
#include <linux/types.h>
+#include "xe_sriov_pf_migration_types.h"
#include "xe_sriov_pf_service_types.h"
/**
@@ -17,6 +18,8 @@
struct xe_sriov_metadata {
/** @version: negotiated VF/PF ABI version */
struct xe_sriov_pf_service_version version;
+ /** @migration: migration data */
+ struct xe_sriov_pf_migration migration;
};
/**
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (4 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 05/26] drm/xe/pf: Add data structures and handlers for migration rings Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 19:12 ` Matthew Brost
2025-10-13 10:15 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
` (19 subsequent siblings)
25 siblings, 2 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Now that it's possible to free the packets - connect the restore
handling logic with the ring.
The helpers will also be used in upcoming changes that will start producing
migration data packets.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/Makefile | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 48 ++++++-
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 10 +-
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 1 +
.../gpu/drm/xe/xe_sriov_pf_migration_data.c | 135 ++++++++++++++++++
.../gpu/drm/xe/xe_sriov_pf_migration_data.h | 32 +++++
6 files changed, 224 insertions(+), 3 deletions(-)
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index 71f685a315dca..e253d65366de4 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -177,6 +177,7 @@ xe-$(CONFIG_PCI_IOV) += \
xe_sriov_pf_control.o \
xe_sriov_pf_debugfs.o \
xe_sriov_pf_migration.o \
+ xe_sriov_pf_migration_data.o \
xe_sriov_pf_service.o \
xe_tile_sriov_pf_debugfs.o
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 16a88e7599f6d..04a4e92133c2e 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -20,6 +20,7 @@
#include "xe_sriov.h"
#include "xe_sriov_pf_control.h"
#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_pf_migration_data.h"
#include "xe_sriov_pf_service.h"
#include "xe_tile.h"
@@ -949,14 +950,57 @@ static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
}
+static void pf_enter_vf_restore_failed(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
+ pf_exit_vf_wip(gt, vfid);
+}
+
+static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ switch (data->type) {
+ default:
+ xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
+ pf_enter_vf_restore_failed(gt, vfid);
+ }
+
+ return -EINVAL;
+}
+
static bool pf_handle_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
{
+ struct xe_sriov_pf_migration_data *data;
+ int ret;
+
if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
return false;
- pf_exit_vf_restore_wip(gt, vfid);
- pf_enter_vf_restored(gt, vfid);
+ data = xe_gt_sriov_pf_migration_ring_consume(gt, vfid);
+ if (IS_ERR(data)) {
+ if (PTR_ERR(data) == -ENODATA &&
+ !xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid)) {
+ pf_exit_vf_restore_wip(gt, vfid);
+ pf_enter_vf_restored(gt, vfid);
+ } else {
+ pf_enter_vf_restore_failed(gt, vfid);
+ }
+ return false;
+ }
+
+ xe_gt_assert(gt, gt->info.id == data->gt);
+ xe_gt_assert(gt, gt->tile->id == data->tile);
+
+ ret = pf_handle_vf_restore_data(gt, vfid, data);
+ if (ret) {
+ xe_gt_sriov_err(gt, "VF%u failed to restore data type: %d (%d)\n",
+ vfid, data->type, ret);
+ xe_sriov_pf_migration_data_free(data);
+ pf_enter_vf_restore_failed(gt, vfid);
+ return false;
+ }
+ xe_sriov_pf_migration_data_free(data);
return true;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index af5952f42fff1..582aaf062cbd4 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -15,6 +15,7 @@
#include "xe_guc_ct.h"
#include "xe_sriov.h"
#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_pf_migration_data.h"
#define XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT (HZ * 20)
#define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
@@ -523,11 +524,18 @@ xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid
return ERR_PTR(-EAGAIN);
}
+static void pf_mig_data_destroy(void *ptr)
+{
+ struct xe_sriov_pf_migration_data *data = ptr;
+
+ xe_sriov_pf_migration_data_free(data);
+}
+
static void pf_gt_migration_cleanup(struct drm_device *dev, void *arg)
{
struct xe_gt_sriov_pf_migration *migration = arg;
- ptr_ring_cleanup(&migration->ring, NULL);
+ ptr_ring_cleanup(&migration->ring, pf_mig_data_destroy);
}
/**
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index 347682f29a03c..d39cee66589b5 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -12,6 +12,7 @@
#include "xe_pm.h"
#include "xe_sriov_pf_helpers.h"
#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_pf_migration_data.h"
#include "xe_sriov_printk.h"
static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
new file mode 100644
index 0000000000000..cfc6b512c6674
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
@@ -0,0 +1,135 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include "xe_bo.h"
+#include "xe_device.h"
+#include "xe_sriov_pf_migration_data.h"
+
+static bool data_needs_bo(struct xe_sriov_pf_migration_data *data)
+{
+ unsigned int type = data->type;
+
+ return type == XE_SRIOV_MIG_DATA_CCS ||
+ type == XE_SRIOV_MIG_DATA_VRAM;
+}
+
+/**
+ * xe_sriov_pf_migration_data_alloc() - Allocate migration data packet
+ * @xe: the &struct xe_device
+ *
+ * Only allocates the "outer" structure, without initializing the migration
+ * data backing storage.
+ *
+ * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
+ * NULL in case of error.
+ */
+struct xe_sriov_pf_migration_data *
+xe_sriov_pf_migration_data_alloc(struct xe_device *xe)
+{
+ struct xe_sriov_pf_migration_data *data;
+
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return NULL;
+
+ data->xe = xe;
+ data->hdr_remaining = sizeof(data->hdr);
+
+ return data;
+}
+
+/**
+ * xe_sriov_pf_migration_data_free() - Free migration data packet
+ * @data: the &struct xe_sriov_pf_migration_data packet
+ */
+void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *data)
+{
+ if (data_needs_bo(data)) {
+ if (data->bo)
+ xe_bo_unpin_map_no_vm(data->bo);
+ } else {
+ if (data->buff)
+ kvfree(data->buff);
+ }
+
+ kfree(data);
+}
+
+static int mig_data_init(struct xe_sriov_pf_migration_data *data)
+{
+ struct xe_gt *gt = xe_device_get_gt(data->xe, data->gt);
+
+ if (!gt || data->tile != gt->tile->id)
+ return -EINVAL;
+
+ if (data->size == 0)
+ return 0;
+
+ if (data_needs_bo(data)) {
+ struct xe_bo *bo = xe_bo_create_pin_map_novm(data->xe, gt->tile,
+ PAGE_ALIGN(data->size),
+ ttm_bo_type_kernel,
+ XE_BO_FLAG_SYSTEM | XE_BO_FLAG_PINNED,
+ false);
+ if (IS_ERR(bo))
+ return PTR_ERR(bo);
+
+ data->bo = bo;
+ data->vaddr = bo->vmap.vaddr;
+ } else {
+ void *buff = kvzalloc(data->size, GFP_KERNEL);
+ if (!buff)
+ return -ENOMEM;
+
+ data->buff = buff;
+ data->vaddr = buff;
+ }
+
+ return 0;
+}
+
+/**
+ * xe_sriov_pf_migration_data_init() - Initialize the migration data header and backing storage
+ * @data: the &struct xe_sriov_pf_migration_data packet
+ * @tile_id: tile identifier
+ * @gt_id: GT identifier
+ * @type: &enum xe_sriov_pf_migration_data_type
+ * @offset: offset of data packet payload (within wider resource)
+ * @size: size of data packet payload
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
+ unsigned int type, loff_t offset, size_t size)
+{
+ xe_assert(data->xe, type < XE_SRIOV_MIG_DATA_MAX);
+ data->version = 1;
+ data->type = type;
+ data->tile = tile_id;
+ data->gt = gt_id;
+ data->offset = offset;
+ data->size = size;
+ data->remaining = size;
+
+ return mig_data_init(data);
+}
+
+/**
+ * xe_sriov_pf_migration_data_init() - Initialize the migration data backing storage based on header
+ * @data: the &struct xe_sriov_pf_migration_data packet
+ *
+ * Header data is expected to be filled prior to calling this function
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *data)
+{
+ if (WARN_ON(data->hdr_remaining))
+ return -EINVAL;
+
+ data->remaining = data->size;
+
+ return mig_data_init(data);
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
new file mode 100644
index 0000000000000..1dde4cfcdbc47
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_SRIOV_PF_MIGRATION_DATA_H_
+#define _XE_SRIOV_PF_MIGRATION_DATA_H_
+
+#include <linux/types.h>
+
+struct xe_device;
+
+enum xe_sriov_pf_migration_data_type {
+ XE_SRIOV_MIG_DATA_DESCRIPTOR = 1,
+ XE_SRIOV_MIG_DATA_TRAILER,
+ XE_SRIOV_MIG_DATA_GGTT,
+ XE_SRIOV_MIG_DATA_MMIO,
+ XE_SRIOV_MIG_DATA_GUC,
+ XE_SRIOV_MIG_DATA_CCS,
+ XE_SRIOV_MIG_DATA_VRAM,
+ XE_SRIOV_MIG_DATA_MAX,
+};
+
+struct xe_sriov_pf_migration_data *
+xe_sriov_pf_migration_data_alloc(struct xe_device *xe);
+void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *snapshot);
+
+int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
+ unsigned int type, loff_t offset, size_t size);
+int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *snapshot);
+
+#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (5 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-11 22:28 ` kernel test robot
2025-10-13 10:46 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
` (18 subsequent siblings)
25 siblings, 2 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Add debugfs handlers for migration state and handle bitstream
.read()/.write() to convert from bitstream to/from migration data
packets.
As descriptor/trailer are handled at this layer - add handling for both
save and restore side.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 18 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 1 +
drivers/gpu/drm/xe/xe_sriov_pf.c | 1 +
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 5 +
drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 45 +++
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 56 +++
.../gpu/drm/xe/xe_sriov_pf_migration_data.c | 353 ++++++++++++++++++
.../gpu/drm/xe/xe_sriov_pf_migration_data.h | 5 +
.../gpu/drm/xe/xe_sriov_pf_migration_types.h | 9 +
9 files changed, 493 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 04a4e92133c2e..092d3d710bca1 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -814,6 +814,23 @@ bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfi
return pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
}
+/**
+ * xe_gt_sriov_pf_control_vf_data_eof() - indicate the end of SR-IOV VF migration data production
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ */
+void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
+{
+ struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
+
+ if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP))
+ pf_enter_vf_state_machine_bug(gt, vfid);
+
+ wake_up_all(wq);
+}
+
static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
@@ -840,6 +857,7 @@ static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
return false;
+ xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
pf_exit_vf_save_wip(gt, vfid);
pf_enter_vf_saved(gt, vfid);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
index 2e121e8132dcf..caf20dd063b1b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
@@ -15,6 +15,7 @@ int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
+void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
index 95743c7af8050..5d115627f3f2f 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
@@ -16,6 +16,7 @@
#include "xe_sriov_pf.h"
#include "xe_sriov_pf_helpers.h"
#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_pf_migration_data.h"
#include "xe_sriov_pf_service.h"
#include "xe_sriov_printk.h"
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
index e64c7b56172c6..10e1f18aa8b11 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
@@ -6,6 +6,7 @@
#include "xe_device.h"
#include "xe_gt_sriov_pf_control.h"
#include "xe_sriov_pf_control.h"
+#include "xe_sriov_pf_migration_data.h"
#include "xe_sriov_printk.h"
/**
@@ -165,6 +166,10 @@ int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid)
unsigned int id;
int ret;
+ ret = xe_sriov_pf_migration_data_save_init(xe, vfid);
+ if (ret)
+ return ret;
+
for_each_gt(gt, xe, id) {
ret = xe_gt_sriov_pf_control_save_vf(gt, vfid);
if (ret)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
index 74eeabef91c57..ce780719760a6 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
@@ -13,6 +13,7 @@
#include "xe_sriov_pf_control.h"
#include "xe_sriov_pf_debugfs.h"
#include "xe_sriov_pf_helpers.h"
+#include "xe_sriov_pf_migration_data.h"
#include "xe_sriov_pf_service.h"
#include "xe_sriov_printk.h"
#include "xe_tile_sriov_pf_debugfs.h"
@@ -71,6 +72,7 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
* /sys/kernel/debug/dri/BDF/
* ├── sriov
* │ ├── vf1
+ * │ │ ├── migration_data
* │ │ ├── pause
* │ │ ├── reset
* │ │ ├── resume
@@ -159,6 +161,48 @@ DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
DEFINE_VF_RW_CONTROL_ATTRIBUTE(save_vf);
DEFINE_VF_RW_CONTROL_ATTRIBUTE(restore_vf);
+static ssize_t data_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
+{
+ struct dentry *dent = file_dentry(file);
+ struct dentry *vfdentry = dent->d_parent;
+ struct dentry *migration_dentry = vfdentry->d_parent;
+ unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
+ struct xe_device *xe = migration_dentry->d_inode->i_private;
+
+ xe_assert(xe, vfid);
+ xe_sriov_pf_assert_vfid(xe, vfid);
+
+ if (*pos)
+ return -ESPIPE;
+
+ return xe_sriov_pf_migration_data_write(xe, vfid, buf, count);
+}
+
+static ssize_t data_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
+{
+ struct dentry *dent = file_dentry(file);
+ struct dentry *vfdentry = dent->d_parent;
+ struct dentry *migration_dentry = vfdentry->d_parent;
+ unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
+ struct xe_device *xe = migration_dentry->d_inode->i_private;
+
+ xe_assert(xe, vfid);
+ xe_sriov_pf_assert_vfid(xe, vfid);
+
+ if (*ppos)
+ return -ESPIPE;
+
+ return xe_sriov_pf_migration_data_read(xe, vfid, buf, count);
+}
+
+static const struct file_operations data_vf_fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .write = data_write,
+ .read = data_read,
+ .llseek = default_llseek,
+};
+
static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
{
debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
@@ -167,6 +211,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
+ debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
}
static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index d39cee66589b5..9cc178126cbdc 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -56,6 +56,18 @@ static bool pf_check_migration_support(struct xe_device *xe)
return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
}
+static void pf_migration_cleanup(struct drm_device *dev, void *arg)
+{
+ struct xe_sriov_pf_migration *migration = arg;
+
+ if (!IS_ERR_OR_NULL(migration->pending))
+ xe_sriov_pf_migration_data_free(migration->pending);
+ if (!IS_ERR_OR_NULL(migration->trailer))
+ xe_sriov_pf_migration_data_free(migration->trailer);
+ if (!IS_ERR_OR_NULL(migration->descriptor))
+ xe_sriov_pf_migration_data_free(migration->descriptor);
+}
+
/**
* xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
* @xe: the &struct xe_device
@@ -65,6 +77,7 @@ static bool pf_check_migration_support(struct xe_device *xe)
int xe_sriov_pf_migration_init(struct xe_device *xe)
{
unsigned int n, totalvfs;
+ int err;
xe_assert(xe, IS_SRIOV_PF(xe));
@@ -76,7 +89,15 @@ int xe_sriov_pf_migration_init(struct xe_device *xe)
for (n = 1; n <= totalvfs; n++) {
struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
+ err = drmm_mutex_init(&xe->drm, &migration->lock);
+ if (err)
+ return err;
+
init_waitqueue_head(&migration->wq);
+
+ err = drmm_add_action_or_reset(&xe->drm, pf_migration_cleanup, migration);
+ if (err)
+ return err;
}
return 0;
@@ -162,6 +183,36 @@ xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
return data;
}
+static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ if (data->tile != 0 || data->gt != 0)
+ return -EINVAL;
+
+ xe_sriov_pf_migration_data_free(data);
+
+ return 0;
+}
+
+static int pf_handle_trailer(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ struct xe_gt *gt;
+ u8 gt_id;
+
+ if (data->tile != 0 || data->gt != 0)
+ return -EINVAL;
+ if (data->offset != 0 || data->size != 0 || data->buff || data->bo)
+ return -EINVAL;
+
+ xe_sriov_pf_migration_data_free(data);
+
+ for_each_gt(gt, xe, gt_id)
+ xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
+
+ return 0;
+}
+
/**
* xe_sriov_pf_migration_produce() - Produce a SR-IOV VF migration data packet for device to process
* @xe: the &struct xe_device
@@ -180,6 +231,11 @@ int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
if (!IS_SRIOV_PF(xe))
return -ENODEV;
+ if (data->type == XE_SRIOV_MIG_DATA_DESCRIPTOR)
+ return pf_handle_descriptor(xe, vfid, data);
+ else if (data->type == XE_SRIOV_MIG_DATA_TRAILER)
+ return pf_handle_trailer(xe, vfid, data);
+
gt = xe_device_get_gt(xe, data->gt);
if (!gt || data->tile != gt->tile->id) {
xe_sriov_err_ratelimited(xe, "VF%d Unknown GT - tile_id:%d, gt_id:%d\n",
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
index cfc6b512c6674..9a2777dcf9a6b 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
@@ -5,7 +5,45 @@
#include "xe_bo.h"
#include "xe_device.h"
+#include "xe_sriov_pf_helpers.h"
+#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_migration_data.h"
+#include "xe_sriov_printk.h"
+
+static struct mutex *pf_migration_mutex(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+ return &xe->sriov.pf.vfs[vfid].migration.lock;
+}
+
+static struct xe_sriov_pf_migration_data **pf_pick_pending(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+ lockdep_assert_held(pf_migration_mutex(xe, vfid));
+
+ return &xe->sriov.pf.vfs[vfid].migration.pending;
+}
+
+static struct xe_sriov_pf_migration_data **
+pf_pick_descriptor(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+ lockdep_assert_held(pf_migration_mutex(xe, vfid));
+
+ return &xe->sriov.pf.vfs[vfid].migration.descriptor;
+}
+
+static struct xe_sriov_pf_migration_data **pf_pick_trailer(struct xe_device *xe, unsigned int vfid)
+{
+ xe_assert(xe, IS_SRIOV_PF(xe));
+ xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
+ lockdep_assert_held(pf_migration_mutex(xe, vfid));
+
+ return &xe->sriov.pf.vfs[vfid].migration.trailer;
+}
static bool data_needs_bo(struct xe_sriov_pf_migration_data *data)
{
@@ -133,3 +171,318 @@ int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *
return mig_data_init(data);
}
+
+static ssize_t vf_mig_data_hdr_read(struct xe_sriov_pf_migration_data *data,
+ char __user *buf, size_t len)
+{
+ loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
+
+ if (!data->hdr_remaining)
+ return -EINVAL;
+
+ if (len > data->hdr_remaining)
+ len = data->hdr_remaining;
+
+ if (copy_to_user(buf, (void *)&data->hdr + offset, len))
+ return -EFAULT;
+
+ data->hdr_remaining -= len;
+
+ return len;
+}
+
+static ssize_t vf_mig_data_read(struct xe_sriov_pf_migration_data *data,
+ char __user *buf, size_t len)
+{
+ if (len > data->remaining)
+ len = data->remaining;
+
+ if (copy_to_user(buf, data->vaddr + (data->size - data->remaining), len))
+ return -EFAULT;
+
+ data->remaining -= len;
+
+ return len;
+}
+
+static ssize_t __vf_mig_data_read_single(struct xe_sriov_pf_migration_data **data,
+ unsigned int vfid, char __user *buf, size_t len)
+{
+ ssize_t copied = 0;
+
+ if ((*data)->hdr_remaining)
+ copied = vf_mig_data_hdr_read(*data, buf, len);
+ else
+ copied = vf_mig_data_read(*data, buf, len);
+
+ if ((*data)->remaining == 0 && (*data)->hdr_remaining == 0) {
+ xe_sriov_pf_migration_data_free(*data);
+ *data = NULL;
+ }
+
+ return copied;
+}
+
+static struct xe_sriov_pf_migration_data **vf_mig_pick_data(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration_data **data;
+
+ data = pf_pick_descriptor(xe, vfid);
+ if (*data)
+ return data;
+
+ data = pf_pick_pending(xe, vfid);
+ if (*data == NULL)
+ *data = xe_sriov_pf_migration_consume(xe, vfid);
+ if (!IS_ERR_OR_NULL(*data))
+ return data;
+ else if (IS_ERR(*data) && PTR_ERR(*data) != -ENODATA)
+ return data;
+
+ data = pf_pick_trailer(xe, vfid);
+ if (*data)
+ return data;
+
+ return ERR_PTR(-ENODATA);
+}
+
+static ssize_t vf_mig_data_read_single(struct xe_device *xe, unsigned int vfid,
+ char __user *buf, size_t len)
+{
+ struct xe_sriov_pf_migration_data **data = vf_mig_pick_data(xe, vfid);
+
+ if (IS_ERR_OR_NULL(data))
+ return PTR_ERR(data);
+
+ return __vf_mig_data_read_single(data, vfid, buf, len);
+}
+
+/**
+ * xe_sriov_pf_migration_data_read() - Read migration data from the device
+ * @gt: the &struct xe_device
+ * @vfid: the VF identifier
+ * @buf: start address of userspace buffer
+ * @len: requested read size from userspace
+ *
+ * Return: number of bytes that has been successfully read
+ * 0 if no more migration data is available
+ * -errno on failure
+ */
+ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
+ char __user *buf, size_t len)
+{
+ ssize_t ret, consumed = 0;
+
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
+ if (ret)
+ return ret;
+
+ while (consumed < len) {
+ ret = vf_mig_data_read_single(xe, vfid, buf, len - consumed);
+ if (ret == -ENODATA)
+ goto out;
+ if (ret < 0) {
+ mutex_unlock(pf_migration_mutex(xe, vfid));
+ return ret;
+ }
+
+ consumed += ret;
+ buf += ret;
+ }
+
+out:
+ mutex_unlock(pf_migration_mutex(xe, vfid));
+ return consumed;
+}
+
+static ssize_t vf_mig_hdr_write(struct xe_sriov_pf_migration_data *data,
+ const char __user *buf, size_t len)
+{
+ loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
+ int ret;
+
+ if (WARN_ON(!data->hdr_remaining))
+ return -EINVAL;
+
+ if (len > data->hdr_remaining)
+ len = data->hdr_remaining;
+
+ if (copy_from_user((void *)&data->hdr + offset, buf, len))
+ return -EFAULT;
+
+ data->hdr_remaining -= len;
+
+ if (!data->hdr_remaining) {
+ ret = xe_sriov_pf_migration_data_init_from_hdr(data);
+ if (ret)
+ return ret;
+ }
+
+ return len;
+}
+
+static ssize_t vf_mig_data_write(struct xe_sriov_pf_migration_data *data,
+ const char __user *buf, size_t len)
+{
+ if (len > data->remaining)
+ len = data->remaining;
+
+ if (copy_from_user(data->vaddr + (data->size - data->remaining), buf, len))
+ return -EFAULT;
+
+ data->remaining -= len;
+
+ return len;
+}
+
+static ssize_t vf_mig_data_write_single(struct xe_device *xe, unsigned int vfid,
+ const char __user *buf, size_t len)
+{
+ struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
+ int ret;
+ ssize_t copied;
+
+ if (IS_ERR_OR_NULL(*data)) {
+ *data = xe_sriov_pf_migration_data_alloc(xe);
+ if (*data == NULL)
+ return -ENOMEM;
+ }
+
+ if ((*data)->hdr_remaining)
+ copied = vf_mig_hdr_write(*data, buf, len);
+ else
+ copied = vf_mig_data_write(*data, buf, len);
+
+ if ((*data)->hdr_remaining == 0 && (*data)->remaining == 0) {
+ ret = xe_sriov_pf_migration_produce(xe, vfid, *data);
+ if (ret) {
+ xe_sriov_pf_migration_data_free(*data);
+ return ret;
+ }
+
+ *data = NULL;
+ }
+
+ return copied;
+}
+
+/**
+ * xe_sriov_pf_migration_data_write() - Write migration data to the device
+ * @gt: the &struct xe_device
+ * @vfid: the VF identifier
+ * @buf: start address of userspace buffer
+ * @len: requested write size from userspace
+ *
+ * Return: number of bytes that has been successfully written
+ * -errno on failure
+ */
+ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
+ const char __user *buf, size_t len)
+{
+ ssize_t ret, produced = 0;
+
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
+ if (ret)
+ return ret;
+
+ while (produced < len) {
+ ret = vf_mig_data_write_single(xe, vfid, buf, len - produced);
+ if (ret < 0) {
+ mutex_unlock(pf_migration_mutex(xe, vfid));
+ return ret;
+ }
+
+ produced += ret;
+ buf += ret;
+ }
+
+ mutex_unlock(pf_migration_mutex(xe, vfid));
+ return produced;
+}
+
+#define MIGRATION_DESC_SIZE 4
+static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration_data **desc = pf_pick_descriptor(xe, vfid);
+ struct xe_sriov_pf_migration_data *data;
+ int ret;
+
+ data = xe_sriov_pf_migration_data_alloc(xe);
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_DESCRIPTOR,
+ 0, MIGRATION_DESC_SIZE);
+ if (ret) {
+ xe_sriov_pf_migration_data_free(data);
+ return ret;
+ }
+
+ *desc = data;
+
+ return 0;
+}
+
+static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
+
+ *data = NULL;
+}
+
+#define MIGRATION_TRAILER_SIZE 0
+static int pf_trailer_init(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration_data **trailer = pf_pick_trailer(xe, vfid);
+ struct xe_sriov_pf_migration_data *data;
+ int ret;
+
+ data = xe_sriov_pf_migration_data_alloc(xe);
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_TRAILER,
+ 0, MIGRATION_TRAILER_SIZE);
+ if (ret) {
+ xe_sriov_pf_migration_data_free(data);
+ return ret;
+ }
+
+ *trailer = data;
+
+ return 0;
+}
+
+/**
+ * xe_sriov_pf_migration_data_save_init() - Initialize the pending save migration data.
+ * @gt: the &struct xe_device
+ * @vfid: the VF identifier
+ *
+ * Return: 0 on success, -errno on failure
+ */
+int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid)
+{
+ int ret;
+
+ ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
+ if (ret)
+ return ret;
+
+ ret = pf_desc_init(xe, vfid);
+ if (ret)
+ goto out;
+
+ ret = pf_trailer_init(xe, vfid);
+ if (ret)
+ goto out;
+
+ pf_pending_init(xe, vfid);
+
+out:
+ mutex_unlock(pf_migration_mutex(xe, vfid));
+ return ret;
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
index 1dde4cfcdbc47..5b96c7f224002 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
@@ -28,5 +28,10 @@ void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *snapshot
int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
unsigned int type, loff_t offset, size_t size);
int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *snapshot);
+ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
+ char __user *buf, size_t len);
+ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
+ const char __user *buf, size_t len);
+int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
#endif
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
index 80fdea32b884a..c5d75bb7f39c0 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
@@ -7,6 +7,7 @@
#define _XE_SRIOV_PF_MIGRATION_TYPES_H_
#include <linux/types.h>
+#include <linux/mutex_types.h>
#include <linux/wait.h>
struct xe_sriov_pf_migration_data {
@@ -32,6 +33,14 @@ struct xe_sriov_pf_migration_data {
struct xe_sriov_pf_migration {
/** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
wait_queue_head_t wq;
+ /** @lock: Mutex protecting the migration data */
+ struct mutex lock;
+ /** @pending: currently processed data packet of VF resource */
+ struct xe_sriov_pf_migration_data *pending;
+ /** @trailer: data packet used to indicate the end of stream */
+ struct xe_sriov_pf_migration_data *trailer;
+ /** @descriptor: data packet containing the metadata describing the device */
+ struct xe_sriov_pf_migration_data *descriptor;
};
#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (6 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-11 22:52 ` kernel test robot
2025-10-13 10:56 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
` (17 subsequent siblings)
25 siblings, 2 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
The descriptor reuses the KLV format used by GuC and contains metadata
that can be used to quickly fail migration when source is incompatible
with destination.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 6 +-
.../gpu/drm/xe/xe_sriov_pf_migration_data.c | 82 ++++++++++++++++++-
.../gpu/drm/xe/xe_sriov_pf_migration_data.h | 2 +
3 files changed, 87 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index 9cc178126cbdc..a0cfac456ba0b 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -186,10 +186,14 @@ xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
struct xe_sriov_pf_migration_data *data)
{
+ int ret;
+
if (data->tile != 0 || data->gt != 0)
return -EINVAL;
- xe_sriov_pf_migration_data_free(data);
+ ret = xe_sriov_pf_migration_data_process_desc(xe, vfid, data);
+ if (ret)
+ return ret;
return 0;
}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
index 9a2777dcf9a6b..307b16b027a5e 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
@@ -5,6 +5,7 @@
#include "xe_bo.h"
#include "xe_device.h"
+#include "xe_guc_klv_helpers.h"
#include "xe_sriov_pf_helpers.h"
#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_migration_data.h"
@@ -404,11 +405,17 @@ ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid
return produced;
}
-#define MIGRATION_DESC_SIZE 4
+#define MIGRATION_KLV_DEVICE_DEVID_KEY 0xf001u
+#define MIGRATION_KLV_DEVICE_DEVID_LEN 1u
+#define MIGRATION_KLV_DEVICE_REVID_KEY 0xf002u
+#define MIGRATION_KLV_DEVICE_REVID_LEN 1u
+
+#define MIGRATION_DESC_DWORDS 4
static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
{
struct xe_sriov_pf_migration_data **desc = pf_pick_descriptor(xe, vfid);
struct xe_sriov_pf_migration_data *data;
+ u32 *klvs;
int ret;
data = xe_sriov_pf_migration_data_alloc(xe);
@@ -416,17 +423,88 @@ static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
return -ENOMEM;
ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_DESCRIPTOR,
- 0, MIGRATION_DESC_SIZE);
+ 0, MIGRATION_DESC_DWORDS * sizeof(u32));
if (ret) {
xe_sriov_pf_migration_data_free(data);
return ret;
}
+ klvs = data->vaddr;
+ *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_DEVID_KEY,
+ MIGRATION_KLV_DEVICE_DEVID_LEN);
+ *klvs++ = xe->info.devid;
+ *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_REVID_KEY,
+ MIGRATION_KLV_DEVICE_REVID_LEN);
+ *klvs++ = xe->info.revid;
+
*desc = data;
return 0;
}
+/**
+ * xe_sriov_pf_migration_data_process_desc() - Process migration data descriptor.
+ * @gt: the &struct xe_device
+ * @vfid: the VF identifier
+ * @data: the &struct xe_sriov_pf_migration_data containing the descriptor
+ *
+ * The descriptor uses the same KLV format as GuC, and contains metadata used for
+ * checking migration data compatibility.
+ *
+ * Return: 0 on success, -errno on failure
+ */
+int xe_sriov_pf_migration_data_process_desc(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ u32 num_dwords = data->size / sizeof(u32);
+ u32 *klvs = data->vaddr;
+
+ xe_assert(xe, data->type == XE_SRIOV_MIG_DATA_DESCRIPTOR);
+ if (data->size % sizeof(u32) != 0)
+ return -EINVAL;
+ if (data->size != num_dwords * sizeof(u32))
+ return -EINVAL;
+
+ while (num_dwords >= GUC_KLV_LEN_MIN) {
+ u32 key = FIELD_GET(GUC_KLV_0_KEY, klvs[0]);
+ u32 len = FIELD_GET(GUC_KLV_0_LEN, klvs[0]);
+
+ klvs += GUC_KLV_LEN_MIN;
+ num_dwords -= GUC_KLV_LEN_MIN;
+
+ switch (key) {
+ case MIGRATION_KLV_DEVICE_DEVID_KEY:
+ if (*klvs != xe->info.devid) {
+ xe_sriov_info(xe,
+ "Aborting migration, devid mismatch %#04x!=%#04x\n",
+ *klvs, xe->info.devid);
+ return -ENODEV;
+ }
+ break;
+ case MIGRATION_KLV_DEVICE_REVID_KEY:
+ if (*klvs != xe->info.revid) {
+ xe_sriov_info(xe,
+ "Aborting migration, revid mismatch %#04x!=%#04x\n",
+ *klvs, xe->info.revid);
+ return -ENODEV;
+ }
+ break;
+ default:
+ xe_sriov_dbg(xe,
+ "Unknown migration descriptor key %#06x - skipping\n", key);
+ break;
+ }
+
+ if (len > num_dwords)
+ return -EINVAL;
+
+ klvs += len;
+ num_dwords -= len;
+ }
+
+ return 0;
+}
+
static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
{
struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
index 5b96c7f224002..7cfd61005c00f 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
@@ -32,6 +32,8 @@ ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
char __user *buf, size_t len);
ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
const char __user *buf, size_t len);
+int xe_sriov_pf_migration_data_process_desc(struct xe_device *xe, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data);
int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (7 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 19:15 ` Matthew Brost
2025-10-13 11:04 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
` (16 subsequent siblings)
25 siblings, 2 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
The size is normally used to make a decision on when to stop the device
(mainly when it's in a pre_copy state).
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 18 ++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 2 ++
drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 34 +++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 30 ++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 1 +
5 files changed, 85 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 582aaf062cbd4..50f09994e2854 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -395,6 +395,24 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
}
#endif /* CONFIG_DEBUG_FS */
+/**
+ * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: total migration data size in bytes or a negative error code on failure.
+ */
+ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
+{
+ ssize_t total = 0;
+
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+
+ return total;
+}
+
/**
* xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty
* @gt: the &struct xe_gt
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 1e4dc46413823..e5298d35d7d7e 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -15,6 +15,8 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
+ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
+
bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_pf_migration_data *data);
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
index ce780719760a6..b06e893fe54cf 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
@@ -13,6 +13,7 @@
#include "xe_sriov_pf_control.h"
#include "xe_sriov_pf_debugfs.h"
#include "xe_sriov_pf_helpers.h"
+#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_migration_data.h"
#include "xe_sriov_pf_service.h"
#include "xe_sriov_printk.h"
@@ -203,6 +204,38 @@ static const struct file_operations data_vf_fops = {
.llseek = default_llseek,
};
+static ssize_t size_read(struct file *file, char __user *ubuf, size_t count, loff_t *ppos)
+{
+ struct dentry *dent = file_dentry(file);
+ struct dentry *vfdentry = dent->d_parent;
+ struct dentry *migration_dentry = vfdentry->d_parent;
+ unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
+ struct xe_device *xe = migration_dentry->d_inode->i_private;
+ char buf[21];
+ ssize_t ret;
+ int len;
+
+ xe_assert(xe, vfid);
+ xe_sriov_pf_assert_vfid(xe, vfid);
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_migration_size(xe, vfid);
+ xe_pm_runtime_put(xe);
+ if (ret < 0)
+ return ret;
+
+ len = scnprintf(buf, sizeof(buf), "%zd\n", ret);
+
+ return simple_read_from_buffer(ubuf, count, ppos, buf, len);
+}
+
+static const struct file_operations size_vf_fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .read = size_read,
+ .llseek = default_llseek,
+};
+
static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
{
debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
@@ -212,6 +245,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
+ debugfs_create_file("migration_size", 0400, vfdent, xe, &size_vf_fops);
}
static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
index a0cfac456ba0b..6b247581dec65 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
@@ -249,3 +249,33 @@ int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
return xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
}
+
+/**
+ * xe_sriov_pf_migration_size() - Total size of migration data from all components within a device
+ * @xe: the &struct xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: total migration data size in bytes or a negative error code on failure.
+ */
+ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid)
+{
+ size_t size = 0;
+ struct xe_gt *gt;
+ ssize_t ret;
+ u8 gt_id;
+
+ xe_assert(xe, IS_SRIOV_PF(xe));
+
+ for_each_gt(gt, xe, gt_id) {
+ ret = xe_gt_sriov_pf_migration_size(gt, vfid);
+ if (ret < 0) {
+ size = ret;
+ break;
+ }
+ size += ret;
+ }
+
+ return size;
+}
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
index f2020ba19c2da..887ea3e9632bd 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
@@ -14,6 +14,7 @@ struct xe_device;
#ifdef CONFIG_PCI_IOV
int xe_sriov_pf_migration_init(struct xe_device *xe);
bool xe_sriov_pf_migration_supported(struct xe_device *xe);
+ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid);
struct xe_sriov_pf_migration_data *
xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (8 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 18:06 ` Matthew Brost
2025-10-13 11:20 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
` (15 subsequent siblings)
25 siblings, 2 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
In upcoming changes the cached buffers are going to be used to read data
produced by the GuC. Add a counterpart to flush, which synchronizes the
CPU-side of suballocation with the GPU data and propagate the interface
to GuC Buffer Cache.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_guc_buf.c | 9 +++++++++
drivers/gpu/drm/xe/xe_guc_buf.h | 1 +
drivers/gpu/drm/xe/xe_sa.c | 21 +++++++++++++++++++++
drivers/gpu/drm/xe/xe_sa.h | 1 +
4 files changed, 32 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
index 502ca3a4ee606..1be26145f0b98 100644
--- a/drivers/gpu/drm/xe/xe_guc_buf.c
+++ b/drivers/gpu/drm/xe/xe_guc_buf.c
@@ -127,6 +127,15 @@ u64 xe_guc_buf_flush(const struct xe_guc_buf buf)
return xe_sa_bo_gpu_addr(buf.sa);
}
+/**
+ * xe_guc_buf_sync() - Copy the data from the GPU memory to the sub-allocation.
+ * @buf: the &xe_guc_buf to sync
+ */
+void xe_guc_buf_sync(const struct xe_guc_buf buf)
+{
+ xe_sa_bo_sync(buf.sa);
+}
+
/**
* xe_guc_buf_cpu_ptr() - Obtain a CPU pointer to the sub-allocation.
* @buf: the &xe_guc_buf to query
diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
index 0d67604d96bdd..fe6b5ffe0d6eb 100644
--- a/drivers/gpu/drm/xe/xe_guc_buf.h
+++ b/drivers/gpu/drm/xe/xe_guc_buf.h
@@ -31,6 +31,7 @@ static inline bool xe_guc_buf_is_valid(const struct xe_guc_buf buf)
void *xe_guc_buf_cpu_ptr(const struct xe_guc_buf buf);
u64 xe_guc_buf_flush(const struct xe_guc_buf buf);
+void xe_guc_buf_sync(const struct xe_guc_buf buf);
u64 xe_guc_buf_gpu_addr(const struct xe_guc_buf buf);
u64 xe_guc_cache_gpu_addr_from_ptr(struct xe_guc_buf_cache *cache, const void *ptr, u32 size);
diff --git a/drivers/gpu/drm/xe/xe_sa.c b/drivers/gpu/drm/xe/xe_sa.c
index fedd017d6dd36..2115789c2bfb7 100644
--- a/drivers/gpu/drm/xe/xe_sa.c
+++ b/drivers/gpu/drm/xe/xe_sa.c
@@ -110,6 +110,10 @@ struct drm_suballoc *__xe_sa_bo_new(struct xe_sa_manager *sa_manager, u32 size,
return drm_suballoc_new(&sa_manager->base, size, gfp, true, 0);
}
+/**
+ * xe_sa_bo_flush_write() - Copy the data from the sub-allocation to the GPU memory.
+ * @sa_bo: the &drm_suballoc to flush
+ */
void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
{
struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
@@ -123,6 +127,23 @@ void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
drm_suballoc_size(sa_bo));
}
+/**
+ * xe_sa_bo_sync() - Copy the data from GPU memory to the sub-allocation.
+ * @sa_bo: the &drm_suballoc to sync
+ */
+void xe_sa_bo_sync(struct drm_suballoc *sa_bo)
+{
+ struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
+ struct xe_device *xe = tile_to_xe(sa_manager->bo->tile);
+
+ if (!sa_manager->bo->vmap.is_iomem)
+ return;
+
+ xe_map_memcpy_from(xe, xe_sa_bo_cpu_addr(sa_bo), &sa_manager->bo->vmap,
+ drm_suballoc_soffset(sa_bo),
+ drm_suballoc_size(sa_bo));
+}
+
void xe_sa_bo_free(struct drm_suballoc *sa_bo,
struct dma_fence *fence)
{
diff --git a/drivers/gpu/drm/xe/xe_sa.h b/drivers/gpu/drm/xe/xe_sa.h
index 99dbf0eea5402..28fd8bb6450c2 100644
--- a/drivers/gpu/drm/xe/xe_sa.h
+++ b/drivers/gpu/drm/xe/xe_sa.h
@@ -37,6 +37,7 @@ static inline struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager
}
void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo);
+void xe_sa_bo_sync(struct drm_suballoc *sa_bo);
void xe_sa_bo_free(struct drm_suballoc *sa_bo, struct dma_fence *fence);
static inline struct xe_sa_manager *
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (9 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-11 23:35 ` kernel test robot
2025-10-13 11:08 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 12/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration Michał Winiarski
` (14 subsequent siblings)
25 siblings, 2 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
An upcoming change will use GuC buffer cache as a place where GuC
migration data will be stored, and the memory requirement for that is
larger than indirect data.
Allow the caller to pass the size based on the intended usecase.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c | 2 +-
drivers/gpu/drm/xe/xe_guc.c | 4 ++--
drivers/gpu/drm/xe/xe_guc_buf.c | 6 +++---
drivers/gpu/drm/xe/xe_guc_buf.h | 2 +-
4 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
index d266882adc0e0..c273ce8087f56 100644
--- a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
+++ b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
@@ -72,7 +72,7 @@ static int guc_buf_test_init(struct kunit *test)
kunit_activate_static_stub(test, xe_managed_bo_create_pin_map,
replacement_xe_managed_bo_create_pin_map);
- KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf));
+ KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf), SZ_8K);
test->priv = &guc->buf;
return 0;
diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index d94490979adc0..ccc7c60ae9b77 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -809,7 +809,7 @@ static int vf_guc_init_post_hwconfig(struct xe_guc *guc)
if (err)
return err;
- err = xe_guc_buf_cache_init(&guc->buf);
+ err = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
if (err)
return err;
@@ -857,7 +857,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
if (ret)
return ret;
- ret = xe_guc_buf_cache_init(&guc->buf);
+ ret = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
if (ret)
return ret;
diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
index 1be26145f0b98..418ada00b99e3 100644
--- a/drivers/gpu/drm/xe/xe_guc_buf.c
+++ b/drivers/gpu/drm/xe/xe_guc_buf.c
@@ -28,16 +28,16 @@ static struct xe_gt *cache_to_gt(struct xe_guc_buf_cache *cache)
* @cache: the &xe_guc_buf_cache to initialize
*
* The Buffer Cache allows to obtain a reusable buffer that can be used to pass
- * indirect H2G data to GuC without a need to create a ad-hoc allocation.
+ * data to GuC or read data from GuC without a need to create a ad-hoc allocation.
*
* Return: 0 on success or a negative error code on failure.
*/
-int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache)
+int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size)
{
struct xe_gt *gt = cache_to_gt(cache);
struct xe_sa_manager *sam;
- sam = __xe_sa_bo_manager_init(gt_to_tile(gt), SZ_8K, 0, sizeof(u32));
+ sam = __xe_sa_bo_manager_init(gt_to_tile(gt), size, 0, sizeof(u32));
if (IS_ERR(sam))
return PTR_ERR(sam);
cache->sam = sam;
diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
index fe6b5ffe0d6eb..fe5cf3b183497 100644
--- a/drivers/gpu/drm/xe/xe_guc_buf.h
+++ b/drivers/gpu/drm/xe/xe_guc_buf.h
@@ -11,7 +11,7 @@
#include "xe_guc_buf_types.h"
-int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache);
+int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size);
u32 xe_guc_buf_cache_dwords(struct xe_guc_buf_cache *cache);
struct xe_guc_buf xe_guc_buf_reserve(struct xe_guc_buf_cache *cache, u32 dwords);
struct xe_guc_buf xe_guc_buf_from_data(struct xe_guc_buf_cache *cache,
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 12/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (10 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 11:27 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 13/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs Michał Winiarski
` (13 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Contiguous PF GGTT VMAs can be scarce after creating VFs.
Increase the GuC buffer cache size to 8M for PF so that we can fit GuC
migration data (which currently maxes out at just over 4M) and use the
cache instead of allocating fresh BOs.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 54 +++++++------------
drivers/gpu/drm/xe/xe_guc.c | 2 +-
2 files changed, 20 insertions(+), 36 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 50f09994e2854..8b96eff8df93b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -11,7 +11,7 @@
#include "xe_gt_sriov_pf_helpers.h"
#include "xe_gt_sriov_pf_migration.h"
#include "xe_gt_sriov_printk.h"
-#include "xe_guc.h"
+#include "xe_guc_buf.h"
#include "xe_guc_ct.h"
#include "xe_sriov.h"
#include "xe_sriov_pf_migration.h"
@@ -57,73 +57,57 @@ static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
/* Return: number of state dwords saved or a negative error code on failure */
static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
- void *buff, size_t size)
+ void *dst, size_t size)
{
const int ndwords = size / sizeof(u32);
- struct xe_tile *tile = gt_to_tile(gt);
- struct xe_device *xe = tile_to_xe(tile);
struct xe_guc *guc = >->uc.guc;
- struct xe_bo *bo;
+ CLASS(xe_guc_buf, buf)(&guc->buf, ndwords);
int ret;
xe_gt_assert(gt, size % sizeof(u32) == 0);
xe_gt_assert(gt, size == ndwords * sizeof(u32));
- bo = xe_bo_create_pin_map_novm(xe, tile,
- ALIGN(size, PAGE_SIZE),
- ttm_bo_type_kernel,
- XE_BO_FLAG_SYSTEM |
- XE_BO_FLAG_GGTT |
- XE_BO_FLAG_GGTT_INVALIDATE, false);
- if (IS_ERR(bo))
- return PTR_ERR(bo);
+ if (!xe_guc_buf_is_valid(buf))
+ return -ENOBUFS;
+
+ memset(xe_guc_buf_cpu_ptr(buf), 0, size);
ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_SAVE,
- xe_bo_ggtt_addr(bo), ndwords);
- if (!ret)
+ xe_guc_buf_flush(buf), ndwords);
+ if (!ret) {
ret = -ENODATA;
- else if (ret > ndwords)
+ } else if (ret > ndwords) {
ret = -EPROTO;
- else if (ret > 0)
- xe_map_memcpy_from(xe, buff, &bo->vmap, 0, ret * sizeof(u32));
+ } else if (ret > 0) {
+ xe_guc_buf_sync(buf);
+ memcpy(dst, xe_guc_buf_cpu_ptr(buf), ret * sizeof(u32));
+ }
- xe_bo_unpin_map_no_vm(bo);
return ret;
}
/* Return: number of state dwords restored or a negative error code on failure */
static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
- const void *buff, size_t size)
+ const void *src, size_t size)
{
const int ndwords = size / sizeof(u32);
- struct xe_tile *tile = gt_to_tile(gt);
- struct xe_device *xe = tile_to_xe(tile);
struct xe_guc *guc = >->uc.guc;
- struct xe_bo *bo;
+ CLASS(xe_guc_buf_from_data, buf)(&guc->buf, src, size);
int ret;
xe_gt_assert(gt, size % sizeof(u32) == 0);
xe_gt_assert(gt, size == ndwords * sizeof(u32));
- bo = xe_bo_create_pin_map_novm(xe, tile,
- ALIGN(size, PAGE_SIZE),
- ttm_bo_type_kernel,
- XE_BO_FLAG_SYSTEM |
- XE_BO_FLAG_GGTT |
- XE_BO_FLAG_GGTT_INVALIDATE, false);
- if (IS_ERR(bo))
- return PTR_ERR(bo);
-
- xe_map_memcpy_to(xe, &bo->vmap, 0, buff, size);
+ if (!xe_guc_buf_is_valid(buf))
+ return -ENOBUFS;
ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_RESTORE,
- xe_bo_ggtt_addr(bo), ndwords);
+ xe_guc_buf_flush(buf), ndwords);
if (!ret)
ret = -ENODATA;
else if (ret > ndwords)
ret = -EPROTO;
- xe_bo_unpin_map_no_vm(bo);
return ret;
}
diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index ccc7c60ae9b77..71ca06d1af62b 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -857,7 +857,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
if (ret)
return ret;
- ret = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
+ ret = xe_guc_buf_cache_init(&guc->buf, IS_SRIOV_PF(guc_to_xe(guc)) ? SZ_8M : SZ_8K);
if (ret)
return ret;
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 13/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (11 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 12/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 11:36 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 14/26] drm/xe/pf: Don't save GuC VF migration data on pause Michał Winiarski
` (12 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
In upcoming changes, SR-IOV VF migration data will be extended beyond
GuC data and exported to userspace using VFIO interface (with a
vendor-specific variant driver) and a device-level debugfs interface.
Remove the GT-level debugfs.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 47 ---------------------
1 file changed, 47 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
index c026a3910e7e3..c2b27dab13aa8 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
@@ -320,9 +320,6 @@ static const struct {
{ "stop", xe_gt_sriov_pf_control_stop_vf },
{ "pause", xe_gt_sriov_pf_control_pause_vf },
{ "resume", xe_gt_sriov_pf_control_resume_vf },
-#ifdef CONFIG_DRM_XE_DEBUG_SRIOV
- { "restore!", xe_gt_sriov_pf_migration_restore_guc_state },
-#endif
};
static ssize_t control_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
@@ -386,47 +383,6 @@ static const struct file_operations control_ops = {
.llseek = default_llseek,
};
-/*
- * /sys/kernel/debug/dri/BDF/
- * ├── sriov
- * : ├── vf1
- * : ├── tile0
- * : ├── gt0
- * : ├── guc_state
- */
-
-static ssize_t guc_state_read(struct file *file, char __user *buf,
- size_t count, loff_t *pos)
-{
- struct dentry *dent = file_dentry(file);
- struct dentry *parent = dent->d_parent;
- struct xe_gt *gt = extract_gt(parent);
- unsigned int vfid = extract_vfid(parent);
-
- return xe_gt_sriov_pf_migration_read_guc_state(gt, vfid, buf, count, pos);
-}
-
-static ssize_t guc_state_write(struct file *file, const char __user *buf,
- size_t count, loff_t *pos)
-{
- struct dentry *dent = file_dentry(file);
- struct dentry *parent = dent->d_parent;
- struct xe_gt *gt = extract_gt(parent);
- unsigned int vfid = extract_vfid(parent);
-
- if (*pos)
- return -EINVAL;
-
- return xe_gt_sriov_pf_migration_write_guc_state(gt, vfid, buf, count);
-}
-
-static const struct file_operations guc_state_ops = {
- .owner = THIS_MODULE,
- .read = guc_state_read,
- .write = guc_state_write,
- .llseek = default_llseek,
-};
-
/*
* /sys/kernel/debug/dri/BDF/
* ├── sriov
@@ -561,9 +517,6 @@ static void pf_populate_gt(struct xe_gt *gt, struct dentry *dent, unsigned int v
/* for testing/debugging purposes only! */
if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
- debugfs_create_file("guc_state",
- IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400,
- dent, NULL, &guc_state_ops);
debugfs_create_file("config_blob",
IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400,
dent, NULL, &config_blob_ops);
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 14/26] drm/xe/pf: Don't save GuC VF migration data on pause
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (12 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 13/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 11:42 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 15/26] drm/xe/pf: Switch VF migration GuC save/restore to struct migration data Michał Winiarski
` (11 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
In upcoming changes, the GuC VF migration data will be handled as part
of separate SAVE/RESTORE states in VF control state machine.
Remove it from PAUSE state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 39 +------------------
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 -
2 files changed, 2 insertions(+), 39 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 092d3d710bca1..6ece775b2e80e 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -183,7 +183,6 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(PAUSE_SEND_PAUSE);
CASE2STR(PAUSE_WAIT_GUC);
CASE2STR(PAUSE_GUC_DONE);
- CASE2STR(PAUSE_SAVE_GUC);
CASE2STR(PAUSE_FAILED);
CASE2STR(PAUSED);
CASE2STR(MIGRATION_DATA_WIP);
@@ -451,8 +450,7 @@ static void pf_enter_vf_ready(struct xe_gt *gt, unsigned int vfid)
* : PAUSE_GUC_DONE o-----restart
* : | :
* : | o---<--busy :
- * : v / / :
- * : PAUSE_SAVE_GUC :
+ * : / :
* : / :
* : / :
* :....o..............o...............o...........:
@@ -472,7 +470,6 @@ static void pf_exit_vf_pause_wip(struct xe_gt *gt, unsigned int vfid)
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_GUC_DONE);
- pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC);
}
}
@@ -503,41 +500,12 @@ static void pf_enter_vf_pause_rejected(struct xe_gt *gt, unsigned int vfid)
pf_enter_vf_pause_failed(gt, vfid);
}
-static void pf_enter_vf_pause_save_guc(struct xe_gt *gt, unsigned int vfid)
-{
- if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC))
- pf_enter_vf_state_machine_bug(gt, vfid);
-}
-
-static bool pf_exit_vf_pause_save_guc(struct xe_gt *gt, unsigned int vfid)
-{
- int err;
-
- if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC))
- return false;
-
- err = xe_gt_sriov_pf_migration_save_guc_state(gt, vfid);
- if (err) {
- /* retry if busy */
- if (err == -EBUSY) {
- pf_enter_vf_pause_save_guc(gt, vfid);
- return true;
- }
- /* give up on error */
- if (err == -EIO)
- pf_enter_vf_mismatch(gt, vfid);
- }
-
- pf_enter_vf_pause_completed(gt, vfid);
- return true;
-}
-
static bool pf_exit_vf_pause_guc_done(struct xe_gt *gt, unsigned int vfid)
{
if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_GUC_DONE))
return false;
- pf_enter_vf_pause_save_guc(gt, vfid);
+ pf_enter_vf_pause_completed(gt, vfid);
return true;
}
@@ -1788,9 +1756,6 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
if (pf_exit_vf_pause_guc_done(gt, vfid))
return true;
- if (pf_exit_vf_pause_save_guc(gt, vfid))
- return true;
-
if (pf_handle_vf_save_wip(gt, vfid))
return true;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index 02b517533ee8a..68ec9d1fc3daf 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -28,7 +28,6 @@
* @XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE: indicates that the PF is about to send a PAUSE command.
* @XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC: indicates that the PF awaits for a response from the GuC.
* @XE_GT_SRIOV_STATE_PAUSE_GUC_DONE: indicates that the PF has received a response from the GuC.
- * @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state.
* @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
* @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
* @XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP: indicates that the new data is expected in migration ring.
@@ -66,7 +65,6 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE,
XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC,
XE_GT_SRIOV_STATE_PAUSE_GUC_DONE,
- XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC,
XE_GT_SRIOV_STATE_PAUSE_FAILED,
XE_GT_SRIOV_STATE_PAUSED,
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 15/26] drm/xe/pf: Switch VF migration GuC save/restore to struct migration data
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (13 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 14/26] drm/xe/pf: Don't save GuC VF migration data on pause Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 16/26] drm/xe/pf: Handle GuC migration data as part of PF control Michał Winiarski
` (10 subsequent siblings)
25 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
In upcoming changes, the GuC VF migration data will be handled as part
of separate SAVE/RESTORE states in VF control state machine.
Now that the data is decoupled from both guc_state debugfs and PAUSE
state, we can safely remove the struct xe_gt_sriov_state_snapshot and
modify the GuC save/restore functions to operate on struct
xe_sriov_migration_data.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 266 +++++-------------
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 13 +-
.../drm/xe/xe_gt_sriov_pf_migration_types.h | 17 --
drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 3 -
4 files changed, 81 insertions(+), 218 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 8b96eff8df93b..e1031465e65c4 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -28,6 +28,15 @@ static struct xe_gt_sriov_pf_migration *pf_pick_gt_migration(struct xe_gt *gt, u
return >->sriov.pf.vfs[vfid].migration;
}
+static void pf_dump_mig_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ if (IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV)) {
+ print_hex_dump_bytes("mig_data: ", DUMP_PREFIX_OFFSET,
+ data->vaddr, min(SZ_64, data->size));
+ }
+}
+
/* Return: number of dwords saved/restored/required or a negative error code on failure */
static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
u64 addr, u32 ndwords)
@@ -47,7 +56,7 @@ static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
}
/* Return: size of the state in dwords or a negative error code on failure */
-static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
+static int pf_send_guc_query_vf_mig_data_size(struct xe_gt *gt, unsigned int vfid)
{
int ret;
@@ -56,8 +65,8 @@ static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
}
/* Return: number of state dwords saved or a negative error code on failure */
-static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
- void *dst, size_t size)
+static int pf_send_guc_save_vf_mig_data(struct xe_gt *gt, unsigned int vfid,
+ void *dst, size_t size)
{
const int ndwords = size / sizeof(u32);
struct xe_guc *guc = >->uc.guc;
@@ -87,8 +96,8 @@ static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
}
/* Return: number of state dwords restored or a negative error code on failure */
-static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
- const void *src, size_t size)
+static int pf_send_guc_restore_vf_mig_data(struct xe_gt *gt, unsigned int vfid,
+ const void *src, size_t size)
{
const int ndwords = size / sizeof(u32);
struct xe_guc *guc = >->uc.guc;
@@ -116,120 +125,67 @@ static bool pf_migration_supported(struct xe_gt *gt)
return xe_sriov_pf_migration_supported(gt_to_xe(gt));
}
-static struct mutex *pf_migration_mutex(struct xe_gt *gt)
-{
- xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- return >->sriov.pf.snapshot_lock;
-}
-
-static struct xe_gt_sriov_state_snapshot *pf_pick_vf_snapshot(struct xe_gt *gt,
- unsigned int vfid)
+static int pf_save_vf_guc_mig_data(struct xe_gt *gt, unsigned int vfid)
{
- xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
- lockdep_assert_held(pf_migration_mutex(gt));
-
- return >->sriov.pf.vfs[vfid].snapshot;
-}
-
-static unsigned int pf_snapshot_index(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot)
-{
- return container_of(snapshot, struct xe_gt_sriov_metadata, snapshot) - gt->sriov.pf.vfs;
-}
-
-static void pf_free_guc_state(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot)
-{
- struct xe_device *xe = gt_to_xe(gt);
-
- drmm_kfree(&xe->drm, snapshot->guc.buff);
- snapshot->guc.buff = NULL;
- snapshot->guc.size = 0;
-}
-
-static int pf_alloc_guc_state(struct xe_gt *gt,
- struct xe_gt_sriov_state_snapshot *snapshot,
- size_t size)
-{
- struct xe_device *xe = gt_to_xe(gt);
- void *p;
-
- pf_free_guc_state(gt, snapshot);
-
- if (!size)
- return -ENODATA;
-
- if (size % sizeof(u32))
- return -EINVAL;
-
- if (size > SZ_2M)
- return -EFBIG;
-
- p = drmm_kzalloc(&xe->drm, size, GFP_KERNEL);
- if (!p)
- return -ENOMEM;
-
- snapshot->guc.buff = p;
- snapshot->guc.size = size;
- return 0;
-}
-
-static void pf_dump_guc_state(struct xe_gt *gt, struct xe_gt_sriov_state_snapshot *snapshot)
-{
- if (IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV)) {
- unsigned int vfid __maybe_unused = pf_snapshot_index(gt, snapshot);
-
- xe_gt_sriov_dbg_verbose(gt, "VF%u GuC state is %zu dwords:\n",
- vfid, snapshot->guc.size / sizeof(u32));
- print_hex_dump_bytes("state: ", DUMP_PREFIX_OFFSET,
- snapshot->guc.buff, min(SZ_64, snapshot->guc.size));
- }
-}
-
-static int pf_save_vf_guc_state(struct xe_gt *gt, unsigned int vfid)
-{
- struct xe_gt_sriov_state_snapshot *snapshot = pf_pick_vf_snapshot(gt, vfid);
+ struct xe_sriov_pf_migration_data *data;
size_t size;
int ret;
- ret = pf_send_guc_query_vf_state_size(gt, vfid);
+ ret = pf_send_guc_query_vf_mig_data_size(gt, vfid);
if (ret < 0)
goto fail;
+
size = ret * sizeof(u32);
- xe_gt_sriov_dbg_verbose(gt, "VF%u state size is %d dwords (%zu bytes)\n", vfid, ret, size);
+ xe_gt_sriov_dbg_verbose(gt, "VF%u GuC state size is %d dwords (%zu bytes)\n",
+ vfid, ret, size);
- ret = pf_alloc_guc_state(gt, snapshot, size);
- if (ret < 0)
+ data = xe_sriov_pf_migration_data_alloc(gt_to_xe(gt));
+ if (!data) {
+ ret = -ENOMEM;
goto fail;
+ }
- ret = pf_send_guc_save_vf_state(gt, vfid, snapshot->guc.buff, size);
+ ret = xe_sriov_pf_migration_data_init(data, gt->tile->id, gt->info.id,
+ XE_SRIOV_MIG_DATA_GUC, 0, size);
+ if (ret)
+ goto fail_free;
+
+ ret = pf_send_guc_save_vf_mig_data(gt, vfid, data->vaddr, size);
if (ret < 0)
- goto fail;
+ goto fail_free;
size = ret * sizeof(u32);
xe_gt_assert(gt, size);
- xe_gt_assert(gt, size <= snapshot->guc.size);
- snapshot->guc.size = size;
+ xe_gt_assert(gt, size <= data->size);
+ data->size = size;
+ data->remaining = size;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
+ if (ret)
+ goto fail_free;
- pf_dump_guc_state(gt, snapshot);
return 0;
+fail_free:
+ xe_sriov_pf_migration_data_free(data);
fail:
- xe_gt_sriov_dbg(gt, "Unable to save VF%u state (%pe)\n", vfid, ERR_PTR(ret));
- pf_free_guc_state(gt, snapshot);
+ xe_gt_sriov_err(gt, "Unable to save VF%u GuC data (%pe)\n", vfid, ERR_PTR(ret));
return ret;
}
/**
- * xe_gt_sriov_pf_migration_save_guc_state() - Take a GuC VF state snapshot.
- * @gt: the &xe_gt
+ * xe_gt_sriov_pf_migration_guc_size() - Get the size of VF GuC migration data.
+ * @gt: the &struct xe_gt
* @vfid: the VF identifier
*
* This function is for PF only.
*
- * Return: 0 on success or a negative error code on failure.
+ * Return: size in bytes or a negative error code on failure.
*/
-int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid)
+ssize_t xe_gt_sriov_pf_migration_guc_size(struct xe_gt *gt, unsigned int vfid)
{
- int err;
+ ssize_t size;
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
xe_gt_assert(gt, vfid != PFID);
@@ -238,48 +194,24 @@ int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid)
if (!pf_migration_supported(gt))
return -ENOPKG;
- mutex_lock(pf_migration_mutex(gt));
- err = pf_save_vf_guc_state(gt, vfid);
- mutex_unlock(pf_migration_mutex(gt));
+ size = pf_send_guc_query_vf_mig_data_size(gt, vfid);
+ if (size >= 0)
+ size *= sizeof(u32);
- return err;
-}
-
-static int pf_restore_vf_guc_state(struct xe_gt *gt, unsigned int vfid)
-{
- struct xe_gt_sriov_state_snapshot *snapshot = pf_pick_vf_snapshot(gt, vfid);
- int ret;
-
- if (!snapshot->guc.size)
- return -ENODATA;
-
- xe_gt_sriov_dbg_verbose(gt, "restoring %zu dwords of VF%u GuC state\n",
- snapshot->guc.size / sizeof(u32), vfid);
- ret = pf_send_guc_restore_vf_state(gt, vfid, snapshot->guc.buff, snapshot->guc.size);
- if (ret < 0)
- goto fail;
-
- xe_gt_sriov_dbg_verbose(gt, "restored %d dwords of VF%u GuC state\n", ret, vfid);
- return 0;
-
-fail:
- xe_gt_sriov_dbg(gt, "Failed to restore VF%u GuC state (%pe)\n", vfid, ERR_PTR(ret));
- return ret;
+ return size;
}
/**
- * xe_gt_sriov_pf_migration_restore_guc_state() - Restore a GuC VF state.
- * @gt: the &xe_gt
+ * xe_gt_sriov_pf_migration_guc_save() - Save VF GuC migration data.
+ * @gt: the &struct xe_gt
* @vfid: the VF identifier
*
* This function is for PF only.
*
* Return: 0 on success or a negative error code on failure.
*/
-int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid)
+int xe_gt_sriov_pf_migration_guc_save(struct xe_gt *gt, unsigned int vfid)
{
- int ret;
-
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
xe_gt_assert(gt, vfid != PFID);
xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
@@ -287,75 +219,44 @@ int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vf
if (!pf_migration_supported(gt))
return -ENOPKG;
- mutex_lock(pf_migration_mutex(gt));
- ret = pf_restore_vf_guc_state(gt, vfid);
- mutex_unlock(pf_migration_mutex(gt));
-
- return ret;
+ return pf_save_vf_guc_mig_data(gt, vfid);
}
-#ifdef CONFIG_DEBUG_FS
-/**
- * xe_gt_sriov_pf_migration_read_guc_state() - Read a GuC VF state.
- * @gt: the &xe_gt
- * @vfid: the VF identifier
- * @buf: the user space buffer to read to
- * @count: the maximum number of bytes to read
- * @pos: the current position in the buffer
- *
- * This function is for PF only.
- *
- * This function reads up to @count bytes from the saved VF GuC state buffer
- * at offset @pos into the user space address starting at @buf.
- *
- * Return: the number of bytes read or a negative error code on failure.
- */
-ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
- char __user *buf, size_t count, loff_t *pos)
+static int pf_restore_vf_guc_state(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
{
- struct xe_gt_sriov_state_snapshot *snapshot;
- ssize_t ret;
+ int ret;
- xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
- xe_gt_assert(gt, vfid != PFID);
- xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+ xe_gt_assert(gt, data->size);
- if (!pf_migration_supported(gt))
- return -ENOPKG;
+ xe_gt_sriov_dbg_verbose(gt, "restoring %lld dwords of VF%u GuC state\n",
+ data->size / sizeof(u32), vfid);
+ pf_dump_mig_data(gt, vfid, data);
- mutex_lock(pf_migration_mutex(gt));
- snapshot = pf_pick_vf_snapshot(gt, vfid);
- if (snapshot->guc.size)
- ret = simple_read_from_buffer(buf, count, pos, snapshot->guc.buff,
- snapshot->guc.size);
- else
- ret = -ENODATA;
- mutex_unlock(pf_migration_mutex(gt));
+ ret = pf_send_guc_restore_vf_mig_data(gt, vfid, data->vaddr, data->size);
+ if (ret < 0)
+ goto fail;
+
+ xe_gt_sriov_dbg_verbose(gt, "restored %d dwords of VF%u GuC state\n", ret, vfid);
+ return 0;
+fail:
+ xe_gt_sriov_dbg(gt, "Failed to restore VF%u GuC state (%pe)\n", vfid, ERR_PTR(ret));
return ret;
}
/**
- * xe_gt_sriov_pf_migration_write_guc_state() - Write a GuC VF state.
- * @gt: the &xe_gt
+ * xe_gt_sriov_pf_migration_guc_restore() - Restore VF GuC migration data.
+ * @gt: the &struct xe_gt
* @vfid: the VF identifier
- * @buf: the user space buffer with GuC VF state
- * @size: the size of GuC VF state (in bytes)
*
* This function is for PF only.
*
- * This function reads @size bytes of the VF GuC state stored at user space
- * address @buf and writes it into a internal VF state buffer.
- *
- * Return: the number of bytes used or a negative error code on failure.
+ * Return: 0 on success or a negative error code on failure.
*/
-ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int vfid,
- const char __user *buf, size_t size)
+int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
{
- struct xe_gt_sriov_state_snapshot *snapshot;
- loff_t pos = 0;
- ssize_t ret;
-
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
xe_gt_assert(gt, vfid != PFID);
xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
@@ -363,21 +264,8 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
if (!pf_migration_supported(gt))
return -ENOPKG;
- mutex_lock(pf_migration_mutex(gt));
- snapshot = pf_pick_vf_snapshot(gt, vfid);
- ret = pf_alloc_guc_state(gt, snapshot, size);
- if (!ret) {
- ret = simple_write_to_buffer(snapshot->guc.buff, size, &pos, buf, size);
- if (ret < 0)
- pf_free_guc_state(gt, snapshot);
- else
- pf_dump_guc_state(gt, snapshot);
- }
- mutex_unlock(pf_migration_mutex(gt));
-
- return ret;
+ return pf_restore_vf_guc_state(gt, vfid, data);
}
-#endif /* CONFIG_DEBUG_FS */
/**
* xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index e5298d35d7d7e..5df64449232bc 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -12,8 +12,10 @@ struct xe_gt;
struct xe_sriov_pf_migration_data;
int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
-int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
-int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
+ssize_t xe_gt_sriov_pf_migration_guc_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_guc_save(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data);
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
@@ -25,11 +27,4 @@ xe_gt_sriov_pf_migration_ring_consume(struct xe_gt *gt, unsigned int vfid);
struct xe_sriov_pf_migration_data *
xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid);
-#ifdef CONFIG_DEBUG_FS
-ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
- char __user *buf, size_t count, loff_t *pos);
-ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int vfid,
- const char __user *buf, size_t count);
-#endif
-
#endif
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
index 8434689372082..aa8f349e6092b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
@@ -6,24 +6,7 @@
#ifndef _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
#define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
-#include <linux/mutex.h>
#include <linux/ptr_ring.h>
-#include <linux/types.h>
-
-/**
- * struct xe_gt_sriov_state_snapshot - GT-level per-VF state snapshot data.
- *
- * Used by the PF driver to maintain per-VF migration data.
- */
-struct xe_gt_sriov_state_snapshot {
- /** @guc: GuC VF state snapshot */
- struct {
- /** @guc.buff: buffer with the VF state */
- u32 *buff;
- /** @guc.size: size of the buffer (must be dwords aligned) */
- u32 size;
- } guc;
-};
/**
* struct xe_gt_sriov_pf_migration - GT-level data.
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
index fbb08f8030f7f..b5feb6a54e434 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
@@ -31,9 +31,6 @@ struct xe_gt_sriov_metadata {
/** @version: negotiated VF/PF ABI version */
struct xe_gt_sriov_pf_service_version version;
- /** @snapshot: snapshot of the VF state data */
- struct xe_gt_sriov_state_snapshot snapshot;
-
/** @migration: */
struct xe_gt_sriov_pf_migration migration;
};
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 16/26] drm/xe/pf: Handle GuC migration data as part of PF control
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (14 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 15/26] drm/xe/pf: Switch VF migration GuC save/restore to struct migration data Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 11:56 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 17/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
` (9 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Connect the helpers to allow save and restore of GuC migration data in
stop_copy / resume device state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 28 ++++++++++++++++++-
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 8 ++++++
3 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 6ece775b2e80e..f73a3bf40037c 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -187,6 +187,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(PAUSED);
CASE2STR(MIGRATION_DATA_WIP);
CASE2STR(SAVE_WIP);
+ CASE2STR(SAVE_DATA_GUC);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
CASE2STR(RESTORE_WIP);
@@ -338,6 +339,7 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOP_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
+ pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
}
@@ -801,6 +803,7 @@ void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
}
@@ -820,16 +823,35 @@ static void pf_exit_vf_saved(struct xe_gt *gt, unsigned int vfid)
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
}
+static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
+{
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
+ pf_exit_vf_wip(gt, vfid);
+}
+
static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
+ int ret;
+
if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
return false;
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC)) {
+ ret = xe_gt_sriov_pf_migration_guc_save(gt, vfid);
+ if (ret)
+ goto err;
+ return true;
+ }
+
xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
pf_exit_vf_save_wip(gt, vfid);
pf_enter_vf_saved(gt, vfid);
return true;
+
+err:
+ pf_enter_vf_save_failed(gt, vfid);
+ return false;
}
static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
@@ -838,6 +860,8 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
pf_exit_vf_restored(gt, vfid);
pf_enter_vf_wip(gt, vfid);
+ if (xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0)
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
pf_queue_vf(gt, vfid);
return true;
}
@@ -946,6 +970,8 @@ static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_pf_migration_data *data)
{
switch (data->type) {
+ case XE_SRIOV_MIG_DATA_GUC:
+ return xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
default:
xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
pf_enter_vf_restore_failed(gt, vfid);
@@ -996,7 +1022,7 @@ static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
pf_exit_vf_saved(gt, vfid);
pf_enter_vf_wip(gt, vfid);
- pf_enter_vf_restored(gt, vfid);
+ pf_queue_vf(gt, vfid);
return true;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index 68ec9d1fc3daf..b9787c425d9f6 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -71,6 +71,7 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP,
XE_GT_SRIOV_STATE_SAVE_WIP,
+ XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index e1031465e65c4..0c10284f0b09a 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -279,9 +279,17 @@ int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
{
ssize_t total = 0;
+ ssize_t size;
xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ size = xe_gt_sriov_pf_migration_guc_size(gt, vfid);
+ if (size < 0)
+ return size;
+ else if (size > 0)
+ size += sizeof(struct xe_sriov_pf_migration_hdr);
+ total += size;
+
return total;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 17/26] drm/xe/pf: Add helpers for VF GGTT migration data handling
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (15 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 16/26] drm/xe/pf: Handle GuC migration data as part of PF control Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 12:17 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 18/26] drm/xe/pf: Handle GGTT migration data as part of PF control Michał Winiarski
` (8 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
In an upcoming change, the VF GGTT migration data will be handled as
part of VF control state machine. Add the necessary helpers to allow the
migration data transfer to/from the HW GGTT resource.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_ggtt.c | 92 ++++++++++++++++++++++
drivers/gpu/drm/xe/xe_ggtt.h | 2 +
drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 64 +++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 5 ++
5 files changed, 165 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
index aca7ae5489b91..89c0ad56c6a8a 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.c
+++ b/drivers/gpu/drm/xe/xe_ggtt.c
@@ -138,6 +138,14 @@ static void xe_ggtt_set_pte_and_flush(struct xe_ggtt *ggtt, u64 addr, u64 pte)
ggtt_update_access_counter(ggtt);
}
+static u64 xe_ggtt_get_pte(struct xe_ggtt *ggtt, u64 addr)
+{
+ xe_tile_assert(ggtt->tile, !(addr & XE_PTE_MASK));
+ xe_tile_assert(ggtt->tile, addr < ggtt->size);
+
+ return readq(&ggtt->gsm[addr >> XE_PTE_SHIFT]);
+}
+
static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size)
{
u16 pat_index = tile_to_xe(ggtt->tile)->pat.idx[XE_CACHE_WB];
@@ -220,16 +228,19 @@ void xe_ggtt_might_lock(struct xe_ggtt *ggtt)
static const struct xe_ggtt_pt_ops xelp_pt_ops = {
.pte_encode_flags = xelp_ggtt_pte_flags,
.ggtt_set_pte = xe_ggtt_set_pte,
+ .ggtt_get_pte = xe_ggtt_get_pte,
};
static const struct xe_ggtt_pt_ops xelpg_pt_ops = {
.pte_encode_flags = xelpg_ggtt_pte_flags,
.ggtt_set_pte = xe_ggtt_set_pte,
+ .ggtt_get_pte = xe_ggtt_get_pte,
};
static const struct xe_ggtt_pt_ops xelpg_pt_wa_ops = {
.pte_encode_flags = xelpg_ggtt_pte_flags,
.ggtt_set_pte = xe_ggtt_set_pte_and_flush,
+ .ggtt_get_pte = xe_ggtt_get_pte,
};
static void __xe_ggtt_init_early(struct xe_ggtt *ggtt, u32 reserved)
@@ -914,6 +925,87 @@ void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid)
xe_ggtt_assign_locked(node->ggtt, &node->base, vfid);
mutex_unlock(&node->ggtt->lock);
}
+
+/**
+ * xe_ggtt_node_save - Save a &struct xe_ggtt_node to a buffer
+ * @node: the &struct xe_ggtt_node to be saved
+ * @dst: destination buffer
+ * @size: destination buffer size in bytes
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size)
+{
+ struct xe_ggtt *ggtt;
+ u64 start, end;
+ u64 *buf = dst;
+
+ if (!node || !node->ggtt)
+ return -ENOENT;
+
+ mutex_lock(&node->ggtt->lock);
+
+ ggtt = node->ggtt;
+ start = node->base.start;
+ end = start + node->base.size - 1;
+
+ if (node->base.size < size) {
+ mutex_unlock(&node->ggtt->lock);
+ return -EINVAL;
+ }
+
+ while (start < end) {
+ *buf++ = ggtt->pt_ops->ggtt_get_pte(ggtt, start) & ~GGTT_PTE_VFID;
+ start += XE_PAGE_SIZE;
+ }
+
+ mutex_unlock(&node->ggtt->lock);
+
+ return 0;
+}
+
+/**
+ * xe_ggtt_node_load - Load a &struct xe_ggtt_node from a buffer
+ * @node: the &struct xe_ggtt_node to be loaded
+ * @src: source buffer
+ * @size: source buffer size in bytes
+ * @vfid: VF identifier
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid)
+{
+ struct xe_ggtt *ggtt;
+ u64 start, end;
+ const u64 *buf = src;
+ u64 vfid_pte = xe_encode_vfid_pte(vfid);
+
+ if (!node || !node->ggtt)
+ return -ENOENT;
+
+ mutex_lock(&node->ggtt->lock);
+
+ ggtt = node->ggtt;
+ start = node->base.start;
+ end = start + size - 1;
+
+ if (node->base.size != size) {
+ mutex_unlock(&node->ggtt->lock);
+ return -EINVAL;
+ }
+
+ while (start < end) {
+ ggtt->pt_ops->ggtt_set_pte(ggtt, start, (*buf & ~GGTT_PTE_VFID) | vfid_pte);
+ start += XE_PAGE_SIZE;
+ buf++;
+ }
+ xe_ggtt_invalidate(ggtt);
+
+ mutex_unlock(&node->ggtt->lock);
+
+ return 0;
+}
+
#endif
/**
diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
index 75fc7a1efea76..469b3a6ca14b4 100644
--- a/drivers/gpu/drm/xe/xe_ggtt.h
+++ b/drivers/gpu/drm/xe/xe_ggtt.h
@@ -43,6 +43,8 @@ u64 xe_ggtt_print_holes(struct xe_ggtt *ggtt, u64 alignment, struct drm_printer
#ifdef CONFIG_PCI_IOV
void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid);
+int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size);
+int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid);
#endif
#ifndef CONFIG_LOCKDEP
diff --git a/drivers/gpu/drm/xe/xe_ggtt_types.h b/drivers/gpu/drm/xe/xe_ggtt_types.h
index c5e999d58ff2a..dacd796f81844 100644
--- a/drivers/gpu/drm/xe/xe_ggtt_types.h
+++ b/drivers/gpu/drm/xe/xe_ggtt_types.h
@@ -78,6 +78,8 @@ struct xe_ggtt_pt_ops {
u64 (*pte_encode_flags)(struct xe_bo *bo, u16 pat_index);
/** @ggtt_set_pte: Directly write into GGTT's PTE */
void (*ggtt_set_pte)(struct xe_ggtt *ggtt, u64 addr, u64 pte);
+ /** @ggtt_get_pte: Directly read from GGTT's PTE */
+ u64 (*ggtt_get_pte)(struct xe_ggtt *ggtt, u64 addr);
};
#endif
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index b2e5c52978e6a..51027921b2988 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -726,6 +726,70 @@ int xe_gt_sriov_pf_config_set_fair_ggtt(struct xe_gt *gt, unsigned int vfid,
return xe_gt_sriov_pf_config_bulk_set_ggtt(gt, vfid, num_vfs, fair);
}
+/**
+ * xe_gt_sriov_pf_config_ggtt_save - Save a VF provisioned GGTT data into a buffer.
+ * @gt: the &struct xe_gt
+ * @vfid: VF identifier
+ * @buf: the GGTT data destination buffer
+ * @size: the size of the buffer
+ *
+ * This function can only be called on PF.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
+ void *buf, size_t size)
+{
+ struct xe_gt_sriov_config *config;
+ ssize_t ret;
+
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid);
+ xe_gt_assert(gt, !(!buf ^ !size));
+
+ mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
+ config = pf_pick_vf_config(gt, vfid);
+ size = size / sizeof(u64) * XE_PAGE_SIZE;
+
+ ret = xe_ggtt_node_save(config->ggtt_region, buf, size);
+
+ mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
+
+ return ret;
+}
+
+/**
+ * xe_gt_sriov_pf_config_ggtt_restore - Restore a VF provisioned GGTT data from a buffer.
+ * @gt: the &struct xe_gt
+ * @vfid: VF identifier
+ * @buf: the GGTT data source buffer
+ * @size: the size of the buffer
+ *
+ * This function can only be called on PF.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size)
+{
+ struct xe_gt_sriov_config *config;
+ ssize_t ret;
+
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid);
+ xe_gt_assert(gt, !(!buf ^ !size));
+
+ mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
+ config = pf_pick_vf_config(gt, vfid);
+ size = size / sizeof(u64) * XE_PAGE_SIZE;
+
+ ret = xe_ggtt_node_load(config->ggtt_region, buf, size, vfid);
+
+ mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
+
+ return ret;
+}
+
static u32 pf_get_min_spare_ctxs(struct xe_gt *gt)
{
/* XXX: preliminary */
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
index 513e6512a575b..6916b8f58ebf2 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
@@ -61,6 +61,11 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
int xe_gt_sriov_pf_config_restore(struct xe_gt *gt, unsigned int vfid,
const void *buf, size_t size);
+int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
+ void *buf, size_t size);
+int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size);
+
bool xe_gt_sriov_pf_config_is_empty(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_config_init(struct xe_gt *gt);
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 18/26] drm/xe/pf: Handle GGTT migration data as part of PF control
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (16 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 17/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 12:36 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 19/26] drm/xe/pf: Add helpers for VF MMIO migration data handling Michał Winiarski
` (7 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Connect the helpers to allow save and restore of GGTT migration data in
stop_copy / resume device state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 13 ++
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 119 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 4 +
4 files changed, 137 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index f73a3bf40037c..a74f6feca4830 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -188,6 +188,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(MIGRATION_DATA_WIP);
CASE2STR(SAVE_WIP);
CASE2STR(SAVE_DATA_GUC);
+ CASE2STR(SAVE_DATA_GGTT);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
CASE2STR(RESTORE_WIP);
@@ -803,6 +804,7 @@ void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
}
@@ -843,6 +845,13 @@ static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
return true;
}
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT)) {
+ ret = xe_gt_sriov_pf_migration_ggtt_save(gt, vfid);
+ if (ret)
+ goto err;
+ return true;
+ }
+
xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
pf_exit_vf_save_wip(gt, vfid);
pf_enter_vf_saved(gt, vfid);
@@ -862,6 +871,8 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
pf_enter_vf_wip(gt, vfid);
if (xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0)
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
+ if (xe_gt_sriov_pf_migration_ggtt_size(gt, vfid) > 0)
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
pf_queue_vf(gt, vfid);
return true;
}
@@ -970,6 +981,8 @@ static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_pf_migration_data *data)
{
switch (data->type) {
+ case XE_SRIOV_MIG_DATA_GGTT:
+ return xe_gt_sriov_pf_migration_ggtt_restore(gt, vfid, data);
case XE_SRIOV_MIG_DATA_GUC:
return xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
default:
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index b9787c425d9f6..c94ff0258306a 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -72,6 +72,7 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_SAVE_WIP,
XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
+ XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 0c10284f0b09a..92ecf47e71bc7 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -7,6 +7,7 @@
#include "abi/guc_actions_sriov_abi.h"
#include "xe_bo.h"
+#include "xe_gt_sriov_pf_config.h"
#include "xe_gt_sriov_pf_control.h"
#include "xe_gt_sriov_pf_helpers.h"
#include "xe_gt_sriov_pf_migration.h"
@@ -37,6 +38,117 @@ static void pf_dump_mig_data(struct xe_gt *gt, unsigned int vfid,
}
}
+static int pf_save_vf_ggtt_mig_data(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration_data *data;
+ size_t size;
+ int ret;
+
+ size = xe_gt_sriov_pf_config_get_ggtt(gt, vfid);
+ if (size == 0)
+ return 0;
+ size = size / XE_PAGE_SIZE * sizeof(u64);
+
+ data = xe_sriov_pf_migration_data_alloc(gt_to_xe(gt));
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_pf_migration_data_init(data, gt->tile->id, gt->info.id,
+ XE_SRIOV_MIG_DATA_GGTT, 0, size);
+ if (ret)
+ goto fail;
+
+ ret = xe_gt_sriov_pf_config_ggtt_save(gt, vfid, data->vaddr, size);
+ if (ret)
+ goto fail;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
+ if (ret)
+ goto fail;
+
+ return 0;
+
+fail:
+ xe_sriov_pf_migration_data_free(data);
+ xe_gt_sriov_err(gt, "Unable to save VF%u GGTT data (%d)\n", vfid, ret);
+ return ret;
+}
+
+static int pf_restore_vf_ggtt_mig_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ size_t size;
+ int ret;
+
+ size = xe_gt_sriov_pf_config_get_ggtt(gt, vfid) / XE_PAGE_SIZE * sizeof(u64);
+ if (size != data->hdr.size)
+ return -EINVAL;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_config_ggtt_restore(gt, vfid, data->vaddr, size);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_ggtt_size() - Get the size of VF GGTT migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: size in bytes or a negative error code on failure.
+ */
+ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid)
+{
+ if (gt != xe_root_mmio_gt(gt_to_xe(gt)))
+ return 0;
+
+ return xe_gt_sriov_pf_config_get_ggtt(gt, vfid) / XE_PAGE_SIZE * sizeof(u64);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_ggtt_save() - Save VF GGTT migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_save_vf_ggtt_mig_data(gt, vfid);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_ggtt_restore() - Restore VF GGTT migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_restore_vf_ggtt_mig_data(gt, vfid, data);
+}
+
/* Return: number of dwords saved/restored/required or a negative error code on failure */
static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
u64 addr, u32 ndwords)
@@ -290,6 +402,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
size += sizeof(struct xe_sriov_pf_migration_hdr);
total += size;
+ size = xe_gt_sriov_pf_migration_ggtt_size(gt, vfid);
+ if (size < 0)
+ return size;
+ else if (size > 0)
+ size += sizeof(struct xe_sriov_pf_migration_hdr);
+ total += size;
+
return total;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 5df64449232bc..5bb8cba2ea0cb 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -16,6 +16,10 @@ ssize_t xe_gt_sriov_pf_migration_guc_size(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_guc_save(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_pf_migration_data *data);
+ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data);
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 19/26] drm/xe/pf: Add helpers for VF MMIO migration data handling
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (17 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 18/26] drm/xe/pf: Handle GGTT migration data as part of PF control Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 13:28 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 20/26] drm/xe/pf: Handle MMIO migration data as part of PF control Michał Winiarski
` (6 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
In an upcoming change, the VF MMIO migration data will be handled as
part of VF control state machine. Add the necessary helpers to allow the
migration data transfer to/from the VF MMIO registers.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 88 +++++++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf.h | 19 +++++++
2 files changed, 107 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
index c4dda87b47cc8..6ceb9e024e41e 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
@@ -194,6 +194,94 @@ static void pf_clear_vf_scratch_regs(struct xe_gt *gt, unsigned int vfid)
}
}
+/**
+ * xe_gt_sriov_pf_mmio_vf_size - Get the size of VF MMIO register data.
+ * @gt: the &struct xe_gt
+ * @vfid: VF identifier
+ *
+ * Return: size in bytes.
+ */
+size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid)
+{
+ if (xe_gt_is_media_type(gt))
+ return MED_VF_SW_FLAG_COUNT * sizeof(u32);
+ else
+ return VF_SW_FLAG_COUNT * sizeof(u32);
+}
+
+/**
+ * xe_gt_sriov_pf_mmio_vf_save - Save VF MMIO register values to a buffer.
+ * @gt: the &struct xe_gt
+ * @vfid: VF identifier
+ * @buf: destination buffer
+ * @size: destination buffer size in bytes
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size)
+{
+ u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
+ struct xe_reg scratch;
+ u32 *regs = buf;
+ int n, count;
+
+ if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
+ return -EINVAL;
+
+ if (xe_gt_is_media_type(gt)) {
+ count = MED_VF_SW_FLAG_COUNT;
+ for (n = 0; n < count; n++) {
+ scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
+ regs[n] = xe_mmio_read32(>->mmio, scratch);
+ }
+ } else {
+ count = VF_SW_FLAG_COUNT;
+ for (n = 0; n < count; n++) {
+ scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
+ regs[n] = xe_mmio_read32(>->mmio, scratch);
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * xe_gt_sriov_pf_mmio_vf_restore - Restore VF MMIO register values from a buffer.
+ * @gt: the &struct xe_gt
+ * @vfid: VF identifier
+ * @buf: source buffer
+ * @size: source buffer size in bytes
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size)
+{
+ u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
+ const u32 *regs = buf;
+ struct xe_reg scratch;
+ int n, count;
+
+ if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
+ return -EINVAL;
+
+ if (xe_gt_is_media_type(gt)) {
+ count = MED_VF_SW_FLAG_COUNT;
+ for (n = 0; n < count; n++) {
+ scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
+ xe_mmio_write32(>->mmio, scratch, regs[n]);
+ }
+ } else {
+ count = VF_SW_FLAG_COUNT;
+ for (n = 0; n < count; n++) {
+ scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
+ xe_mmio_write32(>->mmio, scratch, regs[n]);
+ }
+ }
+
+ return 0;
+}
+
/**
* xe_gt_sriov_pf_sanitize_hw() - Reset hardware state related to a VF.
* @gt: the &xe_gt
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
index e7fde3f9937af..5e5f31d943d89 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
@@ -6,6 +6,8 @@
#ifndef _XE_GT_SRIOV_PF_H_
#define _XE_GT_SRIOV_PF_H_
+#include <linux/types.h>
+
struct xe_gt;
#ifdef CONFIG_PCI_IOV
@@ -16,6 +18,10 @@ void xe_gt_sriov_pf_init_hw(struct xe_gt *gt);
void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid);
void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt);
void xe_gt_sriov_pf_restart(struct xe_gt *gt);
+size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size);
+int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size);
#else
static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
{
@@ -38,6 +44,19 @@ static inline void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt)
static inline void xe_gt_sriov_pf_restart(struct xe_gt *gt)
{
}
+size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid)
+{
+ return 0;
+}
+int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size)
+{
+ return -ENODEV;
+}
+int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
+ const void *buf, size_t size)
+{
+ return -ENODEV;
+}
#endif
#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 20/26] drm/xe/pf: Handle MMIO migration data as part of PF control
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (18 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 19/26] drm/xe/pf: Add helpers for VF MMIO migration data handling Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 21/26] drm/xe/pf: Add helper to retrieve VF's LMEM object Michał Winiarski
` (5 subsequent siblings)
25 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Connect the helpers to allow save and restore of MMIO migration data in
stop_copy / resume device state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 13 +++
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 104 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 4 +
4 files changed, 122 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index a74f6feca4830..7f8f816c10f20 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -189,6 +189,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(SAVE_WIP);
CASE2STR(SAVE_DATA_GUC);
CASE2STR(SAVE_DATA_GGTT);
+ CASE2STR(SAVE_DATA_MMIO);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
CASE2STR(RESTORE_WIP);
@@ -804,6 +805,7 @@ void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
@@ -852,6 +854,13 @@ static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
return true;
}
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO)) {
+ ret = xe_gt_sriov_pf_migration_mmio_save(gt, vfid);
+ if (ret)
+ goto err;
+ return true;
+ }
+
xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
pf_exit_vf_save_wip(gt, vfid);
pf_enter_vf_saved(gt, vfid);
@@ -873,6 +882,8 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
if (xe_gt_sriov_pf_migration_ggtt_size(gt, vfid) > 0)
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
+ if (xe_gt_sriov_pf_migration_mmio_size(gt, vfid) > 0)
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
pf_queue_vf(gt, vfid);
return true;
}
@@ -983,6 +994,8 @@ static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
switch (data->type) {
case XE_SRIOV_MIG_DATA_GGTT:
return xe_gt_sriov_pf_migration_ggtt_restore(gt, vfid, data);
+ case XE_SRIOV_MIG_DATA_MMIO:
+ return xe_gt_sriov_pf_migration_mmio_restore(gt, vfid, data);
case XE_SRIOV_MIG_DATA_GUC:
return xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
default:
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index c94ff0258306a..f8647722bfb3c 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -73,6 +73,7 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_SAVE_WIP,
XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
+ XE_GT_SRIOV_STATE_SAVE_DATA_MMIO,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 92ecf47e71bc7..43e6e1abb92f9 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -7,6 +7,7 @@
#include "abi/guc_actions_sriov_abi.h"
#include "xe_bo.h"
+#include "xe_gt_sriov_pf.h"
#include "xe_gt_sriov_pf_config.h"
#include "xe_gt_sriov_pf_control.h"
#include "xe_gt_sriov_pf_helpers.h"
@@ -379,6 +380,102 @@ int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
return pf_restore_vf_guc_state(gt, vfid, data);
}
+/**
+ * xe_gt_sriov_pf_migration_mmio_size() - Get the size of VF MMIO migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: size in bytes or a negative error code on failure.
+ */
+ssize_t xe_gt_sriov_pf_migration_mmio_size(struct xe_gt *gt, unsigned int vfid)
+{
+ return xe_gt_sriov_pf_mmio_vf_size(gt, vfid);
+}
+
+static int pf_save_vf_mmio_mig_data(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_sriov_pf_migration_data *data;
+ size_t size;
+ int ret;
+
+ size = xe_gt_sriov_pf_migration_mmio_size(gt, vfid);
+ if (size == 0)
+ return 0;
+
+ data = xe_sriov_pf_migration_data_alloc(gt_to_xe(gt));
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_pf_migration_data_init(data, gt->tile->id, gt->info.id,
+ XE_SRIOV_MIG_DATA_MMIO, 0, size);
+ if (ret)
+ goto fail;
+
+ ret = xe_gt_sriov_pf_mmio_vf_save(gt, vfid, data->vaddr, size);
+ if (ret)
+ goto fail;
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
+ if (ret)
+ goto fail;
+
+ return 0;
+
+fail:
+ xe_sriov_pf_migration_data_free(data);
+ xe_gt_sriov_err(gt, "Unable to save VF%u MMIO data (%d)\n", vfid, ret);
+ return ret;
+}
+
+static int pf_restore_vf_mmio_mig_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ pf_dump_mig_data(gt, vfid, data);
+
+ return xe_gt_sriov_pf_mmio_vf_restore(gt, vfid, data->vaddr, data->size);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_mmio_save() - Save VF MMIO migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_save_vf_mmio_mig_data(gt, vfid);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_mmio_restore() - Restore VF MMIO migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_restore_vf_mmio_mig_data(gt, vfid, data);
+}
+
/**
* xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT
* @gt: the &struct xe_gt
@@ -409,6 +506,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
size += sizeof(struct xe_sriov_pf_migration_hdr);
total += size;
+ size = xe_gt_sriov_pf_migration_mmio_size(gt, vfid);
+ if (size < 0)
+ return size;
+ else if (size > 0)
+ size += sizeof(struct xe_sriov_pf_migration_hdr);
+ total += size;
+
return total;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 5bb8cba2ea0cb..66967da761254 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -20,6 +20,10 @@ ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_pf_migration_data *data);
+ssize_t xe_gt_sriov_pf_migration_mmio_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data);
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 21/26] drm/xe/pf: Add helper to retrieve VF's LMEM object
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (19 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 20/26] drm/xe/pf: Handle MMIO migration data as part of PF control Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 22/26] drm/xe/migrate: Add function for raw copy of VRAM and CCS Michał Winiarski
` (4 subsequent siblings)
25 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
From: Lukasz Laguna <lukasz.laguna@intel.com>
Instead of accessing VF's lmem_obj directly, introduce a helper function
to make the access more convenient.
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 30 ++++++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 1 +
2 files changed, 31 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index 51027921b2988..94e434cac5cda 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -1662,6 +1662,36 @@ int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid,
"LMEM", n, err);
}
+static struct xe_bo *pf_get_vf_config_lmem_obj(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
+
+ return config->lmem_obj;
+}
+
+/**
+ * xe_gt_sriov_pf_config_get_lmem_obj - Take a reference to &struct xe_bo backing VF LMEM.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function can only be called on PF.
+ *
+ * Return: pointer to &struct xe_bo backing VF LMEM.
+ */
+struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_bo *lmem_obj;
+
+ xe_gt_assert(gt, vfid);
+
+ mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
+ lmem_obj = pf_get_vf_config_lmem_obj(gt, vfid);
+ xe_bo_get(lmem_obj);
+ mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
+
+ return lmem_obj;
+}
+
static u64 pf_query_free_lmem(struct xe_gt *gt)
{
struct xe_tile *tile = gt->tile;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
index 6916b8f58ebf2..03c5dc0cd5fef 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
@@ -36,6 +36,7 @@ int xe_gt_sriov_pf_config_set_lmem(struct xe_gt *gt, unsigned int vfid, u64 size
int xe_gt_sriov_pf_config_set_fair_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs);
int xe_gt_sriov_pf_config_bulk_set_lmem(struct xe_gt *gt, unsigned int vfid, unsigned int num_vfs,
u64 size);
+struct xe_bo *xe_gt_sriov_pf_config_get_lmem_obj(struct xe_gt *gt, unsigned int vfid);
u32 xe_gt_sriov_pf_config_get_exec_quantum(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_config_set_exec_quantum(struct xe_gt *gt, unsigned int vfid, u32 exec_quantum);
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 22/26] drm/xe/migrate: Add function for raw copy of VRAM and CCS
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (20 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 21/26] drm/xe/pf: Add helper to retrieve VF's LMEM object Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 18:54 ` Matthew Brost
2025-10-11 19:38 ` [PATCH 23/26] drm/xe/pf: Handle VRAM migration data as part of PF control Michał Winiarski
` (3 subsequent siblings)
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna
From: Lukasz Laguna <lukasz.laguna@intel.com>
Introduce a new function to copy data between VRAM and sysmem objects.
It's specifically designed for raw data copies, whereas the existing
xe_migrate_copy() is tailored for eviction and restore operations,
which involves additional logic. For instance, xe_migrate_copy() skips
CCS metadata copies on Xe2 dGPUs, as it's unnecessary in eviction
scenario. However, in cases like VF migration, CCS metadata has to be
saved and restored in its raw form.
Additionally, xe_migrate_raw_vram_copy() allows copying not only entire
objects, but also chunks of data, as well as copying corresponding CCS
metadata to or from a dedicated buffer object, which are essential in
case of VF migration.
Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
---
drivers/gpu/drm/xe/xe_migrate.c | 214 +++++++++++++++++++++++++++++++-
drivers/gpu/drm/xe/xe_migrate.h | 4 +
2 files changed, 217 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index 7345a5b65169a..3f8804a2f4ee2 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -501,7 +501,7 @@ int xe_migrate_init(struct xe_migrate *m)
static u64 max_mem_transfer_per_pass(struct xe_device *xe)
{
- if (!IS_DGFX(xe) && xe_device_has_flat_ccs(xe))
+ if ((!IS_DGFX(xe) || IS_SRIOV_PF(xe)) && xe_device_has_flat_ccs(xe))
return MAX_CCS_LIMITED_TRANSFER;
return MAX_PREEMPTDISABLE_TRANSFER;
@@ -1142,6 +1142,218 @@ struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate)
return migrate->q;
}
+/**
+ * xe_migrate_raw_vram_copy() - Raw copy of VRAM object and corresponding CCS.
+ * @vram_bo: The VRAM buffer object.
+ * @vram_offset: The VRAM offset.
+ * @sysmem_bo: The sysmem buffer object. If copying only CCS metadata set this
+ * to NULL.
+ * @sysmem_offset: The sysmem offset.
+ * @ccs_bo: The CCS buffer object located in sysmem. If copying of CCS metadata
+ * is not needed set this to NULL.
+ * @ccs_offset: The CCS offset.
+ * @size: The size of VRAM chunk to copy.
+ * @to_sysmem: True to copy from VRAM to sysmem, false for opposite direction.
+ *
+ * Copies the content of buffer object from or to VRAM. If supported and
+ * needed, it also copies corresponding CCS metadata.
+ *
+ * Return: Pointer to a dma_fence representing the last copy batch, or
+ * an error pointer on failure. If there is a failure, any copy operation
+ * started by the function call has been synced.
+ */
+struct dma_fence *xe_migrate_raw_vram_copy(struct xe_bo *vram_bo, u64 vram_offset,
+ struct xe_bo *sysmem_bo, u64 sysmem_offset,
+ struct xe_bo *ccs_bo, u64 ccs_offset,
+ u64 size, bool to_sysmem)
+{
+ struct xe_device *xe = xe_bo_device(vram_bo);
+ struct xe_tile *tile = vram_bo->tile;
+ struct xe_gt *gt = tile->primary_gt;
+ struct xe_migrate *m = tile->migrate;
+ struct dma_fence *fence = NULL;
+ struct ttm_resource *vram = vram_bo->ttm.resource, *sysmem, *ccs;
+ struct xe_res_cursor vram_it, sysmem_it, ccs_it;
+ u64 vram_L0_ofs, sysmem_L0_ofs;
+ u32 vram_L0_pt, sysmem_L0_pt;
+ u64 vram_L0, sysmem_L0;
+ bool copy_content = sysmem_bo ? true : false;
+ bool copy_ccs = ccs_bo ? true : false;
+ bool use_comp_pat = copy_content && to_sysmem &&
+ xe_device_has_flat_ccs(xe) && GRAPHICS_VER(xe) >= 20;
+ int pass = 0;
+ int err;
+
+ if (!copy_content && !copy_ccs)
+ return ERR_PTR(-EINVAL);
+
+ if (!IS_ALIGNED(vram_offset | sysmem_offset | ccs_offset | size, PAGE_SIZE))
+ return ERR_PTR(-EINVAL);
+
+ if (!xe_bo_is_vram(vram_bo))
+ return ERR_PTR(-EINVAL);
+
+ if (range_overflows(vram_offset, size, (u64)vram_bo->ttm.base.size))
+ return ERR_PTR(-EOVERFLOW);
+
+ if (copy_content) {
+ if (xe_bo_is_vram(sysmem_bo))
+ return ERR_PTR(-EINVAL);
+ if (range_overflows(sysmem_offset, size, (u64)sysmem_bo->ttm.base.size))
+ return ERR_PTR(-EOVERFLOW);
+ }
+
+ if (copy_ccs) {
+ if (xe_bo_is_vram(ccs_bo))
+ return ERR_PTR(-EINVAL);
+ if (!xe_device_has_flat_ccs(xe))
+ return ERR_PTR(-EOPNOTSUPP);
+ if (ccs_bo->ttm.base.size < xe_device_ccs_bytes(xe, size))
+ return ERR_PTR(-EINVAL);
+ if (range_overflows(ccs_offset, (u64)xe_device_ccs_bytes(xe, size),
+ (u64)ccs_bo->ttm.base.size))
+ return ERR_PTR(-EOVERFLOW);
+ }
+
+ xe_res_first(vram, vram_offset, size, &vram_it);
+
+ if (copy_content) {
+ sysmem = sysmem_bo->ttm.resource;
+ xe_res_first_sg(xe_bo_sg(sysmem_bo), sysmem_offset, size, &sysmem_it);
+ }
+
+ if (copy_ccs) {
+ ccs = ccs_bo->ttm.resource;
+ xe_res_first_sg(xe_bo_sg(ccs_bo), ccs_offset, xe_device_ccs_bytes(xe, size),
+ &ccs_it);
+ }
+
+ while (size) {
+ u32 pte_flags = PTE_UPDATE_FLAG_IS_VRAM;
+ u32 batch_size = 2; /* arb_clear() + MI_BATCH_BUFFER_END */
+ struct xe_sched_job *job;
+ struct xe_bb *bb;
+ u32 flush_flags = 0;
+ u32 update_idx;
+ u64 ccs_ofs, ccs_size;
+ u32 ccs_pt;
+
+ bool usm = xe->info.has_usm;
+ u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE;
+
+ vram_L0 = xe_migrate_res_sizes(m, &vram_it);
+
+ if (copy_content) {
+ sysmem_L0 = xe_migrate_res_sizes(m, &sysmem_it);
+ vram_L0 = min(vram_L0, sysmem_L0);
+ }
+
+ drm_dbg(&xe->drm, "Pass %u, size: %llu\n", pass++, vram_L0);
+
+ pte_flags |= use_comp_pat ? PTE_UPDATE_FLAG_IS_COMP_PTE : 0;
+ batch_size += pte_update_size(m, pte_flags, vram, &vram_it, &vram_L0,
+ &vram_L0_ofs, &vram_L0_pt, 0, 0, avail_pts);
+ if (copy_content) {
+ batch_size += pte_update_size(m, 0, sysmem, &sysmem_it, &vram_L0,
+ &sysmem_L0_ofs, &sysmem_L0_pt, 0, avail_pts,
+ avail_pts);
+ }
+
+ if (copy_ccs) {
+ ccs_size = xe_device_ccs_bytes(xe, vram_L0);
+ batch_size += pte_update_size(m, 0, NULL, &ccs_it, &ccs_size, &ccs_ofs,
+ &ccs_pt, 0, copy_content ? 2 * avail_pts :
+ avail_pts, avail_pts);
+ xe_assert(xe, IS_ALIGNED(ccs_it.start, PAGE_SIZE));
+ }
+
+ batch_size += copy_content ? EMIT_COPY_DW : 0;
+ batch_size += copy_ccs ? EMIT_COPY_CCS_DW : 0;
+
+ bb = xe_bb_new(gt, batch_size, usm);
+ if (IS_ERR(bb)) {
+ err = PTR_ERR(bb);
+ goto err_sync;
+ }
+
+ if (xe_migrate_allow_identity(vram_L0, &vram_it))
+ xe_res_next(&vram_it, vram_L0);
+ else
+ emit_pte(m, bb, vram_L0_pt, true, use_comp_pat, &vram_it, vram_L0, vram);
+
+ if (copy_content)
+ emit_pte(m, bb, sysmem_L0_pt, false, false, &sysmem_it, vram_L0, sysmem);
+
+ if (copy_ccs)
+ emit_pte(m, bb, ccs_pt, false, false, &ccs_it, ccs_size, ccs);
+
+ bb->cs[bb->len++] = MI_BATCH_BUFFER_END;
+ update_idx = bb->len;
+
+ if (copy_content)
+ emit_copy(gt, bb, to_sysmem ? vram_L0_ofs : sysmem_L0_ofs, to_sysmem ?
+ sysmem_L0_ofs : vram_L0_ofs, vram_L0, XE_PAGE_SIZE);
+
+ if (copy_ccs) {
+ emit_copy_ccs(gt, bb, to_sysmem ? ccs_ofs : vram_L0_ofs, !to_sysmem,
+ to_sysmem ? vram_L0_ofs : ccs_ofs, to_sysmem, vram_L0);
+ flush_flags = to_sysmem ? 0 : MI_FLUSH_DW_CCS;
+ }
+
+ job = xe_bb_create_migration_job(m->q, bb, xe_migrate_batch_base(m, usm),
+ update_idx);
+ if (IS_ERR(job)) {
+ err = PTR_ERR(job);
+ goto err;
+ }
+
+ xe_sched_job_add_migrate_flush(job, flush_flags | MI_INVALIDATE_TLB);
+ if (!fence) {
+ err = xe_sched_job_add_deps(job, vram_bo->ttm.base.resv,
+ DMA_RESV_USAGE_BOOKKEEP);
+ if (!err && copy_content)
+ err = xe_sched_job_add_deps(job, sysmem_bo->ttm.base.resv,
+ DMA_RESV_USAGE_BOOKKEEP);
+ if (!err && copy_ccs)
+ err = xe_sched_job_add_deps(job, ccs_bo->ttm.base.resv,
+ DMA_RESV_USAGE_BOOKKEEP);
+ if (err)
+ goto err_job;
+ }
+
+ mutex_lock(&m->job_mutex);
+ xe_sched_job_arm(job);
+ dma_fence_put(fence);
+ fence = dma_fence_get(&job->drm.s_fence->finished);
+ xe_sched_job_push(job);
+
+ dma_fence_put(m->fence);
+ m->fence = dma_fence_get(fence);
+
+ mutex_unlock(&m->job_mutex);
+
+ xe_bb_free(bb, fence);
+ size -= vram_L0;
+ continue;
+
+err_job:
+ xe_sched_job_put(job);
+err:
+ xe_bb_free(bb, NULL);
+
+err_sync:
+ /* Sync partial copy if any. FIXME: under job_mutex? */
+ if (fence) {
+ dma_fence_wait(fence, false);
+ dma_fence_put(fence);
+ }
+
+ return ERR_PTR(err);
+ }
+
+ return fence;
+}
+
static void emit_clear_link_copy(struct xe_gt *gt, struct xe_bb *bb, u64 src_ofs,
u32 size, u32 pitch)
{
diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h
index 4fad324b62535..0d8944b1cee61 100644
--- a/drivers/gpu/drm/xe/xe_migrate.h
+++ b/drivers/gpu/drm/xe/xe_migrate.h
@@ -131,6 +131,10 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
struct xe_lrc *xe_migrate_lrc(struct xe_migrate *migrate);
struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate);
+struct dma_fence *xe_migrate_raw_vram_copy(struct xe_bo *vram_bo, u64 vram_offset,
+ struct xe_bo *sysmem_bo, u64 sysmem_offset,
+ struct xe_bo *ccs_bo, u64 ccs_offset,
+ u64 size, bool to_sysmem);
int xe_migrate_access_memory(struct xe_migrate *m, struct xe_bo *bo,
unsigned long offset, void *buf, int len,
int write);
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 23/26] drm/xe/pf: Handle VRAM migration data as part of PF control
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (21 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 22/26] drm/xe/migrate: Add function for raw copy of VRAM and CCS Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 24/26] drm/xe/pf: Add wait helper for VF FLR Michał Winiarski
` (2 subsequent siblings)
25 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Connect the helpers to allow save and restore of VRAM migration data in
stop_copy / resume device state.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 13 +
.../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 1 +
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 228 ++++++++++++++++++
drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 4 +
4 files changed, 246 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
index 7f8f816c10f20..646914a3f7121 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
@@ -190,6 +190,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
CASE2STR(SAVE_DATA_GUC);
CASE2STR(SAVE_DATA_GGTT);
CASE2STR(SAVE_DATA_MMIO);
+ CASE2STR(SAVE_DATA_VRAM);
CASE2STR(SAVE_FAILED);
CASE2STR(SAVED);
CASE2STR(RESTORE_WIP);
@@ -805,6 +806,7 @@ void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
{
+ pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
@@ -861,6 +863,13 @@ static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
return true;
}
+ if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM)) {
+ ret = xe_gt_sriov_pf_migration_vram_save(gt, vfid);
+ if (ret)
+ goto err;
+ return true;
+ }
+
xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
pf_exit_vf_save_wip(gt, vfid);
pf_enter_vf_saved(gt, vfid);
@@ -884,6 +893,8 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
if (xe_gt_sriov_pf_migration_mmio_size(gt, vfid) > 0)
pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_MMIO);
+ if (xe_gt_sriov_pf_migration_vram_size(gt, vfid) > 0)
+ pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_VRAM);
pf_queue_vf(gt, vfid);
return true;
}
@@ -998,6 +1009,8 @@ static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
return xe_gt_sriov_pf_migration_mmio_restore(gt, vfid, data);
case XE_SRIOV_MIG_DATA_GUC:
return xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
+ case XE_SRIOV_MIG_DATA_VRAM:
+ return xe_gt_sriov_pf_migration_vram_restore(gt, vfid, data);
default:
xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
pf_enter_vf_restore_failed(gt, vfid);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
index f8647722bfb3c..d7efe4a3bab92 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
@@ -74,6 +74,7 @@ enum xe_gt_sriov_control_bits {
XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
XE_GT_SRIOV_STATE_SAVE_DATA_MMIO,
+ XE_GT_SRIOV_STATE_SAVE_DATA_VRAM,
XE_GT_SRIOV_STATE_SAVE_FAILED,
XE_GT_SRIOV_STATE_SAVED,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
index 43e6e1abb92f9..d0beec25bc86c 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
@@ -15,6 +15,7 @@
#include "xe_gt_sriov_printk.h"
#include "xe_guc_buf.h"
#include "xe_guc_ct.h"
+#include "xe_migrate.h"
#include "xe_sriov.h"
#include "xe_sriov_pf_migration.h"
#include "xe_sriov_pf_migration_data.h"
@@ -476,6 +477,226 @@ int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
return pf_restore_vf_mmio_mig_data(gt, vfid, data);
}
+/**
+ * xe_gt_sriov_pf_migration_vram_size() - Get the size of VF VRAM migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: size in bytes or a negative error code on failure.
+ */
+ssize_t xe_gt_sriov_pf_migration_vram_size(struct xe_gt *gt, unsigned int vfid)
+{
+ if (gt != xe_root_mmio_gt(gt_to_xe(gt)))
+ return 0;
+
+ return xe_gt_sriov_pf_config_get_lmem(gt, vfid);
+}
+
+static struct dma_fence *__pf_save_restore_vram(struct xe_gt *gt, unsigned int vfid,
+ struct xe_bo *vram, u64 vram_offset,
+ struct xe_bo *sysmem, u64 sysmem_offset,
+ struct xe_bo *ccs, u64 ccs_offset,
+ size_t size, bool save)
+{
+ struct dma_fence *ret = NULL;
+ struct drm_exec exec;
+ int err;
+
+ xe_gt_assert(gt, sysmem || ccs);
+
+ drm_exec_init(&exec, DRM_EXEC_INTERRUPTIBLE_WAIT, 0);
+ drm_exec_until_all_locked(&exec) {
+ err = drm_exec_lock_obj(&exec, &vram->ttm.base);
+ drm_exec_retry_on_contention(&exec);
+ if (err) {
+ ret = ERR_PTR(err);
+ goto err;
+ }
+
+ if (sysmem) {
+ err = drm_exec_lock_obj(&exec, &sysmem->ttm.base);
+ drm_exec_retry_on_contention(&exec);
+ if (err) {
+ ret = ERR_PTR(err);
+ goto err;
+ }
+ }
+
+ if (ccs) {
+ err = drm_exec_lock_obj(&exec, &ccs->ttm.base);
+ drm_exec_retry_on_contention(&exec);
+ if (err) {
+ ret = ERR_PTR(err);
+ goto err;
+ }
+ }
+ }
+
+ ret = xe_migrate_raw_vram_copy(vram, vram_offset,
+ sysmem, sysmem_offset,
+ ccs, ccs_offset,
+ size, save);
+
+err:
+ drm_exec_fini(&exec);
+
+ return ret;
+}
+
+static int pf_save_vram_chunk(struct xe_gt *gt, unsigned int vfid,
+ struct xe_bo *src_vram, u64 src_vram_offset,
+ size_t size)
+{
+ struct xe_sriov_pf_migration_data *data;
+ struct dma_fence *fence;
+ int ret;
+
+ data = xe_sriov_pf_migration_data_alloc(gt_to_xe(gt));
+ if (!data)
+ return -ENOMEM;
+
+ ret = xe_sriov_pf_migration_data_init(data, gt->tile->id, gt->info.id,
+ XE_SRIOV_MIG_DATA_VRAM, src_vram_offset, size);
+ if (ret)
+ goto fail;
+
+ fence = __pf_save_restore_vram(gt, vfid,
+ src_vram, src_vram_offset,
+ data->bo, 0,
+ NULL, 0, size, true);
+
+ ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
+ dma_fence_put(fence);
+ if (!ret) {
+ ret = -ETIME;
+ goto fail;
+ }
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ ret = xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
+ if (ret)
+ goto fail;
+
+ return 0;
+
+fail:
+ xe_sriov_pf_migration_data_free(data);
+ return ret;
+}
+
+#define VF_VRAM_STATE_CHUNK_MAX_SIZE SZ_512M
+static int pf_save_vf_vram_mig_data(struct xe_gt *gt, unsigned int vfid)
+{
+ struct xe_bo *vram;
+ loff_t offset = 0;
+ size_t size;
+ int ret;
+
+ vram = xe_gt_sriov_pf_config_get_lmem_obj(gt, vfid);
+ if (!vram)
+ return -ENXIO;
+
+ size = xe_bo_size(vram);
+
+ while (size > 0) {
+ size_t chunk_size = min(size, VF_VRAM_STATE_CHUNK_MAX_SIZE);
+
+ ret = pf_save_vram_chunk(gt, vfid, vram, offset, chunk_size);
+ if (ret)
+ goto fail;
+
+ offset += chunk_size;
+ size -= chunk_size;
+ }
+
+ xe_bo_put(vram);
+
+ return 0;
+
+fail:
+ xe_bo_put(vram);
+ xe_gt_sriov_err(gt, "Unable to save VF%u VRAM data (%d)\n", vfid, ret);
+ return ret;
+}
+
+static int pf_restore_vf_vram_mig_data(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ u64 end = data->hdr.offset + data->hdr.size;
+ struct dma_fence *fence;
+ struct xe_bo *vram;
+ size_t size;
+ int ret = 0;
+
+ vram = xe_gt_sriov_pf_config_get_lmem_obj(gt, vfid);
+ if (!vram)
+ return -ENXIO;
+
+ size = xe_bo_size(vram);
+
+ if (end > size || end < data->hdr.size) {
+ ret = -EINVAL;
+ goto err;
+ }
+
+ pf_dump_mig_data(gt, vfid, data);
+
+ fence = __pf_save_restore_vram(gt, vfid, vram, data->hdr.offset,
+ data->bo, 0,
+ NULL, 0, data->hdr.size, false);
+ ret = dma_fence_wait_timeout(fence, false, 5 * HZ);
+ dma_fence_put(fence);
+ if (!ret) {
+ ret = -ETIME;
+ goto err;
+ }
+
+ return 0;
+err:
+ xe_bo_put(vram);
+ return ret;
+}
+
+/**
+ * xe_gt_sriov_pf_migration_vram_save() - Save VF VRAM migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_vram_save(struct xe_gt *gt, unsigned int vfid)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_save_vf_vram_mig_data(gt, vfid);
+}
+
+/**
+ * xe_gt_sriov_pf_migration_vram_restore() - Restore VF VRAM migration data.
+ * @gt: the &struct xe_gt
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_migration_vram_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data)
+{
+ xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+ xe_gt_assert(gt, vfid != PFID);
+ xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
+
+ return pf_restore_vf_vram_mig_data(gt, vfid, data);
+}
+
/**
* xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT
* @gt: the &struct xe_gt
@@ -513,6 +734,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
size += sizeof(struct xe_sriov_pf_migration_hdr);
total += size;
+ size = xe_gt_sriov_pf_migration_vram_size(gt, vfid);
+ if (size < 0)
+ return size;
+ else if (size > 0)
+ size += sizeof(struct xe_sriov_pf_migration_hdr);
+ total += size;
+
return total;
}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
index 66967da761254..c6e6821042619 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
@@ -24,6 +24,10 @@ ssize_t xe_gt_sriov_pf_migration_mmio_size(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_mmio_save(struct xe_gt *gt, unsigned int vfid);
int xe_gt_sriov_pf_migration_mmio_restore(struct xe_gt *gt, unsigned int vfid,
struct xe_sriov_pf_migration_data *data);
+ssize_t xe_gt_sriov_pf_migration_vram_size(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_vram_save(struct xe_gt *gt, unsigned int vfid);
+int xe_gt_sriov_pf_migration_vram_restore(struct xe_gt *gt, unsigned int vfid,
+ struct xe_sriov_pf_migration_data *data);
ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 24/26] drm/xe/pf: Add wait helper for VF FLR
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (22 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 23/26] drm/xe/pf: Handle VRAM migration data as part of PF control Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 13:49 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
2025-10-11 19:38 ` [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
25 siblings, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
VF FLR requires additional processing done by PF driver.
Add a helper to be used as part of VF driver .reset_done().
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/xe_sriov_pf_control.c | 24 ++++++++++++++++++++++++
drivers/gpu/drm/xe/xe_sriov_pf_control.h | 1 +
2 files changed, 25 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
index 10e1f18aa8b11..24845644f269e 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
@@ -122,6 +122,30 @@ int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid)
return result;
}
+/**
+ * xe_sriov_pf_control_wait_flr() - Wait for a VF reset (FLR) to complete.
+ * @xe: the &xe_device
+ * @vfid: the VF identifier
+ *
+ * This function is for PF only.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_pf_control_wait_flr(struct xe_device *xe, unsigned int vfid)
+{
+ struct xe_gt *gt;
+ unsigned int id;
+ int result = 0;
+ int err;
+
+ for_each_gt(gt, xe, id) {
+ err = xe_gt_sriov_pf_control_wait_flr(gt, vfid);
+ result = result ? -EUCLEAN : err;
+ }
+
+ return result;
+}
+
/**
* xe_sriov_pf_control_sync_flr() - Synchronize a VF FLR between all GTs.
* @xe: the &xe_device
diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
index 512fd21d87c1e..c8ea54768cfaa 100644
--- a/drivers/gpu/drm/xe/xe_sriov_pf_control.h
+++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
@@ -12,6 +12,7 @@ int xe_sriov_pf_control_pause_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_resume_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_stop_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid);
+int xe_sriov_pf_control_wait_flr(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid);
int xe_sriov_pf_control_wait_save_vf(struct xe_device *xe, unsigned int vfid);
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 25/26] drm/xe/pf: Export helpers for VFIO
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (23 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 24/26] drm/xe/pf: Add wait helper for VF FLR Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-12 18:32 ` Matthew Brost
2025-10-13 14:02 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
25 siblings, 2 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
Vendor-specific VFIO driver for Xe will implement VF migration.
Export everything that's needed for migration ops.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
drivers/gpu/drm/xe/Makefile | 2 +
drivers/gpu/drm/xe/xe_sriov_vfio.c | 252 +++++++++++++++++++++++++++++
include/drm/intel/xe_sriov_vfio.h | 28 ++++
3 files changed, 282 insertions(+)
create mode 100644 drivers/gpu/drm/xe/xe_sriov_vfio.c
create mode 100644 include/drm/intel/xe_sriov_vfio.h
diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
index e253d65366de4..a5c5afff42aa6 100644
--- a/drivers/gpu/drm/xe/Makefile
+++ b/drivers/gpu/drm/xe/Makefile
@@ -181,6 +181,8 @@ xe-$(CONFIG_PCI_IOV) += \
xe_sriov_pf_service.o \
xe_tile_sriov_pf_debugfs.o
+xe-$(CONFIG_XE_VFIO_PCI) += xe_sriov_vfio.o
+
# include helpers for tests even when XE is built-in
ifdef CONFIG_DRM_XE_KUNIT_TEST
xe-y += tests/xe_kunit_helpers.o
diff --git a/drivers/gpu/drm/xe/xe_sriov_vfio.c b/drivers/gpu/drm/xe/xe_sriov_vfio.c
new file mode 100644
index 0000000000000..a510d1bde93f0
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_sriov_vfio.c
@@ -0,0 +1,252 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <drm/intel/xe_sriov_vfio.h>
+
+#include "xe_pm.h"
+#include "xe_sriov.h"
+#include "xe_sriov_pf_control.h"
+#include "xe_sriov_pf_migration.h"
+#include "xe_sriov_pf_migration_data.h"
+
+/**
+ * xe_sriov_vfio_migration_supported() - Check if migration is supported.
+ * @pdev: PF PCI device
+ *
+ * Return: true if migration is supported, false otherwise.
+ */
+bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ return xe_sriov_pf_migration_supported(xe);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_migration_supported, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_wait_flr_done - Wait for VF FLR completion.
+ * @pdev: PF PCI device
+ * @vfid: VF identifier
+ *
+ * This function will wait until VF FLR is processed by PF on all tiles (or
+ * until timeout occurs).
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ return xe_sriov_pf_control_wait_flr(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_wait_flr_done, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_stop - Stop VF.
+ * @pdev: PF PCI device
+ * @vfid: VF identifier
+ *
+ * This function will pause VF on all tiles/GTs.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+ int ret;
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_control_pause_vf(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ return ret;
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_run - Run VF.
+ * @pdev: PF PCI device
+ * @vfid: VF identifier
+ *
+ * This function will resume VF on all tiles.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+ int ret;
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_control_resume_vf(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ return ret;
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_run, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_stop_copy_enter - Copy VF migration data from device (while stopped).
+ * @pdev: PF PCI device
+ * @vfid: VF identifier
+ *
+ * This function will save VF migration data on all tiles.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+ int ret;
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_control_save_vf(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ return ret;
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_enter, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_stop_copy_exit - Wait until VF migration data save is done.
+ * @pdev: PF PCI device
+ * @vfid: VF identifier
+ *
+ * This function will wait until VF migration data is saved on all tiles.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+ int ret;
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_control_wait_save_vf(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ return ret;
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_exit, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_resume_enter - Copy VF migration data to device (while stopped).
+ * @pdev: PF PCI device
+ * @vfid: VF identifier
+ *
+ * This function will restore VF migration data on all tiles.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+ int ret;
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_control_restore_vf(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ return ret;
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_enter, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_resume_exit - Wait until VF migration data is copied to the device.
+ * @pdev: PF PCI device
+ * @vfid: VF identifier
+ *
+ * This function will wait until VF migration data is restored on all tiles.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+ int ret;
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_control_wait_restore_vf(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ return ret;
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_exit, "xe-vfio-pci");
+
+/**
+ * xe_sriov_vfio_error - Move VF to error state.
+ * @pdev: PF PCI device
+ * @vfid: VF identifier
+ *
+ * This function will stop VF on all tiles.
+ * Reset is needed to move it out of error state.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+ int ret;
+
+ if (!IS_SRIOV_PF(xe))
+ return -ENODEV;
+
+ xe_pm_runtime_get(xe);
+ ret = xe_sriov_pf_control_stop_vf(xe, vfid);
+ xe_pm_runtime_put(xe);
+
+ return ret;
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_error, "xe-vfio-pci");
+
+ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
+ char __user *buf, size_t len)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ return xe_sriov_pf_migration_data_read(xe, vfid, buf, len);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_read, "xe-vfio-pci");
+
+ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
+ const char __user *buf, size_t len)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ return xe_sriov_pf_migration_data_write(xe, vfid, buf, len);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_write, "xe-vfio-pci");
+
+ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid)
+{
+ struct xe_device *xe = pci_get_drvdata(pdev);
+
+ return xe_sriov_pf_migration_size(xe, vfid);
+}
+EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_size, "xe-vfio-pci");
diff --git a/include/drm/intel/xe_sriov_vfio.h b/include/drm/intel/xe_sriov_vfio.h
new file mode 100644
index 0000000000000..24e272f84c0e6
--- /dev/null
+++ b/include/drm/intel/xe_sriov_vfio.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_SRIOV_VFIO_H_
+#define _XE_SRIOV_VFIO_H_
+
+#include <linux/types.h>
+
+struct pci_dev;
+
+bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
+int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid);
+ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
+ char __user *buf, size_t len);
+ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
+ const char __user *buf, size_t len);
+ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid);
+
+#endif /* _XE_SRIOV_VFIO_H_ */
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
` (24 preceding siblings ...)
2025-10-11 19:38 ` [PATCH 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
@ 2025-10-11 19:38 ` Michał Winiarski
2025-10-13 19:00 ` Rodrigo Vivi
2025-10-21 23:03 ` Jason Gunthorpe
25 siblings, 2 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-11 19:38 UTC (permalink / raw)
To: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Michal Wajdeczko, Jani Nikula,
Joonas Lahtinen, Tvrtko Ursulin, David Airlie, Simona Vetter,
Lukasz Laguna, Michał Winiarski
In addition to generic VFIO PCI functionality, the driver implements
VFIO migration uAPI, allowing userspace to enable migration for Intel
Graphics SR-IOV Virtual Functions.
The driver binds to VF device, and uses API exposed by Xe driver bound
to PF device to control VF device state and transfer the migration data.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
MAINTAINERS | 7 +
drivers/vfio/pci/Kconfig | 2 +
drivers/vfio/pci/Makefile | 2 +
drivers/vfio/pci/xe/Kconfig | 12 +
drivers/vfio/pci/xe/Makefile | 3 +
drivers/vfio/pci/xe/main.c | 470 +++++++++++++++++++++++++++++++++++
6 files changed, 496 insertions(+)
create mode 100644 drivers/vfio/pci/xe/Kconfig
create mode 100644 drivers/vfio/pci/xe/Makefile
create mode 100644 drivers/vfio/pci/xe/main.c
diff --git a/MAINTAINERS b/MAINTAINERS
index d46e9f2aaf2ad..ce84b021e6679 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -26567,6 +26567,13 @@ L: virtualization@lists.linux.dev
S: Maintained
F: drivers/vfio/pci/virtio
+VFIO XE PCI DRIVER
+M: Michał Winiarski <michal.winiarski@intel.com>
+L: kvm@vger.kernel.org
+L: intel-xe@lists.freedesktop.org
+S: Supported
+F: drivers/vfio/pci/xe
+
VGA_SWITCHEROO
R: Lukas Wunner <lukas@wunner.de>
S: Maintained
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 2b0172f546652..c100f0ab87f2d 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -67,4 +67,6 @@ source "drivers/vfio/pci/nvgrace-gpu/Kconfig"
source "drivers/vfio/pci/qat/Kconfig"
+source "drivers/vfio/pci/xe/Kconfig"
+
endmenu
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index cf00c0a7e55c8..f5d46aa9347b9 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -19,3 +19,5 @@ obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio/
obj-$(CONFIG_NVGRACE_GPU_VFIO_PCI) += nvgrace-gpu/
obj-$(CONFIG_QAT_VFIO_PCI) += qat/
+
+obj-$(CONFIG_XE_VFIO_PCI) += xe/
diff --git a/drivers/vfio/pci/xe/Kconfig b/drivers/vfio/pci/xe/Kconfig
new file mode 100644
index 0000000000000..787be88268685
--- /dev/null
+++ b/drivers/vfio/pci/xe/Kconfig
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config XE_VFIO_PCI
+ tristate "VFIO support for Intel Graphics"
+ depends on DRM_XE
+ select VFIO_PCI_CORE
+ help
+ This option enables vendor-specific VFIO driver for Intel Graphics.
+ In addition to generic VFIO PCI functionality, it implements VFIO
+ migration uAPI allowing userspace to enable migration for
+ Intel Graphics SR-IOV Virtual Functions supported by the Xe driver.
+
+ If you don't know what to do here, say N.
diff --git a/drivers/vfio/pci/xe/Makefile b/drivers/vfio/pci/xe/Makefile
new file mode 100644
index 0000000000000..13aa0fd192cd4
--- /dev/null
+++ b/drivers/vfio/pci/xe/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_XE_VFIO_PCI) += xe-vfio-pci.o
+xe-vfio-pci-y := main.o
diff --git a/drivers/vfio/pci/xe/main.c b/drivers/vfio/pci/xe/main.c
new file mode 100644
index 0000000000000..b9109b6812eb2
--- /dev/null
+++ b/drivers/vfio/pci/xe/main.c
@@ -0,0 +1,470 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#include <linux/anon_inodes.h>
+#include <linux/delay.h>
+#include <linux/file.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/sizes.h>
+#include <linux/types.h>
+#include <linux/vfio.h>
+#include <linux/vfio_pci_core.h>
+
+#include <drm/intel/xe_sriov_vfio.h>
+
+/**
+ * struct xe_vfio_pci_migration_file - file used for reading / writing migration data
+ */
+struct xe_vfio_pci_migration_file {
+ /** @filp: pointer to underlying &struct file */
+ struct file *filp;
+ /** @lock: serializes accesses to migration data */
+ struct mutex lock;
+ /** @xe_vdev: backpointer to &struct xe_vfio_pci_core_device */
+ struct xe_vfio_pci_core_device *xe_vdev;
+};
+
+/**
+ * struct xe_vfio_pci_core_device - xe-specific vfio_pci_core_device
+ *
+ * Top level structure of xe_vfio_pci.
+ */
+struct xe_vfio_pci_core_device {
+ /** @core_device: vendor-agnostic VFIO device */
+ struct vfio_pci_core_device core_device;
+
+ /** @mig_state: current device migration state */
+ enum vfio_device_mig_state mig_state;
+
+ /** @vfid: VF number used by PF, xe uses 1-based indexing for vfid */
+ unsigned int vfid;
+
+ /** @pf: pointer to driver_private of physical function */
+ struct pci_dev *pf;
+
+ /** @fd: &struct xe_vfio_pci_migration_file for userspace to read/write migration data */
+ struct xe_vfio_pci_migration_file *fd;
+};
+
+#define xe_vdev_to_dev(xe_vdev) (&(xe_vdev)->core_device.pdev->dev)
+#define xe_vdev_to_pdev(xe_vdev) ((xe_vdev)->core_device.pdev)
+
+static void xe_vfio_pci_disable_file(struct xe_vfio_pci_migration_file *migf)
+{
+ struct xe_vfio_pci_core_device *xe_vdev = migf->xe_vdev;
+
+ mutex_lock(&migf->lock);
+ xe_vdev->fd = NULL;
+ mutex_unlock(&migf->lock);
+}
+
+static void xe_vfio_pci_reset(struct xe_vfio_pci_core_device *xe_vdev)
+{
+ if (xe_vdev->fd)
+ xe_vfio_pci_disable_file(xe_vdev->fd);
+
+ xe_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING;
+}
+
+static void xe_vfio_pci_reset_done(struct pci_dev *pdev)
+{
+ struct xe_vfio_pci_core_device *xe_vdev = pci_get_drvdata(pdev);
+ int ret;
+
+ ret = xe_sriov_vfio_wait_flr_done(xe_vdev->pf, xe_vdev->vfid);
+ if (ret)
+ dev_err(&pdev->dev, "Failed to wait for FLR: %d\n", ret);
+
+ xe_vfio_pci_reset(xe_vdev);
+}
+
+static const struct pci_error_handlers xe_vfio_pci_err_handlers = {
+ .reset_done = xe_vfio_pci_reset_done,
+};
+
+static int xe_vfio_pci_open_device(struct vfio_device *core_vdev)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+ struct vfio_pci_core_device *vdev = &xe_vdev->core_device;
+ int ret;
+
+ ret = vfio_pci_core_enable(vdev);
+ if (ret)
+ return ret;
+
+ vfio_pci_core_finish_enable(vdev);
+
+ return 0;
+}
+
+static int xe_vfio_pci_release_file(struct inode *inode, struct file *filp)
+{
+ struct xe_vfio_pci_migration_file *migf = filp->private_data;
+
+ xe_vfio_pci_disable_file(migf);
+ mutex_destroy(&migf->lock);
+ kfree(migf);
+
+ return 0;
+}
+
+static ssize_t xe_vfio_pci_save_read(struct file *filp, char __user *buf, size_t len, loff_t *pos)
+{
+ struct xe_vfio_pci_migration_file *migf = filp->private_data;
+ ssize_t ret;
+
+ if (pos)
+ return -ESPIPE;
+
+ mutex_lock(&migf->lock);
+ ret = xe_sriov_vfio_data_read(migf->xe_vdev->pf, migf->xe_vdev->vfid, buf, len);
+ mutex_unlock(&migf->lock);
+
+ return ret;
+}
+
+static const struct file_operations xe_vfio_pci_save_fops = {
+ .owner = THIS_MODULE,
+ .read = xe_vfio_pci_save_read,
+ .release = xe_vfio_pci_release_file,
+ .llseek = noop_llseek,
+};
+
+static ssize_t xe_vfio_pci_resume_write(struct file *filp, const char __user *buf,
+ size_t len, loff_t *pos)
+{
+ struct xe_vfio_pci_migration_file *migf = filp->private_data;
+ ssize_t ret;
+
+ if (pos)
+ return -ESPIPE;
+
+ mutex_lock(&migf->lock);
+ ret = xe_sriov_vfio_data_write(migf->xe_vdev->pf, migf->xe_vdev->vfid, buf, len);
+ mutex_unlock(&migf->lock);
+
+ return ret;
+}
+
+static const struct file_operations xe_vfio_pci_resume_fops = {
+ .owner = THIS_MODULE,
+ .write = xe_vfio_pci_resume_write,
+ .release = xe_vfio_pci_release_file,
+ .llseek = noop_llseek,
+};
+
+static const char *vfio_dev_state_str(u32 state)
+{
+ switch (state) {
+ case VFIO_DEVICE_STATE_RUNNING: return "running";
+ case VFIO_DEVICE_STATE_RUNNING_P2P: return "running_p2p";
+ case VFIO_DEVICE_STATE_STOP_COPY: return "stopcopy";
+ case VFIO_DEVICE_STATE_STOP: return "stop";
+ case VFIO_DEVICE_STATE_RESUMING: return "resuming";
+ case VFIO_DEVICE_STATE_ERROR: return "error";
+ default: return "";
+ }
+}
+
+enum xe_vfio_pci_file_type {
+ XE_VFIO_FILE_SAVE = 0,
+ XE_VFIO_FILE_RESUME,
+};
+
+static struct xe_vfio_pci_migration_file *
+xe_vfio_pci_alloc_file(struct xe_vfio_pci_core_device *xe_vdev,
+ enum xe_vfio_pci_file_type type)
+{
+ struct xe_vfio_pci_migration_file *migf;
+ const struct file_operations *fops;
+ int flags;
+
+ migf = kzalloc(sizeof(*migf), GFP_KERNEL);
+ if (!migf)
+ return ERR_PTR(-ENOMEM);
+
+ fops = type == XE_VFIO_FILE_SAVE ? &xe_vfio_pci_save_fops : &xe_vfio_pci_resume_fops;
+ flags = type == XE_VFIO_FILE_SAVE ? O_RDONLY : O_WRONLY;
+ migf->filp = anon_inode_getfile("xe_vfio_mig", fops, migf, flags);
+ if (IS_ERR(migf->filp)) {
+ kfree(migf);
+ return ERR_CAST(migf->filp);
+ }
+
+ mutex_init(&migf->lock);
+ migf->xe_vdev = xe_vdev;
+ xe_vdev->fd = migf;
+
+ stream_open(migf->filp->f_inode, migf->filp);
+
+ return migf;
+}
+
+static struct file *
+xe_vfio_set_state(struct xe_vfio_pci_core_device *xe_vdev, u32 new)
+{
+ u32 cur = xe_vdev->mig_state;
+ int ret;
+
+ dev_dbg(xe_vdev_to_dev(xe_vdev),
+ "state: %s->%s\n", vfio_dev_state_str(cur), vfio_dev_state_str(new));
+
+ /*
+ * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
+ * selectively block p2p DMA transfers.
+ * The device is not processing new workload requests when the VF is stopped, and both
+ * memory and MMIO communication channels are transferred to destination (where processing
+ * will be resumed).
+ */
+ if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
+ (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
+ ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
+ if (ret)
+ goto err;
+
+ return NULL;
+ }
+
+ if ((cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_STOP) ||
+ (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING_P2P))
+ return NULL;
+
+ if ((cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING) ||
+ (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_RUNNING)) {
+ ret = xe_sriov_vfio_run(xe_vdev->pf, xe_vdev->vfid);
+ if (ret)
+ goto err;
+
+ return NULL;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_STOP_COPY) {
+ struct xe_vfio_pci_migration_file *migf;
+
+ migf = xe_vfio_pci_alloc_file(xe_vdev, XE_VFIO_FILE_SAVE);
+ if (IS_ERR(migf)) {
+ ret = PTR_ERR(migf);
+ goto err;
+ }
+
+ ret = xe_sriov_vfio_stop_copy_enter(xe_vdev->pf, xe_vdev->vfid);
+ if (ret) {
+ fput(migf->filp);
+ goto err;
+ }
+
+ return migf->filp;
+ }
+
+ if ((cur == VFIO_DEVICE_STATE_STOP_COPY && new == VFIO_DEVICE_STATE_STOP)) {
+ if (xe_vdev->fd)
+ xe_vfio_pci_disable_file(xe_vdev->fd);
+
+ xe_sriov_vfio_stop_copy_exit(xe_vdev->pf, xe_vdev->vfid);
+
+ return NULL;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RESUMING) {
+ struct xe_vfio_pci_migration_file *migf;
+
+ migf = xe_vfio_pci_alloc_file(xe_vdev, XE_VFIO_FILE_RESUME);
+ if (IS_ERR(migf)) {
+ ret = PTR_ERR(migf);
+ goto err;
+ }
+
+ ret = xe_sriov_vfio_resume_enter(xe_vdev->pf, xe_vdev->vfid);
+ if (ret) {
+ fput(migf->filp);
+ goto err;
+ }
+
+ return migf->filp;
+ }
+
+ if (cur == VFIO_DEVICE_STATE_RESUMING && new == VFIO_DEVICE_STATE_STOP) {
+ if (xe_vdev->fd)
+ xe_vfio_pci_disable_file(xe_vdev->fd);
+
+ xe_sriov_vfio_resume_exit(xe_vdev->pf, xe_vdev->vfid);
+
+ return NULL;
+ }
+
+ if (new == VFIO_DEVICE_STATE_ERROR)
+ xe_sriov_vfio_error(xe_vdev->pf, xe_vdev->vfid);
+
+ WARN(true, "Unknown state transition %d->%d", cur, new);
+ return ERR_PTR(-EINVAL);
+
+err:
+ dev_dbg(xe_vdev_to_dev(xe_vdev),
+ "Failed to transition state: %s->%s err=%d\n",
+ vfio_dev_state_str(cur), vfio_dev_state_str(new), ret);
+ return ERR_PTR(ret);
+}
+
+static struct file *
+xe_vfio_pci_set_device_state(struct vfio_device *core_vdev,
+ enum vfio_device_mig_state new_state)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+ enum vfio_device_mig_state next_state;
+ struct file *f = NULL;
+ int ret;
+
+ while (new_state != xe_vdev->mig_state) {
+ ret = vfio_mig_get_next_state(core_vdev, xe_vdev->mig_state,
+ new_state, &next_state);
+ if (ret) {
+ f = ERR_PTR(ret);
+ break;
+ }
+ f = xe_vfio_set_state(xe_vdev, next_state);
+ if (IS_ERR(f))
+ break;
+
+ xe_vdev->mig_state = next_state;
+
+ /* Multiple state transitions with non-NULL file in the middle */
+ if (f && new_state != xe_vdev->mig_state) {
+ fput(f);
+ f = ERR_PTR(-EINVAL);
+ break;
+ }
+ }
+
+ return f;
+}
+
+static int xe_vfio_pci_get_device_state(struct vfio_device *core_vdev,
+ enum vfio_device_mig_state *curr_state)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+
+ *curr_state = xe_vdev->mig_state;
+
+ return 0;
+}
+
+static int xe_vfio_pci_get_data_size(struct vfio_device *vdev,
+ unsigned long *stop_copy_length)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+
+ *stop_copy_length = xe_sriov_vfio_stop_copy_size(xe_vdev->pf, xe_vdev->vfid);
+
+ return 0;
+}
+
+static const struct vfio_migration_ops xe_vfio_pci_migration_ops = {
+ .migration_set_state = xe_vfio_pci_set_device_state,
+ .migration_get_state = xe_vfio_pci_get_device_state,
+ .migration_get_data_size = xe_vfio_pci_get_data_size,
+};
+
+static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
+{
+ struct xe_vfio_pci_core_device *xe_vdev =
+ container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
+ struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
+
+ if (!xe_sriov_vfio_migration_supported(pdev->physfn))
+ return;
+
+ /* vfid starts from 1 for xe */
+ xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
+ xe_vdev->pf = pdev->physfn;
+
+ core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P;
+ core_vdev->mig_ops = &xe_vfio_pci_migration_ops;
+}
+
+static int xe_vfio_pci_init_dev(struct vfio_device *core_vdev)
+{
+ struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
+
+ if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe") == 0)
+ xe_vfio_pci_migration_init(core_vdev);
+
+ return vfio_pci_core_init_dev(core_vdev);
+}
+
+static const struct vfio_device_ops xe_vfio_pci_ops = {
+ .name = "xe-vfio-pci",
+ .init = xe_vfio_pci_init_dev,
+ .release = vfio_pci_core_release_dev,
+ .open_device = xe_vfio_pci_open_device,
+ .close_device = vfio_pci_core_close_device,
+ .ioctl = vfio_pci_core_ioctl,
+ .device_feature = vfio_pci_core_ioctl_feature,
+ .read = vfio_pci_core_read,
+ .write = vfio_pci_core_write,
+ .mmap = vfio_pci_core_mmap,
+ .request = vfio_pci_core_request,
+ .match = vfio_pci_core_match,
+ .match_token_uuid = vfio_pci_core_match_token_uuid,
+ .bind_iommufd = vfio_iommufd_physical_bind,
+ .unbind_iommufd = vfio_iommufd_physical_unbind,
+ .attach_ioas = vfio_iommufd_physical_attach_ioas,
+ .detach_ioas = vfio_iommufd_physical_detach_ioas,
+};
+
+static int xe_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+ struct xe_vfio_pci_core_device *xe_vdev;
+ int ret;
+
+ xe_vdev = vfio_alloc_device(xe_vfio_pci_core_device, core_device.vdev, &pdev->dev,
+ &xe_vfio_pci_ops);
+ if (IS_ERR(xe_vdev))
+ return PTR_ERR(xe_vdev);
+
+ dev_set_drvdata(&pdev->dev, &xe_vdev->core_device);
+
+ ret = vfio_pci_core_register_device(&xe_vdev->core_device);
+ if (ret) {
+ vfio_put_device(&xe_vdev->core_device.vdev);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void xe_vfio_pci_remove(struct pci_dev *pdev)
+{
+ struct xe_vfio_pci_core_device *xe_vdev = pci_get_drvdata(pdev);
+
+ vfio_pci_core_unregister_device(&xe_vdev->core_device);
+ vfio_put_device(&xe_vdev->core_device.vdev);
+}
+
+static const struct pci_device_id xe_vfio_pci_table[] = {
+ { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_ANY_ID),
+ .class = PCI_BASE_CLASS_DISPLAY << 8, .class_mask = 0xff << 16,
+ .override_only = PCI_ID_F_VFIO_DRIVER_OVERRIDE },
+ {}
+};
+MODULE_DEVICE_TABLE(pci, xe_vfio_pci_table);
+
+static struct pci_driver xe_vfio_pci_driver = {
+ .name = "xe-vfio-pci",
+ .id_table = xe_vfio_pci_table,
+ .probe = xe_vfio_pci_probe,
+ .remove = xe_vfio_pci_remove,
+ .err_handler = &xe_vfio_pci_err_handlers,
+ .driver_managed_dma = true,
+};
+module_pci_driver(xe_vfio_pci_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Intel Corporation");
+MODULE_DESCRIPTION("VFIO PCI driver with migration support for Intel Graphics");
--
2.50.1
^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet
2025-10-11 19:38 ` [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
@ 2025-10-11 22:28 ` kernel test robot
2025-10-13 10:46 ` Michal Wajdeczko
1 sibling, 0 replies; 82+ messages in thread
From: kernel test robot @ 2025-10-11 22:28 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: llvm, oe-kbuild-all, dri-devel, Matthew Brost, Michal Wajdeczko,
Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin, David Airlie,
Simona Vetter, Lukasz Laguna, Michał Winiarski
Hi Michał,
kernel test robot noticed the following build warnings:
[auto build test WARNING on drm-xe/drm-xe-next]
[also build test WARNING on next-20251010]
[cannot apply to awilliam-vfio/next drm-i915/for-linux-next drm-i915/for-linux-next-fixes linus/master awilliam-vfio/for-linus v6.17]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Micha-Winiarski/drm-xe-pf-Remove-GuC-version-check-for-migration-support/20251012-034301
base: https://gitlab.freedesktop.org/drm/xe/kernel.git drm-xe-next
patch link: https://lore.kernel.org/r/20251011193847.1836454-8-michal.winiarski%40intel.com
patch subject: [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet
config: riscv-randconfig-002-20251012 (https://download.01.org/0day-ci/archive/20251012/202510120631.vW6dpp07-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 39f292ffa13d7ca0d1edff27ac8fd55024bb4d19)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251012/202510120631.vW6dpp07-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510120631.vW6dpp07-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> Warning: drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c:272 function parameter 'xe' not described in 'xe_sriov_pf_migration_data_read'
>> Warning: drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c:382 function parameter 'xe' not described in 'xe_sriov_pf_migration_data_write'
>> Warning: drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c:467 function parameter 'xe' not described in 'xe_sriov_pf_migration_data_save_init'
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor
2025-10-11 19:38 ` [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
@ 2025-10-11 22:52 ` kernel test robot
2025-10-13 10:56 ` Michal Wajdeczko
1 sibling, 0 replies; 82+ messages in thread
From: kernel test robot @ 2025-10-11 22:52 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: llvm, oe-kbuild-all, dri-devel, Matthew Brost, Michal Wajdeczko,
Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin, David Airlie,
Simona Vetter, Lukasz Laguna, Michał Winiarski
Hi Michał,
kernel test robot noticed the following build warnings:
[auto build test WARNING on drm-xe/drm-xe-next]
[also build test WARNING on next-20251010]
[cannot apply to awilliam-vfio/next drm-i915/for-linux-next drm-i915/for-linux-next-fixes linus/master awilliam-vfio/for-linus v6.17]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Micha-Winiarski/drm-xe-pf-Remove-GuC-version-check-for-migration-support/20251012-034301
base: https://gitlab.freedesktop.org/drm/xe/kernel.git drm-xe-next
patch link: https://lore.kernel.org/r/20251011193847.1836454-9-michal.winiarski%40intel.com
patch subject: [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor
config: riscv-randconfig-002-20251012 (https://download.01.org/0day-ci/archive/20251012/202510120634.LMJaAJ9S-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 39f292ffa13d7ca0d1edff27ac8fd55024bb4d19)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251012/202510120634.LMJaAJ9S-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510120634.LMJaAJ9S-lkp@intel.com/
All warnings (new ones prefixed by >>):
Warning: drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c:273 function parameter 'xe' not described in 'xe_sriov_pf_migration_data_read'
Warning: drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c:383 function parameter 'xe' not described in 'xe_sriov_pf_migration_data_write'
>> Warning: drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c:457 function parameter 'xe' not described in 'xe_sriov_pf_migration_data_process_desc'
Warning: drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c:545 function parameter 'xe' not described in 'xe_sriov_pf_migration_data_save_init'
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size
2025-10-11 19:38 ` [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
@ 2025-10-11 23:35 ` kernel test robot
2025-10-13 11:08 ` Michal Wajdeczko
1 sibling, 0 replies; 82+ messages in thread
From: kernel test robot @ 2025-10-11 23:35 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: llvm, oe-kbuild-all, dri-devel, Matthew Brost, Michal Wajdeczko,
Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin, David Airlie,
Simona Vetter, Lukasz Laguna, Michał Winiarski
Hi Michał,
kernel test robot noticed the following build errors:
[auto build test ERROR on drm-xe/drm-xe-next]
[also build test ERROR on next-20251010]
[cannot apply to awilliam-vfio/next drm-i915/for-linux-next drm-i915/for-linux-next-fixes linus/master awilliam-vfio/for-linus v6.17]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Micha-Winiarski/drm-xe-pf-Remove-GuC-version-check-for-migration-support/20251012-034301
base: https://gitlab.freedesktop.org/drm/xe/kernel.git drm-xe-next
patch link: https://lore.kernel.org/r/20251011193847.1836454-12-michal.winiarski%40intel.com
patch subject: [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size
config: riscv-randconfig-002-20251012 (https://download.01.org/0day-ci/archive/20251012/202510120724.osgbcJi5-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 39f292ffa13d7ca0d1edff27ac8fd55024bb4d19)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251012/202510120724.osgbcJi5-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510120724.osgbcJi5-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
In file included from drivers/gpu/drm/xe/xe_guc_buf.c:180:
>> drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c:75:61: error: too many arguments provided to function-like macro invocation
75 | KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf), SZ_8K);
| ^
include/kunit/test.h:1358:9: note: macro 'KUNIT_ASSERT_EQ' defined here
1358 | #define KUNIT_ASSERT_EQ(test, left, right) \
| ^
In file included from drivers/gpu/drm/xe/xe_guc_buf.c:180:
>> drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c:75:2: error: use of undeclared identifier 'KUNIT_ASSERT_EQ'; did you mean 'KUNIT_ASSERTION'?
75 | KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf), SZ_8K);
| ^~~~~~~~~~~~~~~
| KUNIT_ASSERTION
include/kunit/assert.h:27:2: note: 'KUNIT_ASSERTION' declared here
27 | KUNIT_ASSERTION,
| ^
In file included from drivers/gpu/drm/xe/xe_guc_buf.c:180:
>> drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c:75:2: warning: expression result unused [-Wunused-value]
75 | KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf), SZ_8K);
| ^~~~~~~~~~~~~~~
1 warning and 2 errors generated.
vim +75 drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
51
52 static int guc_buf_test_init(struct kunit *test)
53 {
54 struct xe_pci_fake_data fake = {
55 .sriov_mode = XE_SRIOV_MODE_PF,
56 .platform = XE_TIGERLAKE, /* some random platform */
57 .subplatform = XE_SUBPLATFORM_NONE,
58 };
59 struct xe_ggtt *ggtt;
60 struct xe_guc *guc;
61
62 test->priv = &fake;
63 xe_kunit_helper_xe_device_test_init(test);
64
65 ggtt = xe_device_get_root_tile(test->priv)->mem.ggtt;
66 guc = &xe_device_get_gt(test->priv, 0)->uc.guc;
67
68 KUNIT_ASSERT_EQ(test, 0,
69 xe_ggtt_init_kunit(ggtt, DUT_GGTT_START,
70 DUT_GGTT_START + DUT_GGTT_SIZE));
71
72 kunit_activate_static_stub(test, xe_managed_bo_create_pin_map,
73 replacement_xe_managed_bo_create_pin_map);
74
> 75 KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf), SZ_8K);
76
77 test->priv = &guc->buf;
78 return 0;
79 }
80
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface
2025-10-11 19:38 ` [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
@ 2025-10-12 18:06 ` Matthew Brost
2025-10-21 0:45 ` Michał Winiarski
2025-10-13 11:20 ` Michal Wajdeczko
1 sibling, 1 reply; 82+ messages in thread
From: Matthew Brost @ 2025-10-12 18:06 UTC (permalink / raw)
To: Michał Winiarski
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sat, Oct 11, 2025 at 09:38:31PM +0200, Michał Winiarski wrote:
> In upcoming changes the cached buffers are going to be used to read data
> produced by the GuC. Add a counterpart to flush, which synchronizes the
> CPU-side of suballocation with the GPU data and propagate the interface
> to GuC Buffer Cache.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_guc_buf.c | 9 +++++++++
> drivers/gpu/drm/xe/xe_guc_buf.h | 1 +
> drivers/gpu/drm/xe/xe_sa.c | 21 +++++++++++++++++++++
> drivers/gpu/drm/xe/xe_sa.h | 1 +
> 4 files changed, 32 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
> index 502ca3a4ee606..1be26145f0b98 100644
> --- a/drivers/gpu/drm/xe/xe_guc_buf.c
> +++ b/drivers/gpu/drm/xe/xe_guc_buf.c
> @@ -127,6 +127,15 @@ u64 xe_guc_buf_flush(const struct xe_guc_buf buf)
> return xe_sa_bo_gpu_addr(buf.sa);
> }
>
> +/**
> + * xe_guc_buf_sync() - Copy the data from the GPU memory to the sub-allocation.
> + * @buf: the &xe_guc_buf to sync
> + */
> +void xe_guc_buf_sync(const struct xe_guc_buf buf)
s/sync/sync_read ?
or
s/sync/flush_read ?
Patch itself LGTM.
Matt
> +{
> + xe_sa_bo_sync(buf.sa);
> +}
> +
> /**
> * xe_guc_buf_cpu_ptr() - Obtain a CPU pointer to the sub-allocation.
> * @buf: the &xe_guc_buf to query
> diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
> index 0d67604d96bdd..fe6b5ffe0d6eb 100644
> --- a/drivers/gpu/drm/xe/xe_guc_buf.h
> +++ b/drivers/gpu/drm/xe/xe_guc_buf.h
> @@ -31,6 +31,7 @@ static inline bool xe_guc_buf_is_valid(const struct xe_guc_buf buf)
>
> void *xe_guc_buf_cpu_ptr(const struct xe_guc_buf buf);
> u64 xe_guc_buf_flush(const struct xe_guc_buf buf);
> +void xe_guc_buf_sync(const struct xe_guc_buf buf);
> u64 xe_guc_buf_gpu_addr(const struct xe_guc_buf buf);
> u64 xe_guc_cache_gpu_addr_from_ptr(struct xe_guc_buf_cache *cache, const void *ptr, u32 size);
>
> diff --git a/drivers/gpu/drm/xe/xe_sa.c b/drivers/gpu/drm/xe/xe_sa.c
> index fedd017d6dd36..2115789c2bfb7 100644
> --- a/drivers/gpu/drm/xe/xe_sa.c
> +++ b/drivers/gpu/drm/xe/xe_sa.c
> @@ -110,6 +110,10 @@ struct drm_suballoc *__xe_sa_bo_new(struct xe_sa_manager *sa_manager, u32 size,
> return drm_suballoc_new(&sa_manager->base, size, gfp, true, 0);
> }
>
> +/**
> + * xe_sa_bo_flush_write() - Copy the data from the sub-allocation to the GPU memory.
> + * @sa_bo: the &drm_suballoc to flush
> + */
> void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
> {
> struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
> @@ -123,6 +127,23 @@ void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
> drm_suballoc_size(sa_bo));
> }
>
> +/**
> + * xe_sa_bo_sync() - Copy the data from GPU memory to the sub-allocation.
> + * @sa_bo: the &drm_suballoc to sync
> + */
> +void xe_sa_bo_sync(struct drm_suballoc *sa_bo)
> +{
> + struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
> + struct xe_device *xe = tile_to_xe(sa_manager->bo->tile);
> +
> + if (!sa_manager->bo->vmap.is_iomem)
> + return;
> +
> + xe_map_memcpy_from(xe, xe_sa_bo_cpu_addr(sa_bo), &sa_manager->bo->vmap,
> + drm_suballoc_soffset(sa_bo),
> + drm_suballoc_size(sa_bo));
> +}
> +
> void xe_sa_bo_free(struct drm_suballoc *sa_bo,
> struct dma_fence *fence)
> {
> diff --git a/drivers/gpu/drm/xe/xe_sa.h b/drivers/gpu/drm/xe/xe_sa.h
> index 99dbf0eea5402..28fd8bb6450c2 100644
> --- a/drivers/gpu/drm/xe/xe_sa.h
> +++ b/drivers/gpu/drm/xe/xe_sa.h
> @@ -37,6 +37,7 @@ static inline struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager
> }
>
> void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo);
> +void xe_sa_bo_sync(struct drm_suballoc *sa_bo);
> void xe_sa_bo_free(struct drm_suballoc *sa_bo, struct dma_fence *fence);
>
> static inline struct xe_sa_manager *
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 01/26] drm/xe/pf: Remove GuC version check for migration support
2025-10-11 19:38 ` [PATCH 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
@ 2025-10-12 18:31 ` Michal Wajdeczko
2025-10-20 14:46 ` Michał Winiarski
0 siblings, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-12 18:31 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Since commit 4eb0aab6e4434 ("drm/xe/guc: Bump minimum required GuC
> version to v70.29.2"), the minimum GuC version required by the driver
> is v70.29.2, which should already include everything that we need for
> migration.
> Remove the version check.
>
> Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 44cc612b0a752..a5bf327ef8889 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -384,9 +384,6 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
>
> static bool pf_check_migration_support(struct xe_gt *gt)
> {
> - /* GuC 70.25 with save/restore v2 is required */
> - xe_gt_assert(gt, GUC_FIRMWARE_VER(>->uc.guc) >= MAKE_GUC_VER(70, 25, 0));
> -
alternatively we can move this assert to guc_action_vf_save_restore()
to double check we try that on older firmware, but either way,
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> /* XXX: for now this is for feature enabling only */
> return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> }
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 25/26] drm/xe/pf: Export helpers for VFIO
2025-10-11 19:38 ` [PATCH 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
@ 2025-10-12 18:32 ` Matthew Brost
2025-10-21 1:38 ` Michał Winiarski
2025-10-13 14:02 ` Michal Wajdeczko
1 sibling, 1 reply; 82+ messages in thread
From: Matthew Brost @ 2025-10-12 18:32 UTC (permalink / raw)
To: Michał Winiarski
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sat, Oct 11, 2025 at 09:38:46PM +0200, Michał Winiarski wrote:
> Vendor-specific VFIO driver for Xe will implement VF migration.
> Export everything that's needed for migration ops.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 2 +
> drivers/gpu/drm/xe/xe_sriov_vfio.c | 252 +++++++++++++++++++++++++++++
> include/drm/intel/xe_sriov_vfio.h | 28 ++++
> 3 files changed, 282 insertions(+)
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_vfio.c
> create mode 100644 include/drm/intel/xe_sriov_vfio.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index e253d65366de4..a5c5afff42aa6 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -181,6 +181,8 @@ xe-$(CONFIG_PCI_IOV) += \
> xe_sriov_pf_service.o \
> xe_tile_sriov_pf_debugfs.o
>
> +xe-$(CONFIG_XE_VFIO_PCI) += xe_sriov_vfio.o
> +
> # include helpers for tests even when XE is built-in
> ifdef CONFIG_DRM_XE_KUNIT_TEST
> xe-y += tests/xe_kunit_helpers.o
> diff --git a/drivers/gpu/drm/xe/xe_sriov_vfio.c b/drivers/gpu/drm/xe/xe_sriov_vfio.c
> new file mode 100644
> index 0000000000000..a510d1bde93f0
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_vfio.c
> @@ -0,0 +1,252 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include <drm/intel/xe_sriov_vfio.h>
> +
> +#include "xe_pm.h"
> +#include "xe_sriov.h"
> +#include "xe_sriov_pf_control.h"
> +#include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
> +
> +/**
> + * xe_sriov_vfio_migration_supported() - Check if migration is supported.
> + * @pdev: PF PCI device
> + *
> + * Return: true if migration is supported, false otherwise.
> + */
> +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + return xe_sriov_pf_migration_supported(xe);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_migration_supported, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_wait_flr_done - Wait for VF FLR completion.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will wait until VF FLR is processed by PF on all tiles (or
> + * until timeout occurs).
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + return xe_sriov_pf_control_wait_flr(xe, vfid);
Ideally I think you'd want the exported suffix to match on all these
functions.
i.e.,
xe_sriov_vfio_SUFFIX
xe_sriov_pf_control_SUFFIX
Maybe this doesn't sense in all cases, so take as a suggestion, not a
blocker.
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_wait_flr_done, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_stop - Stop VF.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will pause VF on all tiles/GTs.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
The PF must hold PM ref behalf of the VF' (right?) as VF's don't have
access to the runtime PM.
So either you can assert a PM ref is held here and drop the put / get or
use xe_pm_runtime_get_noresume here.
Exporting the waking runtime PM IMO is risky as waking runtime PM takes
as bunch of locks which could create a problem at the caller if it is
holding locks, best to avoid this if possible.
> + ret = xe_sriov_pf_control_pause_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_run - Run VF.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will resume VF on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_resume_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_run, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_stop_copy_enter - Copy VF migration data from device (while stopped).
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will save VF migration data on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_save_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_enter, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_stop_copy_exit - Wait until VF migration data save is done.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will wait until VF migration data is saved on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_wait_save_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_exit, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_resume_enter - Copy VF migration data to device (while stopped).
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will restore VF migration data on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_restore_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_enter, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_resume_exit - Wait until VF migration data is copied to the device.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will wait until VF migration data is restored on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_wait_restore_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_exit, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_error - Move VF to error state.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will stop VF on all tiles.
> + * Reset is needed to move it out of error state.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_stop_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_error, "xe-vfio-pci");
> +
Kernel doc for the below functions.
Matt
> +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> + char __user *buf, size_t len)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + return xe_sriov_pf_migration_data_read(xe, vfid, buf, len);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_read, "xe-vfio-pci");
> +
> +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> + const char __user *buf, size_t len)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + return xe_sriov_pf_migration_data_write(xe, vfid, buf, len);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_write, "xe-vfio-pci");
> +
> +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + return xe_sriov_pf_migration_size(xe, vfid);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_size, "xe-vfio-pci");
> diff --git a/include/drm/intel/xe_sriov_vfio.h b/include/drm/intel/xe_sriov_vfio.h
> new file mode 100644
> index 0000000000000..24e272f84c0e6
> --- /dev/null
> +++ b/include/drm/intel/xe_sriov_vfio.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_SRIOV_VFIO_H_
> +#define _XE_SRIOV_VFIO_H_
> +
> +#include <linux/types.h>
> +
> +struct pci_dev;
> +
> +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid);
> +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> + char __user *buf, size_t len);
> +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> + const char __user *buf, size_t len);
> +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid);
> +
> +#endif /* _XE_SRIOV_VFIO_H_ */
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 22/26] drm/xe/migrate: Add function for raw copy of VRAM and CCS
2025-10-11 19:38 ` [PATCH 22/26] drm/xe/migrate: Add function for raw copy of VRAM and CCS Michał Winiarski
@ 2025-10-12 18:54 ` Matthew Brost
0 siblings, 0 replies; 82+ messages in thread
From: Matthew Brost @ 2025-10-12 18:54 UTC (permalink / raw)
To: Michał Winiarski
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sat, Oct 11, 2025 at 09:38:43PM +0200, Michał Winiarski wrote:
> From: Lukasz Laguna <lukasz.laguna@intel.com>
>
> Introduce a new function to copy data between VRAM and sysmem objects.
> It's specifically designed for raw data copies, whereas the existing
> xe_migrate_copy() is tailored for eviction and restore operations,
> which involves additional logic. For instance, xe_migrate_copy() skips
> CCS metadata copies on Xe2 dGPUs, as it's unnecessary in eviction
> scenario. However, in cases like VF migration, CCS metadata has to be
> saved and restored in its raw form.
>
> Additionally, xe_migrate_raw_vram_copy() allows copying not only entire
> objects, but also chunks of data, as well as copying corresponding CCS
> metadata to or from a dedicated buffer object, which are essential in
> case of VF migration.
>
> Signed-off-by: Lukasz Laguna <lukasz.laguna@intel.com>
> ---
> drivers/gpu/drm/xe/xe_migrate.c | 214 +++++++++++++++++++++++++++++++-
> drivers/gpu/drm/xe/xe_migrate.h | 4 +
> 2 files changed, 217 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
> index 7345a5b65169a..3f8804a2f4ee2 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.c
> +++ b/drivers/gpu/drm/xe/xe_migrate.c
> @@ -501,7 +501,7 @@ int xe_migrate_init(struct xe_migrate *m)
>
> static u64 max_mem_transfer_per_pass(struct xe_device *xe)
> {
> - if (!IS_DGFX(xe) && xe_device_has_flat_ccs(xe))
> + if ((!IS_DGFX(xe) || IS_SRIOV_PF(xe)) && xe_device_has_flat_ccs(xe))
> return MAX_CCS_LIMITED_TRANSFER;
>
> return MAX_PREEMPTDISABLE_TRANSFER;
> @@ -1142,6 +1142,218 @@ struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate)
> return migrate->q;
> }
>
> +/**
> + * xe_migrate_raw_vram_copy() - Raw copy of VRAM object and corresponding CCS.
> + * @vram_bo: The VRAM buffer object.
> + * @vram_offset: The VRAM offset.
> + * @sysmem_bo: The sysmem buffer object. If copying only CCS metadata set this
> + * to NULL.
> + * @sysmem_offset: The sysmem offset.
> + * @ccs_bo: The CCS buffer object located in sysmem. If copying of CCS metadata
> + * is not needed set this to NULL.
> + * @ccs_offset: The CCS offset.
> + * @size: The size of VRAM chunk to copy.
> + * @to_sysmem: True to copy from VRAM to sysmem, false for opposite direction.
> + *
> + * Copies the content of buffer object from or to VRAM. If supported and
> + * needed, it also copies corresponding CCS metadata.
> + *
> + * Return: Pointer to a dma_fence representing the last copy batch, or
> + * an error pointer on failure. If there is a failure, any copy operation
> + * started by the function call has been synced.
> + */
> +struct dma_fence *xe_migrate_raw_vram_copy(struct xe_bo *vram_bo, u64 vram_offset,
> + struct xe_bo *sysmem_bo, u64 sysmem_offset,
> + struct xe_bo *ccs_bo, u64 ccs_offset,
I’d drop the CCS implementation from this function. As far as I know, it
isn’t functional—hence the reason we’re using comp_pat to decompress
VRAM during the system memory copy.
> + u64 size, bool to_sysmem)
I'd lean towards enum for direction. We already have one in defined in
xe_migrate_copy_dir.
Maybe time to move that to a header file.
> +{
> + struct xe_device *xe = xe_bo_device(vram_bo);
> + struct xe_tile *tile = vram_bo->tile;
> + struct xe_gt *gt = tile->primary_gt;
> + struct xe_migrate *m = tile->migrate;
> + struct dma_fence *fence = NULL;
> + struct ttm_resource *vram = vram_bo->ttm.resource, *sysmem, *ccs;
> + struct xe_res_cursor vram_it, sysmem_it, ccs_it;
> + u64 vram_L0_ofs, sysmem_L0_ofs;
> + u32 vram_L0_pt, sysmem_L0_pt;
> + u64 vram_L0, sysmem_L0;
> + bool copy_content = sysmem_bo ? true : false;
bool copy_content = sysmem_bo;
Or just drop this bool as if CCS is removed this will always just be
true.
> + bool copy_ccs = ccs_bo ? true : false;
> + bool use_comp_pat = copy_content && to_sysmem &&
> + xe_device_has_flat_ccs(xe) && GRAPHICS_VER(xe) >= 20;
> + int pass = 0;
> + int err;
> +
> + if (!copy_content && !copy_ccs)
> + return ERR_PTR(-EINVAL);
> +
> + if (!IS_ALIGNED(vram_offset | sysmem_offset | ccs_offset | size, PAGE_SIZE))
> + return ERR_PTR(-EINVAL);
> +
> + if (!xe_bo_is_vram(vram_bo))
> + return ERR_PTR(-EINVAL);
> +
> + if (range_overflows(vram_offset, size, (u64)vram_bo->ttm.base.size))
> + return ERR_PTR(-EOVERFLOW);
> +
> + if (copy_content) {
> + if (xe_bo_is_vram(sysmem_bo))
> + return ERR_PTR(-EINVAL);
> + if (range_overflows(sysmem_offset, size, (u64)sysmem_bo->ttm.base.size))
> + return ERR_PTR(-EOVERFLOW);
> + }
> +
> + if (copy_ccs) {
> + if (xe_bo_is_vram(ccs_bo))
> + return ERR_PTR(-EINVAL);
> + if (!xe_device_has_flat_ccs(xe))
> + return ERR_PTR(-EOPNOTSUPP);
> + if (ccs_bo->ttm.base.size < xe_device_ccs_bytes(xe, size))
> + return ERR_PTR(-EINVAL);
> + if (range_overflows(ccs_offset, (u64)xe_device_ccs_bytes(xe, size),
> + (u64)ccs_bo->ttm.base.size))
> + return ERR_PTR(-EOVERFLOW);
> + }
This function performs extensive argument sanitization. It's called
purely internally, correct? That is, the Xe module fully controls the
arguments—nothing is exposed to user space, debugfs, or any other
module. If this is purely internal, I’d recommend sanitizing the
arguments via assertions to catch bugs, since internal callers should
know what they’re doing and invoke this correctly.
> +
> + xe_res_first(vram, vram_offset, size, &vram_it);
> +
> + if (copy_content) {
> + sysmem = sysmem_bo->ttm.resource;
> + xe_res_first_sg(xe_bo_sg(sysmem_bo), sysmem_offset, size, &sysmem_it);
> + }
> +
> + if (copy_ccs) {
else if
^^^ If for whatever reason the CCS isn't dropped. This would make it
clear copy_content / copy_ccs are mutually exclusive.
> + ccs = ccs_bo->ttm.resource;
> + xe_res_first_sg(xe_bo_sg(ccs_bo), ccs_offset, xe_device_ccs_bytes(xe, size),
> + &ccs_it);
> + }
> +
> + while (size) {
> + u32 pte_flags = PTE_UPDATE_FLAG_IS_VRAM;
> + u32 batch_size = 2; /* arb_clear() + MI_BATCH_BUFFER_END */
> + struct xe_sched_job *job;
> + struct xe_bb *bb;
> + u32 flush_flags = 0;
> + u32 update_idx;
> + u64 ccs_ofs, ccs_size;
> + u32 ccs_pt;
> +
Extra newline.
> + bool usm = xe->info.has_usm;
> + u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE;
> +
> + vram_L0 = xe_migrate_res_sizes(m, &vram_it);
> +
> + if (copy_content) {
> + sysmem_L0 = xe_migrate_res_sizes(m, &sysmem_it);
> + vram_L0 = min(vram_L0, sysmem_L0);
> + }
> +
> + drm_dbg(&xe->drm, "Pass %u, size: %llu\n", pass++, vram_L0);
> +
> + pte_flags |= use_comp_pat ? PTE_UPDATE_FLAG_IS_COMP_PTE : 0;
> + batch_size += pte_update_size(m, pte_flags, vram, &vram_it, &vram_L0,
> + &vram_L0_ofs, &vram_L0_pt, 0, 0, avail_pts);
> + if (copy_content) {
> + batch_size += pte_update_size(m, 0, sysmem, &sysmem_it, &vram_L0,
> + &sysmem_L0_ofs, &sysmem_L0_pt, 0, avail_pts,
> + avail_pts);
> + }
> +
> + if (copy_ccs) {
> + ccs_size = xe_device_ccs_bytes(xe, vram_L0);
> + batch_size += pte_update_size(m, 0, NULL, &ccs_it, &ccs_size, &ccs_ofs,
> + &ccs_pt, 0, copy_content ? 2 * avail_pts :
> + avail_pts, avail_pts);
> + xe_assert(xe, IS_ALIGNED(ccs_it.start, PAGE_SIZE));
> + }
> +
> + batch_size += copy_content ? EMIT_COPY_DW : 0;
> + batch_size += copy_ccs ? EMIT_COPY_CCS_DW : 0;
> +
> + bb = xe_bb_new(gt, batch_size, usm);
> + if (IS_ERR(bb)) {
> + err = PTR_ERR(bb);
> + goto err_sync;
> + }
> +
> + if (xe_migrate_allow_identity(vram_L0, &vram_it))
> + xe_res_next(&vram_it, vram_L0);
> + else
> + emit_pte(m, bb, vram_L0_pt, true, use_comp_pat, &vram_it, vram_L0, vram);
> +
> + if (copy_content)
> + emit_pte(m, bb, sysmem_L0_pt, false, false, &sysmem_it, vram_L0, sysmem);
> +
> + if (copy_ccs)
> + emit_pte(m, bb, ccs_pt, false, false, &ccs_it, ccs_size, ccs);
> +
> + bb->cs[bb->len++] = MI_BATCH_BUFFER_END;
> + update_idx = bb->len;
> +
> + if (copy_content)
> + emit_copy(gt, bb, to_sysmem ? vram_L0_ofs : sysmem_L0_ofs, to_sysmem ?
> + sysmem_L0_ofs : vram_L0_ofs, vram_L0, XE_PAGE_SIZE);
> +
> + if (copy_ccs) {
> + emit_copy_ccs(gt, bb, to_sysmem ? ccs_ofs : vram_L0_ofs, !to_sysmem,
> + to_sysmem ? vram_L0_ofs : ccs_ofs, to_sysmem, vram_L0);
> + flush_flags = to_sysmem ? 0 : MI_FLUSH_DW_CCS;
> + }
> +
> + job = xe_bb_create_migration_job(m->q, bb, xe_migrate_batch_base(m, usm),
> + update_idx);
> + if (IS_ERR(job)) {
> + err = PTR_ERR(job);
> + goto err;
> + }
> +
> + xe_sched_job_add_migrate_flush(job, flush_flags | MI_INVALIDATE_TLB);
> + if (!fence) {
> + err = xe_sched_job_add_deps(job, vram_bo->ttm.base.resv,
> + DMA_RESV_USAGE_BOOKKEEP);
> + if (!err && copy_content)
> + err = xe_sched_job_add_deps(job, sysmem_bo->ttm.base.resv,
> + DMA_RESV_USAGE_BOOKKEEP);
> + if (!err && copy_ccs)
> + err = xe_sched_job_add_deps(job, ccs_bo->ttm.base.resv,
> + DMA_RESV_USAGE_BOOKKEEP);
> + if (err)
> + goto err_job;
I’d think you do not need dma-resv dependencies here. Do we ever install
any dma-resv fences into vram_bo, sysmem_bo, or ccs_bo? I believe the answer
is no. If that’s the case, maybe just assert that the
DMA_RESV_USAGE_BOOKKEEP slots of each object being used are idle to
ensure this assumption is corrcet.
Matt
> + }
> +
> + mutex_lock(&m->job_mutex);
> + xe_sched_job_arm(job);
> + dma_fence_put(fence);
> + fence = dma_fence_get(&job->drm.s_fence->finished);
> + xe_sched_job_push(job);
> +
> + dma_fence_put(m->fence);
> + m->fence = dma_fence_get(fence);
> +
> + mutex_unlock(&m->job_mutex);
> +
> + xe_bb_free(bb, fence);
> + size -= vram_L0;
> + continue;
> +
> +err_job:
> + xe_sched_job_put(job);
> +err:
> + xe_bb_free(bb, NULL);
> +
> +err_sync:
> + /* Sync partial copy if any. FIXME: under job_mutex? */
> + if (fence) {
> + dma_fence_wait(fence, false);
> + dma_fence_put(fence);
> + }
> +
> + return ERR_PTR(err);
> + }
> +
> + return fence;
> +}
> +
> static void emit_clear_link_copy(struct xe_gt *gt, struct xe_bb *bb, u64 src_ofs,
> u32 size, u32 pitch)
> {
> diff --git a/drivers/gpu/drm/xe/xe_migrate.h b/drivers/gpu/drm/xe/xe_migrate.h
> index 4fad324b62535..0d8944b1cee61 100644
> --- a/drivers/gpu/drm/xe/xe_migrate.h
> +++ b/drivers/gpu/drm/xe/xe_migrate.h
> @@ -131,6 +131,10 @@ int xe_migrate_ccs_rw_copy(struct xe_tile *tile, struct xe_exec_queue *q,
>
> struct xe_lrc *xe_migrate_lrc(struct xe_migrate *migrate);
> struct xe_exec_queue *xe_migrate_exec_queue(struct xe_migrate *migrate);
> +struct dma_fence *xe_migrate_raw_vram_copy(struct xe_bo *vram_bo, u64 vram_offset,
> + struct xe_bo *sysmem_bo, u64 sysmem_offset,
> + struct xe_bo *ccs_bo, u64 ccs_offset,
> + u64 size, bool to_sysmem);
> int xe_migrate_access_memory(struct xe_migrate *m, struct xe_bo *bo,
> unsigned long offset, void *buf, int len,
> int write);
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 02/26] drm/xe: Move migration support to device-level struct
2025-10-11 19:38 ` [PATCH 02/26] drm/xe: Move migration support to device-level struct Michał Winiarski
@ 2025-10-12 18:58 ` Michal Wajdeczko
2025-10-20 14:48 ` Michał Winiarski
0 siblings, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-12 18:58 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Upcoming changes will allow users to control VF state and obtain its
> migration data with a device-level granularity (not tile/gt).
> Change the data structures to reflect that and move the GT-level
> migration init to happen after device-level init.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 1 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 12 +-----
> .../drm/xe/xe_gt_sriov_pf_migration_types.h | 3 --
> drivers/gpu/drm/xe/xe_sriov_pf.c | 5 +++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 43 +++++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 27 ++++++++++++
> .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 0
> drivers/gpu/drm/xe/xe_sriov_pf_types.h | 5 +++
> 8 files changed, 83 insertions(+), 13 deletions(-)
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 84321fad32658..71f685a315dca 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -176,6 +176,7 @@ xe-$(CONFIG_PCI_IOV) += \
> xe_sriov_pf.o \
> xe_sriov_pf_control.o \
> xe_sriov_pf_debugfs.o \
> + xe_sriov_pf_migration.o \
> xe_sriov_pf_service.o \
> xe_tile_sriov_pf_debugfs.o
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index a5bf327ef8889..ca28f45aaf481 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -13,6 +13,7 @@
> #include "xe_guc.h"
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> +#include "xe_sriov_pf_migration.h"
>
> /* Return: number of dwords saved/restored/required or a negative error code on failure */
> static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
> @@ -115,8 +116,7 @@ static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
>
> static bool pf_migration_supported(struct xe_gt *gt)
> {
> - xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> - return gt->sriov.pf.migration.supported;
> + return xe_sriov_pf_migration_supported(gt_to_xe(gt));
> }
>
> static struct mutex *pf_migration_mutex(struct xe_gt *gt)
> @@ -382,12 +382,6 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> }
> #endif /* CONFIG_DEBUG_FS */
>
> -static bool pf_check_migration_support(struct xe_gt *gt)
> -{
> - /* XXX: for now this is for feature enabling only */
> - return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> -}
> -
> /**
> * xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
> * @gt: the &xe_gt
> @@ -403,8 +397,6 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
>
> xe_gt_assert(gt, IS_SRIOV_PF(xe));
>
> - gt->sriov.pf.migration.supported = pf_check_migration_support(gt);
> -
> if (!pf_migration_supported(gt))
> return 0;
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> index 1f3110b6d44fa..9d672feac5f04 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> @@ -30,9 +30,6 @@ struct xe_gt_sriov_state_snapshot {
> * Used by the PF driver to maintain non-VF specific per-GT data.
> */
> struct xe_gt_sriov_pf_migration {
> - /** @supported: indicates whether the feature is supported */
> - bool supported;
> -
> /** @snapshot_lock: protects all VFs snapshots */
> struct mutex snapshot_lock;
> };
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
> index bc1ab9ee31d92..95743c7af8050 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
> @@ -15,6 +15,7 @@
> #include "xe_sriov.h"
> #include "xe_sriov_pf.h"
> #include "xe_sriov_pf_helpers.h"
> +#include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_sriov_printk.h"
>
> @@ -101,6 +102,10 @@ int xe_sriov_pf_init_early(struct xe_device *xe)
> if (err)
> return err;
>
> + err = xe_sriov_pf_migration_init(xe);
> + if (err)
> + return err;
> +
> xe_sriov_pf_service_init(xe);
>
> return 0;
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> new file mode 100644
> index 0000000000000..cf6a210d5597a
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -0,0 +1,43 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include "xe_sriov.h"
> +#include "xe_sriov_pf_migration.h"
> +
> +/**
> + * xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
> + * @xe: the &struct xe_device
nit: this will render better:
@xe: the struct &xe_device
but in other places we just use:
@xe: the &xe_device
> + *
> + * Return: true if migration is supported, false otherwise
> + */
> +bool xe_sriov_pf_migration_supported(struct xe_device *xe)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + return xe->sriov.pf.migration.supported;
> +}
> +
> +static bool pf_check_migration_support(struct xe_device *xe)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
we don't need this here for now
> +
> + /* XXX: for now this is for feature enabling only */
> + return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> +}
> +
> +/**
> + * xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
> + * @xe: the &struct xe_device
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_migration_init(struct xe_device *xe)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
> +
> + return 0;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> new file mode 100644
> index 0000000000000..d3058b6682192
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> @@ -0,0 +1,27 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_SRIOV_PF_MIGRATION_H_
> +#define _XE_SRIOV_PF_MIGRATION_H_
> +
> +#include <linux/types.h>
> +
> +struct xe_device;
> +
> +#ifdef CONFIG_PCI_IOV
> +int xe_sriov_pf_migration_init(struct xe_device *xe);
> +bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> +#else
> +static inline int xe_sriov_pf_migration_init(struct xe_device *xe)
> +{
> + return 0;
> +}
> +static inline bool xe_sriov_pf_migration_supported(struct xe_device *xe)
> +{
> + return false;
> +}
> +#endif
> +
> +#endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> new file mode 100644
> index 0000000000000..e69de29bb2d1d
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> index 956a88f9f213d..2d2fcc0a2f258 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> @@ -32,6 +32,11 @@ struct xe_device_pf {
> /** @driver_max_vfs: Maximum number of VFs supported by the driver. */
> u16 driver_max_vfs;
>
I guess you need to document @migration too to make it work
> + struct {
> + /** @migration.supported: indicates whether VF migration feature is supported */
> + bool supported;
> + } migration;
also can you move that closer to other sub-component "service" below ?
> +
> /** @master_lock: protects all VFs configurations across GTs */
> struct mutex master_lock;
>
but otherwise LGTM, so with above fixed,
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 04/26] drm/xe/pf: Extract migration mutex out of its struct
2025-10-11 19:38 ` [PATCH 04/26] drm/xe/pf: Extract migration mutex out of its struct Michał Winiarski
@ 2025-10-12 19:08 ` Matthew Brost
2025-10-20 14:50 ` Michał Winiarski
0 siblings, 1 reply; 82+ messages in thread
From: Matthew Brost @ 2025-10-12 19:08 UTC (permalink / raw)
To: Michał Winiarski
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sat, Oct 11, 2025 at 09:38:25PM +0200, Michał Winiarski wrote:
> As part of upcoming changes, the struct xe_gt_sriov_pf_migration will be
> used as a per-VF data structure.
> The mutex (which is currently the only member of this structure) will
> have slightly different semantics.
> Extract the mutex to free up the struct name and simplify the future
> changes.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 4 ++--
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h | 2 --
> drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 2 +-
> 3 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index ca28f45aaf481..f8604b172963e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -122,7 +122,7 @@ static bool pf_migration_supported(struct xe_gt *gt)
> static struct mutex *pf_migration_mutex(struct xe_gt *gt)
> {
> xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> - return >->sriov.pf.migration.snapshot_lock;
> + return >->sriov.pf.snapshot_lock;
By the end of series this function looks like:
14 static struct mutex *pf_migration_mutex(struct xe_device *xe, unsigned int vfid)
15 {
16 xe_assert(xe, IS_SRIOV_PF(xe));
17 xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
18 return &xe->sriov.pf.vfs[vfid].migration.lock;
19 }
And...
grep snapshot_lock *.c *.h
xe_gt_sriov_pf_migration.c: err = drmm_mutex_init(&xe->drm, >->sriov.pf.snapshot_lock);
xe_gt_sriov_pf_types.h: struct mutex snapshot_lock;
So 'snapshot_lock' isn't used at the end of the series. Maybe drop this
patch, delete the snapshot_lock in the patch which restructures the
above code / remove the snapshot_lock usage.
Matt
> }
>
> static struct xe_gt_sriov_state_snapshot *pf_pick_vf_snapshot(struct xe_gt *gt,
> @@ -400,7 +400,7 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> if (!pf_migration_supported(gt))
> return 0;
>
> - err = drmm_mutex_init(&xe->drm, >->sriov.pf.migration.snapshot_lock);
> + err = drmm_mutex_init(&xe->drm, >->sriov.pf.snapshot_lock);
> if (err)
> return err;
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> index 9d672feac5f04..fdc5a31dd8989 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> @@ -30,8 +30,6 @@ struct xe_gt_sriov_state_snapshot {
> * Used by the PF driver to maintain non-VF specific per-GT data.
> */
> struct xe_gt_sriov_pf_migration {
> - /** @snapshot_lock: protects all VFs snapshots */
> - struct mutex snapshot_lock;
> };
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> index a64a6835ad656..9a856da379d39 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> @@ -58,7 +58,7 @@ struct xe_gt_sriov_pf {
> struct xe_gt_sriov_pf_service service;
> struct xe_gt_sriov_pf_control control;
> struct xe_gt_sriov_pf_policy policy;
> - struct xe_gt_sriov_pf_migration migration;
> + struct mutex snapshot_lock;
> struct xe_gt_sriov_spare_config spare;
> struct xe_gt_sriov_metadata *vfs;
> };
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free
2025-10-11 19:38 ` [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
@ 2025-10-12 19:12 ` Matthew Brost
2025-10-21 0:26 ` Michał Winiarski
2025-10-13 10:15 ` Michal Wajdeczko
1 sibling, 1 reply; 82+ messages in thread
From: Matthew Brost @ 2025-10-12 19:12 UTC (permalink / raw)
To: Michał Winiarski
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sat, Oct 11, 2025 at 09:38:27PM +0200, Michał Winiarski wrote:
> Now that it's possible to free the packets - connect the restore
> handling logic with the ring.
> The helpers will also be used in upcoming changes that will start producing
> migration data packets.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 1 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 48 ++++++-
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 10 +-
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 1 +
> .../gpu/drm/xe/xe_sriov_pf_migration_data.c | 135 ++++++++++++++++++
> .../gpu/drm/xe/xe_sriov_pf_migration_data.h | 32 +++++
> 6 files changed, 224 insertions(+), 3 deletions(-)
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 71f685a315dca..e253d65366de4 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -177,6 +177,7 @@ xe-$(CONFIG_PCI_IOV) += \
> xe_sriov_pf_control.o \
> xe_sriov_pf_debugfs.o \
> xe_sriov_pf_migration.o \
> + xe_sriov_pf_migration_data.o \
> xe_sriov_pf_service.o \
> xe_tile_sriov_pf_debugfs.o
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index 16a88e7599f6d..04a4e92133c2e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -20,6 +20,7 @@
> #include "xe_sriov.h"
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_tile.h"
>
> @@ -949,14 +950,57 @@ static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> }
>
> +static void pf_enter_vf_restore_failed(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> + pf_exit_vf_wip(gt, vfid);
> +}
> +
> +static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + switch (data->type) {
> + default:
> + xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
> + pf_enter_vf_restore_failed(gt, vfid);
> + }
> +
> + return -EINVAL;
> +}
> +
> static bool pf_handle_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> {
> + struct xe_sriov_pf_migration_data *data;
> + int ret;
> +
> if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> return false;
>
> - pf_exit_vf_restore_wip(gt, vfid);
> - pf_enter_vf_restored(gt, vfid);
> + data = xe_gt_sriov_pf_migration_ring_consume(gt, vfid);
> + if (IS_ERR(data)) {
> + if (PTR_ERR(data) == -ENODATA &&
> + !xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid)) {
> + pf_exit_vf_restore_wip(gt, vfid);
> + pf_enter_vf_restored(gt, vfid);
> + } else {
> + pf_enter_vf_restore_failed(gt, vfid);
> + }
> + return false;
> + }
> +
> + xe_gt_assert(gt, gt->info.id == data->gt);
> + xe_gt_assert(gt, gt->tile->id == data->tile);
> +
> + ret = pf_handle_vf_restore_data(gt, vfid, data);
> + if (ret) {
> + xe_gt_sriov_err(gt, "VF%u failed to restore data type: %d (%d)\n",
> + vfid, data->type, ret);
> + xe_sriov_pf_migration_data_free(data);
> + pf_enter_vf_restore_failed(gt, vfid);
> + return false;
> + }
>
> + xe_sriov_pf_migration_data_free(data);
> return true;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index af5952f42fff1..582aaf062cbd4 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -15,6 +15,7 @@
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
>
> #define XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT (HZ * 20)
> #define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> @@ -523,11 +524,18 @@ xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid
> return ERR_PTR(-EAGAIN);
> }
>
> +static void pf_mig_data_destroy(void *ptr)
> +{
> + struct xe_sriov_pf_migration_data *data = ptr;
> +
> + xe_sriov_pf_migration_data_free(data);
> +}
> +
> static void pf_gt_migration_cleanup(struct drm_device *dev, void *arg)
> {
> struct xe_gt_sriov_pf_migration *migration = arg;
>
> - ptr_ring_cleanup(&migration->ring, NULL);
> + ptr_ring_cleanup(&migration->ring, pf_mig_data_destroy);
> }
>
> /**
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index 347682f29a03c..d39cee66589b5 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -12,6 +12,7 @@
> #include "xe_pm.h"
> #include "xe_sriov_pf_helpers.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_printk.h"
>
> static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> new file mode 100644
> index 0000000000000..cfc6b512c6674
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> @@ -0,0 +1,135 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include "xe_bo.h"
> +#include "xe_device.h"
> +#include "xe_sriov_pf_migration_data.h"
> +
> +static bool data_needs_bo(struct xe_sriov_pf_migration_data *data)
> +{
> + unsigned int type = data->type;
> +
> + return type == XE_SRIOV_MIG_DATA_CCS ||
> + type == XE_SRIOV_MIG_DATA_VRAM;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_alloc() - Allocate migration data packet
> + * @xe: the &struct xe_device
> + *
> + * Only allocates the "outer" structure, without initializing the migration
> + * data backing storage.
> + *
> + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> + * NULL in case of error.
> + */
> +struct xe_sriov_pf_migration_data *
> +xe_sriov_pf_migration_data_alloc(struct xe_device *xe)
> +{
> + struct xe_sriov_pf_migration_data *data;
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> + if (!data)
> + return NULL;
> +
> + data->xe = xe;
> + data->hdr_remaining = sizeof(data->hdr);
> +
> + return data;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_free() - Free migration data packet
> + * @data: the &struct xe_sriov_pf_migration_data packet
> + */
> +void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *data)
> +{
> + if (data_needs_bo(data)) {
> + if (data->bo)
> + xe_bo_unpin_map_no_vm(data->bo);
> + } else {
> + if (data->buff)
> + kvfree(data->buff);
> + }
> +
> + kfree(data);
> +}
> +
> +static int mig_data_init(struct xe_sriov_pf_migration_data *data)
> +{
> + struct xe_gt *gt = xe_device_get_gt(data->xe, data->gt);
> +
> + if (!gt || data->tile != gt->tile->id)
> + return -EINVAL;
> +
> + if (data->size == 0)
> + return 0;
> +
> + if (data_needs_bo(data)) {
> + struct xe_bo *bo = xe_bo_create_pin_map_novm(data->xe, gt->tile,
> + PAGE_ALIGN(data->size),
> + ttm_bo_type_kernel,
> + XE_BO_FLAG_SYSTEM | XE_BO_FLAG_PINNED,
> + false);
> + if (IS_ERR(bo))
> + return PTR_ERR(bo);
> +
> + data->bo = bo;
> + data->vaddr = bo->vmap.vaddr;
> + } else {
> + void *buff = kvzalloc(data->size, GFP_KERNEL);
> + if (!buff)
> + return -ENOMEM;
> +
> + data->buff = buff;
> + data->vaddr = buff;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_init() - Initialize the migration data header and backing storage
> + * @data: the &struct xe_sriov_pf_migration_data packet
> + * @tile_id: tile identifier
> + * @gt_id: GT identifier
> + * @type: &enum xe_sriov_pf_migration_data_type
> + * @offset: offset of data packet payload (within wider resource)
> + * @size: size of data packet payload
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> + unsigned int type, loff_t offset, size_t size)
> +{
> + xe_assert(data->xe, type < XE_SRIOV_MIG_DATA_MAX);
> + data->version = 1;
> + data->type = type;
> + data->tile = tile_id;
> + data->gt = gt_id;
> + data->offset = offset;
> + data->size = size;
> + data->remaining = size;
> +
> + return mig_data_init(data);
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_init() - Initialize the migration data backing storage based on header
> + * @data: the &struct xe_sriov_pf_migration_data packet
> + *
> + * Header data is expected to be filled prior to calling this function
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *data)
> +{
> + if (WARN_ON(data->hdr_remaining))
> + return -EINVAL;
> +
> + data->remaining = data->size;
> +
> + return mig_data_init(data);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> new file mode 100644
> index 0000000000000..1dde4cfcdbc47
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_SRIOV_PF_MIGRATION_DATA_H_
> +#define _XE_SRIOV_PF_MIGRATION_DATA_H_
> +
> +#include <linux/types.h>
> +
> +struct xe_device;
> +
> +enum xe_sriov_pf_migration_data_type {
> + XE_SRIOV_MIG_DATA_DESCRIPTOR = 1,
> + XE_SRIOV_MIG_DATA_TRAILER,
> + XE_SRIOV_MIG_DATA_GGTT,
> + XE_SRIOV_MIG_DATA_MMIO,
> + XE_SRIOV_MIG_DATA_GUC,
> + XE_SRIOV_MIG_DATA_CCS,
grep XE_SRIOV_MIG_DATA_CCS *.c *.h
xe_sriov_pf_migration_data.c: return type == XE_SRIOV_MIG_DATA_CCS ||
xe_sriov_pf_migration_data.h: XE_SRIOV_MIG_DATA_CCS,
XE_SRIOV_MIG_DATA_CCS appears to be unused right now, I'd remove this
data type of now.
Matt
> + XE_SRIOV_MIG_DATA_VRAM,
> + XE_SRIOV_MIG_DATA_MAX,
> +};
> +
> +struct xe_sriov_pf_migration_data *
> +xe_sriov_pf_migration_data_alloc(struct xe_device *xe);
> +void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *snapshot);
> +
> +int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> + unsigned int type, loff_t offset, size_t size);
> +int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *snapshot);
> +
> +#endif
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs
2025-10-11 19:38 ` [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
@ 2025-10-12 19:15 ` Matthew Brost
2025-10-21 0:37 ` Michał Winiarski
2025-10-13 11:04 ` Michal Wajdeczko
1 sibling, 1 reply; 82+ messages in thread
From: Matthew Brost @ 2025-10-12 19:15 UTC (permalink / raw)
To: Michał Winiarski
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sat, Oct 11, 2025 at 09:38:30PM +0200, Michał Winiarski wrote:
> The size is normally used to make a decision on when to stop the device
> (mainly when it's in a pre_copy state).
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 18 ++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 2 ++
> drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 34 +++++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 30 ++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 1 +
> 5 files changed, 85 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 582aaf062cbd4..50f09994e2854 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -395,6 +395,24 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> }
> #endif /* CONFIG_DEBUG_FS */
>
> +/**
> + * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: total migration data size in bytes or a negative error code on failure.
> + */
> +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> +{
> + ssize_t total = 0;
> +
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +
> + return total;
> +}
> +
> /**
> * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty
> * @gt: the &struct xe_gt
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index 1e4dc46413823..e5298d35d7d7e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -15,6 +15,8 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
>
> +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
> +
> bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
> struct xe_sriov_pf_migration_data *data);
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> index ce780719760a6..b06e893fe54cf 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> @@ -13,6 +13,7 @@
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_pf_debugfs.h"
> #include "xe_sriov_pf_helpers.h"
> +#include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_sriov_printk.h"
> @@ -203,6 +204,38 @@ static const struct file_operations data_vf_fops = {
> .llseek = default_llseek,
> };
>
> +static ssize_t size_read(struct file *file, char __user *ubuf, size_t count, loff_t *ppos)
> +{
> + struct dentry *dent = file_dentry(file);
> + struct dentry *vfdentry = dent->d_parent;
> + struct dentry *migration_dentry = vfdentry->d_parent;
> + unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
> + struct xe_device *xe = migration_dentry->d_inode->i_private;
> + char buf[21];
> + ssize_t ret;
> + int len;
> +
> + xe_assert(xe, vfid);
> + xe_sriov_pf_assert_vfid(xe, vfid);
> +
> + xe_pm_runtime_get(xe);
You don't need a PM ref here as this is purely software (i.e, the
hardware is not touched).
Matt
> + ret = xe_sriov_pf_migration_size(xe, vfid);
> + xe_pm_runtime_put(xe);
> + if (ret < 0)
> + return ret;
> +
> + len = scnprintf(buf, sizeof(buf), "%zd\n", ret);
> +
> + return simple_read_from_buffer(ubuf, count, ppos, buf, len);
> +}
> +
> +static const struct file_operations size_vf_fops = {
> + .owner = THIS_MODULE,
> + .open = simple_open,
> + .read = size_read,
> + .llseek = default_llseek,
> +};
> +
> static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> {
> debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> @@ -212,6 +245,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> + debugfs_create_file("migration_size", 0400, vfdent, xe, &size_vf_fops);
> }
>
> static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index a0cfac456ba0b..6b247581dec65 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -249,3 +249,33 @@ int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
>
> return xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
> }
> +
> +/**
> + * xe_sriov_pf_migration_size() - Total size of migration data from all components within a device
> + * @xe: the &struct xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: total migration data size in bytes or a negative error code on failure.
> + */
> +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid)
> +{
> + size_t size = 0;
> + struct xe_gt *gt;
> + ssize_t ret;
> + u8 gt_id;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + for_each_gt(gt, xe, gt_id) {
> + ret = xe_gt_sriov_pf_migration_size(gt, vfid);
> + if (ret < 0) {
> + size = ret;
> + break;
> + }
> + size += ret;
> + }
> +
> + return size;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> index f2020ba19c2da..887ea3e9632bd 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> @@ -14,6 +14,7 @@ struct xe_device;
> #ifdef CONFIG_PCI_IOV
> int xe_sriov_pf_migration_init(struct xe_device *xe);
> bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid);
> struct xe_sriov_pf_migration_data *
> xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs
2025-10-11 19:38 ` [PATCH 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
@ 2025-10-12 20:09 ` Michal Wajdeczko
0 siblings, 0 replies; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-12 20:09 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> The states will be used by upcoming changes to produce (in case of save)
> or consume (in case of resume) the VF migration data.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 270 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 6 +
> .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 17 ++
> drivers/gpu/drm/xe/xe_sriov_pf_control.c | 96 +++++++
> drivers/gpu/drm/xe/xe_sriov_pf_control.h | 4 +
> drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 38 +++
> 6 files changed, 431 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index 2e6bd3d1fe1da..44df984278548 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -184,6 +184,13 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> CASE2STR(PAUSE_SAVE_GUC);
> CASE2STR(PAUSE_FAILED);
> CASE2STR(PAUSED);
> + CASE2STR(MIGRATION_DATA_WIP);
> + CASE2STR(SAVE_WIP);
> + CASE2STR(SAVE_FAILED);
> + CASE2STR(SAVED);
> + CASE2STR(RESTORE_WIP);
> + CASE2STR(RESTORE_FAILED);
> + CASE2STR(RESTORED);
> CASE2STR(RESUME_WIP);
> CASE2STR(RESUME_SEND_RESUME);
> CASE2STR(RESUME_FAILED);
> @@ -207,6 +214,8 @@ static unsigned long pf_get_default_timeout(enum xe_gt_sriov_control_bits bit)
> return HZ / 2;
> case XE_GT_SRIOV_STATE_FLR_WIP:
> case XE_GT_SRIOV_STATE_FLR_RESET_CONFIG:
> + case XE_GT_SRIOV_STATE_SAVE_WIP:
> + case XE_GT_SRIOV_STATE_RESTORE_WIP:
> return 5 * HZ;
> default:
> return HZ;
> @@ -359,6 +368,10 @@ static void pf_queue_vf(struct xe_gt *gt, unsigned int vfid)
>
> static void pf_exit_vf_flr_wip(struct xe_gt *gt, unsigned int vfid);
> static void pf_exit_vf_stop_wip(struct xe_gt *gt, unsigned int vfid);
> +static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid);
> +static void pf_exit_vf_saved(struct xe_gt *gt, unsigned int vfid);
> +static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid);
> +static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid);
> static void pf_exit_vf_pause_wip(struct xe_gt *gt, unsigned int vfid);
> static void pf_exit_vf_resume_wip(struct xe_gt *gt, unsigned int vfid);
>
> @@ -380,6 +393,8 @@ static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
>
> pf_exit_vf_flr_wip(gt, vfid);
> pf_exit_vf_stop_wip(gt, vfid);
> + pf_exit_vf_save_wip(gt, vfid);
> + pf_exit_vf_restore_wip(gt, vfid);
> pf_exit_vf_pause_wip(gt, vfid);
> pf_exit_vf_resume_wip(gt, vfid);
>
> @@ -399,6 +414,8 @@ static void pf_enter_vf_ready(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> + pf_exit_vf_saved(gt, vfid);
> + pf_exit_vf_restored(gt, vfid);
> pf_exit_vf_mismatch(gt, vfid);
> pf_exit_vf_wip(gt, vfid);
> }
> @@ -675,6 +692,8 @@ static void pf_enter_vf_resumed(struct xe_gt *gt, unsigned int vfid)
> {
> pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> + pf_exit_vf_saved(gt, vfid);
> + pf_exit_vf_restored(gt, vfid);
> pf_exit_vf_mismatch(gt, vfid);
> pf_exit_vf_wip(gt, vfid);
> }
> @@ -776,6 +795,249 @@ int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid)
> return -ECANCELED;
> }
>
> +/**
> + * xe_gt_sriov_pf_control_check_vf_data_wip - check if new SR-IOV VF migration data is expected
nit: add () to function name:
xe_gt_sriov_pf_control_check_vf_data_wip() - Check ...
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: true when new migration data is expected to be produced, false otherwise
> + */
> +bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + return pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> +}
> +
> +static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> +}
> +
> +static void pf_enter_vf_saved(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED))
> + pf_enter_vf_state_machine_bug(gt, vfid);
> +
> + xe_gt_sriov_info(gt, "VF%u saved!\n", vfid);
> +
> + pf_exit_vf_mismatch(gt, vfid);
> + pf_exit_vf_wip(gt, vfid);
> +}
> +
> +static void pf_exit_vf_saved(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> +}
> +
> +static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> + return false;
> +
> + pf_exit_vf_save_wip(gt, vfid);
> + pf_enter_vf_saved(gt, vfid);
> +
> + return true;
> +}
> +
> +static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> + pf_exit_vf_restored(gt, vfid);
> + pf_enter_vf_wip(gt, vfid);
> + pf_queue_vf(gt, vfid);
> + return true;
> + }
> +
> + return false;
> +}
> +
please add diagram for the inner state machines for both new SAVE/RESTORE states
and also update master diagram how SAVE/RESTORE interacts with existing states
with diagrams that include expected flows it will be easier to review the impl
> +/**
> + * xe_gt_sriov_pf_control_save_vf - Save SR-IOV VF migration data.
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_save_vf(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
> + xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
> + xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + xe_gt_sriov_dbg(gt, "VF%u restore is in progress!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (!pf_enter_vf_save_wip(gt, vfid)) {
> + xe_gt_sriov_dbg(gt, "VF%u save already in progress!\n", vfid);
> + return -EALREADY;
> + }
> +
> + return 0;
> +}
> +
> +static int pf_wait_vf_save_done(struct xe_gt *gt, unsigned int vfid)
> +{
> + unsigned long timeout = pf_get_default_timeout(XE_GT_SRIOV_STATE_SAVE_WIP);
> + int err;
> +
> + err = pf_wait_vf_wip_done(gt, vfid, timeout);
> + if (err) {
> + xe_gt_sriov_notice(gt, "VF%u SAVE didn't finish in %u ms (%pe)\n",
> + vfid, jiffies_to_msecs(timeout), ERR_PTR(err));
> + return err;
> + }
> +
> + if (!pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED))
> + return -EIO;
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_wait_save_done() - Wait for a VF Save to complete
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_wait_save_done(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> + return 0;
> +
> + return pf_wait_vf_save_done(gt, vfid);
> +}
> +
> +static void pf_exit_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP);
> +}
> +
> +static void pf_enter_vf_restored(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED))
> + pf_enter_vf_state_machine_bug(gt, vfid);
> +
> + xe_gt_sriov_info(gt, "VF%u restored!\n", vfid);
since commit ac43294e8ec2 for GT-level state changes we use dbg() messages
> +
> + pf_exit_vf_mismatch(gt, vfid);
> + pf_exit_vf_wip(gt, vfid);
> +}
> +
> +static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> +}
> +
> +static bool pf_handle_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> + return false;
> +
> + pf_exit_vf_restore_wip(gt, vfid);
> + pf_enter_vf_restored(gt, vfid);
> +
> + return true;
> +}
> +
> +static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP)) {
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> + pf_exit_vf_saved(gt, vfid);
> + pf_enter_vf_wip(gt, vfid);
> + pf_enter_vf_restored(gt, vfid);
> + return true;
> + }
> +
> + return false;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_restore_vf - Restore SR-IOV VF migration data.
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_restore_vf(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOPPED)) {
> + xe_gt_sriov_dbg(gt, "VF%u is stopped!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED)) {
> + xe_gt_sriov_dbg(gt, "VF%u is not paused!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP)) {
> + xe_gt_sriov_dbg(gt, "VF%u save is in progress!\n", vfid);
> + return -EPERM;
> + }
> +
> + if (!pf_enter_vf_restore_wip(gt, vfid)) {
> + xe_gt_sriov_dbg(gt, "VF%u restore already in progress!\n", vfid);
> + return -EALREADY;
> + }
> +
> + return 0;
> +}
> +
> +static int pf_wait_vf_restore_done(struct xe_gt *gt, unsigned int vfid)
> +{
> + unsigned long timeout = pf_get_default_timeout(XE_GT_SRIOV_STATE_RESTORE_WIP);
> + int err;
> +
> + err = pf_wait_vf_wip_done(gt, vfid, timeout);
> + if (err) {
> + xe_gt_sriov_notice(gt, "VF%u RESTORE didn't finish in %u ms (%pe)\n",
> + vfid, jiffies_to_msecs(timeout), ERR_PTR(err));
> + return err;
> + }
> +
> + if (!pf_expect_vf_not_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED))
> + return -EIO;
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_control_wait_restore_done() - Wait for a VF Restore to complete
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_control_wait_restore_done(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> + return 0;
> +
> + return pf_wait_vf_restore_done(gt, vfid);
> +}
> +
> /**
> * DOC: The VF STOP state machine
> *
> @@ -817,6 +1079,8 @@ static void pf_enter_vf_stopped(struct xe_gt *gt, unsigned int vfid)
>
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUMED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSED);
> + pf_exit_vf_saved(gt, vfid);
> + pf_exit_vf_restored(gt, vfid);
> pf_exit_vf_mismatch(gt, vfid);
> pf_exit_vf_wip(gt, vfid);
> }
> @@ -1461,6 +1725,12 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
> if (pf_exit_vf_pause_save_guc(gt, vfid))
> return true;
>
> + if (pf_handle_vf_save_wip(gt, vfid))
> + return true;
> +
> + if (pf_handle_vf_restore_wip(gt, vfid))
> + return true;
> +
> if (pf_exit_vf_resume_send_resume(gt, vfid))
> return true;
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> index 8a72ef3778d47..2e121e8132dcf 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> @@ -14,8 +14,14 @@ struct xe_gt;
> int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
> void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
>
> +bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
> +
> int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_save_vf(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_wait_save_done(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_restore_vf(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_control_wait_restore_done(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_stop_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_trigger_flr(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_sync_flr(struct xe_gt *gt, unsigned int vfid, bool sync);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> index c80b7e77f1ad2..02b517533ee8a 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> @@ -31,6 +31,13 @@
> * @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state.
> * @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
> * @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
> + * @XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP: indicates that the new data is expected in migration ring.
hmm, that looks like a wrong optimization, likely we should have separate:
> + * @XE_GT_SRIOV_STATE_SAVE_WIP: indicates that VF save operation is in progress.
XE_GT_SRIOV_STATE_SAVE_WAIT_DATA - indicates that the new SAVE data is expected
> + * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF save operation has failed.
> + * @XE_GT_SRIOV_STATE_SAVED: indicates that VF is saved.
> + * @XE_GT_SRIOV_STATE_RESTORE_WIP: indicates that VF restore operation is in progress.
and
XE_GT_SRIOV_STATE_RESTORE_WAIT_DATA - indicates that the new RESTORE data is expected
> + * @XE_GT_SRIOV_STATE_SAVE_FAILED: indicates that VF restore operation has failed.
> + * @XE_GT_SRIOV_STATE_SAVED: indicates that VF is restored.
2x copy/paste typo
> * @XE_GT_SRIOV_STATE_RESUME_WIP: indicates the a VF resume operation is in progress.
> * @XE_GT_SRIOV_STATE_RESUME_SEND_RESUME: indicates that the PF is about to send RESUME command.
> * @XE_GT_SRIOV_STATE_RESUME_FAILED: indicates that a VF resume operation has failed.
> @@ -63,6 +70,16 @@ enum xe_gt_sriov_control_bits {
> XE_GT_SRIOV_STATE_PAUSE_FAILED,
> XE_GT_SRIOV_STATE_PAUSED,
>
> + XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP,
> +
> + XE_GT_SRIOV_STATE_SAVE_WIP,
> + XE_GT_SRIOV_STATE_SAVE_FAILED,
> + XE_GT_SRIOV_STATE_SAVED,
> +
> + XE_GT_SRIOV_STATE_RESTORE_WIP,
> + XE_GT_SRIOV_STATE_RESTORE_FAILED,
> + XE_GT_SRIOV_STATE_RESTORED,
> +
> XE_GT_SRIOV_STATE_RESUME_WIP,
> XE_GT_SRIOV_STATE_RESUME_SEND_RESUME,
> XE_GT_SRIOV_STATE_RESUME_FAILED,
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> index 416d00a03fbb7..e64c7b56172c6 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> @@ -149,3 +149,99 @@ int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid)
>
> return 0;
> }
> +
> +/**
> + * xe_sriov_pf_control_save_vf - Save VF migration data on all GTs.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + unsigned int id;
> + int ret;
> +
> + for_each_gt(gt, xe, id) {
> + ret = xe_gt_sriov_pf_control_save_vf(gt, vfid);
> + if (ret)
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_sriov_pf_control_wait_save_vf - Wait until VF migration data was saved on all GTs
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_wait_save_vf(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + u8 id;
"unsigned int" like above?
> + int ret;
> +
> + for_each_gt(gt, xe, id) {
> + ret = xe_gt_sriov_pf_control_wait_save_done(gt, vfid);
> + if (ret)
> + break;
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * xe_sriov_pf_control_restore_vf - Restore VF migration data on all GTs.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_restore_vf(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + u8 id;
> + int ret;
> +
> + for_each_gt(gt, xe, id) {
> + ret = xe_gt_sriov_pf_control_restore_vf(gt, vfid);
> + if (ret)
> + return ret;
> + }
> +
> + return ret;
> +}
> +
> +/**
> + * xe_sriov_pf_control_wait_save_vf - Wait until VF migration data was restored on all GTs
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_wait_restore_vf(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + u8 id;
> + int ret;
> +
> + for_each_gt(gt, xe, id) {
> + ret = xe_gt_sriov_pf_control_wait_restore_done(gt, vfid);
> + if (ret)
> + break;
> + }
> +
> + return ret;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> index 2d52d0ac1b28f..512fd21d87c1e 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> @@ -13,5 +13,9 @@ int xe_sriov_pf_control_resume_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_stop_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_wait_save_vf(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_restore_vf(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_wait_restore_vf(struct xe_device *xe, unsigned int vfid);
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> index 97636ed86fb8b..74eeabef91c57 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> @@ -75,11 +75,31 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
> * │ │ ├── reset
> * │ │ ├── resume
> * │ │ ├── stop
> + * │ │ ├── save
> + * │ │ ├── restore
> * │ │ :
> * │ ├── vf2
> * │ │ ├── ...
> */
>
> +static int from_file_read_to_vf_call(struct seq_file *s,
> + int (*call)(struct xe_device *, unsigned int))
> +{
> + struct dentry *dent = file_dentry(s->file)->d_parent;
> + struct xe_device *xe = extract_xe(dent);
> + unsigned int vfid = extract_vfid(dent);
> + int ret;
> +
> + xe_pm_runtime_get(xe);
> + ret = call(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + if (ret < 0)
> + return ret;
> +
> + return s->count;
since we don't expect to put anything into "s", maybe we can explicitly return 0 here?
> +}
> +
> static ssize_t from_file_write_to_vf_call(struct file *file, const char __user *userbuf,
> size_t count, loff_t *ppos,
> int (*call)(struct xe_device *, unsigned int))
> @@ -118,10 +138,26 @@ static ssize_t OP##_write(struct file *file, const char __user *userbuf, \
> } \
> DEFINE_SHOW_STORE_ATTRIBUTE(OP)
>
> +#define DEFINE_VF_RW_CONTROL_ATTRIBUTE(OP) \
rename it to have RW as suffix (like other ATTR macros):
DEFINE_VF_CONTROL_ATTRIBUTE_RW
> +static int OP##_show(struct seq_file *s, void *unused) \
> +{ \
> + return from_file_read_to_vf_call(s, \
> + xe_sriov_pf_control_wait_##OP); \
> +} \
> +static ssize_t OP##_write(struct file *file, const char __user *userbuf, \
> + size_t count, loff_t *ppos) \
> +{ \
> + return from_file_write_to_vf_call(file, userbuf, count, ppos, \
> + xe_sriov_pf_control_##OP); \
> +} \
> +DEFINE_SHOW_STORE_ATTRIBUTE(OP)
> +
> DEFINE_VF_CONTROL_ATTRIBUTE(pause_vf);
> DEFINE_VF_CONTROL_ATTRIBUTE(resume_vf);
> DEFINE_VF_CONTROL_ATTRIBUTE(stop_vf);
> DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
> +DEFINE_VF_RW_CONTROL_ATTRIBUTE(save_vf);
> +DEFINE_VF_RW_CONTROL_ATTRIBUTE(restore_vf);
>
> static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> {
> @@ -129,6 +165,8 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> debugfs_create_file("resume", 0200, vfdent, xe, &resume_vf_fops);
> debugfs_create_file("stop", 0200, vfdent, xe, &stop_vf_fops);
> debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
> + debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> + debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> }
>
> static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 05/26] drm/xe/pf: Add data structures and handlers for migration rings
2025-10-11 19:38 ` [PATCH 05/26] drm/xe/pf: Add data structures and handlers for migration rings Michał Winiarski
@ 2025-10-12 21:06 ` Michal Wajdeczko
2025-10-20 14:56 ` Michał Winiarski
0 siblings, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-12 21:06 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Migration data is queued in a per-GT ptr_ring to decouple the worker
> responsible for handling the data transfer from the .read()/.write()
> syscalls.
... from the .read() and .write() syscalls.
> Add the data structures and handlers that will be used in future
> commits.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 4 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 163 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 9 +
> .../drm/xe/xe_gt_sriov_pf_migration_types.h | 5 +-
> drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 3 +
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 147 ++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 20 +++
> .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 37 ++++
> drivers/gpu/drm/xe/xe_sriov_pf_types.h | 3 +
> 9 files changed, 390 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index 44df984278548..16a88e7599f6d 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -19,6 +19,7 @@
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> #include "xe_sriov_pf_control.h"
> +#include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_tile.h"
>
> @@ -388,6 +389,8 @@ static bool pf_enter_vf_wip(struct xe_gt *gt, unsigned int vfid)
>
> static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
> {
> + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> +
> if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_WIP)) {
> struct xe_gt_sriov_control_state *cs = pf_pick_vf_control(gt, vfid);
we can declare wq here
>
> @@ -399,6 +402,7 @@ static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_resume_wip(gt, vfid);
>
> complete_all(&cs->done);
> + wake_up_all(wq);
> }
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index f8604b172963e..af5952f42fff1 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -7,6 +7,7 @@
>
> #include "abi/guc_actions_sriov_abi.h"
> #include "xe_bo.h"
> +#include "xe_gt_sriov_pf_control.h"
> #include "xe_gt_sriov_pf_helpers.h"
> #include "xe_gt_sriov_pf_migration.h"
> #include "xe_gt_sriov_printk.h"
> @@ -15,6 +16,17 @@
> #include "xe_sriov.h"
> #include "xe_sriov_pf_migration.h"
>
> +#define XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT (HZ * 20)
> +#define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> +
> +static struct xe_gt_sriov_pf_migration *pf_pick_gt_migration(struct xe_gt *gt, unsigned int vfid)
> +{
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> +
> + return >->sriov.pf.vfs[vfid].migration;
> +}
> +
> /* Return: number of dwords saved/restored/required or a negative error code on failure */
> static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
> u64 addr, u32 ndwords)
> @@ -382,6 +394,142 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> }
> #endif /* CONFIG_DEBUG_FS */
>
> +/**
> + * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * Return: true if the ring is empty, otherwise false.
> + */
> +bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid)
> +{
> + return ptr_ring_empty(&pf_pick_gt_migration(gt, vfid)->ring);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_produce() - Add migration data packet to migration ring
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + * @data: &struct xe_sriov_pf_migration_data packet
> + *
> + * If the ring is full, wait until there is space in the ring.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
> + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> + unsigned long timeout = XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT;
> + int ret;
> +
> + xe_gt_assert(gt, data->tile == gt->tile->id);
> + xe_gt_assert(gt, data->gt == gt->info.id);
> +
> + while (1) {
> + ret = ptr_ring_produce(&migration->ring, data);
> + if (ret == 0) {
if (!ret)
break;
> + wake_up_all(wq);
> + break;
> + }
> +
> + if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
> + return -EINVAL;
> +
> + ret = wait_event_interruptible_timeout(*wq,
> + !ptr_ring_full(&migration->ring),
> + timeout);
> + if (ret == 0)
> + return -ETIMEDOUT;
> +
> + timeout = ret;
> + }
> +
wake_up_all(wq);
return 0;
> + return ret;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_consume() - Get migration data packet from migration ring
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * If the ring is empty, wait until there are new migration data packets to process.
> + *
> + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> + * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
> + * ERR_PTR value in case of error.
> + */
> +struct xe_sriov_pf_migration_data *
> +xe_gt_sriov_pf_migration_ring_consume(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
> + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> + unsigned long timeout = XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT;
> + struct xe_sriov_pf_migration_data *data;
> + int ret;
> +
> + while (1) {
> + data = ptr_ring_consume(&migration->ring);
> + if (data) {
> + wake_up_all(wq);
> + break;
> + }
> +
> + if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
> + return ERR_PTR(-ENODATA);
> +
> + ret = wait_event_interruptible_timeout(*wq,
> + !ptr_ring_empty(&migration->ring) ||
> + !xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid),
> + timeout);
> + if (ret == 0)
> + return ERR_PTR(-ETIMEDOUT);
> +
> + timeout = ret;
> + }
> +
> + return data;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_consume_nowait() - Get migration data packet from migration ring
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * Similar to xe_gt_sriov_pf_migration_consume(), but doesn't wait until more data is available.
> + *
> + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> + * ERR_PTR(-EAGAIN) if ring is empty but migration data is expected,
> + * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
> + * ERR_PTR value in case of error.
> + */
> +struct xe_sriov_pf_migration_data *
> +xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
> + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> + struct xe_sriov_pf_migration_data *data;
> +
> + data = ptr_ring_consume(&migration->ring);
> + if (data) {
> + wake_up_all(wq);
> + return data;
> + }
> +
> + if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
> + return ERR_PTR(-ENODATA);
> +
> + return ERR_PTR(-EAGAIN);
> +}
> +
> +static void pf_gt_migration_cleanup(struct drm_device *dev, void *arg)
no need for the "pf" prefix
and if this is only about ptr_ring, then it could be:
static void action_ring_cleanup(...)
{
struct ptr_ring *r = arg;
ptr_ring_cleanup(r, NULL);
}
> +{
> + struct xe_gt_sriov_pf_migration *migration = arg;
> +
> + ptr_ring_cleanup(&migration->ring, NULL);
> +}
> +
> /**
> * xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
> * @gt: the &xe_gt
> @@ -393,6 +541,7 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> {
> struct xe_device *xe = gt_to_xe(gt);
> + unsigned int n, totalvfs;
> int err;
>
> xe_gt_assert(gt, IS_SRIOV_PF(xe));
> @@ -404,5 +553,19 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> if (err)
> return err;
>
> + totalvfs = xe_sriov_pf_get_totalvfs(xe);
> + for (n = 0; n <= totalvfs; n++) {
> + struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, n);
> +
> + err = ptr_ring_init(&migration->ring,
> + XE_GT_SRIOV_PF_MIGRATION_RING_SIZE, GFP_KERNEL);
> + if (err)
> + return err;
> +
> + err = drmm_add_action_or_reset(&xe->drm, pf_gt_migration_cleanup, migration);
> + if (err)
> + return err;
> + }
> +
> return 0;
> }
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index 09faeae00ddbb..1e4dc46413823 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -9,11 +9,20 @@
> #include <linux/types.h>
>
> struct xe_gt;
> +struct xe_sriov_pf_migration_data;
>
> int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
>
> +bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data);
> +struct xe_sriov_pf_migration_data *
> +xe_gt_sriov_pf_migration_ring_consume(struct xe_gt *gt, unsigned int vfid);
> +struct xe_sriov_pf_migration_data *
> +xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid);
> +
> #ifdef CONFIG_DEBUG_FS
> ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
> char __user *buf, size_t count, loff_t *pos);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> index fdc5a31dd8989..8434689372082 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> @@ -7,6 +7,7 @@
> #define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
>
> #include <linux/mutex.h>
> +#include <linux/ptr_ring.h>
> #include <linux/types.h>
>
> /**
> @@ -27,9 +28,11 @@ struct xe_gt_sriov_state_snapshot {
> /**
> * struct xe_gt_sriov_pf_migration - GT-level data.
> *
> - * Used by the PF driver to maintain non-VF specific per-GT data.
> + * Used by the PF driver to maintain per-VF migration data.
we try to match struct name with the sub-component name, not use it as per-VF name
if you want to have struct for the per-VF data, pick a different name, maybe:
struct xe_gt_sriov_pf_migration_state
or just reuse one that you plan to remove later:
struct xe_gt_sriov_state_snapshot
> */
> struct xe_gt_sriov_pf_migration {
> + /** @ring: queue containing VF save / restore migration data */
> + struct ptr_ring ring;
> };
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> index 9a856da379d39..fbb08f8030f7f 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> @@ -33,6 +33,9 @@ struct xe_gt_sriov_metadata {
>
> /** @snapshot: snapshot of the VF state data */
> struct xe_gt_sriov_state_snapshot snapshot;
> +
> + /** @migration: */
missing description
> + struct xe_gt_sriov_pf_migration migration;
> };
>
> /**
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index cf6a210d5597a..347682f29a03c 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -4,7 +4,35 @@
> */
>
> #include "xe_sriov.h"
> +#include <drm/drm_managed.h>
> +
> +#include "xe_device.h"
> +#include "xe_gt_sriov_pf_control.h"
> +#include "xe_gt_sriov_pf_migration.h"
> +#include "xe_pm.h"
> +#include "xe_sriov_pf_helpers.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_printk.h"
> +
> +static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> +
> + return &xe->sriov.pf.vfs[vfid].migration;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_waitqueue - Get waitqueue for migration
> + * @xe: the &struct xe_device
> + * @vfid: the VF identifier
> + *
> + * Return: pointer to the migration waitqueue.
> + */
> +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
> +{
> + return &pf_pick_migration(xe, vfid)->wq;
> +}
>
> /**
> * xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
> @@ -35,9 +63,128 @@ static bool pf_check_migration_support(struct xe_device *xe)
> */
> int xe_sriov_pf_migration_init(struct xe_device *xe)
> {
> + unsigned int n, totalvfs;
> +
> xe_assert(xe, IS_SRIOV_PF(xe));
>
> xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
> + if (!xe_sriov_pf_migration_supported(xe))
> + return 0;
> +
> + totalvfs = xe_sriov_pf_get_totalvfs(xe);
> + for (n = 1; n <= totalvfs; n++) {
> + struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
> +
> + init_waitqueue_head(&migration->wq);
> + }
>
> return 0;
> }
> +
> +static bool pf_migration_empty(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + u8 gt_id;
> +
> + for_each_gt(gt, xe, gt_id) {
> + if (!xe_gt_sriov_pf_migration_ring_empty(gt, vfid))
> + return false;
> + }
> +
> + return true;
> +}
> +
> +static struct xe_sriov_pf_migration_data *
> +pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_pf_migration_data *data;
> + struct xe_gt *gt;
> + u8 gt_id;
> + bool no_data = true;
> +
> + for_each_gt(gt, xe, gt_id) {
> + data = xe_gt_sriov_pf_migration_ring_consume_nowait(gt, vfid);
> +
> + if (!IS_ERR(data))
> + return data;
> + else if (PTR_ERR(data) == -EAGAIN)
> + no_data = false;
> + }
> +
> + if (no_data)
> + return ERR_PTR(-ENODATA);
> +
> + return ERR_PTR(-EAGAIN);
> +}
> +
> +/**
> + * xe_sriov_pf_migration_consume() - Consume a SR-IOV VF migration data packet from the device
> + * @xe: the &struct xe_device
> + * @vfid: the VF identifier
> + *
> + * If there is no migration data to process, wait until more data is available.
> + *
> + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> + * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
can we use NULL as indication of no data ? then all ERR_PTR will be errors
> + * ERR_PTR value in case of error.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +struct xe_sriov_pf_migration_data *
> +xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, vfid);
> + unsigned long timeout = HZ * 5;
> + struct xe_sriov_pf_migration_data *data;
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return ERR_PTR(-ENODEV);
this is "PF" function, we shouldn't get here if we are not a PF
use assert here, and make sure the caller verifies the PF mode
> +
> + while (1) {
> + data = pf_migration_consume(xe, vfid);
> + if (!IS_ERR(data) || PTR_ERR(data) != -EAGAIN)
> + goto out;
> +
> + ret = wait_event_interruptible_timeout(migration->wq,
> + !pf_migration_empty(xe, vfid),
> + timeout);
> + if (ret == 0) {
> + xe_sriov_warn(xe, "VF%d Timed out waiting for migration data\n", vfid);
> + return ERR_PTR(-ETIMEDOUT);
> + }
> +
> + timeout = ret;
> + }
> +
> +out:
> + return data;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_produce() - Produce a SR-IOV VF migration data packet for device to process
> + * @xe: the &struct xe_device
> + * @vfid: the VF identifier
> + * @data: VF migration data
> + *
> + * If the underlying data structure is full, wait until there is space.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + struct xe_gt *gt;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + gt = xe_device_get_gt(xe, data->gt);
> + if (!gt || data->tile != gt->tile->id) {
> + xe_sriov_err_ratelimited(xe, "VF%d Unknown GT - tile_id:%d, gt_id:%d\n",
> + vfid, data->tile, data->gt);
> + return -EINVAL;
> + }
> +
> + return xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> index d3058b6682192..f2020ba19c2da 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> @@ -7,12 +7,18 @@
> #define _XE_SRIOV_PF_MIGRATION_H_
>
> #include <linux/types.h>
> +#include <linux/wait.h>
>
> struct xe_device;
>
> #ifdef CONFIG_PCI_IOV
> int xe_sriov_pf_migration_init(struct xe_device *xe);
> bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> +struct xe_sriov_pf_migration_data *
> +xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data);
> +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid);
> #else
> static inline int xe_sriov_pf_migration_init(struct xe_device *xe)
> {
> @@ -22,6 +28,20 @@ static inline bool xe_sriov_pf_migration_supported(struct xe_device *xe)
> {
> return false;
> }
> +static inline struct xe_sriov_pf_migration_data *
> +xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> +{
> + return ERR_PTR(-ENODEV);
> +}
> +static inline int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + return -ENODEV;
> +}
> +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
> +{
> + return ERR_PTR(-ENODEV);
> +}
didn't fully check, but do we really need all these stubs?
likely those functions will be called from other real PF-only functions
> #endif
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> index e69de29bb2d1d..80fdea32b884a 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> @@ -0,0 +1,37 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_SRIOV_PF_MIGRATION_TYPES_H_
> +#define _XE_SRIOV_PF_MIGRATION_TYPES_H_
> +
> +#include <linux/types.h>
> +#include <linux/wait.h>
> +
add kernel-doc
> +struct xe_sriov_pf_migration_data {
> + struct xe_device *xe;
> + void *vaddr;
> + size_t remaining;
> + size_t hdr_remaining;
> + union {
> + struct xe_bo *bo;
> + void *buff;
> + };
> + __struct_group(xe_sriov_pf_migration_hdr, hdr, __packed,
> + u8 version;
> + u8 type;
> + u8 tile;
> + u8 gt;
> + u32 flags;
> + u64 offset;
> + u64 size;
> + );
> +};
> +
> +struct xe_sriov_pf_migration {
> + /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
> + wait_queue_head_t wq;
> +};
> +
> +#endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> index 2d2fcc0a2f258..b3ae21a5a0490 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> @@ -9,6 +9,7 @@
> #include <linux/mutex.h>
> #include <linux/types.h>
>
> +#include "xe_sriov_pf_migration_types.h"
> #include "xe_sriov_pf_service_types.h"
>
> /**
> @@ -17,6 +18,8 @@
> struct xe_sriov_metadata {
> /** @version: negotiated VF/PF ABI version */
> struct xe_sriov_pf_service_version version;
> + /** @migration: migration data */
> + struct xe_sriov_pf_migration migration;
> };
>
> /**
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free
2025-10-11 19:38 ` [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
2025-10-12 19:12 ` Matthew Brost
@ 2025-10-13 10:15 ` Michal Wajdeczko
2025-10-21 0:01 ` Michał Winiarski
1 sibling, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 10:15 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Now that it's possible to free the packets - connect the restore
> handling logic with the ring.
> The helpers will also be used in upcoming changes that will start producing
> migration data packets.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 1 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 48 ++++++-
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 10 +-
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 1 +
> .../gpu/drm/xe/xe_sriov_pf_migration_data.c | 135 ++++++++++++++++++
> .../gpu/drm/xe/xe_sriov_pf_migration_data.h | 32 +++++
while this is used by the PF only, maybe those files don't have to include _pf_ tag (like xe_pci_sriov.c or xe_sriov_vfio.c ?)
.../gpu/drm/xe/xe_sriov_migration_data.c | 135 ++++++++++++++++++
.../gpu/drm/xe/xe_sriov_migration_data.h | 32 +++++
or
.../gpu/drm/xe/xe_sriov_vfio_data.c | 135 ++++++++++++++++++
.../gpu/drm/xe/xe_sriov_vfio_data.h | 32 +++++
> 6 files changed, 224 insertions(+), 3 deletions(-)
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index 71f685a315dca..e253d65366de4 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -177,6 +177,7 @@ xe-$(CONFIG_PCI_IOV) += \
> xe_sriov_pf_control.o \
> xe_sriov_pf_debugfs.o \
> xe_sriov_pf_migration.o \
> + xe_sriov_pf_migration_data.o \
> xe_sriov_pf_service.o \
> xe_tile_sriov_pf_debugfs.o
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index 16a88e7599f6d..04a4e92133c2e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -20,6 +20,7 @@
> #include "xe_sriov.h"
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_tile.h"
>
> @@ -949,14 +950,57 @@ static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> }
>
> +static void pf_enter_vf_restore_failed(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> + pf_exit_vf_wip(gt, vfid);
> +}
> +
> +static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + switch (data->type) {
> + default:
> + xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
> + pf_enter_vf_restore_failed(gt, vfid);
shouldn't this be done in pf_handle_vf_restore_wip() where all other state transitions are done?
> + }
> +
> + return -EINVAL;
> +}
> +
> static bool pf_handle_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> {
> + struct xe_sriov_pf_migration_data *data;
> + int ret;
> +
> if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
in other places in VF control state machine we use slightly different pattern:
// can we exit state AAA?
if (!pf_exit_vf_state(AAA))
return false; // no, we are not in this state
// try to process other state
// yes, we _were_ in AAA, process this state
ret = handle_state_aaa();
// now decide where to go next
if (ret == -EAGAIN)
pf_enter_vf_state(AAA); // back
else if (ret < 0)
pf_enter_vf_state(AAA_FAILED) // failed
else
pf_enter_vf_state(AAA_DONE) // next
// state was processed, start next iteration
return true;
> return false;
>
> - pf_exit_vf_restore_wip(gt, vfid);
> - pf_enter_vf_restored(gt, vfid);
> + data = xe_gt_sriov_pf_migration_ring_consume(gt, vfid);
> + if (IS_ERR(data)) {
> + if (PTR_ERR(data) == -ENODATA &&
> + !xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid)) {
> + pf_exit_vf_restore_wip(gt, vfid);
> + pf_enter_vf_restored(gt, vfid);
> + } else {
> + pf_enter_vf_restore_failed(gt, vfid);
> + }
> + return false;
this should be 'true' as we completed this state processing
> + }
> +
> + xe_gt_assert(gt, gt->info.id == data->gt);
> + xe_gt_assert(gt, gt->tile->id == data->tile);
> +
> + ret = pf_handle_vf_restore_data(gt, vfid, data);
> + if (ret) {
> + xe_gt_sriov_err(gt, "VF%u failed to restore data type: %d (%d)\n",
use %pe for error
> + vfid, data->type, ret);
maybe for debug try to dump here more details about failing data packet
> + xe_sriov_pf_migration_data_free(data);
> + pf_enter_vf_restore_failed(gt, vfid);
> + return false;
> + }
>
> + xe_sriov_pf_migration_data_free(data);
> return true;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index af5952f42fff1..582aaf062cbd4 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -15,6 +15,7 @@
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
>
> #define XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT (HZ * 20)
> #define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> @@ -523,11 +524,18 @@ xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid
> return ERR_PTR(-EAGAIN);
> }
>
> +static void pf_mig_data_destroy(void *ptr)
> +{
> + struct xe_sriov_pf_migration_data *data = ptr;
> +
> + xe_sriov_pf_migration_data_free(data);
> +}
> +
> static void pf_gt_migration_cleanup(struct drm_device *dev, void *arg)
> {
> struct xe_gt_sriov_pf_migration *migration = arg;
>
> - ptr_ring_cleanup(&migration->ring, NULL);
> + ptr_ring_cleanup(&migration->ring, pf_mig_data_destroy);
> }
>
> /**
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index 347682f29a03c..d39cee66589b5 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -12,6 +12,7 @@
> #include "xe_pm.h"
> #include "xe_sriov_pf_helpers.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_printk.h"
>
> static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> new file mode 100644
> index 0000000000000..cfc6b512c6674
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> @@ -0,0 +1,135 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include "xe_bo.h"
> +#include "xe_device.h"
> +#include "xe_sriov_pf_migration_data.h"
> +
> +static bool data_needs_bo(struct xe_sriov_pf_migration_data *data)
> +{
> + unsigned int type = data->type;
> +
> + return type == XE_SRIOV_MIG_DATA_CCS ||
> + type == XE_SRIOV_MIG_DATA_VRAM;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_alloc() - Allocate migration data packet
> + * @xe: the &struct xe_device
> + *
> + * Only allocates the "outer" structure, without initializing the migration
> + * data backing storage.
> + *
> + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> + * NULL in case of error.
> + */
> +struct xe_sriov_pf_migration_data *
> +xe_sriov_pf_migration_data_alloc(struct xe_device *xe)
> +{
> + struct xe_sriov_pf_migration_data *data;
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> + if (!data)
> + return NULL;
> +
> + data->xe = xe;
> + data->hdr_remaining = sizeof(data->hdr);
> +
> + return data;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_free() - Free migration data packet
> + * @data: the &struct xe_sriov_pf_migration_data packet
> + */
> +void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *data)
> +{
> + if (data_needs_bo(data)) {
> + if (data->bo)
not needed, xe_bo_unpin_map_no_vm() checks for NULL
> + xe_bo_unpin_map_no_vm(data->bo);
> + } else {
> + if (data->buff)
not needed, kvfree() also checks for NULL
> + kvfree(data->buff);
> + }
> +
> + kfree(data);
> +}
> +
> +static int mig_data_init(struct xe_sriov_pf_migration_data *data)
> +{
> + struct xe_gt *gt = xe_device_get_gt(data->xe, data->gt);
> +
> + if (!gt || data->tile != gt->tile->id)
> + return -EINVAL;
didn't we check that already in xe_sriov_pf_migration_produce() ?
in other places we call xe_sriov_pf_migration_data_init() using ids from real tile and gt
> +
> + if (data->size == 0)
> + return 0;
> +
> + if (data_needs_bo(data)) {
> + struct xe_bo *bo = xe_bo_create_pin_map_novm(data->xe, gt->tile,
> + PAGE_ALIGN(data->size),
> + ttm_bo_type_kernel,
> + XE_BO_FLAG_SYSTEM | XE_BO_FLAG_PINNED,
> + false);
> + if (IS_ERR(bo))
> + return PTR_ERR(bo);
> +
> + data->bo = bo;
> + data->vaddr = bo->vmap.vaddr;
> + } else {
> + void *buff = kvzalloc(data->size, GFP_KERNEL);
> + if (!buff)
> + return -ENOMEM;
> +
> + data->buff = buff;
> + data->vaddr = buff;
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_init() - Initialize the migration data header and backing storage
> + * @data: the &struct xe_sriov_pf_migration_data packet
> + * @tile_id: tile identifier
> + * @gt_id: GT identifier
> + * @type: &enum xe_sriov_pf_migration_data_type
here type is enum
> + * @offset: offset of data packet payload (within wider resource)
> + * @size: size of data packet payload
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> + unsigned int type, loff_t offset, size_t size)
but here is plain int ?
> +{
> + xe_assert(data->xe, type < XE_SRIOV_MIG_DATA_MAX);
if it's "enum" - no need to check
if it's "int" and type is coming from outside of our code, then assert is not sufficient anyway
nit: if assert stays, add sep line here
> + data->version = 1;
magic "1" needs its own #define
> + data->type = type;
> + data->tile = tile_id;
> + data->gt = gt_id;
> + data->offset = offset;
> + data->size = size;
> + data->remaining = size;
> +
> + return mig_data_init(data);
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_init() - Initialize the migration data backing storage based on header
> + * @data: the &struct xe_sriov_pf_migration_data packet
> + *
> + * Header data is expected to be filled prior to calling this function
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *data)
> +{
> + if (WARN_ON(data->hdr_remaining))
better: xe_WARN_ON(xe, ....)
but does it really deserves any WARN here?
we rather know who is the caller
> + return -EINVAL;
> +
> + data->remaining = data->size;
> +
> + return mig_data_init(data);
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> new file mode 100644
> index 0000000000000..1dde4cfcdbc47
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> @@ -0,0 +1,32 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_SRIOV_PF_MIGRATION_DATA_H_
> +#define _XE_SRIOV_PF_MIGRATION_DATA_H_
> +
> +#include <linux/types.h>
> +
> +struct xe_device;
> +
> +enum xe_sriov_pf_migration_data_type {
maybe add a note that default 0 was skipped on purpose to catch uninitialized/invalid data
> + XE_SRIOV_MIG_DATA_DESCRIPTOR = 1,
shouldn't we try to match enumerator names with enum name?
XE_SRIOV_PF_MIGRATION_DATA_TYPE_DESCRIPTOR = 1,
XE_SRIOV_PF_MIGRATION_DATA_TYPE_TRAILER,
XE_SRIOV_PF_MIGRATION_DATA_TYPE_...,
or change the enum (and file) name:
xe_sriov_migration_data.c
XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR = 1,
XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER,
XE_SRIOV_MIGRATION_DATA_TYPE_...,
> + XE_SRIOV_MIG_DATA_TRAILER,
> + XE_SRIOV_MIG_DATA_GGTT,
> + XE_SRIOV_MIG_DATA_MMIO,
> + XE_SRIOV_MIG_DATA_GUC,
> + XE_SRIOV_MIG_DATA_CCS,
> + XE_SRIOV_MIG_DATA_VRAM,
> + XE_SRIOV_MIG_DATA_MAX,
please drop it
> +};
> +
> +struct xe_sriov_pf_migration_data *
> +xe_sriov_pf_migration_data_alloc(struct xe_device *xe);
> +void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *snapshot);
> +
> +int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> + unsigned int type, loff_t offset, size_t size);
> +int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *snapshot);
> +
> +#endif
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet
2025-10-11 19:38 ` [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
2025-10-11 22:28 ` kernel test robot
@ 2025-10-13 10:46 ` Michal Wajdeczko
2025-10-21 0:25 ` Michał Winiarski
1 sibling, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 10:46 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Add debugfs handlers for migration state and handle bitstream
> .read()/.write() to convert from bitstream to/from migration data
> packets.
> As descriptor/trailer are handled at this layer - add handling for both
> save and restore side.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 18 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 1 +
> drivers/gpu/drm/xe/xe_sriov_pf.c | 1 +
> drivers/gpu/drm/xe/xe_sriov_pf_control.c | 5 +
> drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 45 +++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 56 +++
> .../gpu/drm/xe/xe_sriov_pf_migration_data.c | 353 ++++++++++++++++++
> .../gpu/drm/xe/xe_sriov_pf_migration_data.h | 5 +
> .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 9 +
> 9 files changed, 493 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index 04a4e92133c2e..092d3d710bca1 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -814,6 +814,23 @@ bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfi
> return pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> }
>
> +/**
> + * xe_gt_sriov_pf_control_vf_data_eof() - indicate the end of SR-IOV VF migration data production
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + */
> +void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> +
> + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP))
> + pf_enter_vf_state_machine_bug(gt, vfid);
> +
> + wake_up_all(wq);
> +}
> +
> static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> {
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> @@ -840,6 +857,7 @@ static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> return false;
>
> + xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
above call can lead to state_machine_bug but here you just continue as nothing happen and moving to SAVED state
maybe that logic of that function should moved to a helper that at least returns bool so you can make the right decision?
> pf_exit_vf_save_wip(gt, vfid);
> pf_enter_vf_saved(gt, vfid);
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> index 2e121e8132dcf..caf20dd063b1b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> @@ -15,6 +15,7 @@ int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
> void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
>
> bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
> +void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid);
>
> int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
> index 95743c7af8050..5d115627f3f2f 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
> @@ -16,6 +16,7 @@
> #include "xe_sriov_pf.h"
> #include "xe_sriov_pf_helpers.h"
> #include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_sriov_printk.h"
>
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> index e64c7b56172c6..10e1f18aa8b11 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> @@ -6,6 +6,7 @@
> #include "xe_device.h"
> #include "xe_gt_sriov_pf_control.h"
> #include "xe_sriov_pf_control.h"
> +#include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_printk.h"
>
> /**
> @@ -165,6 +166,10 @@ int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid)
> unsigned int id;
> int ret;
>
> + ret = xe_sriov_pf_migration_data_save_init(xe, vfid);
> + if (ret)
> + return ret;
> +
> for_each_gt(gt, xe, id) {
> ret = xe_gt_sriov_pf_control_save_vf(gt, vfid);
> if (ret)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> index 74eeabef91c57..ce780719760a6 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> @@ -13,6 +13,7 @@
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_pf_debugfs.h"
> #include "xe_sriov_pf_helpers.h"
> +#include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_sriov_printk.h"
> #include "xe_tile_sriov_pf_debugfs.h"
> @@ -71,6 +72,7 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
> * /sys/kernel/debug/dri/BDF/
> * ├── sriov
> * │ ├── vf1
> + * │ │ ├── migration_data
> * │ │ ├── pause
> * │ │ ├── reset
> * │ │ ├── resume
> @@ -159,6 +161,48 @@ DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
> DEFINE_VF_RW_CONTROL_ATTRIBUTE(save_vf);
> DEFINE_VF_RW_CONTROL_ATTRIBUTE(restore_vf);
>
> +static ssize_t data_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> +{
> + struct dentry *dent = file_dentry(file);
> + struct dentry *vfdentry = dent->d_parent;
> + struct dentry *migration_dentry = vfdentry->d_parent;
> + unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
> + struct xe_device *xe = migration_dentry->d_inode->i_private;
we have extract_xe() / extract_vfid() helpers for that
> +
> + xe_assert(xe, vfid);
> + xe_sriov_pf_assert_vfid(xe, vfid);
> +
> + if (*pos)
> + return -ESPIPE;
> +
> + return xe_sriov_pf_migration_data_write(xe, vfid, buf, count);
> +}
> +
> +static ssize_t data_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> +{
> + struct dentry *dent = file_dentry(file);
> + struct dentry *vfdentry = dent->d_parent;
> + struct dentry *migration_dentry = vfdentry->d_parent;
> + unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
> + struct xe_device *xe = migration_dentry->d_inode->i_private;
> +
> + xe_assert(xe, vfid);
> + xe_sriov_pf_assert_vfid(xe, vfid);
> +
> + if (*ppos)
> + return -ESPIPE;
> +
> + return xe_sriov_pf_migration_data_read(xe, vfid, buf, count);
> +}
> +
> +static const struct file_operations data_vf_fops = {
> + .owner = THIS_MODULE,
> + .open = simple_open,
> + .write = data_write,
> + .read = data_read,
> + .llseek = default_llseek,
> +};
> +
> static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> {
> debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> @@ -167,6 +211,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
> debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> + debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> }
>
> static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index d39cee66589b5..9cc178126cbdc 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -56,6 +56,18 @@ static bool pf_check_migration_support(struct xe_device *xe)
> return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> }
>
> +static void pf_migration_cleanup(struct drm_device *dev, void *arg)
> +{
> + struct xe_sriov_pf_migration *migration = arg;
> +
> + if (!IS_ERR_OR_NULL(migration->pending))
> + xe_sriov_pf_migration_data_free(migration->pending);
> + if (!IS_ERR_OR_NULL(migration->trailer))
> + xe_sriov_pf_migration_data_free(migration->trailer);
> + if (!IS_ERR_OR_NULL(migration->descriptor))
> + xe_sriov_pf_migration_data_free(migration->descriptor);
maybe instead of checking IS_ERR_OR_NULL here, move the check to data_free() ?
> +}
> +
> /**
> * xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
> * @xe: the &struct xe_device
> @@ -65,6 +77,7 @@ static bool pf_check_migration_support(struct xe_device *xe)
> int xe_sriov_pf_migration_init(struct xe_device *xe)
> {
> unsigned int n, totalvfs;
> + int err;
>
> xe_assert(xe, IS_SRIOV_PF(xe));
>
> @@ -76,7 +89,15 @@ int xe_sriov_pf_migration_init(struct xe_device *xe)
> for (n = 1; n <= totalvfs; n++) {
> struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
>
> + err = drmm_mutex_init(&xe->drm, &migration->lock);
> + if (err)
> + return err;
> +
> init_waitqueue_head(&migration->wq);
> +
> + err = drmm_add_action_or_reset(&xe->drm, pf_migration_cleanup, migration);
> + if (err)
> + return err;
> }
>
> return 0;
> @@ -162,6 +183,36 @@ xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> return data;
> }
>
> +static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + if (data->tile != 0 || data->gt != 0)
> + return -EINVAL;
> +
> + xe_sriov_pf_migration_data_free(data);
> +
> + return 0;
> +}
> +
> +static int pf_handle_trailer(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + struct xe_gt *gt;
> + u8 gt_id;
> +
> + if (data->tile != 0 || data->gt != 0)
> + return -EINVAL;
> + if (data->offset != 0 || data->size != 0 || data->buff || data->bo)
> + return -EINVAL;
who will free the data packet if we return errors here?
> +
> + xe_sriov_pf_migration_data_free(data);
> +
> + for_each_gt(gt, xe, gt_id)
> + xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
> +
> + return 0;
> +}
> +
> /**
> * xe_sriov_pf_migration_produce() - Produce a SR-IOV VF migration data packet for device to process
> * @xe: the &struct xe_device
> @@ -180,6 +231,11 @@ int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> if (!IS_SRIOV_PF(xe))
> return -ENODEV;
>
> + if (data->type == XE_SRIOV_MIG_DATA_DESCRIPTOR)
> + return pf_handle_descriptor(xe, vfid, data);
> + else if (data->type == XE_SRIOV_MIG_DATA_TRAILER)
> + return pf_handle_trailer(xe, vfid, data);
> +
> gt = xe_device_get_gt(xe, data->gt);
> if (!gt || data->tile != gt->tile->id) {
> xe_sriov_err_ratelimited(xe, "VF%d Unknown GT - tile_id:%d, gt_id:%d\n",
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> index cfc6b512c6674..9a2777dcf9a6b 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> @@ -5,7 +5,45 @@
>
> #include "xe_bo.h"
> #include "xe_device.h"
> +#include "xe_sriov_pf_helpers.h"
> +#include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_migration_data.h"
> +#include "xe_sriov_printk.h"
> +
> +static struct mutex *pf_migration_mutex(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> + return &xe->sriov.pf.vfs[vfid].migration.lock;
> +}
> +
> +static struct xe_sriov_pf_migration_data **pf_pick_pending(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> +
> + return &xe->sriov.pf.vfs[vfid].migration.pending;
> +}
> +
> +static struct xe_sriov_pf_migration_data **
> +pf_pick_descriptor(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> +
> + return &xe->sriov.pf.vfs[vfid].migration.descriptor;
> +}
> +
> +static struct xe_sriov_pf_migration_data **pf_pick_trailer(struct xe_device *xe, unsigned int vfid)
> +{
> + xe_assert(xe, IS_SRIOV_PF(xe));
> + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> +
> + return &xe->sriov.pf.vfs[vfid].migration.trailer;
> +}
>
> static bool data_needs_bo(struct xe_sriov_pf_migration_data *data)
> {
> @@ -133,3 +171,318 @@ int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *
>
> return mig_data_init(data);
> }
> +
> +static ssize_t vf_mig_data_hdr_read(struct xe_sriov_pf_migration_data *data,
> + char __user *buf, size_t len)
> +{
> + loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
> +
> + if (!data->hdr_remaining)
> + return -EINVAL;
> +
> + if (len > data->hdr_remaining)
> + len = data->hdr_remaining;
> +
> + if (copy_to_user(buf, (void *)&data->hdr + offset, len))
> + return -EFAULT;
> +
> + data->hdr_remaining -= len;
> +
> + return len;
> +}
> +
> +static ssize_t vf_mig_data_read(struct xe_sriov_pf_migration_data *data,
> + char __user *buf, size_t len)
> +{
> + if (len > data->remaining)
> + len = data->remaining;
> +
> + if (copy_to_user(buf, data->vaddr + (data->size - data->remaining), len))
> + return -EFAULT;
> +
> + data->remaining -= len;
> +
> + return len;
> +}
> +
> +static ssize_t __vf_mig_data_read_single(struct xe_sriov_pf_migration_data **data,
> + unsigned int vfid, char __user *buf, size_t len)
> +{
> + ssize_t copied = 0;
> +
> + if ((*data)->hdr_remaining)
> + copied = vf_mig_data_hdr_read(*data, buf, len);
> + else
> + copied = vf_mig_data_read(*data, buf, len);
> +
> + if ((*data)->remaining == 0 && (*data)->hdr_remaining == 0) {
> + xe_sriov_pf_migration_data_free(*data);
> + *data = NULL;
> + }
> +
> + return copied;
> +}
> +
> +static struct xe_sriov_pf_migration_data **vf_mig_pick_data(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_pf_migration_data **data;
> +
> + data = pf_pick_descriptor(xe, vfid);
> + if (*data)
> + return data;
> +
> + data = pf_pick_pending(xe, vfid);
> + if (*data == NULL)
> + *data = xe_sriov_pf_migration_consume(xe, vfid);
> + if (!IS_ERR_OR_NULL(*data))
> + return data;
> + else if (IS_ERR(*data) && PTR_ERR(*data) != -ENODATA)
> + return data;
> +
> + data = pf_pick_trailer(xe, vfid);
> + if (*data)
> + return data;
> +
> + return ERR_PTR(-ENODATA);
> +}
> +
> +static ssize_t vf_mig_data_read_single(struct xe_device *xe, unsigned int vfid,
> + char __user *buf, size_t len)
> +{
> + struct xe_sriov_pf_migration_data **data = vf_mig_pick_data(xe, vfid);
> +
> + if (IS_ERR_OR_NULL(data))
> + return PTR_ERR(data);
> +
> + return __vf_mig_data_read_single(data, vfid, buf, len);
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_read() - Read migration data from the device
> + * @gt: the &struct xe_device
@xe
> + * @vfid: the VF identifier
> + * @buf: start address of userspace buffer
> + * @len: requested read size from userspace
> + *
> + * Return: number of bytes that has been successfully read
> + * 0 if no more migration data is available
> + * -errno on failure
you likely need to add some punctuation here to properly render the doc
> + */
> +ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
> + char __user *buf, size_t len)
> +{
> + ssize_t ret, consumed = 0;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
> + if (ret)
> + return ret;
> +
> + while (consumed < len) {
> + ret = vf_mig_data_read_single(xe, vfid, buf, len - consumed);
> + if (ret == -ENODATA)
> + goto out;
> + if (ret < 0) {
> + mutex_unlock(pf_migration_mutex(xe, vfid));
> + return ret;
> + }
> +
> + consumed += ret;
> + buf += ret;
> + }
> +
> +out:
> + mutex_unlock(pf_migration_mutex(xe, vfid));
> + return consumed;
> +}
> +
> +static ssize_t vf_mig_hdr_write(struct xe_sriov_pf_migration_data *data,
> + const char __user *buf, size_t len)
> +{
> + loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
> + int ret;
> +
> + if (WARN_ON(!data->hdr_remaining))
xe_WARN_ON(xe, ... ) if having full WARN is really important
> + return -EINVAL;
> +
> + if (len > data->hdr_remaining)
> + len = data->hdr_remaining;
> +
> + if (copy_from_user((void *)&data->hdr + offset, buf, len))
> + return -EFAULT;
> +
> + data->hdr_remaining -= len;
> +
> + if (!data->hdr_remaining) {
> + ret = xe_sriov_pf_migration_data_init_from_hdr(data);
> + if (ret)
> + return ret;
> + }
> +
> + return len;
> +}
> +
> +static ssize_t vf_mig_data_write(struct xe_sriov_pf_migration_data *data,
> + const char __user *buf, size_t len)
> +{
> + if (len > data->remaining)
> + len = data->remaining;
> +
> + if (copy_from_user(data->vaddr + (data->size - data->remaining), buf, len))
> + return -EFAULT;
> +
> + data->remaining -= len;
> +
> + return len;
> +}
> +
> +static ssize_t vf_mig_data_write_single(struct xe_device *xe, unsigned int vfid,
> + const char __user *buf, size_t len)
> +{
> + struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
> + int ret;
> + ssize_t copied;
> +
> + if (IS_ERR_OR_NULL(*data)) {
> + *data = xe_sriov_pf_migration_data_alloc(xe);
> + if (*data == NULL)
> + return -ENOMEM;
> + }
> +
> + if ((*data)->hdr_remaining)
> + copied = vf_mig_hdr_write(*data, buf, len);
> + else
> + copied = vf_mig_data_write(*data, buf, len);
> +
> + if ((*data)->hdr_remaining == 0 && (*data)->remaining == 0) {
> + ret = xe_sriov_pf_migration_produce(xe, vfid, *data);
> + if (ret) {
> + xe_sriov_pf_migration_data_free(*data);
> + return ret;
> + }
> +
> + *data = NULL;
> + }
> +
> + return copied;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_write() - Write migration data to the device
> + * @gt: the &struct xe_device
@xe
> + * @vfid: the VF identifier
> + * @buf: start address of userspace buffer
> + * @len: requested write size from userspace
> + *
> + * Return: number of bytes that has been successfully written
> + * -errno on failure
> + */
> +ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
> + const char __user *buf, size_t len)
> +{
> + ssize_t ret, produced = 0;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
> + if (ret)
> + return ret;
scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) ?
> +
> + while (produced < len) {
> + ret = vf_mig_data_write_single(xe, vfid, buf, len - produced);
> + if (ret < 0) {
> + mutex_unlock(pf_migration_mutex(xe, vfid));
> + return ret;
> + }
> +
> + produced += ret;
> + buf += ret;
> + }
> +
> + mutex_unlock(pf_migration_mutex(xe, vfid));
> + return produced;
> +}
> +
> +#define MIGRATION_DESC_SIZE 4
> +static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_pf_migration_data **desc = pf_pick_descriptor(xe, vfid);
> + struct xe_sriov_pf_migration_data *data;
> + int ret;
> +
> + data = xe_sriov_pf_migration_data_alloc(xe);
> + if (!data)
> + return -ENOMEM;
> +
> + ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_DESCRIPTOR,
> + 0, MIGRATION_DESC_SIZE);
> + if (ret) {
> + xe_sriov_pf_migration_data_free(data);
> + return ret;
> + }
> +
> + *desc = data;
> +
> + return 0;
> +}
> +
> +static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
> +
> + *data = NULL;
> +}
> +
> +#define MIGRATION_TRAILER_SIZE 0
> +static int pf_trailer_init(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_sriov_pf_migration_data **trailer = pf_pick_trailer(xe, vfid);
> + struct xe_sriov_pf_migration_data *data;
> + int ret;
> +
> + data = xe_sriov_pf_migration_data_alloc(xe);
> + if (!data)
> + return -ENOMEM;
> +
> + ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_TRAILER,
> + 0, MIGRATION_TRAILER_SIZE);
> + if (ret) {
> + xe_sriov_pf_migration_data_free(data);
> + return ret;
> + }
> +
> + *trailer = data;
> +
> + return 0;
> +}
> +
> +/**
> + * xe_sriov_pf_migration_data_save_init() - Initialize the pending save migration data.
> + * @gt: the &struct xe_device
> + * @vfid: the VF identifier
> + *
> + * Return: 0 on success, -errno on failure
> + */
> +int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid)
> +{
> + int ret;
> +
> + ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
> + if (ret)
> + return ret;
> +
> + ret = pf_desc_init(xe, vfid);
> + if (ret)
> + goto out;
> +
> + ret = pf_trailer_init(xe, vfid);
> + if (ret)
> + goto out;
> +
> + pf_pending_init(xe, vfid);
> +
> +out:
> + mutex_unlock(pf_migration_mutex(xe, vfid));
> + return ret;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> index 1dde4cfcdbc47..5b96c7f224002 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> @@ -28,5 +28,10 @@ void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *snapshot
> int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> unsigned int type, loff_t offset, size_t size);
> int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *snapshot);
> +ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
> + char __user *buf, size_t len);
> +ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
> + const char __user *buf, size_t len);
> +int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> index 80fdea32b884a..c5d75bb7f39c0 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> @@ -7,6 +7,7 @@
> #define _XE_SRIOV_PF_MIGRATION_TYPES_H_
>
> #include <linux/types.h>
> +#include <linux/mutex_types.h>
> #include <linux/wait.h>
>
> struct xe_sriov_pf_migration_data {
> @@ -32,6 +33,14 @@ struct xe_sriov_pf_migration_data {
> struct xe_sriov_pf_migration {
> /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
> wait_queue_head_t wq;
> + /** @lock: Mutex protecting the migration data */
> + struct mutex lock;
> + /** @pending: currently processed data packet of VF resource */
> + struct xe_sriov_pf_migration_data *pending;
> + /** @trailer: data packet used to indicate the end of stream */
> + struct xe_sriov_pf_migration_data *trailer;
> + /** @descriptor: data packet containing the metadata describing the device */
> + struct xe_sriov_pf_migration_data *descriptor;
> };
>
> #endif
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor
2025-10-11 19:38 ` [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
2025-10-11 22:52 ` kernel test robot
@ 2025-10-13 10:56 ` Michal Wajdeczko
2025-10-21 0:31 ` Michał Winiarski
1 sibling, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 10:56 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> The descriptor reuses the KLV format used by GuC and contains metadata
> that can be used to quickly fail migration when source is incompatible
> with destination.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 6 +-
> .../gpu/drm/xe/xe_sriov_pf_migration_data.c | 82 ++++++++++++++++++-
> .../gpu/drm/xe/xe_sriov_pf_migration_data.h | 2 +
> 3 files changed, 87 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index 9cc178126cbdc..a0cfac456ba0b 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -186,10 +186,14 @@ xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
> struct xe_sriov_pf_migration_data *data)
> {
> + int ret;
> +
> if (data->tile != 0 || data->gt != 0)
> return -EINVAL;
>
> - xe_sriov_pf_migration_data_free(data);
> + ret = xe_sriov_pf_migration_data_process_desc(xe, vfid, data);
> + if (ret)
> + return ret;
>
> return 0;
> }
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> index 9a2777dcf9a6b..307b16b027a5e 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> @@ -5,6 +5,7 @@
>
> #include "xe_bo.h"
> #include "xe_device.h"
> +#include "xe_guc_klv_helpers.h"
> #include "xe_sriov_pf_helpers.h"
> #include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_migration_data.h"
> @@ -404,11 +405,17 @@ ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid
> return produced;
> }
>
> -#define MIGRATION_DESC_SIZE 4
> +#define MIGRATION_KLV_DEVICE_DEVID_KEY 0xf001u
> +#define MIGRATION_KLV_DEVICE_DEVID_LEN 1u
> +#define MIGRATION_KLV_DEVICE_REVID_KEY 0xf002u
> +#define MIGRATION_KLV_DEVICE_REVID_LEN 1u
as we aim to have unique KLVs across GuC ABI, maybe we should ask GuC team to reserve some KLVs range (0xf000-0xffff) for our (driver) use ?
> +
> +#define MIGRATION_DESC_DWORDS 4
maybe:
(GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_DEVID_LEN +
GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_REVID_LEN)
> static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
> {
> struct xe_sriov_pf_migration_data **desc = pf_pick_descriptor(xe, vfid);
> struct xe_sriov_pf_migration_data *data;
> + u32 *klvs;
> int ret;
>
> data = xe_sriov_pf_migration_data_alloc(xe);
> @@ -416,17 +423,88 @@ static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
> return -ENOMEM;
>
> ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_DESCRIPTOR,
> - 0, MIGRATION_DESC_SIZE);
> + 0, MIGRATION_DESC_DWORDS * sizeof(u32));
> if (ret) {
> xe_sriov_pf_migration_data_free(data);
> return ret;
> }
>
> + klvs = data->vaddr;
> + *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_DEVID_KEY,
> + MIGRATION_KLV_DEVICE_DEVID_LEN);
> + *klvs++ = xe->info.devid;
> + *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_REVID_KEY,
> + MIGRATION_KLV_DEVICE_REVID_LEN);
> + *klvs++ = xe->info.revid;
> +
> *desc = data;
>
> return 0;
> }
>
> +/**
> + * xe_sriov_pf_migration_data_process_desc() - Process migration data descriptor.
> + * @gt: the &struct xe_device
@xe
> + * @vfid: the VF identifier
> + * @data: the &struct xe_sriov_pf_migration_data containing the descriptor
> + *
> + * The descriptor uses the same KLV format as GuC, and contains metadata used for
> + * checking migration data compatibility.
> + *
> + * Return: 0 on success, -errno on failure
> + */
> +int xe_sriov_pf_migration_data_process_desc(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + u32 num_dwords = data->size / sizeof(u32);
> + u32 *klvs = data->vaddr;
> +
> + xe_assert(xe, data->type == XE_SRIOV_MIG_DATA_DESCRIPTOR);
> + if (data->size % sizeof(u32) != 0)
> + return -EINVAL;
> + if (data->size != num_dwords * sizeof(u32))
> + return -EINVAL;
isn't that redundant ?
> +
> + while (num_dwords >= GUC_KLV_LEN_MIN) {
> + u32 key = FIELD_GET(GUC_KLV_0_KEY, klvs[0]);
> + u32 len = FIELD_GET(GUC_KLV_0_LEN, klvs[0]);
> +
> + klvs += GUC_KLV_LEN_MIN;
> + num_dwords -= GUC_KLV_LEN_MIN;
> +
> + switch (key) {
> + case MIGRATION_KLV_DEVICE_DEVID_KEY:
> + if (*klvs != xe->info.devid) {
> + xe_sriov_info(xe,
maybe it should be more that info() ?
> + "Aborting migration, devid mismatch %#04x!=%#04x\n",
> + *klvs, xe->info.devid);
> + return -ENODEV;
> + }
> + break;
> + case MIGRATION_KLV_DEVICE_REVID_KEY:
> + if (*klvs != xe->info.revid) {
> + xe_sriov_info(xe,
> + "Aborting migration, revid mismatch %#04x!=%#04x\n",
> + *klvs, xe->info.revid);
> + return -ENODEV;
> + }
> + break;
> + default:
> + xe_sriov_dbg(xe,
> + "Unknown migration descriptor key %#06x - skipping\n", key);
> + break;
> + }
> +
> + if (len > num_dwords)
> + return -EINVAL;
> +
> + klvs += len;
> + num_dwords -= len;
> + }
> +
> + return 0;
> +}
> +
> static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
> {
> struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> index 5b96c7f224002..7cfd61005c00f 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> @@ -32,6 +32,8 @@ ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
> char __user *buf, size_t len);
> ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
> const char __user *buf, size_t len);
> +int xe_sriov_pf_migration_data_process_desc(struct xe_device *xe, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data);
> int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
>
> #endif
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs
2025-10-11 19:38 ` [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
2025-10-12 19:15 ` Matthew Brost
@ 2025-10-13 11:04 ` Michal Wajdeczko
2025-10-21 0:42 ` Michał Winiarski
1 sibling, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 11:04 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> The size is normally used to make a decision on when to stop the device
> (mainly when it's in a pre_copy state).
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 18 ++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 2 ++
> drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 34 +++++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 30 ++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 1 +
> 5 files changed, 85 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 582aaf062cbd4..50f09994e2854 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -395,6 +395,24 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> }
> #endif /* CONFIG_DEBUG_FS */
>
> +/**
> + * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: total migration data size in bytes or a negative error code on failure.
> + */
> +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> +{
> + ssize_t total = 0;
> +
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +
as this is so trivial now, maybe add some note why it is like that for now
> + return total;
> +}
> +
> /**
> * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty
> * @gt: the &struct xe_gt
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index 1e4dc46413823..e5298d35d7d7e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -15,6 +15,8 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
>
> +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
> +
> bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
> struct xe_sriov_pf_migration_data *data);
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> index ce780719760a6..b06e893fe54cf 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> @@ -13,6 +13,7 @@
> #include "xe_sriov_pf_control.h"
> #include "xe_sriov_pf_debugfs.h"
> #include "xe_sriov_pf_helpers.h"
> +#include "xe_sriov_pf_migration.h"
> #include "xe_sriov_pf_migration_data.h"
> #include "xe_sriov_pf_service.h"
> #include "xe_sriov_printk.h"
> @@ -203,6 +204,38 @@ static const struct file_operations data_vf_fops = {
> .llseek = default_llseek,
> };
>
> +static ssize_t size_read(struct file *file, char __user *ubuf, size_t count, loff_t *ppos)
> +{
> + struct dentry *dent = file_dentry(file);
> + struct dentry *vfdentry = dent->d_parent;
> + struct dentry *migration_dentry = vfdentry->d_parent;
> + unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
> + struct xe_device *xe = migration_dentry->d_inode->i_private;
use helpers
> + char buf[21];
> + ssize_t ret;
> + int len;
> +
> + xe_assert(xe, vfid);
> + xe_sriov_pf_assert_vfid(xe, vfid);
it doesn't matter for this function, so why assert here?
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_migration_size(xe, vfid);
> + xe_pm_runtime_put(xe);
> + if (ret < 0)
> + return ret;
> +
> + len = scnprintf(buf, sizeof(buf), "%zd\n", ret);
> +
> + return simple_read_from_buffer(ubuf, count, ppos, buf, len);
> +}
> +
> +static const struct file_operations size_vf_fops = {
> + .owner = THIS_MODULE,
> + .open = simple_open,
> + .read = size_read,
> + .llseek = default_llseek,
> +};
> +
> static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> {
> debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> @@ -212,6 +245,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> + debugfs_create_file("migration_size", 0400, vfdent, xe, &size_vf_fops);
> }
>
> static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> index a0cfac456ba0b..6b247581dec65 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> @@ -249,3 +249,33 @@ int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
>
> return xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
> }
> +
> +/**
> + * xe_sriov_pf_migration_size() - Total size of migration data from all components within a device
> + * @xe: the &struct xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: total migration data size in bytes or a negative error code on failure.
> + */
> +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid)
> +{
> + size_t size = 0;
> + struct xe_gt *gt;
> + ssize_t ret;
> + u8 gt_id;
> +
> + xe_assert(xe, IS_SRIOV_PF(xe));
> +
> + for_each_gt(gt, xe, gt_id) {
> + ret = xe_gt_sriov_pf_migration_size(gt, vfid);
> + if (ret < 0) {
> + size = ret;
> + break;
just:
return ret;
> + }
> + size += ret;
> + }
> +
> + return size;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> index f2020ba19c2da..887ea3e9632bd 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> @@ -14,6 +14,7 @@ struct xe_device;
> #ifdef CONFIG_PCI_IOV
> int xe_sriov_pf_migration_init(struct xe_device *xe);
> bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid);
> struct xe_sriov_pf_migration_data *
> xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size
2025-10-11 19:38 ` [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
2025-10-11 23:35 ` kernel test robot
@ 2025-10-13 11:08 ` Michal Wajdeczko
2025-10-21 0:47 ` Michał Winiarski
1 sibling, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 11:08 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> An upcoming change will use GuC buffer cache as a place where GuC
> migration data will be stored, and the memory requirement for that is
> larger than indirect data.
> Allow the caller to pass the size based on the intended usecase.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c | 2 +-
> drivers/gpu/drm/xe/xe_guc.c | 4 ++--
> drivers/gpu/drm/xe/xe_guc_buf.c | 6 +++---
> drivers/gpu/drm/xe/xe_guc_buf.h | 2 +-
> 4 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> index d266882adc0e0..c273ce8087f56 100644
> --- a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> +++ b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> @@ -72,7 +72,7 @@ static int guc_buf_test_init(struct kunit *test)
> kunit_activate_static_stub(test, xe_managed_bo_create_pin_map,
> replacement_xe_managed_bo_create_pin_map);
>
> - KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf));
> + KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf), SZ_8K);
SZ_8K added to wrong place ;)
>
> test->priv = &guc->buf;
> return 0;
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index d94490979adc0..ccc7c60ae9b77 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -809,7 +809,7 @@ static int vf_guc_init_post_hwconfig(struct xe_guc *guc)
> if (err)
> return err;
>
> - err = xe_guc_buf_cache_init(&guc->buf);
> + err = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
> if (err)
> return err;
>
> @@ -857,7 +857,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
> if (ret)
> return ret;
>
> - ret = xe_guc_buf_cache_init(&guc->buf);
> + ret = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
> if (ret)
> return ret;
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
> index 1be26145f0b98..418ada00b99e3 100644
> --- a/drivers/gpu/drm/xe/xe_guc_buf.c
> +++ b/drivers/gpu/drm/xe/xe_guc_buf.c
> @@ -28,16 +28,16 @@ static struct xe_gt *cache_to_gt(struct xe_guc_buf_cache *cache)
> * @cache: the &xe_guc_buf_cache to initialize
> *
> * The Buffer Cache allows to obtain a reusable buffer that can be used to pass
> - * indirect H2G data to GuC without a need to create a ad-hoc allocation.
> + * data to GuC or read data from GuC without a need to create a ad-hoc allocation.
> *
> * Return: 0 on success or a negative error code on failure.
> */
> -int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache)
> +int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size)
> {
> struct xe_gt *gt = cache_to_gt(cache);
> struct xe_sa_manager *sam;
>
> - sam = __xe_sa_bo_manager_init(gt_to_tile(gt), SZ_8K, 0, sizeof(u32));
maybe we should promote this magic SZ_8K as
#define XE_GUC_BUF_CACHE_DEFAULT_SIZE SZ_8K
> + sam = __xe_sa_bo_manager_init(gt_to_tile(gt), size, 0, sizeof(u32));
> if (IS_ERR(sam))
> return PTR_ERR(sam);
> cache->sam = sam;
> diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
> index fe6b5ffe0d6eb..fe5cf3b183497 100644
> --- a/drivers/gpu/drm/xe/xe_guc_buf.h
> +++ b/drivers/gpu/drm/xe/xe_guc_buf.h
> @@ -11,7 +11,7 @@
>
> #include "xe_guc_buf_types.h"
>
> -int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache);
> +int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size);
> u32 xe_guc_buf_cache_dwords(struct xe_guc_buf_cache *cache);
> struct xe_guc_buf xe_guc_buf_reserve(struct xe_guc_buf_cache *cache, u32 dwords);
> struct xe_guc_buf xe_guc_buf_from_data(struct xe_guc_buf_cache *cache,
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface
2025-10-11 19:38 ` [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
2025-10-12 18:06 ` Matthew Brost
@ 2025-10-13 11:20 ` Michal Wajdeczko
2025-10-21 0:44 ` Michał Winiarski
1 sibling, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 11:20 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> In upcoming changes the cached buffers are going to be used to read data
> produced by the GuC. Add a counterpart to flush, which synchronizes the
> CPU-side of suballocation with the GPU data and propagate the interface
> to GuC Buffer Cache.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_guc_buf.c | 9 +++++++++
> drivers/gpu/drm/xe/xe_guc_buf.h | 1 +
> drivers/gpu/drm/xe/xe_sa.c | 21 +++++++++++++++++++++
> drivers/gpu/drm/xe/xe_sa.h | 1 +
> 4 files changed, 32 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
> index 502ca3a4ee606..1be26145f0b98 100644
> --- a/drivers/gpu/drm/xe/xe_guc_buf.c
> +++ b/drivers/gpu/drm/xe/xe_guc_buf.c
> @@ -127,6 +127,15 @@ u64 xe_guc_buf_flush(const struct xe_guc_buf buf)
> return xe_sa_bo_gpu_addr(buf.sa);
> }
>
> +/**
> + * xe_guc_buf_sync() - Copy the data from the GPU memory to the sub-allocation.
> + * @buf: the &xe_guc_buf to sync
for convenience, can we return the buf CPU pointer here?
something that I already had in my initial impl [1]
[1] https://patchwork.freedesktop.org/patch/619024/?series=139801&rev=1
> + */
> +void xe_guc_buf_sync(const struct xe_guc_buf buf)
> +{
> + xe_sa_bo_sync(buf.sa);
> +}
> +
> /**
> * xe_guc_buf_cpu_ptr() - Obtain a CPU pointer to the sub-allocation.
> * @buf: the &xe_guc_buf to query
> diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
> index 0d67604d96bdd..fe6b5ffe0d6eb 100644
> --- a/drivers/gpu/drm/xe/xe_guc_buf.h
> +++ b/drivers/gpu/drm/xe/xe_guc_buf.h
> @@ -31,6 +31,7 @@ static inline bool xe_guc_buf_is_valid(const struct xe_guc_buf buf)
>
> void *xe_guc_buf_cpu_ptr(const struct xe_guc_buf buf);
> u64 xe_guc_buf_flush(const struct xe_guc_buf buf);
> +void xe_guc_buf_sync(const struct xe_guc_buf buf);
> u64 xe_guc_buf_gpu_addr(const struct xe_guc_buf buf);
> u64 xe_guc_cache_gpu_addr_from_ptr(struct xe_guc_buf_cache *cache, const void *ptr, u32 size);
>
> diff --git a/drivers/gpu/drm/xe/xe_sa.c b/drivers/gpu/drm/xe/xe_sa.c
> index fedd017d6dd36..2115789c2bfb7 100644
> --- a/drivers/gpu/drm/xe/xe_sa.c
> +++ b/drivers/gpu/drm/xe/xe_sa.c
> @@ -110,6 +110,10 @@ struct drm_suballoc *__xe_sa_bo_new(struct xe_sa_manager *sa_manager, u32 size,
> return drm_suballoc_new(&sa_manager->base, size, gfp, true, 0);
> }
>
> +/**
> + * xe_sa_bo_flush_write() - Copy the data from the sub-allocation to the GPU memory.
> + * @sa_bo: the &drm_suballoc to flush
> + */
> void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
> {
> struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
> @@ -123,6 +127,23 @@ void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
> drm_suballoc_size(sa_bo));
> }
>
> +/**
> + * xe_sa_bo_sync() - Copy the data from GPU memory to the sub-allocation.
> + * @sa_bo: the &drm_suballoc to sync
> + */
> +void xe_sa_bo_sync(struct drm_suballoc *sa_bo)
> +{
> + struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
> + struct xe_device *xe = tile_to_xe(sa_manager->bo->tile);
> +
> + if (!sa_manager->bo->vmap.is_iomem)
> + return;
> +
> + xe_map_memcpy_from(xe, xe_sa_bo_cpu_addr(sa_bo), &sa_manager->bo->vmap,
> + drm_suballoc_soffset(sa_bo),
> + drm_suballoc_size(sa_bo));
> +}
> +
> void xe_sa_bo_free(struct drm_suballoc *sa_bo,
> struct dma_fence *fence)
> {
> diff --git a/drivers/gpu/drm/xe/xe_sa.h b/drivers/gpu/drm/xe/xe_sa.h
> index 99dbf0eea5402..28fd8bb6450c2 100644
> --- a/drivers/gpu/drm/xe/xe_sa.h
> +++ b/drivers/gpu/drm/xe/xe_sa.h
> @@ -37,6 +37,7 @@ static inline struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager
> }
>
> void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo);
> +void xe_sa_bo_sync(struct drm_suballoc *sa_bo);
> void xe_sa_bo_free(struct drm_suballoc *sa_bo, struct dma_fence *fence);
>
> static inline struct xe_sa_manager *
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 12/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration
2025-10-11 19:38 ` [PATCH 12/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration Michał Winiarski
@ 2025-10-13 11:27 ` Michal Wajdeczko
2025-10-21 0:50 ` Michał Winiarski
0 siblings, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 11:27 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Contiguous PF GGTT VMAs can be scarce after creating VFs.
> Increase the GuC buffer cache size to 8M for PF so that we can fit GuC
> migration data (which currently maxes out at just over 4M) and use the
> cache instead of allocating fresh BOs.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 54 +++++++------------
> drivers/gpu/drm/xe/xe_guc.c | 2 +-
> 2 files changed, 20 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 50f09994e2854..8b96eff8df93b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -11,7 +11,7 @@
> #include "xe_gt_sriov_pf_helpers.h"
> #include "xe_gt_sriov_pf_migration.h"
> #include "xe_gt_sriov_printk.h"
> -#include "xe_guc.h"
> +#include "xe_guc_buf.h"
> #include "xe_guc_ct.h"
> #include "xe_sriov.h"
> #include "xe_sriov_pf_migration.h"
> @@ -57,73 +57,57 @@ static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
>
> /* Return: number of state dwords saved or a negative error code on failure */
> static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
> - void *buff, size_t size)
> + void *dst, size_t size)
> {
> const int ndwords = size / sizeof(u32);
> - struct xe_tile *tile = gt_to_tile(gt);
> - struct xe_device *xe = tile_to_xe(tile);
> struct xe_guc *guc = >->uc.guc;
> - struct xe_bo *bo;
> + CLASS(xe_guc_buf, buf)(&guc->buf, ndwords);
> int ret;
>
> xe_gt_assert(gt, size % sizeof(u32) == 0);
> xe_gt_assert(gt, size == ndwords * sizeof(u32));
>
> - bo = xe_bo_create_pin_map_novm(xe, tile,
> - ALIGN(size, PAGE_SIZE),
> - ttm_bo_type_kernel,
> - XE_BO_FLAG_SYSTEM |
> - XE_BO_FLAG_GGTT |
> - XE_BO_FLAG_GGTT_INVALIDATE, false);
> - if (IS_ERR(bo))
> - return PTR_ERR(bo);
> + if (!xe_guc_buf_is_valid(buf))
> + return -ENOBUFS;
> +
> + memset(xe_guc_buf_cpu_ptr(buf), 0, size);
is that necessary? GuC will overwrite that anyway
>
> ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_SAVE,
> - xe_bo_ggtt_addr(bo), ndwords);
> - if (!ret)
> + xe_guc_buf_flush(buf), ndwords);
> + if (!ret) {
> ret = -ENODATA;
> - else if (ret > ndwords)
> + } else if (ret > ndwords) {
> ret = -EPROTO;
> - else if (ret > 0)
> - xe_map_memcpy_from(xe, buff, &bo->vmap, 0, ret * sizeof(u32));
> + } else if (ret > 0) {
> + xe_guc_buf_sync(buf);
> + memcpy(dst, xe_guc_buf_cpu_ptr(buf), ret * sizeof(u32));
with a small change suggested earlier, this could be just:
memcpy(dst, xe_guc_buf_sync(buf), ret * sizeof(u32));
> + }
>
> - xe_bo_unpin_map_no_vm(bo);
> return ret;
> }
>
> /* Return: number of state dwords restored or a negative error code on failure */
> static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
> - const void *buff, size_t size)
> + const void *src, size_t size)
> {
> const int ndwords = size / sizeof(u32);
> - struct xe_tile *tile = gt_to_tile(gt);
> - struct xe_device *xe = tile_to_xe(tile);
> struct xe_guc *guc = >->uc.guc;
> - struct xe_bo *bo;
> + CLASS(xe_guc_buf_from_data, buf)(&guc->buf, src, size);
> int ret;
>
> xe_gt_assert(gt, size % sizeof(u32) == 0);
> xe_gt_assert(gt, size == ndwords * sizeof(u32));
>
> - bo = xe_bo_create_pin_map_novm(xe, tile,
> - ALIGN(size, PAGE_SIZE),
> - ttm_bo_type_kernel,
> - XE_BO_FLAG_SYSTEM |
> - XE_BO_FLAG_GGTT |
> - XE_BO_FLAG_GGTT_INVALIDATE, false);
> - if (IS_ERR(bo))
> - return PTR_ERR(bo);
> -
> - xe_map_memcpy_to(xe, &bo->vmap, 0, buff, size);
> + if (!xe_guc_buf_is_valid(buf))
> + return -ENOBUFS;
>
> ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_RESTORE,
> - xe_bo_ggtt_addr(bo), ndwords);
> + xe_guc_buf_flush(buf), ndwords);
> if (!ret)
> ret = -ENODATA;
> else if (ret > ndwords)
> ret = -EPROTO;
>
> - xe_bo_unpin_map_no_vm(bo);
> return ret;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index ccc7c60ae9b77..71ca06d1af62b 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -857,7 +857,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
> if (ret)
> return ret;
>
> - ret = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
> + ret = xe_guc_buf_cache_init(&guc->buf, IS_SRIOV_PF(guc_to_xe(guc)) ? SZ_8M : SZ_8K);
shouldn't we also check for xe_sriov_pf_migration_supported() ?
also, shouldn't we get this SZ_8M somewhere from the PF code?
and maybe PF could (one day) query that somehow from the GuC?
> if (ret)
> return ret;
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 13/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs
2025-10-11 19:38 ` [PATCH 13/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs Michał Winiarski
@ 2025-10-13 11:36 ` Michal Wajdeczko
0 siblings, 0 replies; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 11:36 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> In upcoming changes, SR-IOV VF migration data will be extended beyond
> GuC data and exported to userspace using VFIO interface (with a
> vendor-specific variant driver) and a device-level debugfs interface.
> Remove the GT-level debugfs.
this was already under CONFIG_DRM_XE_DEBUG_SRIOV for early bring-up only,
so if now it's hard to keep it exposed on the GT-level, then
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 47 ---------------------
> 1 file changed, 47 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> index c026a3910e7e3..c2b27dab13aa8 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> @@ -320,9 +320,6 @@ static const struct {
> { "stop", xe_gt_sriov_pf_control_stop_vf },
> { "pause", xe_gt_sriov_pf_control_pause_vf },
> { "resume", xe_gt_sriov_pf_control_resume_vf },
> -#ifdef CONFIG_DRM_XE_DEBUG_SRIOV
> - { "restore!", xe_gt_sriov_pf_migration_restore_guc_state },
> -#endif
> };
>
> static ssize_t control_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> @@ -386,47 +383,6 @@ static const struct file_operations control_ops = {
> .llseek = default_llseek,
> };
>
> -/*
> - * /sys/kernel/debug/dri/BDF/
> - * ├── sriov
> - * : ├── vf1
> - * : ├── tile0
> - * : ├── gt0
> - * : ├── guc_state
> - */
> -
> -static ssize_t guc_state_read(struct file *file, char __user *buf,
> - size_t count, loff_t *pos)
> -{
> - struct dentry *dent = file_dentry(file);
> - struct dentry *parent = dent->d_parent;
> - struct xe_gt *gt = extract_gt(parent);
> - unsigned int vfid = extract_vfid(parent);
> -
> - return xe_gt_sriov_pf_migration_read_guc_state(gt, vfid, buf, count, pos);
> -}
> -
> -static ssize_t guc_state_write(struct file *file, const char __user *buf,
> - size_t count, loff_t *pos)
> -{
> - struct dentry *dent = file_dentry(file);
> - struct dentry *parent = dent->d_parent;
> - struct xe_gt *gt = extract_gt(parent);
> - unsigned int vfid = extract_vfid(parent);
> -
> - if (*pos)
> - return -EINVAL;
> -
> - return xe_gt_sriov_pf_migration_write_guc_state(gt, vfid, buf, count);
> -}
> -
> -static const struct file_operations guc_state_ops = {
> - .owner = THIS_MODULE,
> - .read = guc_state_read,
> - .write = guc_state_write,
> - .llseek = default_llseek,
> -};
> -
> /*
> * /sys/kernel/debug/dri/BDF/
> * ├── sriov
> @@ -561,9 +517,6 @@ static void pf_populate_gt(struct xe_gt *gt, struct dentry *dent, unsigned int v
>
> /* for testing/debugging purposes only! */
> if (IS_ENABLED(CONFIG_DRM_XE_DEBUG)) {
> - debugfs_create_file("guc_state",
> - IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400,
> - dent, NULL, &guc_state_ops);
> debugfs_create_file("config_blob",
> IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400,
> dent, NULL, &config_blob_ops);
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 14/26] drm/xe/pf: Don't save GuC VF migration data on pause
2025-10-11 19:38 ` [PATCH 14/26] drm/xe/pf: Don't save GuC VF migration data on pause Michał Winiarski
@ 2025-10-13 11:42 ` Michal Wajdeczko
0 siblings, 0 replies; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 11:42 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> In upcoming changes, the GuC VF migration data will be handled as part
> of separate SAVE/RESTORE states in VF control state machine.
> Remove it from PAUSE state.
still waiting for the full SAVE/RESTORE state diagrams,
but that split makes sense, as this extra SAVE_GUC step was for early-debug, so:
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 39 +------------------
> .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 2 -
> 2 files changed, 2 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index 092d3d710bca1..6ece775b2e80e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -183,7 +183,6 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> CASE2STR(PAUSE_SEND_PAUSE);
> CASE2STR(PAUSE_WAIT_GUC);
> CASE2STR(PAUSE_GUC_DONE);
> - CASE2STR(PAUSE_SAVE_GUC);
> CASE2STR(PAUSE_FAILED);
> CASE2STR(PAUSED);
> CASE2STR(MIGRATION_DATA_WIP);
> @@ -451,8 +450,7 @@ static void pf_enter_vf_ready(struct xe_gt *gt, unsigned int vfid)
> * : PAUSE_GUC_DONE o-----restart
> * : | :
> * : | o---<--busy :
> - * : v / / :
> - * : PAUSE_SAVE_GUC :
> + * : / :
> * : / :
> * : / :
> * :....o..............o...............o...........:
> @@ -472,7 +470,6 @@ static void pf_exit_vf_pause_wip(struct xe_gt *gt, unsigned int vfid)
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_GUC_DONE);
> - pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC);
> }
> }
>
> @@ -503,41 +500,12 @@ static void pf_enter_vf_pause_rejected(struct xe_gt *gt, unsigned int vfid)
> pf_enter_vf_pause_failed(gt, vfid);
> }
>
> -static void pf_enter_vf_pause_save_guc(struct xe_gt *gt, unsigned int vfid)
> -{
> - if (!pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC))
> - pf_enter_vf_state_machine_bug(gt, vfid);
> -}
> -
> -static bool pf_exit_vf_pause_save_guc(struct xe_gt *gt, unsigned int vfid)
> -{
> - int err;
> -
> - if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC))
> - return false;
> -
> - err = xe_gt_sriov_pf_migration_save_guc_state(gt, vfid);
> - if (err) {
> - /* retry if busy */
> - if (err == -EBUSY) {
> - pf_enter_vf_pause_save_guc(gt, vfid);
> - return true;
> - }
> - /* give up on error */
> - if (err == -EIO)
> - pf_enter_vf_mismatch(gt, vfid);
> - }
> -
> - pf_enter_vf_pause_completed(gt, vfid);
> - return true;
> -}
> -
> static bool pf_exit_vf_pause_guc_done(struct xe_gt *gt, unsigned int vfid)
> {
> if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_GUC_DONE))
> return false;
>
> - pf_enter_vf_pause_save_guc(gt, vfid);
> + pf_enter_vf_pause_completed(gt, vfid);
> return true;
> }
>
> @@ -1788,9 +1756,6 @@ static bool pf_process_vf_state_machine(struct xe_gt *gt, unsigned int vfid)
> if (pf_exit_vf_pause_guc_done(gt, vfid))
> return true;
>
> - if (pf_exit_vf_pause_save_guc(gt, vfid))
> - return true;
> -
> if (pf_handle_vf_save_wip(gt, vfid))
> return true;
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> index 02b517533ee8a..68ec9d1fc3daf 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> @@ -28,7 +28,6 @@
> * @XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE: indicates that the PF is about to send a PAUSE command.
> * @XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC: indicates that the PF awaits for a response from the GuC.
> * @XE_GT_SRIOV_STATE_PAUSE_GUC_DONE: indicates that the PF has received a response from the GuC.
> - * @XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC: indicates that the PF needs to save the VF GuC state.
> * @XE_GT_SRIOV_STATE_PAUSE_FAILED: indicates that a VF pause operation has failed.
> * @XE_GT_SRIOV_STATE_PAUSED: indicates that the VF is paused.
> * @XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP: indicates that the new data is expected in migration ring.
> @@ -66,7 +65,6 @@ enum xe_gt_sriov_control_bits {
> XE_GT_SRIOV_STATE_PAUSE_SEND_PAUSE,
> XE_GT_SRIOV_STATE_PAUSE_WAIT_GUC,
> XE_GT_SRIOV_STATE_PAUSE_GUC_DONE,
> - XE_GT_SRIOV_STATE_PAUSE_SAVE_GUC,
> XE_GT_SRIOV_STATE_PAUSE_FAILED,
> XE_GT_SRIOV_STATE_PAUSED,
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 16/26] drm/xe/pf: Handle GuC migration data as part of PF control
2025-10-11 19:38 ` [PATCH 16/26] drm/xe/pf: Handle GuC migration data as part of PF control Michał Winiarski
@ 2025-10-13 11:56 ` Michal Wajdeczko
2025-10-21 0:52 ` Michał Winiarski
0 siblings, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 11:56 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Connect the helpers to allow save and restore of GuC migration data in
> stop_copy / resume device state.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 28 ++++++++++++++++++-
> .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 1 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 8 ++++++
> 3 files changed, 36 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index 6ece775b2e80e..f73a3bf40037c 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -187,6 +187,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> CASE2STR(PAUSED);
> CASE2STR(MIGRATION_DATA_WIP);
> CASE2STR(SAVE_WIP);
> + CASE2STR(SAVE_DATA_GUC);
> CASE2STR(SAVE_FAILED);
> CASE2STR(SAVED);
> CASE2STR(RESTORE_WIP);
> @@ -338,6 +339,7 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOP_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
> + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
> }
>
> @@ -801,6 +803,7 @@ void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
>
> static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> {
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> }
>
> @@ -820,16 +823,35 @@ static void pf_exit_vf_saved(struct xe_gt *gt, unsigned int vfid)
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> }
>
> +static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
> +{
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> + pf_exit_vf_wip(gt, vfid);
> +}
> +
> static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> {
> + int ret;
> +
> if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> return false;
>
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC)) {
> + ret = xe_gt_sriov_pf_migration_guc_save(gt, vfid);
> + if (ret)
> + goto err;
> + return true;
> + }
> +
> xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
> pf_exit_vf_save_wip(gt, vfid);
> pf_enter_vf_saved(gt, vfid);
>
> return true;
> +
> +err:
> + pf_enter_vf_save_failed(gt, vfid);
> + return false;
return true - as this is an indication that state was processed (not that it was successful or not)
> }
>
> static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> @@ -838,6 +860,8 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> pf_exit_vf_restored(gt, vfid);
> pf_enter_vf_wip(gt, vfid);
> + if (xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0)
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> pf_queue_vf(gt, vfid);
> return true;
> }
> @@ -946,6 +970,8 @@ static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
> struct xe_sriov_pf_migration_data *data)
> {
> switch (data->type) {
> + case XE_SRIOV_MIG_DATA_GUC:
> + return xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
> default:
> xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
> pf_enter_vf_restore_failed(gt, vfid);
> @@ -996,7 +1022,7 @@ static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> pf_exit_vf_saved(gt, vfid);
> pf_enter_vf_wip(gt, vfid);
> - pf_enter_vf_restored(gt, vfid);
> + pf_queue_vf(gt, vfid);
> return true;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> index 68ec9d1fc3daf..b9787c425d9f6 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> @@ -71,6 +71,7 @@ enum xe_gt_sriov_control_bits {
> XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP,
>
> XE_GT_SRIOV_STATE_SAVE_WIP,
> + XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
> XE_GT_SRIOV_STATE_SAVE_FAILED,
> XE_GT_SRIOV_STATE_SAVED,
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index e1031465e65c4..0c10284f0b09a 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -279,9 +279,17 @@ int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
> ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> {
> ssize_t total = 0;
> + ssize_t size;
>
> xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>
> + size = xe_gt_sriov_pf_migration_guc_size(gt, vfid);
> + if (size < 0)
> + return size;
> + else if (size > 0)
no need for "else"
and isn't zero GuC state size an error anyway ?
> + size += sizeof(struct xe_sriov_pf_migration_hdr);
> + total += size;
> +
> return total;
> }
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 17/26] drm/xe/pf: Add helpers for VF GGTT migration data handling
2025-10-11 19:38 ` [PATCH 17/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
@ 2025-10-13 12:17 ` Michal Wajdeczko
2025-10-21 1:00 ` Michał Winiarski
0 siblings, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 12:17 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> In an upcoming change, the VF GGTT migration data will be handled as
> part of VF control state machine. Add the necessary helpers to allow the
> migration data transfer to/from the HW GGTT resource.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_ggtt.c | 92 ++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_ggtt.h | 2 +
> drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 64 +++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 5 ++
> 5 files changed, 165 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> index aca7ae5489b91..89c0ad56c6a8a 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.c
> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> @@ -138,6 +138,14 @@ static void xe_ggtt_set_pte_and_flush(struct xe_ggtt *ggtt, u64 addr, u64 pte)
> ggtt_update_access_counter(ggtt);
> }
>
> +static u64 xe_ggtt_get_pte(struct xe_ggtt *ggtt, u64 addr)
> +{
> + xe_tile_assert(ggtt->tile, !(addr & XE_PTE_MASK));
> + xe_tile_assert(ggtt->tile, addr < ggtt->size);
> +
> + return readq(&ggtt->gsm[addr >> XE_PTE_SHIFT]);
> +}
> +
> static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size)
> {
> u16 pat_index = tile_to_xe(ggtt->tile)->pat.idx[XE_CACHE_WB];
> @@ -220,16 +228,19 @@ void xe_ggtt_might_lock(struct xe_ggtt *ggtt)
> static const struct xe_ggtt_pt_ops xelp_pt_ops = {
> .pte_encode_flags = xelp_ggtt_pte_flags,
> .ggtt_set_pte = xe_ggtt_set_pte,
> + .ggtt_get_pte = xe_ggtt_get_pte,
> };
>
> static const struct xe_ggtt_pt_ops xelpg_pt_ops = {
> .pte_encode_flags = xelpg_ggtt_pte_flags,
> .ggtt_set_pte = xe_ggtt_set_pte,
> + .ggtt_get_pte = xe_ggtt_get_pte,
> };
>
> static const struct xe_ggtt_pt_ops xelpg_pt_wa_ops = {
> .pte_encode_flags = xelpg_ggtt_pte_flags,
> .ggtt_set_pte = xe_ggtt_set_pte_and_flush,
> + .ggtt_get_pte = xe_ggtt_get_pte,
> };
>
> static void __xe_ggtt_init_early(struct xe_ggtt *ggtt, u32 reserved)
> @@ -914,6 +925,87 @@ void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid)
> xe_ggtt_assign_locked(node->ggtt, &node->base, vfid);
> mutex_unlock(&node->ggtt->lock);
> }
> +
> +/**
> + * xe_ggtt_node_save - Save a &struct xe_ggtt_node to a buffer
> + * @node: the &struct xe_ggtt_node to be saved
> + * @dst: destination buffer
correct me: this is buffer for the PTEs
> + * @size: destination buffer size in bytes
and this is size of above buffer
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size)
> +{
> + struct xe_ggtt *ggtt;
> + u64 start, end;
> + u64 *buf = dst;
> +
> + if (!node || !node->ggtt)
> + return -ENOENT;
hmm, non-NULL node must be initialized by xe_ggtt_node_init() which sets the .ggtt so this second check is redundant
> +
> + mutex_lock(&node->ggtt->lock);
guard(mutex)(&node->ggtt->lock);
> +
> + ggtt = node->ggtt;
> + start = node->base.start;
> + end = start + node->base.size - 1;
> +
> + if (node->base.size < size) {
so that's looks wrong, we are about to save 64bit PTEs of that node
we should compare size of all PTEs not the size of address space allocated by this node
> + mutex_unlock(&node->ggtt->lock);
> + return -EINVAL;
> + }
> +
> + while (start < end) {
> + *buf++ = ggtt->pt_ops->ggtt_get_pte(ggtt, start) & ~GGTT_PTE_VFID;
> + start += XE_PAGE_SIZE;
> + }
> +
> + mutex_unlock(&node->ggtt->lock);
> +
> + return 0;
> +}
> +
> +/**
> + * xe_ggtt_node_load - Load a &struct xe_ggtt_node from a buffer
> + * @node: the &struct xe_ggtt_node to be loaded
> + * @src: source buffer
> + * @size: source buffer size in bytes
> + * @vfid: VF identifier
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid)
> +{
> + struct xe_ggtt *ggtt;
> + u64 start, end;
> + const u64 *buf = src;
> + u64 vfid_pte = xe_encode_vfid_pte(vfid);
try to define vars in reverse xmas tree order
> +
> + if (!node || !node->ggtt)
> + return -ENOENT;
> +
> + mutex_lock(&node->ggtt->lock);
use guard(mutex)
> +
> + ggtt = node->ggtt;
> + start = node->base.start;
> + end = start + size - 1;
> +
> + if (node->base.size != size) {
> + mutex_unlock(&node->ggtt->lock);
> + return -EINVAL;
> + }
> +
> + while (start < end) {
> + ggtt->pt_ops->ggtt_set_pte(ggtt, start, (*buf & ~GGTT_PTE_VFID) | vfid_pte);
> + start += XE_PAGE_SIZE;
> + buf++;
> + }
> + xe_ggtt_invalidate(ggtt);
> +
> + mutex_unlock(&node->ggtt->lock);
> +
> + return 0;
> +}
> +
> #endif
>
> /**
> diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
> index 75fc7a1efea76..469b3a6ca14b4 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt.h
> +++ b/drivers/gpu/drm/xe/xe_ggtt.h
> @@ -43,6 +43,8 @@ u64 xe_ggtt_print_holes(struct xe_ggtt *ggtt, u64 alignment, struct drm_printer
>
> #ifdef CONFIG_PCI_IOV
> void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid);
> +int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size);
> +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid);
> #endif
>
> #ifndef CONFIG_LOCKDEP
> diff --git a/drivers/gpu/drm/xe/xe_ggtt_types.h b/drivers/gpu/drm/xe/xe_ggtt_types.h
> index c5e999d58ff2a..dacd796f81844 100644
> --- a/drivers/gpu/drm/xe/xe_ggtt_types.h
> +++ b/drivers/gpu/drm/xe/xe_ggtt_types.h
> @@ -78,6 +78,8 @@ struct xe_ggtt_pt_ops {
> u64 (*pte_encode_flags)(struct xe_bo *bo, u16 pat_index);
> /** @ggtt_set_pte: Directly write into GGTT's PTE */
> void (*ggtt_set_pte)(struct xe_ggtt *ggtt, u64 addr, u64 pte);
> + /** @ggtt_get_pte: Directly read from GGTT's PTE */
> + u64 (*ggtt_get_pte)(struct xe_ggtt *ggtt, u64 addr);
> };
>
> #endif
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> index b2e5c52978e6a..51027921b2988 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> @@ -726,6 +726,70 @@ int xe_gt_sriov_pf_config_set_fair_ggtt(struct xe_gt *gt, unsigned int vfid,
> return xe_gt_sriov_pf_config_bulk_set_ggtt(gt, vfid, num_vfs, fair);
> }
>
> +/**
> + * xe_gt_sriov_pf_config_ggtt_save - Save a VF provisioned GGTT data into a buffer.
> + * @gt: the &struct xe_gt
> + * @vfid: VF identifier
> + * @buf: the GGTT data destination buffer
> + * @size: the size of the buffer
> + *
> + * This function can only be called on PF.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
> + void *buf, size_t size)
> +{
> + struct xe_gt_sriov_config *config;
> + ssize_t ret;
int
> +
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid);
> + xe_gt_assert(gt, !(!buf ^ !size));
there seems to be no "query" option for this call, so both buf & size must be valid
> +
> + mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
> + config = pf_pick_vf_config(gt, vfid);
> + size = size / sizeof(u64) * XE_PAGE_SIZE;
?? something is wrong here - why do we have to change the size of the buf?
> +
> + ret = xe_ggtt_node_save(config->ggtt_region, buf, size);
> +
> + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
> +
> + return ret;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_config_ggtt_restore - Restore a VF provisioned GGTT data from a buffer.
> + * @gt: the &struct xe_gt
> + * @vfid: VF identifier
> + * @buf: the GGTT data source buffer
> + * @size: the size of the buffer
> + *
> + * This function can only be called on PF.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size)
> +{
> + struct xe_gt_sriov_config *config;
> + ssize_t ret;
> +
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid);
> + xe_gt_assert(gt, !(!buf ^ !size));
> +
> + mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
> + config = pf_pick_vf_config(gt, vfid);
> + size = size / sizeof(u64) * XE_PAGE_SIZE;
> +
> + ret = xe_ggtt_node_load(config->ggtt_region, buf, size, vfid);
> +
> + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
> +
> + return ret;
> +}
ditto
> +
> static u32 pf_get_min_spare_ctxs(struct xe_gt *gt)
> {
> /* XXX: preliminary */
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> index 513e6512a575b..6916b8f58ebf2 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> @@ -61,6 +61,11 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
> int xe_gt_sriov_pf_config_restore(struct xe_gt *gt, unsigned int vfid,
> const void *buf, size_t size);
>
> +int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
> + void *buf, size_t size);
> +int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size);
> +
> bool xe_gt_sriov_pf_config_is_empty(struct xe_gt *gt, unsigned int vfid);
>
> int xe_gt_sriov_pf_config_init(struct xe_gt *gt);
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 18/26] drm/xe/pf: Handle GGTT migration data as part of PF control
2025-10-11 19:38 ` [PATCH 18/26] drm/xe/pf: Handle GGTT migration data as part of PF control Michał Winiarski
@ 2025-10-13 12:36 ` Michal Wajdeczko
2025-10-21 1:16 ` Michał Winiarski
0 siblings, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 12:36 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Connect the helpers to allow save and restore of GGTT migration data in
> stop_copy / resume device state.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 13 ++
> .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 1 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 119 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 4 +
> 4 files changed, 137 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> index f73a3bf40037c..a74f6feca4830 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> @@ -188,6 +188,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> CASE2STR(MIGRATION_DATA_WIP);
> CASE2STR(SAVE_WIP);
> CASE2STR(SAVE_DATA_GUC);
> + CASE2STR(SAVE_DATA_GGTT);
> CASE2STR(SAVE_FAILED);
> CASE2STR(SAVED);
> CASE2STR(RESTORE_WIP);
> @@ -803,6 +804,7 @@ void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
>
> static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> {
> + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
> pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> }
> @@ -843,6 +845,13 @@ static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> return true;
> }
>
> + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT)) {
> + ret = xe_gt_sriov_pf_migration_ggtt_save(gt, vfid);
> + if (ret)
> + goto err;
> + return true;
> + }
> +
> xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
> pf_exit_vf_save_wip(gt, vfid);
> pf_enter_vf_saved(gt, vfid);
> @@ -862,6 +871,8 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> pf_enter_vf_wip(gt, vfid);
> if (xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0)
> pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> + if (xe_gt_sriov_pf_migration_ggtt_size(gt, vfid) > 0)
> + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
> pf_queue_vf(gt, vfid);
> return true;
> }
> @@ -970,6 +981,8 @@ static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
> struct xe_sriov_pf_migration_data *data)
> {
> switch (data->type) {
> + case XE_SRIOV_MIG_DATA_GGTT:
> + return xe_gt_sriov_pf_migration_ggtt_restore(gt, vfid, data);
> case XE_SRIOV_MIG_DATA_GUC:
> return xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
> default:
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> index b9787c425d9f6..c94ff0258306a 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> @@ -72,6 +72,7 @@ enum xe_gt_sriov_control_bits {
>
> XE_GT_SRIOV_STATE_SAVE_WIP,
> XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
> + XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
> XE_GT_SRIOV_STATE_SAVE_FAILED,
> XE_GT_SRIOV_STATE_SAVED,
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> index 0c10284f0b09a..92ecf47e71bc7 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> @@ -7,6 +7,7 @@
>
> #include "abi/guc_actions_sriov_abi.h"
> #include "xe_bo.h"
> +#include "xe_gt_sriov_pf_config.h"
> #include "xe_gt_sriov_pf_control.h"
> #include "xe_gt_sriov_pf_helpers.h"
> #include "xe_gt_sriov_pf_migration.h"
> @@ -37,6 +38,117 @@ static void pf_dump_mig_data(struct xe_gt *gt, unsigned int vfid,
> }
> }
>
> +static int pf_save_vf_ggtt_mig_data(struct xe_gt *gt, unsigned int vfid)
> +{
> + struct xe_sriov_pf_migration_data *data;
> + size_t size;
> + int ret;
> +
> + size = xe_gt_sriov_pf_config_get_ggtt(gt, vfid);
> + if (size == 0)
> + return 0;
> + size = size / XE_PAGE_SIZE * sizeof(u64);
maybe it would be better to avoid reusing the var and have two:
u64 alloc_size = xe_gt_sriov_pf_config_get_ggtt(...);
u64 pte_size = xe_ggtt_pte_size(alloc_size);
> +
> + data = xe_sriov_pf_migration_data_alloc(gt_to_xe(gt));
> + if (!data)
> + return -ENOMEM;
> +
> + ret = xe_sriov_pf_migration_data_init(data, gt->tile->id, gt->info.id,
> + XE_SRIOV_MIG_DATA_GGTT, 0, size);
> + if (ret)
> + goto fail;
> +
> + ret = xe_gt_sriov_pf_config_ggtt_save(gt, vfid, data->vaddr, size);
> + if (ret)
> + goto fail;
> +
> + pf_dump_mig_data(gt, vfid, data);
> +
> + ret = xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
> + if (ret)
> + goto fail;
> +
> + return 0;
> +
> +fail:
> + xe_sriov_pf_migration_data_free(data);
> + xe_gt_sriov_err(gt, "Unable to save VF%u GGTT data (%d)\n", vfid, ret);
use %pe for errors
> + return ret;
> +}
> +
> +static int pf_restore_vf_ggtt_mig_data(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + size_t size;
> + int ret;
> +
> + size = xe_gt_sriov_pf_config_get_ggtt(gt, vfid) / XE_PAGE_SIZE * sizeof(u64);
> + if (size != data->hdr.size)
> + return -EINVAL;
do we need this ?
there seems to be similar check in xe_ggtt_node_load() called by restore() below
> +
> + pf_dump_mig_data(gt, vfid, data);
> +
> + ret = xe_gt_sriov_pf_config_ggtt_restore(gt, vfid, data->vaddr, size);
> + if (ret)
> + return ret;
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_ggtt_size() - Get the size of VF GGTT migration data.
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: size in bytes or a negative error code on failure.
> + */
> +ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (gt != xe_root_mmio_gt(gt_to_xe(gt)))
> + return 0;
> +
> + return xe_gt_sriov_pf_config_get_ggtt(gt, vfid) / XE_PAGE_SIZE * sizeof(u64);
this conversion logic should be done by xe_ggtt layer helper
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_ggtt_save() - Save VF GGTT migration data.
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
since there is assert, probably you should also say: "(can't be 0)"
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid)
> +{
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid != PFID);
> + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> +
> + return pf_save_vf_ggtt_mig_data(gt, vfid);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_migration_ggtt_restore() - Restore VF GGTT migration data.
> + * @gt: the &struct xe_gt
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data)
> +{
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> + xe_gt_assert(gt, vfid != PFID);
> + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> +
> + return pf_restore_vf_ggtt_mig_data(gt, vfid, data);
> +}
> +
> /* Return: number of dwords saved/restored/required or a negative error code on failure */
> static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
> u64 addr, u32 ndwords)
> @@ -290,6 +402,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> size += sizeof(struct xe_sriov_pf_migration_hdr);
> total += size;
>
> + size = xe_gt_sriov_pf_migration_ggtt_size(gt, vfid);
> + if (size < 0)
> + return size;
> + else if (size > 0)
> + size += sizeof(struct xe_sriov_pf_migration_hdr);
> + total += size;
> +
> return total;
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> index 5df64449232bc..5bb8cba2ea0cb 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> @@ -16,6 +16,10 @@ ssize_t xe_gt_sriov_pf_migration_guc_size(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_guc_save(struct xe_gt *gt, unsigned int vfid);
> int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
> struct xe_sriov_pf_migration_data *data);
> +ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> + struct xe_sriov_pf_migration_data *data);
>
> ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 19/26] drm/xe/pf: Add helpers for VF MMIO migration data handling
2025-10-11 19:38 ` [PATCH 19/26] drm/xe/pf: Add helpers for VF MMIO migration data handling Michał Winiarski
@ 2025-10-13 13:28 ` Michal Wajdeczko
0 siblings, 0 replies; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 13:28 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> In an upcoming change, the VF MMIO migration data will be handled as
> part of VF control state machine. Add the necessary helpers to allow the
> migration data transfer to/from the VF MMIO registers.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 88 +++++++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf.h | 19 +++++++
> 2 files changed, 107 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> index c4dda87b47cc8..6ceb9e024e41e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> @@ -194,6 +194,94 @@ static void pf_clear_vf_scratch_regs(struct xe_gt *gt, unsigned int vfid)
> }
> }
>
> +/**
> + * xe_gt_sriov_pf_mmio_vf_size - Get the size of VF MMIO register data.
> + * @gt: the &struct xe_gt
> + * @vfid: VF identifier
> + *
> + * Return: size in bytes.
> + */
> +size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid)
> +{
> + if (xe_gt_is_media_type(gt))
> + return MED_VF_SW_FLAG_COUNT * sizeof(u32);
> + else
> + return VF_SW_FLAG_COUNT * sizeof(u32);
> +}
> +
> +/**
> + * xe_gt_sriov_pf_mmio_vf_save - Save VF MMIO register values to a buffer.
> + * @gt: the &struct xe_gt
> + * @vfid: VF identifier
> + * @buf: destination buffer
> + * @size: destination buffer size in bytes
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size)
> +{
> + u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
> + struct xe_reg scratch;
> + u32 *regs = buf;
> + int n, count;
> +
> + if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
> + return -EINVAL;
> +
> + if (xe_gt_is_media_type(gt)) {
> + count = MED_VF_SW_FLAG_COUNT;
> + for (n = 0; n < count; n++) {
> + scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
> + regs[n] = xe_mmio_read32(>->mmio, scratch);
> + }
> + } else {
> + count = VF_SW_FLAG_COUNT;
> + for (n = 0; n < count; n++) {
> + scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
> + regs[n] = xe_mmio_read32(>->mmio, scratch);
> + }
> + }
> +
> + return 0;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_mmio_vf_restore - Restore VF MMIO register values from a buffer.
> + * @gt: the &struct xe_gt
> + * @vfid: VF identifier
> + * @buf: source buffer
> + * @size: source buffer size in bytes
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size)
> +{
> + u32 stride = pf_get_vf_regs_stride(gt_to_xe(gt));
> + const u32 *regs = buf;
> + struct xe_reg scratch;
> + int n, count;
> +
> + if (size != xe_gt_sriov_pf_mmio_vf_size(gt, vfid))
> + return -EINVAL;
> +
> + if (xe_gt_is_media_type(gt)) {
> + count = MED_VF_SW_FLAG_COUNT;
> + for (n = 0; n < count; n++) {
> + scratch = xe_reg_vf_to_pf(MED_VF_SW_FLAG(n), vfid, stride);
> + xe_mmio_write32(>->mmio, scratch, regs[n]);
> + }
> + } else {
> + count = VF_SW_FLAG_COUNT;
> + for (n = 0; n < count; n++) {
> + scratch = xe_reg_vf_to_pf(VF_SW_FLAG(n), vfid, stride);
> + xe_mmio_write32(>->mmio, scratch, regs[n]);
> + }
> + }
> +
> + return 0;
> +}
> +
> /**
> * xe_gt_sriov_pf_sanitize_hw() - Reset hardware state related to a VF.
> * @gt: the &xe_gt
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> index e7fde3f9937af..5e5f31d943d89 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> @@ -6,6 +6,8 @@
> #ifndef _XE_GT_SRIOV_PF_H_
> #define _XE_GT_SRIOV_PF_H_
>
> +#include <linux/types.h>
likely also <linux/errno.h> if you want to keep stubs (but double check if those are really needed)
> +
> struct xe_gt;
>
> #ifdef CONFIG_PCI_IOV
> @@ -16,6 +18,10 @@ void xe_gt_sriov_pf_init_hw(struct xe_gt *gt);
> void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid);
> void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt);
> void xe_gt_sriov_pf_restart(struct xe_gt *gt);
> +size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid);
> +int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size);
> +int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size);
> #else
> static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
> {
> @@ -38,6 +44,19 @@ static inline void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt)
> static inline void xe_gt_sriov_pf_restart(struct xe_gt *gt)
> {
> }
> +size_t xe_gt_sriov_pf_mmio_vf_size(struct xe_gt *gt, unsigned int vfid)
> +{
> + return 0;
> +}
> +int xe_gt_sriov_pf_mmio_vf_save(struct xe_gt *gt, unsigned int vfid, void *buf, size_t size)
> +{
> + return -ENODEV;
> +}
> +int xe_gt_sriov_pf_mmio_vf_restore(struct xe_gt *gt, unsigned int vfid,
> + const void *buf, size_t size)
> +{
> + return -ENODEV;
> +}
> #endif
>
> #endif
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 24/26] drm/xe/pf: Add wait helper for VF FLR
2025-10-11 19:38 ` [PATCH 24/26] drm/xe/pf: Add wait helper for VF FLR Michał Winiarski
@ 2025-10-13 13:49 ` Michal Wajdeczko
0 siblings, 0 replies; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 13:49 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> VF FLR requires additional processing done by PF driver.
> Add a helper to be used as part of VF driver .reset_done().
this ".reset_done" part might require some explanation/update
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/xe_sriov_pf_control.c | 24 ++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_sriov_pf_control.h | 1 +
> 2 files changed, 25 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> index 10e1f18aa8b11..24845644f269e 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> @@ -122,6 +122,30 @@ int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid)
> return result;
> }
>
> +/**
> + * xe_sriov_pf_control_wait_flr() - Wait for a VF reset (FLR) to complete.
> + * @xe: the &xe_device
> + * @vfid: the VF identifier
> + *
> + * This function is for PF only.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_pf_control_wait_flr(struct xe_device *xe, unsigned int vfid)
> +{
> + struct xe_gt *gt;
> + unsigned int id;
> + int result = 0;
> + int err;
> +
> + for_each_gt(gt, xe, id) {
> + err = xe_gt_sriov_pf_control_wait_flr(gt, vfid);
> + result = result ? -EUCLEAN : err;
> + }
> +
> + return result;
> +}
one might want to call this new wait function from within xe_sriov_pf_control_reset_vf() which does both trigger/wait
but for me it works as is, so with commit message update
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> +
> /**
> * xe_sriov_pf_control_sync_flr() - Synchronize a VF FLR between all GTs.
> * @xe: the &xe_device
> diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> index 512fd21d87c1e..c8ea54768cfaa 100644
> --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.h
> @@ -12,6 +12,7 @@ int xe_sriov_pf_control_pause_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_resume_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_stop_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_reset_vf(struct xe_device *xe, unsigned int vfid);
> +int xe_sriov_pf_control_wait_flr(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_sync_flr(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid);
> int xe_sriov_pf_control_wait_save_vf(struct xe_device *xe, unsigned int vfid);
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 25/26] drm/xe/pf: Export helpers for VFIO
2025-10-11 19:38 ` [PATCH 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
2025-10-12 18:32 ` Matthew Brost
@ 2025-10-13 14:02 ` Michal Wajdeczko
2025-10-21 1:49 ` Michał Winiarski
1 sibling, 1 reply; 82+ messages in thread
From: Michal Wajdeczko @ 2025-10-13 14:02 UTC (permalink / raw)
To: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Jason Gunthorpe,
Yishai Hadas, Kevin Tian, Shameer Kolothum, intel-xe,
linux-kernel, kvm
Cc: dri-devel, Matthew Brost, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Lukasz Laguna
On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> Vendor-specific VFIO driver for Xe will implement VF migration.
> Export everything that's needed for migration ops.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> drivers/gpu/drm/xe/Makefile | 2 +
> drivers/gpu/drm/xe/xe_sriov_vfio.c | 252 +++++++++++++++++++++++++++++
> include/drm/intel/xe_sriov_vfio.h | 28 ++++
> 3 files changed, 282 insertions(+)
> create mode 100644 drivers/gpu/drm/xe/xe_sriov_vfio.c
> create mode 100644 include/drm/intel/xe_sriov_vfio.h
>
> diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> index e253d65366de4..a5c5afff42aa6 100644
> --- a/drivers/gpu/drm/xe/Makefile
> +++ b/drivers/gpu/drm/xe/Makefile
> @@ -181,6 +181,8 @@ xe-$(CONFIG_PCI_IOV) += \
> xe_sriov_pf_service.o \
> xe_tile_sriov_pf_debugfs.o
>
> +xe-$(CONFIG_XE_VFIO_PCI) += xe_sriov_vfio.o
> +
> # include helpers for tests even when XE is built-in
> ifdef CONFIG_DRM_XE_KUNIT_TEST
> xe-y += tests/xe_kunit_helpers.o
> diff --git a/drivers/gpu/drm/xe/xe_sriov_vfio.c b/drivers/gpu/drm/xe/xe_sriov_vfio.c
> new file mode 100644
> index 0000000000000..a510d1bde93f0
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_sriov_vfio.c
> @@ -0,0 +1,252 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include <drm/intel/xe_sriov_vfio.h>
> +
> +#include "xe_pm.h"
> +#include "xe_sriov.h"
> +#include "xe_sriov_pf_control.h"
> +#include "xe_sriov_pf_migration.h"
> +#include "xe_sriov_pf_migration_data.h"
> +
> +/**
> + * xe_sriov_vfio_migration_supported() - Check if migration is supported.
> + * @pdev: PF PCI device
> + *
> + * Return: true if migration is supported, false otherwise.
> + */
> +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + return xe_sriov_pf_migration_supported(xe);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_migration_supported, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_wait_flr_done - Wait for VF FLR completion.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
or
* @pdev: the PF struct &pci_dev device
* @vfid: the VF identifier (can't be 0)
> + *
> + * This function will wait until VF FLR is processed by PF on all tiles (or
> + * until timeout occurs).
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
you also need to validate:
vfid != PFID
and
vfid <= xe_sriov_pf_get_totalvfs()
this applies to all exported functions below
> +
> + return xe_sriov_pf_control_wait_flr(xe, vfid);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_wait_flr_done, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_stop - Stop VF.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will pause VF on all tiles/GTs.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
maybe we should use xe_pm_runtime_get_if_active() to avoid awaking PF if there are no VFs?
when VFs are enabled xe_pm_runtime_get_if_active() will always return true
> + ret = xe_sriov_pf_control_pause_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_run - Run VF.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will resume VF on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_resume_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_run, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_stop_copy_enter - Copy VF migration data from device (while stopped).
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will save VF migration data on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_save_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_enter, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_stop_copy_exit - Wait until VF migration data save is done.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will wait until VF migration data is saved on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_wait_save_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_exit, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_resume_enter - Copy VF migration data to device (while stopped).
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will restore VF migration data on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_restore_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_enter, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_resume_exit - Wait until VF migration data is copied to the device.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will wait until VF migration data is restored on all tiles.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_wait_restore_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_exit, "xe-vfio-pci");
> +
> +/**
> + * xe_sriov_vfio_error - Move VF to error state.
> + * @pdev: PF PCI device
> + * @vfid: VF identifier
> + *
> + * This function will stop VF on all tiles.
> + * Reset is needed to move it out of error state.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> + int ret;
> +
> + if (!IS_SRIOV_PF(xe))
> + return -ENODEV;
> +
> + xe_pm_runtime_get(xe);
> + ret = xe_sriov_pf_control_stop_vf(xe, vfid);
> + xe_pm_runtime_put(xe);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_error, "xe-vfio-pci");
> +
add kernel-doc
> +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> + char __user *buf, size_t len)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
missing param validation
is PF
is valid VFID
no RPM ?
> +
> + return xe_sriov_pf_migration_data_read(xe, vfid, buf, len);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_read, "xe-vfio-pci");
> +
> +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> + const char __user *buf, size_t len)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + return xe_sriov_pf_migration_data_write(xe, vfid, buf, len);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_write, "xe-vfio-pci");
> +
> +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid)
> +{
> + struct xe_device *xe = pci_get_drvdata(pdev);
> +
> + return xe_sriov_pf_migration_size(xe, vfid);
> +}
> +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_size, "xe-vfio-pci");
> diff --git a/include/drm/intel/xe_sriov_vfio.h b/include/drm/intel/xe_sriov_vfio.h
> new file mode 100644
> index 0000000000000..24e272f84c0e6
> --- /dev/null
> +++ b/include/drm/intel/xe_sriov_vfio.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_SRIOV_VFIO_H_
> +#define _XE_SRIOV_VFIO_H_
> +
> +#include <linux/types.h>
> +
> +struct pci_dev;
> +
> +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid);
> +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> + char __user *buf, size_t len);
> +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> + const char __user *buf, size_t len);
> +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid);
> +
> +#endif /* _XE_SRIOV_VFIO_H_ */
this is a very simple header, no need to repeat include guard name here
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-11 19:38 ` [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
@ 2025-10-13 19:00 ` Rodrigo Vivi
2025-10-21 23:03 ` Jason Gunthorpe
1 sibling, 0 replies; 82+ messages in thread
From: Rodrigo Vivi @ 2025-10-13 19:00 UTC (permalink / raw)
To: Michał Winiarski, Lucas De Marchi, Thomas Hellström
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Jason Gunthorpe, Yishai Hadas, Kevin Tian, Shameer Kolothum,
intel-xe, linux-kernel, kvm, dri-devel, Matthew Brost,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sat, Oct 11, 2025 at 09:38:47PM +0200, Michał Winiarski wrote:
> In addition to generic VFIO PCI functionality, the driver implements
> VFIO migration uAPI, allowing userspace to enable migration for Intel
> Graphics SR-IOV Virtual Functions.
> The driver binds to VF device, and uses API exposed by Xe driver bound
> to PF device to control VF device state and transfer the migration data.
>
> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> ---
> MAINTAINERS | 7 +
> drivers/vfio/pci/Kconfig | 2 +
> drivers/vfio/pci/Makefile | 2 +
> drivers/vfio/pci/xe/Kconfig | 12 +
> drivers/vfio/pci/xe/Makefile | 3 +
> drivers/vfio/pci/xe/main.c | 470 +++++++++++++++++++++++++++++++++++
> 6 files changed, 496 insertions(+)
> create mode 100644 drivers/vfio/pci/xe/Kconfig
> create mode 100644 drivers/vfio/pci/xe/Makefile
> create mode 100644 drivers/vfio/pci/xe/main.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d46e9f2aaf2ad..ce84b021e6679 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -26567,6 +26567,13 @@ L: virtualization@lists.linux.dev
> S: Maintained
> F: drivers/vfio/pci/virtio
>
> +VFIO XE PCI DRIVER
> +M: Michał Winiarski <michal.winiarski@intel.com>
> +L: kvm@vger.kernel.org
> +L: intel-xe@lists.freedesktop.org
> +S: Supported
> +F: drivers/vfio/pci/xe
Just to confirm:
The patch flow towards drm-xe-next right? or for future changes
that doesn't depend on the xe necessarily you plan to send through some
kvm tree?
Either way,
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> +
> VGA_SWITCHEROO
> R: Lukas Wunner <lukas@wunner.de>
> S: Maintained
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 2b0172f546652..c100f0ab87f2d 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -67,4 +67,6 @@ source "drivers/vfio/pci/nvgrace-gpu/Kconfig"
>
> source "drivers/vfio/pci/qat/Kconfig"
>
> +source "drivers/vfio/pci/xe/Kconfig"
> +
> endmenu
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index cf00c0a7e55c8..f5d46aa9347b9 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -19,3 +19,5 @@ obj-$(CONFIG_VIRTIO_VFIO_PCI) += virtio/
> obj-$(CONFIG_NVGRACE_GPU_VFIO_PCI) += nvgrace-gpu/
>
> obj-$(CONFIG_QAT_VFIO_PCI) += qat/
> +
> +obj-$(CONFIG_XE_VFIO_PCI) += xe/
> diff --git a/drivers/vfio/pci/xe/Kconfig b/drivers/vfio/pci/xe/Kconfig
> new file mode 100644
> index 0000000000000..787be88268685
> --- /dev/null
> +++ b/drivers/vfio/pci/xe/Kconfig
> @@ -0,0 +1,12 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +config XE_VFIO_PCI
> + tristate "VFIO support for Intel Graphics"
> + depends on DRM_XE
> + select VFIO_PCI_CORE
> + help
> + This option enables vendor-specific VFIO driver for Intel Graphics.
> + In addition to generic VFIO PCI functionality, it implements VFIO
> + migration uAPI allowing userspace to enable migration for
> + Intel Graphics SR-IOV Virtual Functions supported by the Xe driver.
> +
> + If you don't know what to do here, say N.
> diff --git a/drivers/vfio/pci/xe/Makefile b/drivers/vfio/pci/xe/Makefile
> new file mode 100644
> index 0000000000000..13aa0fd192cd4
> --- /dev/null
> +++ b/drivers/vfio/pci/xe/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +obj-$(CONFIG_XE_VFIO_PCI) += xe-vfio-pci.o
> +xe-vfio-pci-y := main.o
> diff --git a/drivers/vfio/pci/xe/main.c b/drivers/vfio/pci/xe/main.c
> new file mode 100644
> index 0000000000000..b9109b6812eb2
> --- /dev/null
> +++ b/drivers/vfio/pci/xe/main.c
> @@ -0,0 +1,470 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#include <linux/anon_inodes.h>
> +#include <linux/delay.h>
> +#include <linux/file.h>
> +#include <linux/module.h>
> +#include <linux/pci.h>
> +#include <linux/sizes.h>
> +#include <linux/types.h>
> +#include <linux/vfio.h>
> +#include <linux/vfio_pci_core.h>
> +
> +#include <drm/intel/xe_sriov_vfio.h>
> +
> +/**
> + * struct xe_vfio_pci_migration_file - file used for reading / writing migration data
> + */
> +struct xe_vfio_pci_migration_file {
> + /** @filp: pointer to underlying &struct file */
> + struct file *filp;
> + /** @lock: serializes accesses to migration data */
> + struct mutex lock;
> + /** @xe_vdev: backpointer to &struct xe_vfio_pci_core_device */
> + struct xe_vfio_pci_core_device *xe_vdev;
> +};
> +
> +/**
> + * struct xe_vfio_pci_core_device - xe-specific vfio_pci_core_device
> + *
> + * Top level structure of xe_vfio_pci.
> + */
> +struct xe_vfio_pci_core_device {
> + /** @core_device: vendor-agnostic VFIO device */
> + struct vfio_pci_core_device core_device;
> +
> + /** @mig_state: current device migration state */
> + enum vfio_device_mig_state mig_state;
> +
> + /** @vfid: VF number used by PF, xe uses 1-based indexing for vfid */
> + unsigned int vfid;
> +
> + /** @pf: pointer to driver_private of physical function */
> + struct pci_dev *pf;
> +
> + /** @fd: &struct xe_vfio_pci_migration_file for userspace to read/write migration data */
> + struct xe_vfio_pci_migration_file *fd;
> +};
> +
> +#define xe_vdev_to_dev(xe_vdev) (&(xe_vdev)->core_device.pdev->dev)
> +#define xe_vdev_to_pdev(xe_vdev) ((xe_vdev)->core_device.pdev)
> +
> +static void xe_vfio_pci_disable_file(struct xe_vfio_pci_migration_file *migf)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev = migf->xe_vdev;
> +
> + mutex_lock(&migf->lock);
> + xe_vdev->fd = NULL;
> + mutex_unlock(&migf->lock);
> +}
> +
> +static void xe_vfio_pci_reset(struct xe_vfio_pci_core_device *xe_vdev)
> +{
> + if (xe_vdev->fd)
> + xe_vfio_pci_disable_file(xe_vdev->fd);
> +
> + xe_vdev->mig_state = VFIO_DEVICE_STATE_RUNNING;
> +}
> +
> +static void xe_vfio_pci_reset_done(struct pci_dev *pdev)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev = pci_get_drvdata(pdev);
> + int ret;
> +
> + ret = xe_sriov_vfio_wait_flr_done(xe_vdev->pf, xe_vdev->vfid);
> + if (ret)
> + dev_err(&pdev->dev, "Failed to wait for FLR: %d\n", ret);
> +
> + xe_vfio_pci_reset(xe_vdev);
> +}
> +
> +static const struct pci_error_handlers xe_vfio_pci_err_handlers = {
> + .reset_done = xe_vfio_pci_reset_done,
> +};
> +
> +static int xe_vfio_pci_open_device(struct vfio_device *core_vdev)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev =
> + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> + struct vfio_pci_core_device *vdev = &xe_vdev->core_device;
> + int ret;
> +
> + ret = vfio_pci_core_enable(vdev);
> + if (ret)
> + return ret;
> +
> + vfio_pci_core_finish_enable(vdev);
> +
> + return 0;
> +}
> +
> +static int xe_vfio_pci_release_file(struct inode *inode, struct file *filp)
> +{
> + struct xe_vfio_pci_migration_file *migf = filp->private_data;
> +
> + xe_vfio_pci_disable_file(migf);
> + mutex_destroy(&migf->lock);
> + kfree(migf);
> +
> + return 0;
> +}
> +
> +static ssize_t xe_vfio_pci_save_read(struct file *filp, char __user *buf, size_t len, loff_t *pos)
> +{
> + struct xe_vfio_pci_migration_file *migf = filp->private_data;
> + ssize_t ret;
> +
> + if (pos)
> + return -ESPIPE;
> +
> + mutex_lock(&migf->lock);
> + ret = xe_sriov_vfio_data_read(migf->xe_vdev->pf, migf->xe_vdev->vfid, buf, len);
> + mutex_unlock(&migf->lock);
> +
> + return ret;
> +}
> +
> +static const struct file_operations xe_vfio_pci_save_fops = {
> + .owner = THIS_MODULE,
> + .read = xe_vfio_pci_save_read,
> + .release = xe_vfio_pci_release_file,
> + .llseek = noop_llseek,
> +};
> +
> +static ssize_t xe_vfio_pci_resume_write(struct file *filp, const char __user *buf,
> + size_t len, loff_t *pos)
> +{
> + struct xe_vfio_pci_migration_file *migf = filp->private_data;
> + ssize_t ret;
> +
> + if (pos)
> + return -ESPIPE;
> +
> + mutex_lock(&migf->lock);
> + ret = xe_sriov_vfio_data_write(migf->xe_vdev->pf, migf->xe_vdev->vfid, buf, len);
> + mutex_unlock(&migf->lock);
> +
> + return ret;
> +}
> +
> +static const struct file_operations xe_vfio_pci_resume_fops = {
> + .owner = THIS_MODULE,
> + .write = xe_vfio_pci_resume_write,
> + .release = xe_vfio_pci_release_file,
> + .llseek = noop_llseek,
> +};
> +
> +static const char *vfio_dev_state_str(u32 state)
> +{
> + switch (state) {
> + case VFIO_DEVICE_STATE_RUNNING: return "running";
> + case VFIO_DEVICE_STATE_RUNNING_P2P: return "running_p2p";
> + case VFIO_DEVICE_STATE_STOP_COPY: return "stopcopy";
> + case VFIO_DEVICE_STATE_STOP: return "stop";
> + case VFIO_DEVICE_STATE_RESUMING: return "resuming";
> + case VFIO_DEVICE_STATE_ERROR: return "error";
> + default: return "";
> + }
> +}
> +
> +enum xe_vfio_pci_file_type {
> + XE_VFIO_FILE_SAVE = 0,
> + XE_VFIO_FILE_RESUME,
> +};
> +
> +static struct xe_vfio_pci_migration_file *
> +xe_vfio_pci_alloc_file(struct xe_vfio_pci_core_device *xe_vdev,
> + enum xe_vfio_pci_file_type type)
> +{
> + struct xe_vfio_pci_migration_file *migf;
> + const struct file_operations *fops;
> + int flags;
> +
> + migf = kzalloc(sizeof(*migf), GFP_KERNEL);
> + if (!migf)
> + return ERR_PTR(-ENOMEM);
> +
> + fops = type == XE_VFIO_FILE_SAVE ? &xe_vfio_pci_save_fops : &xe_vfio_pci_resume_fops;
> + flags = type == XE_VFIO_FILE_SAVE ? O_RDONLY : O_WRONLY;
> + migf->filp = anon_inode_getfile("xe_vfio_mig", fops, migf, flags);
> + if (IS_ERR(migf->filp)) {
> + kfree(migf);
> + return ERR_CAST(migf->filp);
> + }
> +
> + mutex_init(&migf->lock);
> + migf->xe_vdev = xe_vdev;
> + xe_vdev->fd = migf;
> +
> + stream_open(migf->filp->f_inode, migf->filp);
> +
> + return migf;
> +}
> +
> +static struct file *
> +xe_vfio_set_state(struct xe_vfio_pci_core_device *xe_vdev, u32 new)
> +{
> + u32 cur = xe_vdev->mig_state;
> + int ret;
> +
> + dev_dbg(xe_vdev_to_dev(xe_vdev),
> + "state: %s->%s\n", vfio_dev_state_str(cur), vfio_dev_state_str(new));
> +
> + /*
> + * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
> + * selectively block p2p DMA transfers.
> + * The device is not processing new workload requests when the VF is stopped, and both
> + * memory and MMIO communication channels are transferred to destination (where processing
> + * will be resumed).
> + */
> + if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
> + (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
> + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
> + if (ret)
> + goto err;
> +
> + return NULL;
> + }
> +
> + if ((cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_STOP) ||
> + (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING_P2P))
> + return NULL;
> +
> + if ((cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING) ||
> + (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_RUNNING)) {
> + ret = xe_sriov_vfio_run(xe_vdev->pf, xe_vdev->vfid);
> + if (ret)
> + goto err;
> +
> + return NULL;
> + }
> +
> + if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_STOP_COPY) {
> + struct xe_vfio_pci_migration_file *migf;
> +
> + migf = xe_vfio_pci_alloc_file(xe_vdev, XE_VFIO_FILE_SAVE);
> + if (IS_ERR(migf)) {
> + ret = PTR_ERR(migf);
> + goto err;
> + }
> +
> + ret = xe_sriov_vfio_stop_copy_enter(xe_vdev->pf, xe_vdev->vfid);
> + if (ret) {
> + fput(migf->filp);
> + goto err;
> + }
> +
> + return migf->filp;
> + }
> +
> + if ((cur == VFIO_DEVICE_STATE_STOP_COPY && new == VFIO_DEVICE_STATE_STOP)) {
> + if (xe_vdev->fd)
> + xe_vfio_pci_disable_file(xe_vdev->fd);
> +
> + xe_sriov_vfio_stop_copy_exit(xe_vdev->pf, xe_vdev->vfid);
> +
> + return NULL;
> + }
> +
> + if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RESUMING) {
> + struct xe_vfio_pci_migration_file *migf;
> +
> + migf = xe_vfio_pci_alloc_file(xe_vdev, XE_VFIO_FILE_RESUME);
> + if (IS_ERR(migf)) {
> + ret = PTR_ERR(migf);
> + goto err;
> + }
> +
> + ret = xe_sriov_vfio_resume_enter(xe_vdev->pf, xe_vdev->vfid);
> + if (ret) {
> + fput(migf->filp);
> + goto err;
> + }
> +
> + return migf->filp;
> + }
> +
> + if (cur == VFIO_DEVICE_STATE_RESUMING && new == VFIO_DEVICE_STATE_STOP) {
> + if (xe_vdev->fd)
> + xe_vfio_pci_disable_file(xe_vdev->fd);
> +
> + xe_sriov_vfio_resume_exit(xe_vdev->pf, xe_vdev->vfid);
> +
> + return NULL;
> + }
> +
> + if (new == VFIO_DEVICE_STATE_ERROR)
> + xe_sriov_vfio_error(xe_vdev->pf, xe_vdev->vfid);
> +
> + WARN(true, "Unknown state transition %d->%d", cur, new);
> + return ERR_PTR(-EINVAL);
> +
> +err:
> + dev_dbg(xe_vdev_to_dev(xe_vdev),
> + "Failed to transition state: %s->%s err=%d\n",
> + vfio_dev_state_str(cur), vfio_dev_state_str(new), ret);
> + return ERR_PTR(ret);
> +}
> +
> +static struct file *
> +xe_vfio_pci_set_device_state(struct vfio_device *core_vdev,
> + enum vfio_device_mig_state new_state)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev =
> + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> + enum vfio_device_mig_state next_state;
> + struct file *f = NULL;
> + int ret;
> +
> + while (new_state != xe_vdev->mig_state) {
> + ret = vfio_mig_get_next_state(core_vdev, xe_vdev->mig_state,
> + new_state, &next_state);
> + if (ret) {
> + f = ERR_PTR(ret);
> + break;
> + }
> + f = xe_vfio_set_state(xe_vdev, next_state);
> + if (IS_ERR(f))
> + break;
> +
> + xe_vdev->mig_state = next_state;
> +
> + /* Multiple state transitions with non-NULL file in the middle */
> + if (f && new_state != xe_vdev->mig_state) {
> + fput(f);
> + f = ERR_PTR(-EINVAL);
> + break;
> + }
> + }
> +
> + return f;
> +}
> +
> +static int xe_vfio_pci_get_device_state(struct vfio_device *core_vdev,
> + enum vfio_device_mig_state *curr_state)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev =
> + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> +
> + *curr_state = xe_vdev->mig_state;
> +
> + return 0;
> +}
> +
> +static int xe_vfio_pci_get_data_size(struct vfio_device *vdev,
> + unsigned long *stop_copy_length)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev =
> + container_of(vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> +
> + *stop_copy_length = xe_sriov_vfio_stop_copy_size(xe_vdev->pf, xe_vdev->vfid);
> +
> + return 0;
> +}
> +
> +static const struct vfio_migration_ops xe_vfio_pci_migration_ops = {
> + .migration_set_state = xe_vfio_pci_set_device_state,
> + .migration_get_state = xe_vfio_pci_get_device_state,
> + .migration_get_data_size = xe_vfio_pci_get_data_size,
> +};
> +
> +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev =
> + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> +
> + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> + return;
> +
> + /* vfid starts from 1 for xe */
> + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
> + xe_vdev->pf = pdev->physfn;
> +
> + core_vdev->migration_flags = VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P;
> + core_vdev->mig_ops = &xe_vfio_pci_migration_ops;
> +}
> +
> +static int xe_vfio_pci_init_dev(struct vfio_device *core_vdev)
> +{
> + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> +
> + if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe") == 0)
> + xe_vfio_pci_migration_init(core_vdev);
> +
> + return vfio_pci_core_init_dev(core_vdev);
> +}
> +
> +static const struct vfio_device_ops xe_vfio_pci_ops = {
> + .name = "xe-vfio-pci",
> + .init = xe_vfio_pci_init_dev,
> + .release = vfio_pci_core_release_dev,
> + .open_device = xe_vfio_pci_open_device,
> + .close_device = vfio_pci_core_close_device,
> + .ioctl = vfio_pci_core_ioctl,
> + .device_feature = vfio_pci_core_ioctl_feature,
> + .read = vfio_pci_core_read,
> + .write = vfio_pci_core_write,
> + .mmap = vfio_pci_core_mmap,
> + .request = vfio_pci_core_request,
> + .match = vfio_pci_core_match,
> + .match_token_uuid = vfio_pci_core_match_token_uuid,
> + .bind_iommufd = vfio_iommufd_physical_bind,
> + .unbind_iommufd = vfio_iommufd_physical_unbind,
> + .attach_ioas = vfio_iommufd_physical_attach_ioas,
> + .detach_ioas = vfio_iommufd_physical_detach_ioas,
> +};
> +
> +static int xe_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev;
> + int ret;
> +
> + xe_vdev = vfio_alloc_device(xe_vfio_pci_core_device, core_device.vdev, &pdev->dev,
> + &xe_vfio_pci_ops);
> + if (IS_ERR(xe_vdev))
> + return PTR_ERR(xe_vdev);
> +
> + dev_set_drvdata(&pdev->dev, &xe_vdev->core_device);
> +
> + ret = vfio_pci_core_register_device(&xe_vdev->core_device);
> + if (ret) {
> + vfio_put_device(&xe_vdev->core_device.vdev);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +static void xe_vfio_pci_remove(struct pci_dev *pdev)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev = pci_get_drvdata(pdev);
> +
> + vfio_pci_core_unregister_device(&xe_vdev->core_device);
> + vfio_put_device(&xe_vdev->core_device.vdev);
> +}
> +
> +static const struct pci_device_id xe_vfio_pci_table[] = {
> + { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_ANY_ID),
> + .class = PCI_BASE_CLASS_DISPLAY << 8, .class_mask = 0xff << 16,
> + .override_only = PCI_ID_F_VFIO_DRIVER_OVERRIDE },
> + {}
> +};
> +MODULE_DEVICE_TABLE(pci, xe_vfio_pci_table);
> +
> +static struct pci_driver xe_vfio_pci_driver = {
> + .name = "xe-vfio-pci",
> + .id_table = xe_vfio_pci_table,
> + .probe = xe_vfio_pci_probe,
> + .remove = xe_vfio_pci_remove,
> + .err_handler = &xe_vfio_pci_err_handlers,
> + .driver_managed_dma = true,
> +};
> +module_pci_driver(xe_vfio_pci_driver);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Intel Corporation");
> +MODULE_DESCRIPTION("VFIO PCI driver with migration support for Intel Graphics");
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 01/26] drm/xe/pf: Remove GuC version check for migration support
2025-10-12 18:31 ` Michal Wajdeczko
@ 2025-10-20 14:46 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-20 14:46 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sun, Oct 12, 2025 at 08:31:48PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Since commit 4eb0aab6e4434 ("drm/xe/guc: Bump minimum required GuC
> > version to v70.29.2"), the minimum GuC version required by the driver
> > is v70.29.2, which should already include everything that we need for
> > migration.
> > Remove the version check.
> >
> > Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 3 ---
> > 1 file changed, 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index 44cc612b0a752..a5bf327ef8889 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -384,9 +384,6 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> >
> > static bool pf_check_migration_support(struct xe_gt *gt)
> > {
> > - /* GuC 70.25 with save/restore v2 is required */
> > - xe_gt_assert(gt, GUC_FIRMWARE_VER(>->uc.guc) >= MAKE_GUC_VER(70, 25, 0));
> > -
>
> alternatively we can move this assert to guc_action_vf_save_restore()
> to double check we try that on older firmware, but either way,
>
> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Left it as is for now.
Thanks,
-Michał
>
> > /* XXX: for now this is for feature enabling only */
> > return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> > }
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 02/26] drm/xe: Move migration support to device-level struct
2025-10-12 18:58 ` Michal Wajdeczko
@ 2025-10-20 14:48 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-20 14:48 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sun, Oct 12, 2025 at 08:58:42PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Upcoming changes will allow users to control VF state and obtain its
> > migration data with a device-level granularity (not tile/gt).
> > Change the data structures to reflect that and move the GT-level
> > migration init to happen after device-level init.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile | 1 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 12 +-----
> > .../drm/xe/xe_gt_sriov_pf_migration_types.h | 3 --
> > drivers/gpu/drm/xe/xe_sriov_pf.c | 5 +++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 43 +++++++++++++++++++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 27 ++++++++++++
> > .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 0
> > drivers/gpu/drm/xe/xe_sriov_pf_types.h | 5 +++
> > 8 files changed, 83 insertions(+), 13 deletions(-)
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index 84321fad32658..71f685a315dca 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -176,6 +176,7 @@ xe-$(CONFIG_PCI_IOV) += \
> > xe_sriov_pf.o \
> > xe_sriov_pf_control.o \
> > xe_sriov_pf_debugfs.o \
> > + xe_sriov_pf_migration.o \
> > xe_sriov_pf_service.o \
> > xe_tile_sriov_pf_debugfs.o
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index a5bf327ef8889..ca28f45aaf481 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -13,6 +13,7 @@
> > #include "xe_guc.h"
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > +#include "xe_sriov_pf_migration.h"
> >
> > /* Return: number of dwords saved/restored/required or a negative error code on failure */
> > static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
> > @@ -115,8 +116,7 @@ static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
> >
> > static bool pf_migration_supported(struct xe_gt *gt)
> > {
> > - xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > - return gt->sriov.pf.migration.supported;
> > + return xe_sriov_pf_migration_supported(gt_to_xe(gt));
> > }
> >
> > static struct mutex *pf_migration_mutex(struct xe_gt *gt)
> > @@ -382,12 +382,6 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> > }
> > #endif /* CONFIG_DEBUG_FS */
> >
> > -static bool pf_check_migration_support(struct xe_gt *gt)
> > -{
> > - /* XXX: for now this is for feature enabling only */
> > - return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> > -}
> > -
> > /**
> > * xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
> > * @gt: the &xe_gt
> > @@ -403,8 +397,6 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> >
> > xe_gt_assert(gt, IS_SRIOV_PF(xe));
> >
> > - gt->sriov.pf.migration.supported = pf_check_migration_support(gt);
> > -
> > if (!pf_migration_supported(gt))
> > return 0;
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > index 1f3110b6d44fa..9d672feac5f04 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > @@ -30,9 +30,6 @@ struct xe_gt_sriov_state_snapshot {
> > * Used by the PF driver to maintain non-VF specific per-GT data.
> > */
> > struct xe_gt_sriov_pf_migration {
> > - /** @supported: indicates whether the feature is supported */
> > - bool supported;
> > -
> > /** @snapshot_lock: protects all VFs snapshots */
> > struct mutex snapshot_lock;
> > };
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
> > index bc1ab9ee31d92..95743c7af8050 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
> > @@ -15,6 +15,7 @@
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf.h"
> > #include "xe_sriov_pf_helpers.h"
> > +#include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_sriov_printk.h"
> >
> > @@ -101,6 +102,10 @@ int xe_sriov_pf_init_early(struct xe_device *xe)
> > if (err)
> > return err;
> >
> > + err = xe_sriov_pf_migration_init(xe);
> > + if (err)
> > + return err;
> > +
> > xe_sriov_pf_service_init(xe);
> >
> > return 0;
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > new file mode 100644
> > index 0000000000000..cf6a210d5597a
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -0,0 +1,43 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#include "xe_sriov.h"
> > +#include "xe_sriov_pf_migration.h"
> > +
> > +/**
> > + * xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
> > + * @xe: the &struct xe_device
>
> nit: this will render better:
>
> @xe: the struct &xe_device
>
> but in other places we just use:
>
> @xe: the &xe_device
Indeed - I'll change it (here and in other instances though the series).
>
> > + *
> > + * Return: true if migration is supported, false otherwise
> > + */
> > +bool xe_sriov_pf_migration_supported(struct xe_device *xe)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + return xe->sriov.pf.migration.supported;
> > +}
> > +
> > +static bool pf_check_migration_support(struct xe_device *xe)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
>
> we don't need this here for now
Ok.
>
> > +
> > + /* XXX: for now this is for feature enabling only */
> > + return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
> > + * @xe: the &struct xe_device
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_pf_migration_init(struct xe_device *xe)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
> > +
> > + return 0;
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > new file mode 100644
> > index 0000000000000..d3058b6682192
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > @@ -0,0 +1,27 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_SRIOV_PF_MIGRATION_H_
> > +#define _XE_SRIOV_PF_MIGRATION_H_
> > +
> > +#include <linux/types.h>
> > +
> > +struct xe_device;
> > +
> > +#ifdef CONFIG_PCI_IOV
> > +int xe_sriov_pf_migration_init(struct xe_device *xe);
> > +bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> > +#else
> > +static inline int xe_sriov_pf_migration_init(struct xe_device *xe)
> > +{
> > + return 0;
> > +}
> > +static inline bool xe_sriov_pf_migration_supported(struct xe_device *xe)
> > +{
> > + return false;
> > +}
> > +#endif
> > +
> > +#endif
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > new file mode 100644
> > index 0000000000000..e69de29bb2d1d
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> > index 956a88f9f213d..2d2fcc0a2f258 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> > @@ -32,6 +32,11 @@ struct xe_device_pf {
> > /** @driver_max_vfs: Maximum number of VFs supported by the driver. */
> > u16 driver_max_vfs;
> >
>
> I guess you need to document @migration too to make it work
Ok.
>
> > + struct {
> > + /** @migration.supported: indicates whether VF migration feature is supported */
> > + bool supported;
> > + } migration;
>
> also can you move that closer to other sub-component "service" below ?
Will do.
>
> > +
> > /** @master_lock: protects all VFs configurations across GTs */
> > struct mutex master_lock;
> >
>
> but otherwise LGTM, so with above fixed,
>
> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
>
Thanks,
-Michał
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 04/26] drm/xe/pf: Extract migration mutex out of its struct
2025-10-12 19:08 ` Matthew Brost
@ 2025-10-20 14:50 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-20 14:50 UTC (permalink / raw)
To: Matthew Brost
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sun, Oct 12, 2025 at 12:08:34PM -0700, Matthew Brost wrote:
> On Sat, Oct 11, 2025 at 09:38:25PM +0200, Michał Winiarski wrote:
> > As part of upcoming changes, the struct xe_gt_sriov_pf_migration will be
> > used as a per-VF data structure.
> > The mutex (which is currently the only member of this structure) will
> > have slightly different semantics.
> > Extract the mutex to free up the struct name and simplify the future
> > changes.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 4 ++--
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h | 2 --
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 2 +-
> > 3 files changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index ca28f45aaf481..f8604b172963e 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -122,7 +122,7 @@ static bool pf_migration_supported(struct xe_gt *gt)
> > static struct mutex *pf_migration_mutex(struct xe_gt *gt)
> > {
> > xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > - return >->sriov.pf.migration.snapshot_lock;
> > + return >->sriov.pf.snapshot_lock;
>
> By the end of series this function looks like:
>
> 14 static struct mutex *pf_migration_mutex(struct xe_device *xe, unsigned int vfid)
> 15 {
> 16 xe_assert(xe, IS_SRIOV_PF(xe));
> 17 xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> 18 return &xe->sriov.pf.vfs[vfid].migration.lock;
> 19 }
>
> And...
>
> grep snapshot_lock *.c *.h
> xe_gt_sriov_pf_migration.c: err = drmm_mutex_init(&xe->drm, >->sriov.pf.snapshot_lock);
> xe_gt_sriov_pf_types.h: struct mutex snapshot_lock;
>
> So 'snapshot_lock' isn't used at the end of the series. Maybe drop this
> patch, delete the snapshot_lock in the patch which restructures the
> above code / remove the snapshot_lock usage.
>
> Matt
It should actually be removed later in the series, but looks like it got
left over by mistake.
I'll squash it with:
drm/xe/pf: Switch VF migration GuC save/restore to struct migration data
Thanks,
-Michał
>
> > }
> >
> > static struct xe_gt_sriov_state_snapshot *pf_pick_vf_snapshot(struct xe_gt *gt,
> > @@ -400,7 +400,7 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> > if (!pf_migration_supported(gt))
> > return 0;
> >
> > - err = drmm_mutex_init(&xe->drm, >->sriov.pf.migration.snapshot_lock);
> > + err = drmm_mutex_init(&xe->drm, >->sriov.pf.snapshot_lock);
> > if (err)
> > return err;
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > index 9d672feac5f04..fdc5a31dd8989 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > @@ -30,8 +30,6 @@ struct xe_gt_sriov_state_snapshot {
> > * Used by the PF driver to maintain non-VF specific per-GT data.
> > */
> > struct xe_gt_sriov_pf_migration {
> > - /** @snapshot_lock: protects all VFs snapshots */
> > - struct mutex snapshot_lock;
> > };
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > index a64a6835ad656..9a856da379d39 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > @@ -58,7 +58,7 @@ struct xe_gt_sriov_pf {
> > struct xe_gt_sriov_pf_service service;
> > struct xe_gt_sriov_pf_control control;
> > struct xe_gt_sriov_pf_policy policy;
> > - struct xe_gt_sriov_pf_migration migration;
> > + struct mutex snapshot_lock;
> > struct xe_gt_sriov_spare_config spare;
> > struct xe_gt_sriov_metadata *vfs;
> > };
> > --
> > 2.50.1
> >
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 05/26] drm/xe/pf: Add data structures and handlers for migration rings
2025-10-12 21:06 ` Michal Wajdeczko
@ 2025-10-20 14:56 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-20 14:56 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sun, Oct 12, 2025 at 11:06:59PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Migration data is queued in a per-GT ptr_ring to decouple the worker
> > responsible for handling the data transfer from the .read()/.write()
> > syscalls.
>
> ... from the .read() and .write() syscalls.
Ok.
>
> > Add the data structures and handlers that will be used in future
> > commits.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 4 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 163 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 9 +
> > .../drm/xe/xe_gt_sriov_pf_migration_types.h | 5 +-
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h | 3 +
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 147 ++++++++++++++++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 20 +++
> > .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 37 ++++
> > drivers/gpu/drm/xe/xe_sriov_pf_types.h | 3 +
> > 9 files changed, 390 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index 44df984278548..16a88e7599f6d 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -19,6 +19,7 @@
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_control.h"
> > +#include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_tile.h"
> >
> > @@ -388,6 +389,8 @@ static bool pf_enter_vf_wip(struct xe_gt *gt, unsigned int vfid)
> >
> > static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> > +
> > if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_WIP)) {
> > struct xe_gt_sriov_control_state *cs = pf_pick_vf_control(gt, vfid);
>
> we can declare wq here
It actually needs to be removed from here - we can't use WQ from
non-migration related PF control flows.
The state machine will be modified to match your suggestions from
later points in the series.
>
> >
> > @@ -399,6 +402,7 @@ static void pf_exit_vf_wip(struct xe_gt *gt, unsigned int vfid)
> > pf_exit_vf_resume_wip(gt, vfid);
> >
> > complete_all(&cs->done);
> > + wake_up_all(wq);
> > }
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index f8604b172963e..af5952f42fff1 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -7,6 +7,7 @@
> >
> > #include "abi/guc_actions_sriov_abi.h"
> > #include "xe_bo.h"
> > +#include "xe_gt_sriov_pf_control.h"
> > #include "xe_gt_sriov_pf_helpers.h"
> > #include "xe_gt_sriov_pf_migration.h"
> > #include "xe_gt_sriov_printk.h"
> > @@ -15,6 +16,17 @@
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_migration.h"
> >
> > +#define XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT (HZ * 20)
> > +#define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> > +
> > +static struct xe_gt_sriov_pf_migration *pf_pick_gt_migration(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> > +
> > + return >->sriov.pf.vfs[vfid].migration;
> > +}
> > +
> > /* Return: number of dwords saved/restored/required or a negative error code on failure */
> > static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
> > u64 addr, u32 ndwords)
> > @@ -382,6 +394,142 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> > }
> > #endif /* CONFIG_DEBUG_FS */
> >
> > +/**
> > + * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * Return: true if the ring is empty, otherwise false.
> > + */
> > +bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + return ptr_ring_empty(&pf_pick_gt_migration(gt, vfid)->ring);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_produce() - Add migration data packet to migration ring
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + * @data: &struct xe_sriov_pf_migration_data packet
> > + *
> > + * If the ring is full, wait until there is space in the ring.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
> > + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> > + unsigned long timeout = XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT;
> > + int ret;
> > +
> > + xe_gt_assert(gt, data->tile == gt->tile->id);
> > + xe_gt_assert(gt, data->gt == gt->info.id);
> > +
> > + while (1) {
> > + ret = ptr_ring_produce(&migration->ring, data);
> > + if (ret == 0) {
>
> if (!ret)
> break;
> > + wake_up_all(wq);
> > + break;
> > + }
> > +
> > + if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
> > + return -EINVAL;
> > +
> > + ret = wait_event_interruptible_timeout(*wq,
> > + !ptr_ring_full(&migration->ring),
> > + timeout);
> > + if (ret == 0)
> > + return -ETIMEDOUT;
> > +
> > + timeout = ret;
> > + }
> > +
>
> wake_up_all(wq);
> return 0;
>
> > + return ret;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_consume() - Get migration data packet from migration ring
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * If the ring is empty, wait until there are new migration data packets to process.
> > + *
> > + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> > + * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
> > + * ERR_PTR value in case of error.
> > + */
> > +struct xe_sriov_pf_migration_data *
> > +xe_gt_sriov_pf_migration_ring_consume(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
> > + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> > + unsigned long timeout = XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT;
> > + struct xe_sriov_pf_migration_data *data;
> > + int ret;
> > +
> > + while (1) {
> > + data = ptr_ring_consume(&migration->ring);
> > + if (data) {
> > + wake_up_all(wq);
> > + break;
> > + }
> > +
> > + if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
> > + return ERR_PTR(-ENODATA);
> > +
> > + ret = wait_event_interruptible_timeout(*wq,
> > + !ptr_ring_empty(&migration->ring) ||
> > + !xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid),
> > + timeout);
> > + if (ret == 0)
> > + return ERR_PTR(-ETIMEDOUT);
> > +
> > + timeout = ret;
> > + }
> > +
> > + return data;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_consume_nowait() - Get migration data packet from migration ring
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * Similar to xe_gt_sriov_pf_migration_consume(), but doesn't wait until more data is available.
> > + *
> > + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> > + * ERR_PTR(-EAGAIN) if ring is empty but migration data is expected,
> > + * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
> > + * ERR_PTR value in case of error.
> > + */
> > +struct xe_sriov_pf_migration_data *
> > +xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, vfid);
> > + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> > + struct xe_sriov_pf_migration_data *data;
> > +
> > + data = ptr_ring_consume(&migration->ring);
> > + if (data) {
> > + wake_up_all(wq);
> > + return data;
> > + }
> > +
> > + if (!xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid))
> > + return ERR_PTR(-ENODATA);
> > +
> > + return ERR_PTR(-EAGAIN);
> > +}
> > +
> > +static void pf_gt_migration_cleanup(struct drm_device *dev, void *arg)
>
> no need for the "pf" prefix
>
> and if this is only about ptr_ring, then it could be:
>
> static void action_ring_cleanup(...)
> {
> struct ptr_ring *r = arg;
>
> ptr_ring_cleanup(r, NULL);
> }
Ok.
>
> > +{
> > + struct xe_gt_sriov_pf_migration *migration = arg;
> > +
> > + ptr_ring_cleanup(&migration->ring, NULL);
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_migration_init() - Initialize support for VF migration.
> > * @gt: the &xe_gt
> > @@ -393,6 +541,7 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> > int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> > {
> > struct xe_device *xe = gt_to_xe(gt);
> > + unsigned int n, totalvfs;
> > int err;
> >
> > xe_gt_assert(gt, IS_SRIOV_PF(xe));
> > @@ -404,5 +553,19 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt)
> > if (err)
> > return err;
> >
> > + totalvfs = xe_sriov_pf_get_totalvfs(xe);
> > + for (n = 0; n <= totalvfs; n++) {
> > + struct xe_gt_sriov_pf_migration *migration = pf_pick_gt_migration(gt, n);
> > +
> > + err = ptr_ring_init(&migration->ring,
> > + XE_GT_SRIOV_PF_MIGRATION_RING_SIZE, GFP_KERNEL);
> > + if (err)
> > + return err;
> > +
> > + err = drmm_add_action_or_reset(&xe->drm, pf_gt_migration_cleanup, migration);
> > + if (err)
> > + return err;
> > + }
> > +
> > return 0;
> > }
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > index 09faeae00ddbb..1e4dc46413823 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > @@ -9,11 +9,20 @@
> > #include <linux/types.h>
> >
> > struct xe_gt;
> > +struct xe_sriov_pf_migration_data;
> >
> > int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> > int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
> >
> > +bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data);
> > +struct xe_sriov_pf_migration_data *
> > +xe_gt_sriov_pf_migration_ring_consume(struct xe_gt *gt, unsigned int vfid);
> > +struct xe_sriov_pf_migration_data *
> > +xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid);
> > +
> > #ifdef CONFIG_DEBUG_FS
> > ssize_t xe_gt_sriov_pf_migration_read_guc_state(struct xe_gt *gt, unsigned int vfid,
> > char __user *buf, size_t count, loff_t *pos);
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > index fdc5a31dd8989..8434689372082 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration_types.h
> > @@ -7,6 +7,7 @@
> > #define _XE_GT_SRIOV_PF_MIGRATION_TYPES_H_
> >
> > #include <linux/mutex.h>
> > +#include <linux/ptr_ring.h>
> > #include <linux/types.h>
> >
> > /**
> > @@ -27,9 +28,11 @@ struct xe_gt_sriov_state_snapshot {
> > /**
> > * struct xe_gt_sriov_pf_migration - GT-level data.
> > *
> > - * Used by the PF driver to maintain non-VF specific per-GT data.
> > + * Used by the PF driver to maintain per-VF migration data.
>
> we try to match struct name with the sub-component name, not use it as per-VF name
>
> if you want to have struct for the per-VF data, pick a different name, maybe:
>
> struct xe_gt_sriov_pf_migration_state
>
> or just reuse one that you plan to remove later:
>
> struct xe_gt_sriov_state_snapshot
I'm trying to avoid using word "state" anywhere in migration data
related codepaths, as that's used to denote device state on VFIO side
(which translates to Xe PF control state).
I'll rename it to xe_gt_sriov_migration_data (per-vf struct containing
the ring) and xe_sriov_migration_data (actual data packet).
>
> > */
> > struct xe_gt_sriov_pf_migration {
> > + /** @ring: queue containing VF save / restore migration data */
> > + struct ptr_ring ring;
> > };
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > index 9a856da379d39..fbb08f8030f7f 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_types.h
> > @@ -33,6 +33,9 @@ struct xe_gt_sriov_metadata {
> >
> > /** @snapshot: snapshot of the VF state data */
> > struct xe_gt_sriov_state_snapshot snapshot;
> > +
> > + /** @migration: */
>
> missing description
Ok.
>
> > + struct xe_gt_sriov_pf_migration migration;
> > };
> >
> > /**
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index cf6a210d5597a..347682f29a03c 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -4,7 +4,35 @@
> > */
> >
> > #include "xe_sriov.h"
> > +#include <drm/drm_managed.h>
> > +
> > +#include "xe_device.h"
> > +#include "xe_gt_sriov_pf_control.h"
> > +#include "xe_gt_sriov_pf_migration.h"
> > +#include "xe_pm.h"
> > +#include "xe_sriov_pf_helpers.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_printk.h"
> > +
> > +static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > +
> > + return &xe->sriov.pf.vfs[vfid].migration;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_waitqueue - Get waitqueue for migration
> > + * @xe: the &struct xe_device
> > + * @vfid: the VF identifier
> > + *
> > + * Return: pointer to the migration waitqueue.
> > + */
> > +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
> > +{
> > + return &pf_pick_migration(xe, vfid)->wq;
> > +}
> >
> > /**
> > * xe_sriov_pf_migration_supported() - Check if SR-IOV VF migration is supported by the device
> > @@ -35,9 +63,128 @@ static bool pf_check_migration_support(struct xe_device *xe)
> > */
> > int xe_sriov_pf_migration_init(struct xe_device *xe)
> > {
> > + unsigned int n, totalvfs;
> > +
> > xe_assert(xe, IS_SRIOV_PF(xe));
> >
> > xe->sriov.pf.migration.supported = pf_check_migration_support(xe);
> > + if (!xe_sriov_pf_migration_supported(xe))
> > + return 0;
> > +
> > + totalvfs = xe_sriov_pf_get_totalvfs(xe);
> > + for (n = 1; n <= totalvfs; n++) {
> > + struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
> > +
> > + init_waitqueue_head(&migration->wq);
> > + }
> >
> > return 0;
> > }
> > +
> > +static bool pf_migration_empty(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_gt *gt;
> > + u8 gt_id;
> > +
> > + for_each_gt(gt, xe, gt_id) {
> > + if (!xe_gt_sriov_pf_migration_ring_empty(gt, vfid))
> > + return false;
> > + }
> > +
> > + return true;
> > +}
> > +
> > +static struct xe_sriov_pf_migration_data *
> > +pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_pf_migration_data *data;
> > + struct xe_gt *gt;
> > + u8 gt_id;
> > + bool no_data = true;
> > +
> > + for_each_gt(gt, xe, gt_id) {
> > + data = xe_gt_sriov_pf_migration_ring_consume_nowait(gt, vfid);
> > +
> > + if (!IS_ERR(data))
> > + return data;
> > + else if (PTR_ERR(data) == -EAGAIN)
> > + no_data = false;
> > + }
> > +
> > + if (no_data)
> > + return ERR_PTR(-ENODATA);
> > +
> > + return ERR_PTR(-EAGAIN);
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_consume() - Consume a SR-IOV VF migration data packet from the device
> > + * @xe: the &struct xe_device
> > + * @vfid: the VF identifier
> > + *
> > + * If there is no migration data to process, wait until more data is available.
> > + *
> > + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> > + * ERR_PTR(-ENODATA) if ring is empty and no more migration data is expected,
>
> can we use NULL as indication of no data ? then all ERR_PTR will be errors
Will do.
>
> > + * ERR_PTR value in case of error.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +struct xe_sriov_pf_migration_data *
> > +xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, vfid);
> > + unsigned long timeout = HZ * 5;
> > + struct xe_sriov_pf_migration_data *data;
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return ERR_PTR(-ENODEV);
>
> this is "PF" function, we shouldn't get here if we are not a PF
>
> use assert here, and make sure the caller verifies the PF mode
Ok.
>
> > +
> > + while (1) {
> > + data = pf_migration_consume(xe, vfid);
> > + if (!IS_ERR(data) || PTR_ERR(data) != -EAGAIN)
> > + goto out;
> > +
> > + ret = wait_event_interruptible_timeout(migration->wq,
> > + !pf_migration_empty(xe, vfid),
> > + timeout);
> > + if (ret == 0) {
> > + xe_sriov_warn(xe, "VF%d Timed out waiting for migration data\n", vfid);
> > + return ERR_PTR(-ETIMEDOUT);
> > + }
> > +
> > + timeout = ret;
> > + }
> > +
> > +out:
> > + return data;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_produce() - Produce a SR-IOV VF migration data packet for device to process
> > + * @xe: the &struct xe_device
> > + * @vfid: the VF identifier
> > + * @data: VF migration data
> > + *
> > + * If the underlying data structure is full, wait until there is space.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + struct xe_gt *gt;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + gt = xe_device_get_gt(xe, data->gt);
> > + if (!gt || data->tile != gt->tile->id) {
> > + xe_sriov_err_ratelimited(xe, "VF%d Unknown GT - tile_id:%d, gt_id:%d\n",
> > + vfid, data->tile, data->gt);
> > + return -EINVAL;
> > + }
> > +
> > + return xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > index d3058b6682192..f2020ba19c2da 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > @@ -7,12 +7,18 @@
> > #define _XE_SRIOV_PF_MIGRATION_H_
> >
> > #include <linux/types.h>
> > +#include <linux/wait.h>
> >
> > struct xe_device;
> >
> > #ifdef CONFIG_PCI_IOV
> > int xe_sriov_pf_migration_init(struct xe_device *xe);
> > bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> > +struct xe_sriov_pf_migration_data *
> > +xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid);
> > +int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data);
> > +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid);
> > #else
> > static inline int xe_sriov_pf_migration_init(struct xe_device *xe)
> > {
> > @@ -22,6 +28,20 @@ static inline bool xe_sriov_pf_migration_supported(struct xe_device *xe)
> > {
> > return false;
> > }
> > +static inline struct xe_sriov_pf_migration_data *
> > +xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> > +{
> > + return ERR_PTR(-ENODEV);
> > +}
> > +static inline int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + return -ENODEV;
> > +}
> > +wait_queue_head_t *xe_sriov_pf_migration_waitqueue(struct xe_device *xe, unsigned int vfid)
> > +{
> > + return ERR_PTR(-ENODEV);
> > +}
>
> didn't fully check, but do we really need all these stubs?
> likely those functions will be called from other real PF-only functions
We don't. I'll remove the stubs.
>
> > #endif
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > index e69de29bb2d1d..80fdea32b884a 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > @@ -0,0 +1,37 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_SRIOV_PF_MIGRATION_TYPES_H_
> > +#define _XE_SRIOV_PF_MIGRATION_TYPES_H_
> > +
> > +#include <linux/types.h>
> > +#include <linux/wait.h>
> > +
>
> add kernel-doc
Ok.
Thanks,
-Michał
>
> > +struct xe_sriov_pf_migration_data {
> > + struct xe_device *xe;
> > + void *vaddr;
> > + size_t remaining;
> > + size_t hdr_remaining;
> > + union {
> > + struct xe_bo *bo;
> > + void *buff;
> > + };
> > + __struct_group(xe_sriov_pf_migration_hdr, hdr, __packed,
> > + u8 version;
> > + u8 type;
> > + u8 tile;
> > + u8 gt;
> > + u32 flags;
> > + u64 offset;
> > + u64 size;
> > + );
> > +};
> > +
> > +struct xe_sriov_pf_migration {
> > + /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
> > + wait_queue_head_t wq;
> > +};
> > +
> > +#endif
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> > index 2d2fcc0a2f258..b3ae21a5a0490 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_types.h
> > @@ -9,6 +9,7 @@
> > #include <linux/mutex.h>
> > #include <linux/types.h>
> >
> > +#include "xe_sriov_pf_migration_types.h"
> > #include "xe_sriov_pf_service_types.h"
> >
> > /**
> > @@ -17,6 +18,8 @@
> > struct xe_sriov_metadata {
> > /** @version: negotiated VF/PF ABI version */
> > struct xe_sriov_pf_service_version version;
> > + /** @migration: migration data */
> > + struct xe_sriov_pf_migration migration;
> > };
> >
> > /**
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free
2025-10-13 10:15 ` Michal Wajdeczko
@ 2025-10-21 0:01 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:01 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 12:15:55PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Now that it's possible to free the packets - connect the restore
> > handling logic with the ring.
> > The helpers will also be used in upcoming changes that will start producing
> > migration data packets.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile | 1 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 48 ++++++-
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 10 +-
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 1 +
> > .../gpu/drm/xe/xe_sriov_pf_migration_data.c | 135 ++++++++++++++++++
> > .../gpu/drm/xe/xe_sriov_pf_migration_data.h | 32 +++++
>
> while this is used by the PF only, maybe those files don't have to include _pf_ tag (like xe_pci_sriov.c or xe_sriov_vfio.c ?)
>
> .../gpu/drm/xe/xe_sriov_migration_data.c | 135 ++++++++++++++++++
> .../gpu/drm/xe/xe_sriov_migration_data.h | 32 +++++
>
> or
>
> .../gpu/drm/xe/xe_sriov_vfio_data.c | 135 ++++++++++++++++++
> .../gpu/drm/xe/xe_sriov_vfio_data.h | 32 +++++
Ok.
>
> > 6 files changed, 224 insertions(+), 3 deletions(-)
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index 71f685a315dca..e253d65366de4 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -177,6 +177,7 @@ xe-$(CONFIG_PCI_IOV) += \
> > xe_sriov_pf_control.o \
> > xe_sriov_pf_debugfs.o \
> > xe_sriov_pf_migration.o \
> > + xe_sriov_pf_migration_data.o \
> > xe_sriov_pf_service.o \
> > xe_tile_sriov_pf_debugfs.o
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index 16a88e7599f6d..04a4e92133c2e 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -20,6 +20,7 @@
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_tile.h"
> >
> > @@ -949,14 +950,57 @@ static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid)
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> > }
> >
> > +static void pf_enter_vf_restore_failed(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> > + pf_exit_vf_wip(gt, vfid);
> > +}
> > +
> > +static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + switch (data->type) {
> > + default:
> > + xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
> > + pf_enter_vf_restore_failed(gt, vfid);
>
> shouldn't this be done in pf_handle_vf_restore_wip() where all other state transitions are done?
Yes - will do.
>
> > + }
> > +
> > + return -EINVAL;
> > +}
> > +
> > static bool pf_handle_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > + struct xe_sriov_pf_migration_data *data;
> > + int ret;
> > +
> > if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
>
> in other places in VF control state machine we use slightly different pattern:
>
> // can we exit state AAA?
> if (!pf_exit_vf_state(AAA))
> return false; // no, we are not in this state
> // try to process other state
>
> // yes, we _were_ in AAA, process this state
> ret = handle_state_aaa();
>
> // now decide where to go next
> if (ret == -EAGAIN)
> pf_enter_vf_state(AAA); // back
> else if (ret < 0)
> pf_enter_vf_state(AAA_FAILED) // failed
> else
> pf_enter_vf_state(AAA_DONE) // next
>
> // state was processed, start next iteration
> return true;
>
It won't follow this exact pattern, but I'll change the state machine to
match it more closely.
> > return false;
> >
> > - pf_exit_vf_restore_wip(gt, vfid);
> > - pf_enter_vf_restored(gt, vfid);
> > + data = xe_gt_sriov_pf_migration_ring_consume(gt, vfid);
> > + if (IS_ERR(data)) {
> > + if (PTR_ERR(data) == -ENODATA &&
> > + !xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid)) {
> > + pf_exit_vf_restore_wip(gt, vfid);
> > + pf_enter_vf_restored(gt, vfid);
> > + } else {
> > + pf_enter_vf_restore_failed(gt, vfid);
> > + }
> > + return false;
>
> this should be 'true' as we completed this state processing
Ok.
> > + }
> > +
> > + xe_gt_assert(gt, gt->info.id == data->gt);
> > + xe_gt_assert(gt, gt->tile->id == data->tile);
> > +
> > + ret = pf_handle_vf_restore_data(gt, vfid, data);
> > + if (ret) {
> > + xe_gt_sriov_err(gt, "VF%u failed to restore data type: %d (%d)\n",
>
> use %pe for error
Ok.
>
> > + vfid, data->type, ret);
>
> maybe for debug try to dump here more details about failing data packet
The start of the packet is dbg hexdumped.
>
> > + xe_sriov_pf_migration_data_free(data);
> > + pf_enter_vf_restore_failed(gt, vfid);
> > + return false;
> > + }
> >
> > + xe_sriov_pf_migration_data_free(data);
> > return true;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index af5952f42fff1..582aaf062cbd4 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -15,6 +15,7 @@
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> >
> > #define XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT (HZ * 20)
> > #define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> > @@ -523,11 +524,18 @@ xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid
> > return ERR_PTR(-EAGAIN);
> > }
> >
> > +static void pf_mig_data_destroy(void *ptr)
> > +{
> > + struct xe_sriov_pf_migration_data *data = ptr;
> > +
> > + xe_sriov_pf_migration_data_free(data);
> > +}
> > +
> > static void pf_gt_migration_cleanup(struct drm_device *dev, void *arg)
> > {
> > struct xe_gt_sriov_pf_migration *migration = arg;
> >
> > - ptr_ring_cleanup(&migration->ring, NULL);
> > + ptr_ring_cleanup(&migration->ring, pf_mig_data_destroy);
> > }
> >
> > /**
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index 347682f29a03c..d39cee66589b5 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -12,6 +12,7 @@
> > #include "xe_pm.h"
> > #include "xe_sriov_pf_helpers.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_printk.h"
> >
> > static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > new file mode 100644
> > index 0000000000000..cfc6b512c6674
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > @@ -0,0 +1,135 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#include "xe_bo.h"
> > +#include "xe_device.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > +
> > +static bool data_needs_bo(struct xe_sriov_pf_migration_data *data)
> > +{
> > + unsigned int type = data->type;
> > +
> > + return type == XE_SRIOV_MIG_DATA_CCS ||
> > + type == XE_SRIOV_MIG_DATA_VRAM;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_alloc() - Allocate migration data packet
> > + * @xe: the &struct xe_device
> > + *
> > + * Only allocates the "outer" structure, without initializing the migration
> > + * data backing storage.
> > + *
> > + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> > + * NULL in case of error.
> > + */
> > +struct xe_sriov_pf_migration_data *
> > +xe_sriov_pf_migration_data_alloc(struct xe_device *xe)
> > +{
> > + struct xe_sriov_pf_migration_data *data;
> > +
> > + data = kzalloc(sizeof(*data), GFP_KERNEL);
> > + if (!data)
> > + return NULL;
> > +
> > + data->xe = xe;
> > + data->hdr_remaining = sizeof(data->hdr);
> > +
> > + return data;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_free() - Free migration data packet
> > + * @data: the &struct xe_sriov_pf_migration_data packet
> > + */
> > +void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *data)
> > +{
> > + if (data_needs_bo(data)) {
> > + if (data->bo)
>
> not needed, xe_bo_unpin_map_no_vm() checks for NULL
Ok.
>
> > + xe_bo_unpin_map_no_vm(data->bo);
> > + } else {
> > + if (data->buff)
>
> not needed, kvfree() also checks for NULL
Ok.
>
> > + kvfree(data->buff);
> > + }
> > +
> > + kfree(data);
> > +}
> > +
> > +static int mig_data_init(struct xe_sriov_pf_migration_data *data)
> > +{
> > + struct xe_gt *gt = xe_device_get_gt(data->xe, data->gt);
> > +
> > + if (!gt || data->tile != gt->tile->id)
> > + return -EINVAL;
>
> didn't we check that already in xe_sriov_pf_migration_produce() ?
>
> in other places we call xe_sriov_pf_migration_data_init() using ids from real tile and gt
I'll remove it from here.
>
> > +
> > + if (data->size == 0)
> > + return 0;
> > +
> > + if (data_needs_bo(data)) {
> > + struct xe_bo *bo = xe_bo_create_pin_map_novm(data->xe, gt->tile,
> > + PAGE_ALIGN(data->size),
> > + ttm_bo_type_kernel,
> > + XE_BO_FLAG_SYSTEM | XE_BO_FLAG_PINNED,
> > + false);
> > + if (IS_ERR(bo))
> > + return PTR_ERR(bo);
> > +
> > + data->bo = bo;
> > + data->vaddr = bo->vmap.vaddr;
> > + } else {
> > + void *buff = kvzalloc(data->size, GFP_KERNEL);
> > + if (!buff)
> > + return -ENOMEM;
> > +
> > + data->buff = buff;
> > + data->vaddr = buff;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_init() - Initialize the migration data header and backing storage
> > + * @data: the &struct xe_sriov_pf_migration_data packet
> > + * @tile_id: tile identifier
> > + * @gt_id: GT identifier
> > + * @type: &enum xe_sriov_pf_migration_data_type
>
> here type is enum
>
> > + * @offset: offset of data packet payload (within wider resource)
> > + * @size: size of data packet payload
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> > + unsigned int type, loff_t offset, size_t size)
>
> but here is plain int ?
>
> > +{
> > + xe_assert(data->xe, type < XE_SRIOV_MIG_DATA_MAX);
>
> if it's "enum" - no need to check
>
> if it's "int" and type is coming from outside of our code, then assert is not sufficient anyway
>
> nit: if assert stays, add sep line here
>
> > + data->version = 1;
>
> magic "1" needs its own #define
XE_SRIOV_MIGRATION_DATA_SUPPORTED_VERSION.
>
> > + data->type = type;
> > + data->tile = tile_id;
> > + data->gt = gt_id;
> > + data->offset = offset;
> > + data->size = size;
> > + data->remaining = size;
> > +
> > + return mig_data_init(data);
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_init() - Initialize the migration data backing storage based on header
> > + * @data: the &struct xe_sriov_pf_migration_data packet
> > + *
> > + * Header data is expected to be filled prior to calling this function
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *data)
> > +{
> > + if (WARN_ON(data->hdr_remaining))
>
> better: xe_WARN_ON(xe, ....)
>
> but does it really deserves any WARN here?
> we rather know who is the caller
I'll remove it.
>
> > + return -EINVAL;
> > +
> > + data->remaining = data->size;
> > +
> > + return mig_data_init(data);
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > new file mode 100644
> > index 0000000000000..1dde4cfcdbc47
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > @@ -0,0 +1,32 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_SRIOV_PF_MIGRATION_DATA_H_
> > +#define _XE_SRIOV_PF_MIGRATION_DATA_H_
> > +
> > +#include <linux/types.h>
> > +
> > +struct xe_device;
> > +
> > +enum xe_sriov_pf_migration_data_type {
>
> maybe add a note that default 0 was skipped on purpose to catch uninitialized/invalid data
>
> > + XE_SRIOV_MIG_DATA_DESCRIPTOR = 1,
>
> shouldn't we try to match enumerator names with enum name?
>
> XE_SRIOV_PF_MIGRATION_DATA_TYPE_DESCRIPTOR = 1,
> XE_SRIOV_PF_MIGRATION_DATA_TYPE_TRAILER,
> XE_SRIOV_PF_MIGRATION_DATA_TYPE_...,
>
> or change the enum (and file) name:
>
> xe_sriov_migration_data.c
>
> XE_SRIOV_MIGRATION_DATA_TYPE_DESCRIPTOR = 1,
> XE_SRIOV_MIGRATION_DATA_TYPE_TRAILER,
> XE_SRIOV_MIGRATION_DATA_TYPE_...,
> > + XE_SRIOV_MIG_DATA_TRAILER,
> > + XE_SRIOV_MIG_DATA_GGTT,
> > + XE_SRIOV_MIG_DATA_MMIO,
> > + XE_SRIOV_MIG_DATA_GUC,
> > + XE_SRIOV_MIG_DATA_CCS,
> > + XE_SRIOV_MIG_DATA_VRAM,
> > + XE_SRIOV_MIG_DATA_MAX,
>
> please drop it
Ok.
> > +};
> > +
> > +struct xe_sriov_pf_migration_data *
> > +xe_sriov_pf_migration_data_alloc(struct xe_device *xe);
> > +void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *snapshot);
> > +
> > +int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> > + unsigned int type, loff_t offset, size_t size);
> > +int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *snapshot);
> > +
> > +#endif
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet
2025-10-13 10:46 ` Michal Wajdeczko
@ 2025-10-21 0:25 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:25 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 12:46:18PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Add debugfs handlers for migration state and handle bitstream
> > .read()/.write() to convert from bitstream to/from migration data
> > packets.
> > As descriptor/trailer are handled at this layer - add handling for both
> > save and restore side.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 18 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h | 1 +
> > drivers/gpu/drm/xe/xe_sriov_pf.c | 1 +
> > drivers/gpu/drm/xe/xe_sriov_pf_control.c | 5 +
> > drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 45 +++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 56 +++
> > .../gpu/drm/xe/xe_sriov_pf_migration_data.c | 353 ++++++++++++++++++
> > .../gpu/drm/xe/xe_sriov_pf_migration_data.h | 5 +
> > .../gpu/drm/xe/xe_sriov_pf_migration_types.h | 9 +
> > 9 files changed, 493 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index 04a4e92133c2e..092d3d710bca1 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -814,6 +814,23 @@ bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfi
> > return pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> > }
> >
> > +/**
> > + * xe_gt_sriov_pf_control_vf_data_eof() - indicate the end of SR-IOV VF migration data production
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + */
> > +void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct wait_queue_head *wq = xe_sriov_pf_migration_waitqueue(gt_to_xe(gt), vfid);
> > +
> > + if (!pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP))
> > + pf_enter_vf_state_machine_bug(gt, vfid);
> > +
> > + wake_up_all(wq);
> > +}
> > +
> > static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> > @@ -840,6 +857,7 @@ static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> > return false;
> >
> > + xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
>
> above call can lead to state_machine_bug but here you just continue as nothing happen and moving to SAVED state
>
> maybe that logic of that function should moved to a helper that at least returns bool so you can make the right decision?
Won't be relevant after applying changes from previous patches.
>
> > pf_exit_vf_save_wip(gt, vfid);
> > pf_enter_vf_saved(gt, vfid);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > index 2e121e8132dcf..caf20dd063b1b 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.h
> > @@ -15,6 +15,7 @@ int xe_gt_sriov_pf_control_init(struct xe_gt *gt);
> > void xe_gt_sriov_pf_control_restart(struct xe_gt *gt);
> >
> > bool xe_gt_sriov_pf_control_check_vf_data_wip(struct xe_gt *gt, unsigned int vfid);
> > +void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid);
> >
> > int xe_gt_sriov_pf_control_pause_vf(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_control_resume_vf(struct xe_gt *gt, unsigned int vfid);
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf.c b/drivers/gpu/drm/xe/xe_sriov_pf.c
> > index 95743c7af8050..5d115627f3f2f 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf.c
> > @@ -16,6 +16,7 @@
> > #include "xe_sriov_pf.h"
> > #include "xe_sriov_pf_helpers.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_sriov_printk.h"
> >
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > index e64c7b56172c6..10e1f18aa8b11 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_control.c
> > @@ -6,6 +6,7 @@
> > #include "xe_device.h"
> > #include "xe_gt_sriov_pf_control.h"
> > #include "xe_sriov_pf_control.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_printk.h"
> >
> > /**
> > @@ -165,6 +166,10 @@ int xe_sriov_pf_control_save_vf(struct xe_device *xe, unsigned int vfid)
> > unsigned int id;
> > int ret;
> >
> > + ret = xe_sriov_pf_migration_data_save_init(xe, vfid);
> > + if (ret)
> > + return ret;
> > +
> > for_each_gt(gt, xe, id) {
> > ret = xe_gt_sriov_pf_control_save_vf(gt, vfid);
> > if (ret)
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > index 74eeabef91c57..ce780719760a6 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > @@ -13,6 +13,7 @@
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_pf_debugfs.h"
> > #include "xe_sriov_pf_helpers.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_sriov_printk.h"
> > #include "xe_tile_sriov_pf_debugfs.h"
> > @@ -71,6 +72,7 @@ static void pf_populate_pf(struct xe_device *xe, struct dentry *pfdent)
> > * /sys/kernel/debug/dri/BDF/
> > * ├── sriov
> > * │ ├── vf1
> > + * │ │ ├── migration_data
> > * │ │ ├── pause
> > * │ │ ├── reset
> > * │ │ ├── resume
> > @@ -159,6 +161,48 @@ DEFINE_VF_CONTROL_ATTRIBUTE(reset_vf);
> > DEFINE_VF_RW_CONTROL_ATTRIBUTE(save_vf);
> > DEFINE_VF_RW_CONTROL_ATTRIBUTE(restore_vf);
> >
> > +static ssize_t data_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > +{
> > + struct dentry *dent = file_dentry(file);
> > + struct dentry *vfdentry = dent->d_parent;
> > + struct dentry *migration_dentry = vfdentry->d_parent;
> > + unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
> > + struct xe_device *xe = migration_dentry->d_inode->i_private;
>
> we have extract_xe() / extract_vfid() helpers for that
Ok.
>
> > +
> > + xe_assert(xe, vfid);
> > + xe_sriov_pf_assert_vfid(xe, vfid);
> > +
> > + if (*pos)
> > + return -ESPIPE;
> > +
> > + return xe_sriov_pf_migration_data_write(xe, vfid, buf, count);
> > +}
> > +
> > +static ssize_t data_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
> > +{
> > + struct dentry *dent = file_dentry(file);
> > + struct dentry *vfdentry = dent->d_parent;
> > + struct dentry *migration_dentry = vfdentry->d_parent;
> > + unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
> > + struct xe_device *xe = migration_dentry->d_inode->i_private;
> > +
> > + xe_assert(xe, vfid);
> > + xe_sriov_pf_assert_vfid(xe, vfid);
> > +
> > + if (*ppos)
> > + return -ESPIPE;
> > +
> > + return xe_sriov_pf_migration_data_read(xe, vfid, buf, count);
> > +}
> > +
> > +static const struct file_operations data_vf_fops = {
> > + .owner = THIS_MODULE,
> > + .open = simple_open,
> > + .write = data_write,
> > + .read = data_read,
> > + .llseek = default_llseek,
> > +};
> > +
> > static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> > {
> > debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> > @@ -167,6 +211,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> > debugfs_create_file("reset", 0200, vfdent, xe, &reset_vf_fops);
> > debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> > debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> > + debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> > }
> >
> > static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index d39cee66589b5..9cc178126cbdc 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -56,6 +56,18 @@ static bool pf_check_migration_support(struct xe_device *xe)
> > return IS_ENABLED(CONFIG_DRM_XE_DEBUG);
> > }
> >
> > +static void pf_migration_cleanup(struct drm_device *dev, void *arg)
> > +{
> > + struct xe_sriov_pf_migration *migration = arg;
> > +
> > + if (!IS_ERR_OR_NULL(migration->pending))
> > + xe_sriov_pf_migration_data_free(migration->pending);
> > + if (!IS_ERR_OR_NULL(migration->trailer))
> > + xe_sriov_pf_migration_data_free(migration->trailer);
> > + if (!IS_ERR_OR_NULL(migration->descriptor))
> > + xe_sriov_pf_migration_data_free(migration->descriptor);
>
> maybe instead of checking IS_ERR_OR_NULL here, move the check to data_free() ?
Ok.
>
> > +}
> > +
> > /**
> > * xe_sriov_pf_migration_init() - Initialize support for SR-IOV VF migration.
> > * @xe: the &struct xe_device
> > @@ -65,6 +77,7 @@ static bool pf_check_migration_support(struct xe_device *xe)
> > int xe_sriov_pf_migration_init(struct xe_device *xe)
> > {
> > unsigned int n, totalvfs;
> > + int err;
> >
> > xe_assert(xe, IS_SRIOV_PF(xe));
> >
> > @@ -76,7 +89,15 @@ int xe_sriov_pf_migration_init(struct xe_device *xe)
> > for (n = 1; n <= totalvfs; n++) {
> > struct xe_sriov_pf_migration *migration = pf_pick_migration(xe, n);
> >
> > + err = drmm_mutex_init(&xe->drm, &migration->lock);
> > + if (err)
> > + return err;
> > +
> > init_waitqueue_head(&migration->wq);
> > +
> > + err = drmm_add_action_or_reset(&xe->drm, pf_migration_cleanup, migration);
> > + if (err)
> > + return err;
> > }
> >
> > return 0;
> > @@ -162,6 +183,36 @@ xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> > return data;
> > }
> >
> > +static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + if (data->tile != 0 || data->gt != 0)
> > + return -EINVAL;
> > +
> > + xe_sriov_pf_migration_data_free(data);
> > +
> > + return 0;
> > +}
> > +
> > +static int pf_handle_trailer(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + struct xe_gt *gt;
> > + u8 gt_id;
> > +
> > + if (data->tile != 0 || data->gt != 0)
> > + return -EINVAL;
> > + if (data->offset != 0 || data->size != 0 || data->buff || data->bo)
> > + return -EINVAL;
>
> who will free the data packet if we return errors here?
The caller is responsible for releasing the packet in case of errors.
For GT-level packetes this goes to the ring and gets processed (and
freed) by the control worker. For descriptor / trailer it's processed
immediately, but we still follow the caller releasing upon failure to
produce pattern.
>
> > +
> > + xe_sriov_pf_migration_data_free(data);
> > +
> > + for_each_gt(gt, xe, gt_id)
> > + xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
> > +
> > + return 0;
> > +}
> > +
> > /**
> > * xe_sriov_pf_migration_produce() - Produce a SR-IOV VF migration data packet for device to process
> > * @xe: the &struct xe_device
> > @@ -180,6 +231,11 @@ int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> > if (!IS_SRIOV_PF(xe))
> > return -ENODEV;
> >
> > + if (data->type == XE_SRIOV_MIG_DATA_DESCRIPTOR)
> > + return pf_handle_descriptor(xe, vfid, data);
> > + else if (data->type == XE_SRIOV_MIG_DATA_TRAILER)
> > + return pf_handle_trailer(xe, vfid, data);
> > +
> > gt = xe_device_get_gt(xe, data->gt);
> > if (!gt || data->tile != gt->tile->id) {
> > xe_sriov_err_ratelimited(xe, "VF%d Unknown GT - tile_id:%d, gt_id:%d\n",
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > index cfc6b512c6674..9a2777dcf9a6b 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > @@ -5,7 +5,45 @@
> >
> > #include "xe_bo.h"
> > #include "xe_device.h"
> > +#include "xe_sriov_pf_helpers.h"
> > +#include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_pf_migration_data.h"
> > +#include "xe_sriov_printk.h"
> > +
> > +static struct mutex *pf_migration_mutex(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > + return &xe->sriov.pf.vfs[vfid].migration.lock;
> > +}
> > +
> > +static struct xe_sriov_pf_migration_data **pf_pick_pending(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> > +
> > + return &xe->sriov.pf.vfs[vfid].migration.pending;
> > +}
> > +
> > +static struct xe_sriov_pf_migration_data **
> > +pf_pick_descriptor(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> > +
> > + return &xe->sriov.pf.vfs[vfid].migration.descriptor;
> > +}
> > +
> > +static struct xe_sriov_pf_migration_data **pf_pick_trailer(struct xe_device *xe, unsigned int vfid)
> > +{
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > + xe_assert(xe, vfid <= xe_sriov_pf_get_totalvfs(xe));
> > + lockdep_assert_held(pf_migration_mutex(xe, vfid));
> > +
> > + return &xe->sriov.pf.vfs[vfid].migration.trailer;
> > +}
> >
> > static bool data_needs_bo(struct xe_sriov_pf_migration_data *data)
> > {
> > @@ -133,3 +171,318 @@ int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *
> >
> > return mig_data_init(data);
> > }
> > +
> > +static ssize_t vf_mig_data_hdr_read(struct xe_sriov_pf_migration_data *data,
> > + char __user *buf, size_t len)
> > +{
> > + loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
> > +
> > + if (!data->hdr_remaining)
> > + return -EINVAL;
> > +
> > + if (len > data->hdr_remaining)
> > + len = data->hdr_remaining;
> > +
> > + if (copy_to_user(buf, (void *)&data->hdr + offset, len))
> > + return -EFAULT;
> > +
> > + data->hdr_remaining -= len;
> > +
> > + return len;
> > +}
> > +
> > +static ssize_t vf_mig_data_read(struct xe_sriov_pf_migration_data *data,
> > + char __user *buf, size_t len)
> > +{
> > + if (len > data->remaining)
> > + len = data->remaining;
> > +
> > + if (copy_to_user(buf, data->vaddr + (data->size - data->remaining), len))
> > + return -EFAULT;
> > +
> > + data->remaining -= len;
> > +
> > + return len;
> > +}
> > +
> > +static ssize_t __vf_mig_data_read_single(struct xe_sriov_pf_migration_data **data,
> > + unsigned int vfid, char __user *buf, size_t len)
> > +{
> > + ssize_t copied = 0;
> > +
> > + if ((*data)->hdr_remaining)
> > + copied = vf_mig_data_hdr_read(*data, buf, len);
> > + else
> > + copied = vf_mig_data_read(*data, buf, len);
> > +
> > + if ((*data)->remaining == 0 && (*data)->hdr_remaining == 0) {
> > + xe_sriov_pf_migration_data_free(*data);
> > + *data = NULL;
> > + }
> > +
> > + return copied;
> > +}
> > +
> > +static struct xe_sriov_pf_migration_data **vf_mig_pick_data(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_pf_migration_data **data;
> > +
> > + data = pf_pick_descriptor(xe, vfid);
> > + if (*data)
> > + return data;
> > +
> > + data = pf_pick_pending(xe, vfid);
> > + if (*data == NULL)
> > + *data = xe_sriov_pf_migration_consume(xe, vfid);
> > + if (!IS_ERR_OR_NULL(*data))
> > + return data;
> > + else if (IS_ERR(*data) && PTR_ERR(*data) != -ENODATA)
> > + return data;
> > +
> > + data = pf_pick_trailer(xe, vfid);
> > + if (*data)
> > + return data;
> > +
> > + return ERR_PTR(-ENODATA);
> > +}
> > +
> > +static ssize_t vf_mig_data_read_single(struct xe_device *xe, unsigned int vfid,
> > + char __user *buf, size_t len)
> > +{
> > + struct xe_sriov_pf_migration_data **data = vf_mig_pick_data(xe, vfid);
> > +
> > + if (IS_ERR_OR_NULL(data))
> > + return PTR_ERR(data);
> > +
> > + return __vf_mig_data_read_single(data, vfid, buf, len);
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_read() - Read migration data from the device
> > + * @gt: the &struct xe_device
>
> @xe
Ok.
>
> > + * @vfid: the VF identifier
> > + * @buf: start address of userspace buffer
> > + * @len: requested read size from userspace
> > + *
> > + * Return: number of bytes that has been successfully read
> > + * 0 if no more migration data is available
> > + * -errno on failure
>
> you likely need to add some punctuation here to properly render the doc
Ok.
>
> > + */
> > +ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
> > + char __user *buf, size_t len)
> > +{
> > + ssize_t ret, consumed = 0;
> > +
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
> > + if (ret)
> > + return ret;
> > +
> > + while (consumed < len) {
> > + ret = vf_mig_data_read_single(xe, vfid, buf, len - consumed);
> > + if (ret == -ENODATA)
> > + goto out;
> > + if (ret < 0) {
> > + mutex_unlock(pf_migration_mutex(xe, vfid));
> > + return ret;
> > + }
> > +
> > + consumed += ret;
> > + buf += ret;
> > + }
> > +
> > +out:
> > + mutex_unlock(pf_migration_mutex(xe, vfid));
> > + return consumed;
> > +}
> > +
> > +static ssize_t vf_mig_hdr_write(struct xe_sriov_pf_migration_data *data,
> > + const char __user *buf, size_t len)
> > +{
> > + loff_t offset = sizeof(data->hdr) - data->hdr_remaining;
> > + int ret;
> > +
> > + if (WARN_ON(!data->hdr_remaining))
>
> xe_WARN_ON(xe, ... ) if having full WARN is really important
Removed.
>
> > + return -EINVAL;
> > +
> > + if (len > data->hdr_remaining)
> > + len = data->hdr_remaining;
> > +
> > + if (copy_from_user((void *)&data->hdr + offset, buf, len))
> > + return -EFAULT;
> > +
> > + data->hdr_remaining -= len;
> > +
> > + if (!data->hdr_remaining) {
> > + ret = xe_sriov_pf_migration_data_init_from_hdr(data);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + return len;
> > +}
> > +
> > +static ssize_t vf_mig_data_write(struct xe_sriov_pf_migration_data *data,
> > + const char __user *buf, size_t len)
> > +{
> > + if (len > data->remaining)
> > + len = data->remaining;
> > +
> > + if (copy_from_user(data->vaddr + (data->size - data->remaining), buf, len))
> > + return -EFAULT;
> > +
> > + data->remaining -= len;
> > +
> > + return len;
> > +}
> > +
> > +static ssize_t vf_mig_data_write_single(struct xe_device *xe, unsigned int vfid,
> > + const char __user *buf, size_t len)
> > +{
> > + struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
> > + int ret;
> > + ssize_t copied;
> > +
> > + if (IS_ERR_OR_NULL(*data)) {
> > + *data = xe_sriov_pf_migration_data_alloc(xe);
> > + if (*data == NULL)
> > + return -ENOMEM;
> > + }
> > +
> > + if ((*data)->hdr_remaining)
> > + copied = vf_mig_hdr_write(*data, buf, len);
> > + else
> > + copied = vf_mig_data_write(*data, buf, len);
> > +
> > + if ((*data)->hdr_remaining == 0 && (*data)->remaining == 0) {
> > + ret = xe_sriov_pf_migration_produce(xe, vfid, *data);
> > + if (ret) {
> > + xe_sriov_pf_migration_data_free(*data);
> > + return ret;
> > + }
> > +
> > + *data = NULL;
> > + }
> > +
> > + return copied;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_write() - Write migration data to the device
> > + * @gt: the &struct xe_device
>
> @xe
Ok.
>
> > + * @vfid: the VF identifier
> > + * @buf: start address of userspace buffer
> > + * @len: requested write size from userspace
> > + *
> > + * Return: number of bytes that has been successfully written
> > + * -errno on failure
> > + */
> > +ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
> > + const char __user *buf, size_t len)
> > +{
> > + ssize_t ret, produced = 0;
> > +
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
> > + if (ret)
> > + return ret;
>
> scoped_cond_guard(mutex_intr, return -EINTR, pf_migration_mutex(xe, vfid)) ?
Ok.
Thanks,
-Michał
>
> > +
> > + while (produced < len) {
> > + ret = vf_mig_data_write_single(xe, vfid, buf, len - produced);
> > + if (ret < 0) {
> > + mutex_unlock(pf_migration_mutex(xe, vfid));
> > + return ret;
> > + }
> > +
> > + produced += ret;
> > + buf += ret;
> > + }
> > +
> > + mutex_unlock(pf_migration_mutex(xe, vfid));
> > + return produced;
> > +}
> > +
> > +#define MIGRATION_DESC_SIZE 4
> > +static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_pf_migration_data **desc = pf_pick_descriptor(xe, vfid);
> > + struct xe_sriov_pf_migration_data *data;
> > + int ret;
> > +
> > + data = xe_sriov_pf_migration_data_alloc(xe);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_DESCRIPTOR,
> > + 0, MIGRATION_DESC_SIZE);
> > + if (ret) {
> > + xe_sriov_pf_migration_data_free(data);
> > + return ret;
> > + }
> > +
> > + *desc = data;
> > +
> > + return 0;
> > +}
> > +
> > +static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
> > +
> > + *data = NULL;
> > +}
> > +
> > +#define MIGRATION_TRAILER_SIZE 0
> > +static int pf_trailer_init(struct xe_device *xe, unsigned int vfid)
> > +{
> > + struct xe_sriov_pf_migration_data **trailer = pf_pick_trailer(xe, vfid);
> > + struct xe_sriov_pf_migration_data *data;
> > + int ret;
> > +
> > + data = xe_sriov_pf_migration_data_alloc(xe);
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_TRAILER,
> > + 0, MIGRATION_TRAILER_SIZE);
> > + if (ret) {
> > + xe_sriov_pf_migration_data_free(data);
> > + return ret;
> > + }
> > +
> > + *trailer = data;
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_save_init() - Initialize the pending save migration data.
> > + * @gt: the &struct xe_device
> > + * @vfid: the VF identifier
> > + *
> > + * Return: 0 on success, -errno on failure
> > + */
> > +int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid)
> > +{
> > + int ret;
> > +
> > + ret = mutex_lock_interruptible(pf_migration_mutex(xe, vfid));
> > + if (ret)
> > + return ret;
> > +
> > + ret = pf_desc_init(xe, vfid);
> > + if (ret)
> > + goto out;
> > +
> > + ret = pf_trailer_init(xe, vfid);
> > + if (ret)
> > + goto out;
> > +
> > + pf_pending_init(xe, vfid);
> > +
> > +out:
> > + mutex_unlock(pf_migration_mutex(xe, vfid));
> > + return ret;
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > index 1dde4cfcdbc47..5b96c7f224002 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > @@ -28,5 +28,10 @@ void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *snapshot
> > int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> > unsigned int type, loff_t offset, size_t size);
> > int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *snapshot);
> > +ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
> > + char __user *buf, size_t len);
> > +ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
> > + const char __user *buf, size_t len);
> > +int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > index 80fdea32b884a..c5d75bb7f39c0 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_types.h
> > @@ -7,6 +7,7 @@
> > #define _XE_SRIOV_PF_MIGRATION_TYPES_H_
> >
> > #include <linux/types.h>
> > +#include <linux/mutex_types.h>
> > #include <linux/wait.h>
> >
> > struct xe_sriov_pf_migration_data {
> > @@ -32,6 +33,14 @@ struct xe_sriov_pf_migration_data {
> > struct xe_sriov_pf_migration {
> > /** @wq: waitqueue used to avoid busy-waiting for snapshot production/consumption */
> > wait_queue_head_t wq;
> > + /** @lock: Mutex protecting the migration data */
> > + struct mutex lock;
> > + /** @pending: currently processed data packet of VF resource */
> > + struct xe_sriov_pf_migration_data *pending;
> > + /** @trailer: data packet used to indicate the end of stream */
> > + struct xe_sriov_pf_migration_data *trailer;
> > + /** @descriptor: data packet containing the metadata describing the device */
> > + struct xe_sriov_pf_migration_data *descriptor;
> > };
> >
> > #endif
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free
2025-10-12 19:12 ` Matthew Brost
@ 2025-10-21 0:26 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:26 UTC (permalink / raw)
To: Matthew Brost
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sun, Oct 12, 2025 at 12:12:12PM -0700, Matthew Brost wrote:
> On Sat, Oct 11, 2025 at 09:38:27PM +0200, Michał Winiarski wrote:
> > Now that it's possible to free the packets - connect the restore
> > handling logic with the ring.
> > The helpers will also be used in upcoming changes that will start producing
> > migration data packets.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile | 1 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 48 ++++++-
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 10 +-
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 1 +
> > .../gpu/drm/xe/xe_sriov_pf_migration_data.c | 135 ++++++++++++++++++
> > .../gpu/drm/xe/xe_sriov_pf_migration_data.h | 32 +++++
> > 6 files changed, 224 insertions(+), 3 deletions(-)
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index 71f685a315dca..e253d65366de4 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -177,6 +177,7 @@ xe-$(CONFIG_PCI_IOV) += \
> > xe_sriov_pf_control.o \
> > xe_sriov_pf_debugfs.o \
> > xe_sriov_pf_migration.o \
> > + xe_sriov_pf_migration_data.o \
> > xe_sriov_pf_service.o \
> > xe_tile_sriov_pf_debugfs.o
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index 16a88e7599f6d..04a4e92133c2e 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -20,6 +20,7 @@
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_tile.h"
> >
> > @@ -949,14 +950,57 @@ static void pf_exit_vf_restored(struct xe_gt *gt, unsigned int vfid)
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORED);
> > }
> >
> > +static void pf_enter_vf_restore_failed(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_FAILED);
> > + pf_exit_vf_wip(gt, vfid);
> > +}
> > +
> > +static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + switch (data->type) {
> > + default:
> > + xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
> > + pf_enter_vf_restore_failed(gt, vfid);
> > + }
> > +
> > + return -EINVAL;
> > +}
> > +
> > static bool pf_handle_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > + struct xe_sriov_pf_migration_data *data;
> > + int ret;
> > +
> > if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESTORE_WIP))
> > return false;
> >
> > - pf_exit_vf_restore_wip(gt, vfid);
> > - pf_enter_vf_restored(gt, vfid);
> > + data = xe_gt_sriov_pf_migration_ring_consume(gt, vfid);
> > + if (IS_ERR(data)) {
> > + if (PTR_ERR(data) == -ENODATA &&
> > + !xe_gt_sriov_pf_control_check_vf_data_wip(gt, vfid)) {
> > + pf_exit_vf_restore_wip(gt, vfid);
> > + pf_enter_vf_restored(gt, vfid);
> > + } else {
> > + pf_enter_vf_restore_failed(gt, vfid);
> > + }
> > + return false;
> > + }
> > +
> > + xe_gt_assert(gt, gt->info.id == data->gt);
> > + xe_gt_assert(gt, gt->tile->id == data->tile);
> > +
> > + ret = pf_handle_vf_restore_data(gt, vfid, data);
> > + if (ret) {
> > + xe_gt_sriov_err(gt, "VF%u failed to restore data type: %d (%d)\n",
> > + vfid, data->type, ret);
> > + xe_sriov_pf_migration_data_free(data);
> > + pf_enter_vf_restore_failed(gt, vfid);
> > + return false;
> > + }
> >
> > + xe_sriov_pf_migration_data_free(data);
> > return true;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index af5952f42fff1..582aaf062cbd4 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -15,6 +15,7 @@
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> >
> > #define XE_GT_SRIOV_PF_MIGRATION_RING_TIMEOUT (HZ * 20)
> > #define XE_GT_SRIOV_PF_MIGRATION_RING_SIZE 5
> > @@ -523,11 +524,18 @@ xe_gt_sriov_pf_migration_ring_consume_nowait(struct xe_gt *gt, unsigned int vfid
> > return ERR_PTR(-EAGAIN);
> > }
> >
> > +static void pf_mig_data_destroy(void *ptr)
> > +{
> > + struct xe_sriov_pf_migration_data *data = ptr;
> > +
> > + xe_sriov_pf_migration_data_free(data);
> > +}
> > +
> > static void pf_gt_migration_cleanup(struct drm_device *dev, void *arg)
> > {
> > struct xe_gt_sriov_pf_migration *migration = arg;
> >
> > - ptr_ring_cleanup(&migration->ring, NULL);
> > + ptr_ring_cleanup(&migration->ring, pf_mig_data_destroy);
> > }
> >
> > /**
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index 347682f29a03c..d39cee66589b5 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -12,6 +12,7 @@
> > #include "xe_pm.h"
> > #include "xe_sriov_pf_helpers.h"
> > #include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_printk.h"
> >
> > static struct xe_sriov_pf_migration *pf_pick_migration(struct xe_device *xe, unsigned int vfid)
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > new file mode 100644
> > index 0000000000000..cfc6b512c6674
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > @@ -0,0 +1,135 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#include "xe_bo.h"
> > +#include "xe_device.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > +
> > +static bool data_needs_bo(struct xe_sriov_pf_migration_data *data)
> > +{
> > + unsigned int type = data->type;
> > +
> > + return type == XE_SRIOV_MIG_DATA_CCS ||
> > + type == XE_SRIOV_MIG_DATA_VRAM;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_alloc() - Allocate migration data packet
> > + * @xe: the &struct xe_device
> > + *
> > + * Only allocates the "outer" structure, without initializing the migration
> > + * data backing storage.
> > + *
> > + * Return: Pointer to &struct xe_sriov_pf_migration_data on success,
> > + * NULL in case of error.
> > + */
> > +struct xe_sriov_pf_migration_data *
> > +xe_sriov_pf_migration_data_alloc(struct xe_device *xe)
> > +{
> > + struct xe_sriov_pf_migration_data *data;
> > +
> > + data = kzalloc(sizeof(*data), GFP_KERNEL);
> > + if (!data)
> > + return NULL;
> > +
> > + data->xe = xe;
> > + data->hdr_remaining = sizeof(data->hdr);
> > +
> > + return data;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_free() - Free migration data packet
> > + * @data: the &struct xe_sriov_pf_migration_data packet
> > + */
> > +void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *data)
> > +{
> > + if (data_needs_bo(data)) {
> > + if (data->bo)
> > + xe_bo_unpin_map_no_vm(data->bo);
> > + } else {
> > + if (data->buff)
> > + kvfree(data->buff);
> > + }
> > +
> > + kfree(data);
> > +}
> > +
> > +static int mig_data_init(struct xe_sriov_pf_migration_data *data)
> > +{
> > + struct xe_gt *gt = xe_device_get_gt(data->xe, data->gt);
> > +
> > + if (!gt || data->tile != gt->tile->id)
> > + return -EINVAL;
> > +
> > + if (data->size == 0)
> > + return 0;
> > +
> > + if (data_needs_bo(data)) {
> > + struct xe_bo *bo = xe_bo_create_pin_map_novm(data->xe, gt->tile,
> > + PAGE_ALIGN(data->size),
> > + ttm_bo_type_kernel,
> > + XE_BO_FLAG_SYSTEM | XE_BO_FLAG_PINNED,
> > + false);
> > + if (IS_ERR(bo))
> > + return PTR_ERR(bo);
> > +
> > + data->bo = bo;
> > + data->vaddr = bo->vmap.vaddr;
> > + } else {
> > + void *buff = kvzalloc(data->size, GFP_KERNEL);
> > + if (!buff)
> > + return -ENOMEM;
> > +
> > + data->buff = buff;
> > + data->vaddr = buff;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_init() - Initialize the migration data header and backing storage
> > + * @data: the &struct xe_sriov_pf_migration_data packet
> > + * @tile_id: tile identifier
> > + * @gt_id: GT identifier
> > + * @type: &enum xe_sriov_pf_migration_data_type
> > + * @offset: offset of data packet payload (within wider resource)
> > + * @size: size of data packet payload
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> > + unsigned int type, loff_t offset, size_t size)
> > +{
> > + xe_assert(data->xe, type < XE_SRIOV_MIG_DATA_MAX);
> > + data->version = 1;
> > + data->type = type;
> > + data->tile = tile_id;
> > + data->gt = gt_id;
> > + data->offset = offset;
> > + data->size = size;
> > + data->remaining = size;
> > +
> > + return mig_data_init(data);
> > +}
> > +
> > +/**
> > + * xe_sriov_pf_migration_data_init() - Initialize the migration data backing storage based on header
> > + * @data: the &struct xe_sriov_pf_migration_data packet
> > + *
> > + * Header data is expected to be filled prior to calling this function
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *data)
> > +{
> > + if (WARN_ON(data->hdr_remaining))
> > + return -EINVAL;
> > +
> > + data->remaining = data->size;
> > +
> > + return mig_data_init(data);
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > new file mode 100644
> > index 0000000000000..1dde4cfcdbc47
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > @@ -0,0 +1,32 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_SRIOV_PF_MIGRATION_DATA_H_
> > +#define _XE_SRIOV_PF_MIGRATION_DATA_H_
> > +
> > +#include <linux/types.h>
> > +
> > +struct xe_device;
> > +
> > +enum xe_sriov_pf_migration_data_type {
> > + XE_SRIOV_MIG_DATA_DESCRIPTOR = 1,
> > + XE_SRIOV_MIG_DATA_TRAILER,
> > + XE_SRIOV_MIG_DATA_GGTT,
> > + XE_SRIOV_MIG_DATA_MMIO,
> > + XE_SRIOV_MIG_DATA_GUC,
> > + XE_SRIOV_MIG_DATA_CCS,
>
> grep XE_SRIOV_MIG_DATA_CCS *.c *.h
> xe_sriov_pf_migration_data.c: return type == XE_SRIOV_MIG_DATA_CCS ||
> xe_sriov_pf_migration_data.h: XE_SRIOV_MIG_DATA_CCS,
>
> XE_SRIOV_MIG_DATA_CCS appears to be unused right now, I'd remove this
> data type of now.
>
> Matt
I'll remove it.
Thanks,
-Michał
>
> > + XE_SRIOV_MIG_DATA_VRAM,
> > + XE_SRIOV_MIG_DATA_MAX,
> > +};
> > +
> > +struct xe_sriov_pf_migration_data *
> > +xe_sriov_pf_migration_data_alloc(struct xe_device *xe);
> > +void xe_sriov_pf_migration_data_free(struct xe_sriov_pf_migration_data *snapshot);
> > +
> > +int xe_sriov_pf_migration_data_init(struct xe_sriov_pf_migration_data *data, u8 tile_id, u8 gt_id,
> > + unsigned int type, loff_t offset, size_t size);
> > +int xe_sriov_pf_migration_data_init_from_hdr(struct xe_sriov_pf_migration_data *snapshot);
> > +
> > +#endif
> > --
> > 2.50.1
> >
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor
2025-10-13 10:56 ` Michal Wajdeczko
@ 2025-10-21 0:31 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:31 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 12:56:34PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > The descriptor reuses the KLV format used by GuC and contains metadata
> > that can be used to quickly fail migration when source is incompatible
> > with destination.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 6 +-
> > .../gpu/drm/xe/xe_sriov_pf_migration_data.c | 82 ++++++++++++++++++-
> > .../gpu/drm/xe/xe_sriov_pf_migration_data.h | 2 +
> > 3 files changed, 87 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index 9cc178126cbdc..a0cfac456ba0b 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -186,10 +186,14 @@ xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid)
> > static int pf_handle_descriptor(struct xe_device *xe, unsigned int vfid,
> > struct xe_sriov_pf_migration_data *data)
> > {
> > + int ret;
> > +
> > if (data->tile != 0 || data->gt != 0)
> > return -EINVAL;
> >
> > - xe_sriov_pf_migration_data_free(data);
> > + ret = xe_sriov_pf_migration_data_process_desc(xe, vfid, data);
> > + if (ret)
> > + return ret;
> >
> > return 0;
> > }
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > index 9a2777dcf9a6b..307b16b027a5e 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.c
> > @@ -5,6 +5,7 @@
> >
> > #include "xe_bo.h"
> > #include "xe_device.h"
> > +#include "xe_guc_klv_helpers.h"
> > #include "xe_sriov_pf_helpers.h"
> > #include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_pf_migration_data.h"
> > @@ -404,11 +405,17 @@ ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid
> > return produced;
> > }
> >
> > -#define MIGRATION_DESC_SIZE 4
> > +#define MIGRATION_KLV_DEVICE_DEVID_KEY 0xf001u
> > +#define MIGRATION_KLV_DEVICE_DEVID_LEN 1u
> > +#define MIGRATION_KLV_DEVICE_REVID_KEY 0xf002u
> > +#define MIGRATION_KLV_DEVICE_REVID_LEN 1u
>
> as we aim to have unique KLVs across GuC ABI, maybe we should ask GuC team to reserve some KLVs range (0xf000-0xffff) for our (driver) use ?
I'll start the discussion.
>
> > +
> > +#define MIGRATION_DESC_DWORDS 4
>
> maybe:
> (GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_DEVID_LEN +
> GUC_KLV_LEN_MIN + MIGRATION_KLV_DEVICE_REVID_LEN)
Ok.
>
> > static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
> > {
> > struct xe_sriov_pf_migration_data **desc = pf_pick_descriptor(xe, vfid);
> > struct xe_sriov_pf_migration_data *data;
> > + u32 *klvs;
> > int ret;
> >
> > data = xe_sriov_pf_migration_data_alloc(xe);
> > @@ -416,17 +423,88 @@ static size_t pf_desc_init(struct xe_device *xe, unsigned int vfid)
> > return -ENOMEM;
> >
> > ret = xe_sriov_pf_migration_data_init(data, 0, 0, XE_SRIOV_MIG_DATA_DESCRIPTOR,
> > - 0, MIGRATION_DESC_SIZE);
> > + 0, MIGRATION_DESC_DWORDS * sizeof(u32));
> > if (ret) {
> > xe_sriov_pf_migration_data_free(data);
> > return ret;
> > }
> >
> > + klvs = data->vaddr;
> > + *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_DEVID_KEY,
> > + MIGRATION_KLV_DEVICE_DEVID_LEN);
> > + *klvs++ = xe->info.devid;
> > + *klvs++ = PREP_GUC_KLV_CONST(MIGRATION_KLV_DEVICE_REVID_KEY,
> > + MIGRATION_KLV_DEVICE_REVID_LEN);
> > + *klvs++ = xe->info.revid;
> > +
> > *desc = data;
> >
> > return 0;
> > }
> >
> > +/**
> > + * xe_sriov_pf_migration_data_process_desc() - Process migration data descriptor.
> > + * @gt: the &struct xe_device
>
> @xe
Ok.
>
> > + * @vfid: the VF identifier
> > + * @data: the &struct xe_sriov_pf_migration_data containing the descriptor
> > + *
> > + * The descriptor uses the same KLV format as GuC, and contains metadata used for
> > + * checking migration data compatibility.
> > + *
> > + * Return: 0 on success, -errno on failure
> > + */
> > +int xe_sriov_pf_migration_data_process_desc(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + u32 num_dwords = data->size / sizeof(u32);
> > + u32 *klvs = data->vaddr;
> > +
> > + xe_assert(xe, data->type == XE_SRIOV_MIG_DATA_DESCRIPTOR);
> > + if (data->size % sizeof(u32) != 0)
> > + return -EINVAL;
> > + if (data->size != num_dwords * sizeof(u32))
> > + return -EINVAL;
>
> isn't that redundant ?
It is - will remove.
>
> > +
> > + while (num_dwords >= GUC_KLV_LEN_MIN) {
> > + u32 key = FIELD_GET(GUC_KLV_0_KEY, klvs[0]);
> > + u32 len = FIELD_GET(GUC_KLV_0_LEN, klvs[0]);
> > +
> > + klvs += GUC_KLV_LEN_MIN;
> > + num_dwords -= GUC_KLV_LEN_MIN;
> > +
> > + switch (key) {
> > + case MIGRATION_KLV_DEVICE_DEVID_KEY:
> > + if (*klvs != xe->info.devid) {
> > + xe_sriov_info(xe,
>
> maybe it should be more that info() ?
Promoted to warn().
Thanks,
-Michał
>
> > + "Aborting migration, devid mismatch %#04x!=%#04x\n",
> > + *klvs, xe->info.devid);
> > + return -ENODEV;
> > + }
> > + break;
> > + case MIGRATION_KLV_DEVICE_REVID_KEY:
> > + if (*klvs != xe->info.revid) {
> > + xe_sriov_info(xe,
> > + "Aborting migration, revid mismatch %#04x!=%#04x\n",
> > + *klvs, xe->info.revid);
> > + return -ENODEV;
> > + }
> > + break;
> > + default:
> > + xe_sriov_dbg(xe,
> > + "Unknown migration descriptor key %#06x - skipping\n", key);
> > + break;
> > + }
> > +
> > + if (len > num_dwords)
> > + return -EINVAL;
> > +
> > + klvs += len;
> > + num_dwords -= len;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > static void pf_pending_init(struct xe_device *xe, unsigned int vfid)
> > {
> > struct xe_sriov_pf_migration_data **data = pf_pick_pending(xe, vfid);
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > index 5b96c7f224002..7cfd61005c00f 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration_data.h
> > @@ -32,6 +32,8 @@ ssize_t xe_sriov_pf_migration_data_read(struct xe_device *xe, unsigned int vfid,
> > char __user *buf, size_t len);
> > ssize_t xe_sriov_pf_migration_data_write(struct xe_device *xe, unsigned int vfid,
> > const char __user *buf, size_t len);
> > +int xe_sriov_pf_migration_data_process_desc(struct xe_device *xe, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data);
> > int xe_sriov_pf_migration_data_save_init(struct xe_device *xe, unsigned int vfid);
> >
> > #endif
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs
2025-10-12 19:15 ` Matthew Brost
@ 2025-10-21 0:37 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:37 UTC (permalink / raw)
To: Matthew Brost
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sun, Oct 12, 2025 at 12:15:22PM -0700, Matthew Brost wrote:
> On Sat, Oct 11, 2025 at 09:38:30PM +0200, Michał Winiarski wrote:
> > The size is normally used to make a decision on when to stop the device
> > (mainly when it's in a pre_copy state).
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 18 ++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 2 ++
> > drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 34 +++++++++++++++++++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 30 ++++++++++++++++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 1 +
> > 5 files changed, 85 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index 582aaf062cbd4..50f09994e2854 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -395,6 +395,24 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> > }
> > #endif /* CONFIG_DEBUG_FS */
> >
> > +/**
> > + * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: total migration data size in bytes or a negative error code on failure.
> > + */
> > +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + ssize_t total = 0;
> > +
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > +
> > + return total;
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty
> > * @gt: the &struct xe_gt
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > index 1e4dc46413823..e5298d35d7d7e 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > @@ -15,6 +15,8 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> > int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
> >
> > +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
> > +
> > bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
> > struct xe_sriov_pf_migration_data *data);
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > index ce780719760a6..b06e893fe54cf 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > @@ -13,6 +13,7 @@
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_pf_debugfs.h"
> > #include "xe_sriov_pf_helpers.h"
> > +#include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_sriov_printk.h"
> > @@ -203,6 +204,38 @@ static const struct file_operations data_vf_fops = {
> > .llseek = default_llseek,
> > };
> >
> > +static ssize_t size_read(struct file *file, char __user *ubuf, size_t count, loff_t *ppos)
> > +{
> > + struct dentry *dent = file_dentry(file);
> > + struct dentry *vfdentry = dent->d_parent;
> > + struct dentry *migration_dentry = vfdentry->d_parent;
> > + unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
> > + struct xe_device *xe = migration_dentry->d_inode->i_private;
> > + char buf[21];
> > + ssize_t ret;
> > + int len;
> > +
> > + xe_assert(xe, vfid);
> > + xe_sriov_pf_assert_vfid(xe, vfid);
> > +
> > + xe_pm_runtime_get(xe);
>
> You don't need a PM ref here as this is purely software (i.e, the
> hardware is not touched).
Not in the case of GuC migration data size. While in this patch GuC is
not yet present as a migration data resource, we should assume that
xe_sriov_pf_migration_size needs a PM ref.
Thanks,
-Michał
>
> Matt
>
> > + ret = xe_sriov_pf_migration_size(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > + if (ret < 0)
> > + return ret;
> > +
> > + len = scnprintf(buf, sizeof(buf), "%zd\n", ret);
> > +
> > + return simple_read_from_buffer(ubuf, count, ppos, buf, len);
> > +}
> > +
> > +static const struct file_operations size_vf_fops = {
> > + .owner = THIS_MODULE,
> > + .open = simple_open,
> > + .read = size_read,
> > + .llseek = default_llseek,
> > +};
> > +
> > static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> > {
> > debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> > @@ -212,6 +245,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> > debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> > debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> > debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> > + debugfs_create_file("migration_size", 0400, vfdent, xe, &size_vf_fops);
> > }
> >
> > static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index a0cfac456ba0b..6b247581dec65 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -249,3 +249,33 @@ int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> >
> > return xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
> > }
> > +
> > +/**
> > + * xe_sriov_pf_migration_size() - Total size of migration data from all components within a device
> > + * @xe: the &struct xe_device
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: total migration data size in bytes or a negative error code on failure.
> > + */
> > +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid)
> > +{
> > + size_t size = 0;
> > + struct xe_gt *gt;
> > + ssize_t ret;
> > + u8 gt_id;
> > +
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + for_each_gt(gt, xe, gt_id) {
> > + ret = xe_gt_sriov_pf_migration_size(gt, vfid);
> > + if (ret < 0) {
> > + size = ret;
> > + break;
> > + }
> > + size += ret;
> > + }
> > +
> > + return size;
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > index f2020ba19c2da..887ea3e9632bd 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > @@ -14,6 +14,7 @@ struct xe_device;
> > #ifdef CONFIG_PCI_IOV
> > int xe_sriov_pf_migration_init(struct xe_device *xe);
> > bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> > +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid);
> > struct xe_sriov_pf_migration_data *
> > xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid);
> > int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> > --
> > 2.50.1
> >
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs
2025-10-13 11:04 ` Michal Wajdeczko
@ 2025-10-21 0:42 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:42 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 01:04:22PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > The size is normally used to make a decision on when to stop the device
> > (mainly when it's in a pre_copy state).
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 18 ++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 2 ++
> > drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c | 34 +++++++++++++++++++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.c | 30 ++++++++++++++++
> > drivers/gpu/drm/xe/xe_sriov_pf_migration.h | 1 +
> > 5 files changed, 85 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index 582aaf062cbd4..50f09994e2854 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -395,6 +395,24 @@ ssize_t xe_gt_sriov_pf_migration_write_guc_state(struct xe_gt *gt, unsigned int
> > }
> > #endif /* CONFIG_DEBUG_FS */
> >
> > +/**
> > + * xe_gt_sriov_pf_migration_size() - Total size of migration data from all components within a GT
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: total migration data size in bytes or a negative error code on failure.
> > + */
> > +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + ssize_t total = 0;
> > +
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > +
>
> as this is so trivial now, maybe add some note why it is like that for now
Ok.
>
> > + return total;
> > +}
> > +
> > /**
> > * xe_gt_sriov_pf_migration_ring_empty() - Check if a migration ring is empty
> > * @gt: the &struct xe_gt
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > index 1e4dc46413823..e5298d35d7d7e 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > @@ -15,6 +15,8 @@ int xe_gt_sriov_pf_migration_init(struct xe_gt *gt);
> > int xe_gt_sriov_pf_migration_save_guc_state(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_restore_guc_state(struct xe_gt *gt, unsigned int vfid);
> >
> > +ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
> > +
> > bool xe_gt_sriov_pf_migration_ring_empty(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_ring_produce(struct xe_gt *gt, unsigned int vfid,
> > struct xe_sriov_pf_migration_data *data);
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > index ce780719760a6..b06e893fe54cf 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_debugfs.c
> > @@ -13,6 +13,7 @@
> > #include "xe_sriov_pf_control.h"
> > #include "xe_sriov_pf_debugfs.h"
> > #include "xe_sriov_pf_helpers.h"
> > +#include "xe_sriov_pf_migration.h"
> > #include "xe_sriov_pf_migration_data.h"
> > #include "xe_sriov_pf_service.h"
> > #include "xe_sriov_printk.h"
> > @@ -203,6 +204,38 @@ static const struct file_operations data_vf_fops = {
> > .llseek = default_llseek,
> > };
> >
> > +static ssize_t size_read(struct file *file, char __user *ubuf, size_t count, loff_t *ppos)
> > +{
> > + struct dentry *dent = file_dentry(file);
> > + struct dentry *vfdentry = dent->d_parent;
> > + struct dentry *migration_dentry = vfdentry->d_parent;
> > + unsigned int vfid = (uintptr_t)vfdentry->d_inode->i_private;
> > + struct xe_device *xe = migration_dentry->d_inode->i_private;
>
> use helpers
Ok.
>
> > + char buf[21];
> > + ssize_t ret;
> > + int len;
> > +
> > + xe_assert(xe, vfid);
> > + xe_sriov_pf_assert_vfid(xe, vfid);
>
> it doesn't matter for this function, so why assert here?
I'll drop it.
>
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_migration_size(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > + if (ret < 0)
> > + return ret;
> > +
> > + len = scnprintf(buf, sizeof(buf), "%zd\n", ret);
> > +
> > + return simple_read_from_buffer(ubuf, count, ppos, buf, len);
> > +}
> > +
> > +static const struct file_operations size_vf_fops = {
> > + .owner = THIS_MODULE,
> > + .open = simple_open,
> > + .read = size_read,
> > + .llseek = default_llseek,
> > +};
> > +
> > static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> > {
> > debugfs_create_file("pause", 0200, vfdent, xe, &pause_vf_fops);
> > @@ -212,6 +245,7 @@ static void pf_populate_vf(struct xe_device *xe, struct dentry *vfdent)
> > debugfs_create_file("save", 0600, vfdent, xe, &save_vf_fops);
> > debugfs_create_file("restore", 0600, vfdent, xe, &restore_vf_fops);
> > debugfs_create_file("migration_data", 0600, vfdent, xe, &data_vf_fops);
> > + debugfs_create_file("migration_size", 0400, vfdent, xe, &size_vf_fops);
> > }
> >
> > static void pf_populate_with_tiles(struct xe_device *xe, struct dentry *dent, unsigned int vfid)
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > index a0cfac456ba0b..6b247581dec65 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.c
> > @@ -249,3 +249,33 @@ int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
> >
> > return xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
> > }
> > +
> > +/**
> > + * xe_sriov_pf_migration_size() - Total size of migration data from all components within a device
> > + * @xe: the &struct xe_device
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: total migration data size in bytes or a negative error code on failure.
> > + */
> > +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid)
> > +{
> > + size_t size = 0;
> > + struct xe_gt *gt;
> > + ssize_t ret;
> > + u8 gt_id;
> > +
> > + xe_assert(xe, IS_SRIOV_PF(xe));
> > +
> > + for_each_gt(gt, xe, gt_id) {
> > + ret = xe_gt_sriov_pf_migration_size(gt, vfid);
> > + if (ret < 0) {
> > + size = ret;
> > + break;
>
> just:
> return ret;
Ok.
Thanks,
-Michał
> > + }
> > + size += ret;
> > + }
> > +
> > + return size;
> > +}
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > index f2020ba19c2da..887ea3e9632bd 100644
> > --- a/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_sriov_pf_migration.h
> > @@ -14,6 +14,7 @@ struct xe_device;
> > #ifdef CONFIG_PCI_IOV
> > int xe_sriov_pf_migration_init(struct xe_device *xe);
> > bool xe_sriov_pf_migration_supported(struct xe_device *xe);
> > +ssize_t xe_sriov_pf_migration_size(struct xe_device *xe, unsigned int vfid);
> > struct xe_sriov_pf_migration_data *
> > xe_sriov_pf_migration_consume(struct xe_device *xe, unsigned int vfid);
> > int xe_sriov_pf_migration_produce(struct xe_device *xe, unsigned int vfid,
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface
2025-10-13 11:20 ` Michal Wajdeczko
@ 2025-10-21 0:44 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:44 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 01:20:53PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > In upcoming changes the cached buffers are going to be used to read data
> > produced by the GuC. Add a counterpart to flush, which synchronizes the
> > CPU-side of suballocation with the GPU data and propagate the interface
> > to GuC Buffer Cache.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_guc_buf.c | 9 +++++++++
> > drivers/gpu/drm/xe/xe_guc_buf.h | 1 +
> > drivers/gpu/drm/xe/xe_sa.c | 21 +++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_sa.h | 1 +
> > 4 files changed, 32 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
> > index 502ca3a4ee606..1be26145f0b98 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_buf.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_buf.c
> > @@ -127,6 +127,15 @@ u64 xe_guc_buf_flush(const struct xe_guc_buf buf)
> > return xe_sa_bo_gpu_addr(buf.sa);
> > }
> >
> > +/**
> > + * xe_guc_buf_sync() - Copy the data from the GPU memory to the sub-allocation.
> > + * @buf: the &xe_guc_buf to sync
>
> for convenience, can we return the buf CPU pointer here?
>
> something that I already had in my initial impl [1]
>
> [1] https://patchwork.freedesktop.org/patch/619024/?series=139801&rev=1
Will do.
Thanks,
-Michał
>
>
> > + */
> > +void xe_guc_buf_sync(const struct xe_guc_buf buf)
> > +{
> > + xe_sa_bo_sync(buf.sa);
> > +}
> > +
> > /**
> > * xe_guc_buf_cpu_ptr() - Obtain a CPU pointer to the sub-allocation.
> > * @buf: the &xe_guc_buf to query
> > diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
> > index 0d67604d96bdd..fe6b5ffe0d6eb 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_buf.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_buf.h
> > @@ -31,6 +31,7 @@ static inline bool xe_guc_buf_is_valid(const struct xe_guc_buf buf)
> >
> > void *xe_guc_buf_cpu_ptr(const struct xe_guc_buf buf);
> > u64 xe_guc_buf_flush(const struct xe_guc_buf buf);
> > +void xe_guc_buf_sync(const struct xe_guc_buf buf);
> > u64 xe_guc_buf_gpu_addr(const struct xe_guc_buf buf);
> > u64 xe_guc_cache_gpu_addr_from_ptr(struct xe_guc_buf_cache *cache, const void *ptr, u32 size);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_sa.c b/drivers/gpu/drm/xe/xe_sa.c
> > index fedd017d6dd36..2115789c2bfb7 100644
> > --- a/drivers/gpu/drm/xe/xe_sa.c
> > +++ b/drivers/gpu/drm/xe/xe_sa.c
> > @@ -110,6 +110,10 @@ struct drm_suballoc *__xe_sa_bo_new(struct xe_sa_manager *sa_manager, u32 size,
> > return drm_suballoc_new(&sa_manager->base, size, gfp, true, 0);
> > }
> >
> > +/**
> > + * xe_sa_bo_flush_write() - Copy the data from the sub-allocation to the GPU memory.
> > + * @sa_bo: the &drm_suballoc to flush
> > + */
> > void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
> > {
> > struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
> > @@ -123,6 +127,23 @@ void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
> > drm_suballoc_size(sa_bo));
> > }
> >
> > +/**
> > + * xe_sa_bo_sync() - Copy the data from GPU memory to the sub-allocation.
> > + * @sa_bo: the &drm_suballoc to sync
> > + */
> > +void xe_sa_bo_sync(struct drm_suballoc *sa_bo)
> > +{
> > + struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
> > + struct xe_device *xe = tile_to_xe(sa_manager->bo->tile);
> > +
> > + if (!sa_manager->bo->vmap.is_iomem)
> > + return;
> > +
> > + xe_map_memcpy_from(xe, xe_sa_bo_cpu_addr(sa_bo), &sa_manager->bo->vmap,
> > + drm_suballoc_soffset(sa_bo),
> > + drm_suballoc_size(sa_bo));
> > +}
> > +
> > void xe_sa_bo_free(struct drm_suballoc *sa_bo,
> > struct dma_fence *fence)
> > {
> > diff --git a/drivers/gpu/drm/xe/xe_sa.h b/drivers/gpu/drm/xe/xe_sa.h
> > index 99dbf0eea5402..28fd8bb6450c2 100644
> > --- a/drivers/gpu/drm/xe/xe_sa.h
> > +++ b/drivers/gpu/drm/xe/xe_sa.h
> > @@ -37,6 +37,7 @@ static inline struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager
> > }
> >
> > void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo);
> > +void xe_sa_bo_sync(struct drm_suballoc *sa_bo);
> > void xe_sa_bo_free(struct drm_suballoc *sa_bo, struct dma_fence *fence);
> >
> > static inline struct xe_sa_manager *
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface
2025-10-12 18:06 ` Matthew Brost
@ 2025-10-21 0:45 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:45 UTC (permalink / raw)
To: Matthew Brost
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sun, Oct 12, 2025 at 11:06:28AM -0700, Matthew Brost wrote:
> On Sat, Oct 11, 2025 at 09:38:31PM +0200, Michał Winiarski wrote:
> > In upcoming changes the cached buffers are going to be used to read data
> > produced by the GuC. Add a counterpart to flush, which synchronizes the
> > CPU-side of suballocation with the GPU data and propagate the interface
> > to GuC Buffer Cache.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_guc_buf.c | 9 +++++++++
> > drivers/gpu/drm/xe/xe_guc_buf.h | 1 +
> > drivers/gpu/drm/xe/xe_sa.c | 21 +++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_sa.h | 1 +
> > 4 files changed, 32 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
> > index 502ca3a4ee606..1be26145f0b98 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_buf.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_buf.c
> > @@ -127,6 +127,15 @@ u64 xe_guc_buf_flush(const struct xe_guc_buf buf)
> > return xe_sa_bo_gpu_addr(buf.sa);
> > }
> >
> > +/**
> > + * xe_guc_buf_sync() - Copy the data from the GPU memory to the sub-allocation.
> > + * @buf: the &xe_guc_buf to sync
> > + */
> > +void xe_guc_buf_sync(const struct xe_guc_buf buf)
>
> s/sync/sync_read ?
>
> or
>
> s/sync/flush_read ?
>
> Patch itself LGTM.
>
> Matt
I'll rename it to sync_read.
Thanks,
-Michał
>
> > +{
> > + xe_sa_bo_sync(buf.sa);
> > +}
> > +
> > /**
> > * xe_guc_buf_cpu_ptr() - Obtain a CPU pointer to the sub-allocation.
> > * @buf: the &xe_guc_buf to query
> > diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
> > index 0d67604d96bdd..fe6b5ffe0d6eb 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_buf.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_buf.h
> > @@ -31,6 +31,7 @@ static inline bool xe_guc_buf_is_valid(const struct xe_guc_buf buf)
> >
> > void *xe_guc_buf_cpu_ptr(const struct xe_guc_buf buf);
> > u64 xe_guc_buf_flush(const struct xe_guc_buf buf);
> > +void xe_guc_buf_sync(const struct xe_guc_buf buf);
> > u64 xe_guc_buf_gpu_addr(const struct xe_guc_buf buf);
> > u64 xe_guc_cache_gpu_addr_from_ptr(struct xe_guc_buf_cache *cache, const void *ptr, u32 size);
> >
> > diff --git a/drivers/gpu/drm/xe/xe_sa.c b/drivers/gpu/drm/xe/xe_sa.c
> > index fedd017d6dd36..2115789c2bfb7 100644
> > --- a/drivers/gpu/drm/xe/xe_sa.c
> > +++ b/drivers/gpu/drm/xe/xe_sa.c
> > @@ -110,6 +110,10 @@ struct drm_suballoc *__xe_sa_bo_new(struct xe_sa_manager *sa_manager, u32 size,
> > return drm_suballoc_new(&sa_manager->base, size, gfp, true, 0);
> > }
> >
> > +/**
> > + * xe_sa_bo_flush_write() - Copy the data from the sub-allocation to the GPU memory.
> > + * @sa_bo: the &drm_suballoc to flush
> > + */
> > void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
> > {
> > struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
> > @@ -123,6 +127,23 @@ void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo)
> > drm_suballoc_size(sa_bo));
> > }
> >
> > +/**
> > + * xe_sa_bo_sync() - Copy the data from GPU memory to the sub-allocation.
> > + * @sa_bo: the &drm_suballoc to sync
> > + */
> > +void xe_sa_bo_sync(struct drm_suballoc *sa_bo)
> > +{
> > + struct xe_sa_manager *sa_manager = to_xe_sa_manager(sa_bo->manager);
> > + struct xe_device *xe = tile_to_xe(sa_manager->bo->tile);
> > +
> > + if (!sa_manager->bo->vmap.is_iomem)
> > + return;
> > +
> > + xe_map_memcpy_from(xe, xe_sa_bo_cpu_addr(sa_bo), &sa_manager->bo->vmap,
> > + drm_suballoc_soffset(sa_bo),
> > + drm_suballoc_size(sa_bo));
> > +}
> > +
> > void xe_sa_bo_free(struct drm_suballoc *sa_bo,
> > struct dma_fence *fence)
> > {
> > diff --git a/drivers/gpu/drm/xe/xe_sa.h b/drivers/gpu/drm/xe/xe_sa.h
> > index 99dbf0eea5402..28fd8bb6450c2 100644
> > --- a/drivers/gpu/drm/xe/xe_sa.h
> > +++ b/drivers/gpu/drm/xe/xe_sa.h
> > @@ -37,6 +37,7 @@ static inline struct drm_suballoc *xe_sa_bo_new(struct xe_sa_manager *sa_manager
> > }
> >
> > void xe_sa_bo_flush_write(struct drm_suballoc *sa_bo);
> > +void xe_sa_bo_sync(struct drm_suballoc *sa_bo);
> > void xe_sa_bo_free(struct drm_suballoc *sa_bo, struct dma_fence *fence);
> >
> > static inline struct xe_sa_manager *
> > --
> > 2.50.1
> >
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size
2025-10-13 11:08 ` Michal Wajdeczko
@ 2025-10-21 0:47 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:47 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 01:08:39PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > An upcoming change will use GuC buffer cache as a place where GuC
> > migration data will be stored, and the memory requirement for that is
> > larger than indirect data.
> > Allow the caller to pass the size based on the intended usecase.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c | 2 +-
> > drivers/gpu/drm/xe/xe_guc.c | 4 ++--
> > drivers/gpu/drm/xe/xe_guc_buf.c | 6 +++---
> > drivers/gpu/drm/xe/xe_guc_buf.h | 2 +-
> > 4 files changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> > index d266882adc0e0..c273ce8087f56 100644
> > --- a/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> > +++ b/drivers/gpu/drm/xe/tests/xe_guc_buf_kunit.c
> > @@ -72,7 +72,7 @@ static int guc_buf_test_init(struct kunit *test)
> > kunit_activate_static_stub(test, xe_managed_bo_create_pin_map,
> > replacement_xe_managed_bo_create_pin_map);
> >
> > - KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf));
> > + KUNIT_ASSERT_EQ(test, 0, xe_guc_buf_cache_init(&guc->buf), SZ_8K);
>
> SZ_8K added to wrong place ;)
As buildbots & CI already noticed :)
> >
> > test->priv = &guc->buf;
> > return 0;
> > diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> > index d94490979adc0..ccc7c60ae9b77 100644
> > --- a/drivers/gpu/drm/xe/xe_guc.c
> > +++ b/drivers/gpu/drm/xe/xe_guc.c
> > @@ -809,7 +809,7 @@ static int vf_guc_init_post_hwconfig(struct xe_guc *guc)
> > if (err)
> > return err;
> >
> > - err = xe_guc_buf_cache_init(&guc->buf);
> > + err = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
> > if (err)
> > return err;
> >
> > @@ -857,7 +857,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
> > if (ret)
> > return ret;
> >
> > - ret = xe_guc_buf_cache_init(&guc->buf);
> > + ret = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
> > if (ret)
> > return ret;
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc_buf.c b/drivers/gpu/drm/xe/xe_guc_buf.c
> > index 1be26145f0b98..418ada00b99e3 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_buf.c
> > +++ b/drivers/gpu/drm/xe/xe_guc_buf.c
> > @@ -28,16 +28,16 @@ static struct xe_gt *cache_to_gt(struct xe_guc_buf_cache *cache)
> > * @cache: the &xe_guc_buf_cache to initialize
> > *
> > * The Buffer Cache allows to obtain a reusable buffer that can be used to pass
> > - * indirect H2G data to GuC without a need to create a ad-hoc allocation.
> > + * data to GuC or read data from GuC without a need to create a ad-hoc allocation.
> > *
> > * Return: 0 on success or a negative error code on failure.
> > */
> > -int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache)
> > +int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size)
> > {
> > struct xe_gt *gt = cache_to_gt(cache);
> > struct xe_sa_manager *sam;
> >
> > - sam = __xe_sa_bo_manager_init(gt_to_tile(gt), SZ_8K, 0, sizeof(u32));
>
> maybe we should promote this magic SZ_8K as
>
> #define XE_GUC_BUF_CACHE_DEFAULT_SIZE SZ_8K
Ok.
Thanks,
-Michał
>
> > + sam = __xe_sa_bo_manager_init(gt_to_tile(gt), size, 0, sizeof(u32));
> > if (IS_ERR(sam))
> > return PTR_ERR(sam);
> > cache->sam = sam;
> > diff --git a/drivers/gpu/drm/xe/xe_guc_buf.h b/drivers/gpu/drm/xe/xe_guc_buf.h
> > index fe6b5ffe0d6eb..fe5cf3b183497 100644
> > --- a/drivers/gpu/drm/xe/xe_guc_buf.h
> > +++ b/drivers/gpu/drm/xe/xe_guc_buf.h
> > @@ -11,7 +11,7 @@
> >
> > #include "xe_guc_buf_types.h"
> >
> > -int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache);
> > +int xe_guc_buf_cache_init(struct xe_guc_buf_cache *cache, u32 size);
> > u32 xe_guc_buf_cache_dwords(struct xe_guc_buf_cache *cache);
> > struct xe_guc_buf xe_guc_buf_reserve(struct xe_guc_buf_cache *cache, u32 dwords);
> > struct xe_guc_buf xe_guc_buf_from_data(struct xe_guc_buf_cache *cache,
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 12/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration
2025-10-13 11:27 ` Michal Wajdeczko
@ 2025-10-21 0:50 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:50 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 01:27:55PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Contiguous PF GGTT VMAs can be scarce after creating VFs.
> > Increase the GuC buffer cache size to 8M for PF so that we can fit GuC
> > migration data (which currently maxes out at just over 4M) and use the
> > cache instead of allocating fresh BOs.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 54 +++++++------------
> > drivers/gpu/drm/xe/xe_guc.c | 2 +-
> > 2 files changed, 20 insertions(+), 36 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index 50f09994e2854..8b96eff8df93b 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -11,7 +11,7 @@
> > #include "xe_gt_sriov_pf_helpers.h"
> > #include "xe_gt_sriov_pf_migration.h"
> > #include "xe_gt_sriov_printk.h"
> > -#include "xe_guc.h"
> > +#include "xe_guc_buf.h"
> > #include "xe_guc_ct.h"
> > #include "xe_sriov.h"
> > #include "xe_sriov_pf_migration.h"
> > @@ -57,73 +57,57 @@ static int pf_send_guc_query_vf_state_size(struct xe_gt *gt, unsigned int vfid)
> >
> > /* Return: number of state dwords saved or a negative error code on failure */
> > static int pf_send_guc_save_vf_state(struct xe_gt *gt, unsigned int vfid,
> > - void *buff, size_t size)
> > + void *dst, size_t size)
> > {
> > const int ndwords = size / sizeof(u32);
> > - struct xe_tile *tile = gt_to_tile(gt);
> > - struct xe_device *xe = tile_to_xe(tile);
> > struct xe_guc *guc = >->uc.guc;
> > - struct xe_bo *bo;
> > + CLASS(xe_guc_buf, buf)(&guc->buf, ndwords);
> > int ret;
> >
> > xe_gt_assert(gt, size % sizeof(u32) == 0);
> > xe_gt_assert(gt, size == ndwords * sizeof(u32));
> >
> > - bo = xe_bo_create_pin_map_novm(xe, tile,
> > - ALIGN(size, PAGE_SIZE),
> > - ttm_bo_type_kernel,
> > - XE_BO_FLAG_SYSTEM |
> > - XE_BO_FLAG_GGTT |
> > - XE_BO_FLAG_GGTT_INVALIDATE, false);
> > - if (IS_ERR(bo))
> > - return PTR_ERR(bo);
> > + if (!xe_guc_buf_is_valid(buf))
> > + return -ENOBUFS;
> > +
> > + memset(xe_guc_buf_cpu_ptr(buf), 0, size);
>
> is that necessary? GuC will overwrite that anyway
It doesn't, so it actually is necessary.
>
> >
> > ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_SAVE,
> > - xe_bo_ggtt_addr(bo), ndwords);
> > - if (!ret)
> > + xe_guc_buf_flush(buf), ndwords);
> > + if (!ret) {
> > ret = -ENODATA;
> > - else if (ret > ndwords)
> > + } else if (ret > ndwords) {
> > ret = -EPROTO;
> > - else if (ret > 0)
> > - xe_map_memcpy_from(xe, buff, &bo->vmap, 0, ret * sizeof(u32));
> > + } else if (ret > 0) {
> > + xe_guc_buf_sync(buf);
> > + memcpy(dst, xe_guc_buf_cpu_ptr(buf), ret * sizeof(u32));
>
> with a small change suggested earlier, this could be just:
>
> memcpy(dst, xe_guc_buf_sync(buf), ret * sizeof(u32));
Ok.
>
> > + }
> >
> > - xe_bo_unpin_map_no_vm(bo);
> > return ret;
> > }
> >
> > /* Return: number of state dwords restored or a negative error code on failure */
> > static int pf_send_guc_restore_vf_state(struct xe_gt *gt, unsigned int vfid,
> > - const void *buff, size_t size)
> > + const void *src, size_t size)
> > {
> > const int ndwords = size / sizeof(u32);
> > - struct xe_tile *tile = gt_to_tile(gt);
> > - struct xe_device *xe = tile_to_xe(tile);
> > struct xe_guc *guc = >->uc.guc;
> > - struct xe_bo *bo;
> > + CLASS(xe_guc_buf_from_data, buf)(&guc->buf, src, size);
> > int ret;
> >
> > xe_gt_assert(gt, size % sizeof(u32) == 0);
> > xe_gt_assert(gt, size == ndwords * sizeof(u32));
> >
> > - bo = xe_bo_create_pin_map_novm(xe, tile,
> > - ALIGN(size, PAGE_SIZE),
> > - ttm_bo_type_kernel,
> > - XE_BO_FLAG_SYSTEM |
> > - XE_BO_FLAG_GGTT |
> > - XE_BO_FLAG_GGTT_INVALIDATE, false);
> > - if (IS_ERR(bo))
> > - return PTR_ERR(bo);
> > -
> > - xe_map_memcpy_to(xe, &bo->vmap, 0, buff, size);
> > + if (!xe_guc_buf_is_valid(buf))
> > + return -ENOBUFS;
> >
> > ret = guc_action_vf_save_restore(guc, vfid, GUC_PF_OPCODE_VF_RESTORE,
> > - xe_bo_ggtt_addr(bo), ndwords);
> > + xe_guc_buf_flush(buf), ndwords);
> > if (!ret)
> > ret = -ENODATA;
> > else if (ret > ndwords)
> > ret = -EPROTO;
> >
> > - xe_bo_unpin_map_no_vm(bo);
> > return ret;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> > index ccc7c60ae9b77..71ca06d1af62b 100644
> > --- a/drivers/gpu/drm/xe/xe_guc.c
> > +++ b/drivers/gpu/drm/xe/xe_guc.c
> > @@ -857,7 +857,7 @@ int xe_guc_init_post_hwconfig(struct xe_guc *guc)
> > if (ret)
> > return ret;
> >
> > - ret = xe_guc_buf_cache_init(&guc->buf, SZ_8K);
> > + ret = xe_guc_buf_cache_init(&guc->buf, IS_SRIOV_PF(guc_to_xe(guc)) ? SZ_8M : SZ_8K);
>
> shouldn't we also check for xe_sriov_pf_migration_supported() ?
Ok.
>
> also, shouldn't we get this SZ_8M somewhere from the PF code?
> and maybe PF could (one day) query that somehow from the GuC?
I'll start a discussion, but for now we'll stick to hardcoded max.
And it turns out it's just shy of 4M, so I'll reduce the size to SZ_4M.
-Michał
>
>
> > if (ret)
> > return ret;
> >
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 16/26] drm/xe/pf: Handle GuC migration data as part of PF control
2025-10-13 11:56 ` Michal Wajdeczko
@ 2025-10-21 0:52 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 0:52 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 01:56:48PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Connect the helpers to allow save and restore of GuC migration data in
> > stop_copy / resume device state.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 28 ++++++++++++++++++-
> > .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 1 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 8 ++++++
> > 3 files changed, 36 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index 6ece775b2e80e..f73a3bf40037c 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -187,6 +187,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> > CASE2STR(PAUSED);
> > CASE2STR(MIGRATION_DATA_WIP);
> > CASE2STR(SAVE_WIP);
> > + CASE2STR(SAVE_DATA_GUC);
> > CASE2STR(SAVE_FAILED);
> > CASE2STR(SAVED);
> > CASE2STR(RESTORE_WIP);
> > @@ -338,6 +339,7 @@ static void pf_exit_vf_mismatch(struct xe_gt *gt, unsigned int vfid)
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_STOP_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_PAUSE_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_RESUME_FAILED);
> > + pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_FLR_FAILED);
> > }
> >
> > @@ -801,6 +803,7 @@ void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
> >
> > static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> > }
> >
> > @@ -820,16 +823,35 @@ static void pf_exit_vf_saved(struct xe_gt *gt, unsigned int vfid)
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVED);
> > }
> >
> > +static void pf_enter_vf_save_failed(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_FAILED);
> > + pf_exit_vf_wip(gt, vfid);
> > +}
> > +
> > static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > + int ret;
> > +
> > if (!pf_check_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP))
> > return false;
> >
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC)) {
> > + ret = xe_gt_sriov_pf_migration_guc_save(gt, vfid);
> > + if (ret)
> > + goto err;
> > + return true;
> > + }
> > +
> > xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
> > pf_exit_vf_save_wip(gt, vfid);
> > pf_enter_vf_saved(gt, vfid);
> >
> > return true;
> > +
> > +err:
> > + pf_enter_vf_save_failed(gt, vfid);
> > + return false;
>
> return true - as this is an indication that state was processed (not that it was successful or not)
Ok.
>
> > }
> >
> > static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > @@ -838,6 +860,8 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> > pf_exit_vf_restored(gt, vfid);
> > pf_enter_vf_wip(gt, vfid);
> > + if (xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0)
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> > pf_queue_vf(gt, vfid);
> > return true;
> > }
> > @@ -946,6 +970,8 @@ static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
> > struct xe_sriov_pf_migration_data *data)
> > {
> > switch (data->type) {
> > + case XE_SRIOV_MIG_DATA_GUC:
> > + return xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
> > default:
> > xe_gt_sriov_notice(gt, "Skipping VF%u invalid data type: %d\n", vfid, data->type);
> > pf_enter_vf_restore_failed(gt, vfid);
> > @@ -996,7 +1022,7 @@ static bool pf_enter_vf_restore_wip(struct xe_gt *gt, unsigned int vfid)
> > pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP);
> > pf_exit_vf_saved(gt, vfid);
> > pf_enter_vf_wip(gt, vfid);
> > - pf_enter_vf_restored(gt, vfid);
> > + pf_queue_vf(gt, vfid);
> > return true;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > index 68ec9d1fc3daf..b9787c425d9f6 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > @@ -71,6 +71,7 @@ enum xe_gt_sriov_control_bits {
> > XE_GT_SRIOV_STATE_MIGRATION_DATA_WIP,
> >
> > XE_GT_SRIOV_STATE_SAVE_WIP,
> > + XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
> > XE_GT_SRIOV_STATE_SAVE_FAILED,
> > XE_GT_SRIOV_STATE_SAVED,
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index e1031465e65c4..0c10284f0b09a 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -279,9 +279,17 @@ int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
> > ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> > {
> > ssize_t total = 0;
> > + ssize_t size;
> >
> > xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> >
> > + size = xe_gt_sriov_pf_migration_guc_size(gt, vfid);
> > + if (size < 0)
> > + return size;
> > + else if (size > 0)
>
> no need for "else"
>
> and isn't zero GuC state size an error anyway ?
Replaced with an assert.
Thanks,
-Michał
>
> > + size += sizeof(struct xe_sriov_pf_migration_hdr);
> > + total += size;
> > +
> > return total;
> > }
> >
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 17/26] drm/xe/pf: Add helpers for VF GGTT migration data handling
2025-10-13 12:17 ` Michal Wajdeczko
@ 2025-10-21 1:00 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 1:00 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 02:17:56PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > In an upcoming change, the VF GGTT migration data will be handled as
> > part of VF control state machine. Add the necessary helpers to allow the
> > migration data transfer to/from the HW GGTT resource.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_ggtt.c | 92 ++++++++++++++++++++++
> > drivers/gpu/drm/xe/xe_ggtt.h | 2 +
> > drivers/gpu/drm/xe/xe_ggtt_types.h | 2 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 64 +++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h | 5 ++
> > 5 files changed, 165 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
> > index aca7ae5489b91..89c0ad56c6a8a 100644
> > --- a/drivers/gpu/drm/xe/xe_ggtt.c
> > +++ b/drivers/gpu/drm/xe/xe_ggtt.c
> > @@ -138,6 +138,14 @@ static void xe_ggtt_set_pte_and_flush(struct xe_ggtt *ggtt, u64 addr, u64 pte)
> > ggtt_update_access_counter(ggtt);
> > }
> >
> > +static u64 xe_ggtt_get_pte(struct xe_ggtt *ggtt, u64 addr)
> > +{
> > + xe_tile_assert(ggtt->tile, !(addr & XE_PTE_MASK));
> > + xe_tile_assert(ggtt->tile, addr < ggtt->size);
> > +
> > + return readq(&ggtt->gsm[addr >> XE_PTE_SHIFT]);
> > +}
> > +
> > static void xe_ggtt_clear(struct xe_ggtt *ggtt, u64 start, u64 size)
> > {
> > u16 pat_index = tile_to_xe(ggtt->tile)->pat.idx[XE_CACHE_WB];
> > @@ -220,16 +228,19 @@ void xe_ggtt_might_lock(struct xe_ggtt *ggtt)
> > static const struct xe_ggtt_pt_ops xelp_pt_ops = {
> > .pte_encode_flags = xelp_ggtt_pte_flags,
> > .ggtt_set_pte = xe_ggtt_set_pte,
> > + .ggtt_get_pte = xe_ggtt_get_pte,
> > };
> >
> > static const struct xe_ggtt_pt_ops xelpg_pt_ops = {
> > .pte_encode_flags = xelpg_ggtt_pte_flags,
> > .ggtt_set_pte = xe_ggtt_set_pte,
> > + .ggtt_get_pte = xe_ggtt_get_pte,
> > };
> >
> > static const struct xe_ggtt_pt_ops xelpg_pt_wa_ops = {
> > .pte_encode_flags = xelpg_ggtt_pte_flags,
> > .ggtt_set_pte = xe_ggtt_set_pte_and_flush,
> > + .ggtt_get_pte = xe_ggtt_get_pte,
> > };
> >
> > static void __xe_ggtt_init_early(struct xe_ggtt *ggtt, u32 reserved)
> > @@ -914,6 +925,87 @@ void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid)
> > xe_ggtt_assign_locked(node->ggtt, &node->base, vfid);
> > mutex_unlock(&node->ggtt->lock);
> > }
> > +
> > +/**
> > + * xe_ggtt_node_save - Save a &struct xe_ggtt_node to a buffer
> > + * @node: the &struct xe_ggtt_node to be saved
> > + * @dst: destination buffer
>
> correct me: this is buffer for the PTEs
>
> > + * @size: destination buffer size in bytes
>
> and this is size of above buffer
>
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size)
> > +{
> > + struct xe_ggtt *ggtt;
> > + u64 start, end;
> > + u64 *buf = dst;
> > +
> > + if (!node || !node->ggtt)
> > + return -ENOENT;
>
> hmm, non-NULL node must be initialized by xe_ggtt_node_init() which sets the .ggtt so this second check is redundant
Ok.
>
> > +
> > + mutex_lock(&node->ggtt->lock);
>
> guard(mutex)(&node->ggtt->lock);
Ok.
>
> > +
> > + ggtt = node->ggtt;
> > + start = node->base.start;
> > + end = start + node->base.size - 1;
> > +
> > + if (node->base.size < size) {
>
> so that's looks wrong, we are about to save 64bit PTEs of that node
>
> we should compare size of all PTEs not the size of address space allocated by this node
I'll replace it with
if (xe_ggtt_pte_size(ggtt, size: node->base.size) > size)
return -EINVAL;
>
> > + mutex_unlock(&node->ggtt->lock);
> > + return -EINVAL;
> > + }
> > +
> > + while (start < end) {
> > + *buf++ = ggtt->pt_ops->ggtt_get_pte(ggtt, start) & ~GGTT_PTE_VFID;
> > + start += XE_PAGE_SIZE;
> > + }
> > +
> > + mutex_unlock(&node->ggtt->lock);
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_ggtt_node_load - Load a &struct xe_ggtt_node from a buffer
> > + * @node: the &struct xe_ggtt_node to be loaded
> > + * @src: source buffer
> > + * @size: source buffer size in bytes
> > + * @vfid: VF identifier
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid)
> > +{
> > + struct xe_ggtt *ggtt;
> > + u64 start, end;
> > + const u64 *buf = src;
> > + u64 vfid_pte = xe_encode_vfid_pte(vfid);
>
> try to define vars in reverse xmas tree order
Ok.
>
> > +
> > + if (!node || !node->ggtt)
> > + return -ENOENT;
> > +
> > + mutex_lock(&node->ggtt->lock);
>
> use guard(mutex)
Ok.
>
> > +
> > + ggtt = node->ggtt;
> > + start = node->base.start;
> > + end = start + size - 1;
> > +
> > + if (node->base.size != size) {
> > + mutex_unlock(&node->ggtt->lock);
> > + return -EINVAL;
> > + }
> > +
> > + while (start < end) {
> > + ggtt->pt_ops->ggtt_set_pte(ggtt, start, (*buf & ~GGTT_PTE_VFID) | vfid_pte);
> > + start += XE_PAGE_SIZE;
> > + buf++;
> > + }
> > + xe_ggtt_invalidate(ggtt);
> > +
> > + mutex_unlock(&node->ggtt->lock);
> > +
> > + return 0;
> > +}
> > +
> > #endif
> >
> > /**
> > diff --git a/drivers/gpu/drm/xe/xe_ggtt.h b/drivers/gpu/drm/xe/xe_ggtt.h
> > index 75fc7a1efea76..469b3a6ca14b4 100644
> > --- a/drivers/gpu/drm/xe/xe_ggtt.h
> > +++ b/drivers/gpu/drm/xe/xe_ggtt.h
> > @@ -43,6 +43,8 @@ u64 xe_ggtt_print_holes(struct xe_ggtt *ggtt, u64 alignment, struct drm_printer
> >
> > #ifdef CONFIG_PCI_IOV
> > void xe_ggtt_assign(const struct xe_ggtt_node *node, u16 vfid);
> > +int xe_ggtt_node_save(struct xe_ggtt_node *node, void *dst, size_t size);
> > +int xe_ggtt_node_load(struct xe_ggtt_node *node, const void *src, size_t size, u16 vfid);
> > #endif
> >
> > #ifndef CONFIG_LOCKDEP
> > diff --git a/drivers/gpu/drm/xe/xe_ggtt_types.h b/drivers/gpu/drm/xe/xe_ggtt_types.h
> > index c5e999d58ff2a..dacd796f81844 100644
> > --- a/drivers/gpu/drm/xe/xe_ggtt_types.h
> > +++ b/drivers/gpu/drm/xe/xe_ggtt_types.h
> > @@ -78,6 +78,8 @@ struct xe_ggtt_pt_ops {
> > u64 (*pte_encode_flags)(struct xe_bo *bo, u16 pat_index);
> > /** @ggtt_set_pte: Directly write into GGTT's PTE */
> > void (*ggtt_set_pte)(struct xe_ggtt *ggtt, u64 addr, u64 pte);
> > + /** @ggtt_get_pte: Directly read from GGTT's PTE */
> > + u64 (*ggtt_get_pte)(struct xe_ggtt *ggtt, u64 addr);
> > };
> >
> > #endif
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > index b2e5c52978e6a..51027921b2988 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> > @@ -726,6 +726,70 @@ int xe_gt_sriov_pf_config_set_fair_ggtt(struct xe_gt *gt, unsigned int vfid,
> > return xe_gt_sriov_pf_config_bulk_set_ggtt(gt, vfid, num_vfs, fair);
> > }
> >
> > +/**
> > + * xe_gt_sriov_pf_config_ggtt_save - Save a VF provisioned GGTT data into a buffer.
> > + * @gt: the &struct xe_gt
> > + * @vfid: VF identifier
> > + * @buf: the GGTT data destination buffer
> > + * @size: the size of the buffer
> > + *
> > + * This function can only be called on PF.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
> > + void *buf, size_t size)
> > +{
> > + struct xe_gt_sriov_config *config;
> > + ssize_t ret;
>
> int
>
> > +
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid);
> > + xe_gt_assert(gt, !(!buf ^ !size));
>
> there seems to be no "query" option for this call, so both buf & size must be valid
>
> > +
> > + mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
> > + config = pf_pick_vf_config(gt, vfid);
> > + size = size / sizeof(u64) * XE_PAGE_SIZE;
>
> ?? something is wrong here - why do we have to change the size of the buf?
Should be simplified after tweaking the logic with size conversions
to/from PTE.
>
> > +
> > + ret = xe_ggtt_node_save(config->ggtt_region, buf, size);
> > +
> > + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
> > +
> > + return ret;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_config_ggtt_restore - Restore a VF provisioned GGTT data from a buffer.
> > + * @gt: the &struct xe_gt
> > + * @vfid: VF identifier
> > + * @buf: the GGTT data source buffer
> > + * @size: the size of the buffer
> > + *
> > + * This function can only be called on PF.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> > + const void *buf, size_t size)
> > +{
> > + struct xe_gt_sriov_config *config;
> > + ssize_t ret;
> > +
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid);
> > + xe_gt_assert(gt, !(!buf ^ !size));
> > +
> > + mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
> > + config = pf_pick_vf_config(gt, vfid);
> > + size = size / sizeof(u64) * XE_PAGE_SIZE;
> > +
> > + ret = xe_ggtt_node_load(config->ggtt_region, buf, size, vfid);
> > +
> > + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
> > +
> > + return ret;
> > +}
>
> ditto
Ok.
Thanks,
-Michał
>
> > +
> > static u32 pf_get_min_spare_ctxs(struct xe_gt *gt)
> > {
> > /* XXX: preliminary */
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > index 513e6512a575b..6916b8f58ebf2 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
> > @@ -61,6 +61,11 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
> > int xe_gt_sriov_pf_config_restore(struct xe_gt *gt, unsigned int vfid,
> > const void *buf, size_t size);
> >
> > +int xe_gt_sriov_pf_config_ggtt_save(struct xe_gt *gt, unsigned int vfid,
> > + void *buf, size_t size);
> > +int xe_gt_sriov_pf_config_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> > + const void *buf, size_t size);
> > +
> > bool xe_gt_sriov_pf_config_is_empty(struct xe_gt *gt, unsigned int vfid);
> >
> > int xe_gt_sriov_pf_config_init(struct xe_gt *gt);
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 18/26] drm/xe/pf: Handle GGTT migration data as part of PF control
2025-10-13 12:36 ` Michal Wajdeczko
@ 2025-10-21 1:16 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 1:16 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 02:36:56PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Connect the helpers to allow save and restore of GGTT migration data in
> > stop_copy / resume device state.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c | 13 ++
> > .../gpu/drm/xe/xe_gt_sriov_pf_control_types.h | 1 +
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c | 119 ++++++++++++++++++
> > drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h | 4 +
> > 4 files changed, 137 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > index f73a3bf40037c..a74f6feca4830 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control.c
> > @@ -188,6 +188,7 @@ static const char *control_bit_to_string(enum xe_gt_sriov_control_bits bit)
> > CASE2STR(MIGRATION_DATA_WIP);
> > CASE2STR(SAVE_WIP);
> > CASE2STR(SAVE_DATA_GUC);
> > + CASE2STR(SAVE_DATA_GGTT);
> > CASE2STR(SAVE_FAILED);
> > CASE2STR(SAVED);
> > CASE2STR(RESTORE_WIP);
> > @@ -803,6 +804,7 @@ void xe_gt_sriov_pf_control_vf_data_eof(struct xe_gt *gt, unsigned int vfid)
> >
> > static void pf_exit_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > {
> > + pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
> > pf_escape_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> > pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_WIP);
> > }
> > @@ -843,6 +845,13 @@ static bool pf_handle_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > return true;
> > }
> >
> > + if (pf_exit_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT)) {
> > + ret = xe_gt_sriov_pf_migration_ggtt_save(gt, vfid);
> > + if (ret)
> > + goto err;
> > + return true;
> > + }
> > +
> > xe_gt_sriov_pf_control_vf_data_eof(gt, vfid);
> > pf_exit_vf_save_wip(gt, vfid);
> > pf_enter_vf_saved(gt, vfid);
> > @@ -862,6 +871,8 @@ static bool pf_enter_vf_save_wip(struct xe_gt *gt, unsigned int vfid)
> > pf_enter_vf_wip(gt, vfid);
> > if (xe_gt_sriov_pf_migration_guc_size(gt, vfid) > 0)
> > pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GUC);
> > + if (xe_gt_sriov_pf_migration_ggtt_size(gt, vfid) > 0)
> > + pf_enter_vf_state(gt, vfid, XE_GT_SRIOV_STATE_SAVE_DATA_GGTT);
> > pf_queue_vf(gt, vfid);
> > return true;
> > }
> > @@ -970,6 +981,8 @@ static int pf_handle_vf_restore_data(struct xe_gt *gt, unsigned int vfid,
> > struct xe_sriov_pf_migration_data *data)
> > {
> > switch (data->type) {
> > + case XE_SRIOV_MIG_DATA_GGTT:
> > + return xe_gt_sriov_pf_migration_ggtt_restore(gt, vfid, data);
> > case XE_SRIOV_MIG_DATA_GUC:
> > return xe_gt_sriov_pf_migration_guc_restore(gt, vfid, data);
> > default:
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > index b9787c425d9f6..c94ff0258306a 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_control_types.h
> > @@ -72,6 +72,7 @@ enum xe_gt_sriov_control_bits {
> >
> > XE_GT_SRIOV_STATE_SAVE_WIP,
> > XE_GT_SRIOV_STATE_SAVE_DATA_GUC,
> > + XE_GT_SRIOV_STATE_SAVE_DATA_GGTT,
> > XE_GT_SRIOV_STATE_SAVE_FAILED,
> > XE_GT_SRIOV_STATE_SAVED,
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > index 0c10284f0b09a..92ecf47e71bc7 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.c
> > @@ -7,6 +7,7 @@
> >
> > #include "abi/guc_actions_sriov_abi.h"
> > #include "xe_bo.h"
> > +#include "xe_gt_sriov_pf_config.h"
> > #include "xe_gt_sriov_pf_control.h"
> > #include "xe_gt_sriov_pf_helpers.h"
> > #include "xe_gt_sriov_pf_migration.h"
> > @@ -37,6 +38,117 @@ static void pf_dump_mig_data(struct xe_gt *gt, unsigned int vfid,
> > }
> > }
> >
> > +static int pf_save_vf_ggtt_mig_data(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + struct xe_sriov_pf_migration_data *data;
> > + size_t size;
> > + int ret;
> > +
> > + size = xe_gt_sriov_pf_config_get_ggtt(gt, vfid);
> > + if (size == 0)
> > + return 0;
> > + size = size / XE_PAGE_SIZE * sizeof(u64);
>
> maybe it would be better to avoid reusing the var and have two:
>
> u64 alloc_size = xe_gt_sriov_pf_config_get_ggtt(...);
> u64 pte_size = xe_ggtt_pte_size(alloc_size);
We just need the pte size.
>
> > +
> > + data = xe_sriov_pf_migration_data_alloc(gt_to_xe(gt));
> > + if (!data)
> > + return -ENOMEM;
> > +
> > + ret = xe_sriov_pf_migration_data_init(data, gt->tile->id, gt->info.id,
> > + XE_SRIOV_MIG_DATA_GGTT, 0, size);
> > + if (ret)
> > + goto fail;
> > +
> > + ret = xe_gt_sriov_pf_config_ggtt_save(gt, vfid, data->vaddr, size);
> > + if (ret)
> > + goto fail;
> > +
> > + pf_dump_mig_data(gt, vfid, data);
> > +
> > + ret = xe_gt_sriov_pf_migration_ring_produce(gt, vfid, data);
> > + if (ret)
> > + goto fail;
> > +
> > + return 0;
> > +
> > +fail:
> > + xe_sriov_pf_migration_data_free(data);
> > + xe_gt_sriov_err(gt, "Unable to save VF%u GGTT data (%d)\n", vfid, ret);
>
> use %pe for errors
Ok.
>
> > + return ret;
> > +}
> > +
> > +static int pf_restore_vf_ggtt_mig_data(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + size_t size;
> > + int ret;
> > +
> > + size = xe_gt_sriov_pf_config_get_ggtt(gt, vfid) / XE_PAGE_SIZE * sizeof(u64);
> > + if (size != data->hdr.size)
> > + return -EINVAL;
>
> do we need this ?
>
> there seems to be similar check in xe_ggtt_node_load() called by restore() below
I'll remove it.
>
> > +
> > + pf_dump_mig_data(gt, vfid, data);
> > +
> > + ret = xe_gt_sriov_pf_config_ggtt_restore(gt, vfid, data->vaddr, size);
> > + if (ret)
> > + return ret;
> > +
> > + return 0;
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_ggtt_size() - Get the size of VF GGTT migration data.
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: size in bytes or a negative error code on failure.
> > + */
> > +ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + if (gt != xe_root_mmio_gt(gt_to_xe(gt)))
> > + return 0;
> > +
> > + return xe_gt_sriov_pf_config_get_ggtt(gt, vfid) / XE_PAGE_SIZE * sizeof(u64);
>
> this conversion logic should be done by xe_ggtt layer helper
I'll add a helper.
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_ggtt_save() - Save VF GGTT migration data.
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
>
> since there is assert, probably you should also say: "(can't be 0)"
Ok.
Thanks,
-Michał
>
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid)
> > +{
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid != PFID);
> > + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> > +
> > + return pf_save_vf_ggtt_mig_data(gt, vfid);
> > +}
> > +
> > +/**
> > + * xe_gt_sriov_pf_migration_ggtt_restore() - Restore VF GGTT migration data.
> > + * @gt: the &struct xe_gt
> > + * @vfid: the VF identifier
> > + *
> > + * This function is for PF only.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data)
> > +{
> > + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> > + xe_gt_assert(gt, vfid != PFID);
> > + xe_gt_assert(gt, vfid <= xe_sriov_pf_get_totalvfs(gt_to_xe(gt)));
> > +
> > + return pf_restore_vf_ggtt_mig_data(gt, vfid, data);
> > +}
> > +
> > /* Return: number of dwords saved/restored/required or a negative error code on failure */
> > static int guc_action_vf_save_restore(struct xe_guc *guc, u32 vfid, u32 opcode,
> > u64 addr, u32 ndwords)
> > @@ -290,6 +402,13 @@ ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid)
> > size += sizeof(struct xe_sriov_pf_migration_hdr);
> > total += size;
> >
> > + size = xe_gt_sriov_pf_migration_ggtt_size(gt, vfid);
> > + if (size < 0)
> > + return size;
> > + else if (size > 0)
> > + size += sizeof(struct xe_sriov_pf_migration_hdr);
> > + total += size;
> > +
> > return total;
> > }
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > index 5df64449232bc..5bb8cba2ea0cb 100644
> > --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_migration.h
> > @@ -16,6 +16,10 @@ ssize_t xe_gt_sriov_pf_migration_guc_size(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_guc_save(struct xe_gt *gt, unsigned int vfid);
> > int xe_gt_sriov_pf_migration_guc_restore(struct xe_gt *gt, unsigned int vfid,
> > struct xe_sriov_pf_migration_data *data);
> > +ssize_t xe_gt_sriov_pf_migration_ggtt_size(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_migration_ggtt_save(struct xe_gt *gt, unsigned int vfid);
> > +int xe_gt_sriov_pf_migration_ggtt_restore(struct xe_gt *gt, unsigned int vfid,
> > + struct xe_sriov_pf_migration_data *data);
> >
> > ssize_t xe_gt_sriov_pf_migration_size(struct xe_gt *gt, unsigned int vfid);
> >
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 25/26] drm/xe/pf: Export helpers for VFIO
2025-10-12 18:32 ` Matthew Brost
@ 2025-10-21 1:38 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 1:38 UTC (permalink / raw)
To: Matthew Brost
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sun, Oct 12, 2025 at 11:32:47AM -0700, Matthew Brost wrote:
> On Sat, Oct 11, 2025 at 09:38:46PM +0200, Michał Winiarski wrote:
> > Vendor-specific VFIO driver for Xe will implement VF migration.
> > Export everything that's needed for migration ops.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile | 2 +
> > drivers/gpu/drm/xe/xe_sriov_vfio.c | 252 +++++++++++++++++++++++++++++
> > include/drm/intel/xe_sriov_vfio.h | 28 ++++
> > 3 files changed, 282 insertions(+)
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_vfio.c
> > create mode 100644 include/drm/intel/xe_sriov_vfio.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index e253d65366de4..a5c5afff42aa6 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -181,6 +181,8 @@ xe-$(CONFIG_PCI_IOV) += \
> > xe_sriov_pf_service.o \
> > xe_tile_sriov_pf_debugfs.o
> >
> > +xe-$(CONFIG_XE_VFIO_PCI) += xe_sriov_vfio.o
> > +
> > # include helpers for tests even when XE is built-in
> > ifdef CONFIG_DRM_XE_KUNIT_TEST
> > xe-y += tests/xe_kunit_helpers.o
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_vfio.c b/drivers/gpu/drm/xe/xe_sriov_vfio.c
> > new file mode 100644
> > index 0000000000000..a510d1bde93f0
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_vfio.c
> > @@ -0,0 +1,252 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#include <drm/intel/xe_sriov_vfio.h>
> > +
> > +#include "xe_pm.h"
> > +#include "xe_sriov.h"
> > +#include "xe_sriov_pf_control.h"
> > +#include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > +
> > +/**
> > + * xe_sriov_vfio_migration_supported() - Check if migration is supported.
> > + * @pdev: PF PCI device
> > + *
> > + * Return: true if migration is supported, false otherwise.
> > + */
> > +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + return xe_sriov_pf_migration_supported(xe);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_migration_supported, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_wait_flr_done - Wait for VF FLR completion.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will wait until VF FLR is processed by PF on all tiles (or
> > + * until timeout occurs).
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + return xe_sriov_pf_control_wait_flr(xe, vfid);
>
> Ideally I think you'd want the exported suffix to match on all these
> functions.
>
> i.e.,
>
> xe_sriov_vfio_SUFFIX
> xe_sriov_pf_control_SUFFIX
>
> Maybe this doesn't sense in all cases, so take as a suggestion, not a
> blocker.
The VFIO side uses different naming than the pf control.
Pause == Stop
Stop == Error
Restore == Resume
So the translation needs to happen at some place, and I guess the
current choice is at the exports.
>
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_wait_flr_done, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_stop - Stop VF.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will pause VF on all tiles/GTs.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
>
> The PF must hold PM ref behalf of the VF' (right?) as VF's don't have
> access to the runtime PM.
>
> So either you can assert a PM ref is held here and drop the put / get or
> use xe_pm_runtime_get_noresume here.
>
> Exporting the waking runtime PM IMO is risky as waking runtime PM takes
> as bunch of locks which could create a problem at the caller if it is
> holding locks, best to avoid this if possible.
I'll replace it with an assert.
>
> > + ret = xe_sriov_pf_control_pause_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_run - Run VF.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will resume VF on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_resume_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_run, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_stop_copy_enter - Copy VF migration data from device (while stopped).
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will save VF migration data on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_save_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_enter, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_stop_copy_exit - Wait until VF migration data save is done.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will wait until VF migration data is saved on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_wait_save_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_exit, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_resume_enter - Copy VF migration data to device (while stopped).
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will restore VF migration data on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_restore_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_enter, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_resume_exit - Wait until VF migration data is copied to the device.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will wait until VF migration data is restored on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_wait_restore_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_exit, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_error - Move VF to error state.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will stop VF on all tiles.
> > + * Reset is needed to move it out of error state.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_stop_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_error, "xe-vfio-pci");
> > +
>
> Kernel doc for the below functions.
Ok.
Thanks,
-Michał
>
> Matt
>
> > +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> > + char __user *buf, size_t len)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + return xe_sriov_pf_migration_data_read(xe, vfid, buf, len);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_read, "xe-vfio-pci");
> > +
> > +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> > + const char __user *buf, size_t len)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + return xe_sriov_pf_migration_data_write(xe, vfid, buf, len);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_write, "xe-vfio-pci");
> > +
> > +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + return xe_sriov_pf_migration_size(xe, vfid);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_size, "xe-vfio-pci");
> > diff --git a/include/drm/intel/xe_sriov_vfio.h b/include/drm/intel/xe_sriov_vfio.h
> > new file mode 100644
> > index 0000000000000..24e272f84c0e6
> > --- /dev/null
> > +++ b/include/drm/intel/xe_sriov_vfio.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_SRIOV_VFIO_H_
> > +#define _XE_SRIOV_VFIO_H_
> > +
> > +#include <linux/types.h>
> > +
> > +struct pci_dev;
> > +
> > +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> > +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid);
> > +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> > + char __user *buf, size_t len);
> > +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> > + const char __user *buf, size_t len);
> > +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid);
> > +
> > +#endif /* _XE_SRIOV_VFIO_H_ */
> > --
> > 2.50.1
> >
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 25/26] drm/xe/pf: Export helpers for VFIO
2025-10-13 14:02 ` Michal Wajdeczko
@ 2025-10-21 1:49 ` Michał Winiarski
0 siblings, 0 replies; 82+ messages in thread
From: Michał Winiarski @ 2025-10-21 1:49 UTC (permalink / raw)
To: Michal Wajdeczko
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Jason Gunthorpe, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Matthew Brost, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Mon, Oct 13, 2025 at 04:02:36PM +0200, Michal Wajdeczko wrote:
>
>
> On 10/11/2025 9:38 PM, Michał Winiarski wrote:
> > Vendor-specific VFIO driver for Xe will implement VF migration.
> > Export everything that's needed for migration ops.
> >
> > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
> > ---
> > drivers/gpu/drm/xe/Makefile | 2 +
> > drivers/gpu/drm/xe/xe_sriov_vfio.c | 252 +++++++++++++++++++++++++++++
> > include/drm/intel/xe_sriov_vfio.h | 28 ++++
> > 3 files changed, 282 insertions(+)
> > create mode 100644 drivers/gpu/drm/xe/xe_sriov_vfio.c
> > create mode 100644 include/drm/intel/xe_sriov_vfio.h
> >
> > diff --git a/drivers/gpu/drm/xe/Makefile b/drivers/gpu/drm/xe/Makefile
> > index e253d65366de4..a5c5afff42aa6 100644
> > --- a/drivers/gpu/drm/xe/Makefile
> > +++ b/drivers/gpu/drm/xe/Makefile
> > @@ -181,6 +181,8 @@ xe-$(CONFIG_PCI_IOV) += \
> > xe_sriov_pf_service.o \
> > xe_tile_sriov_pf_debugfs.o
> >
> > +xe-$(CONFIG_XE_VFIO_PCI) += xe_sriov_vfio.o
> > +
> > # include helpers for tests even when XE is built-in
> > ifdef CONFIG_DRM_XE_KUNIT_TEST
> > xe-y += tests/xe_kunit_helpers.o
> > diff --git a/drivers/gpu/drm/xe/xe_sriov_vfio.c b/drivers/gpu/drm/xe/xe_sriov_vfio.c
> > new file mode 100644
> > index 0000000000000..a510d1bde93f0
> > --- /dev/null
> > +++ b/drivers/gpu/drm/xe/xe_sriov_vfio.c
> > @@ -0,0 +1,252 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#include <drm/intel/xe_sriov_vfio.h>
> > +
> > +#include "xe_pm.h"
> > +#include "xe_sriov.h"
> > +#include "xe_sriov_pf_control.h"
> > +#include "xe_sriov_pf_migration.h"
> > +#include "xe_sriov_pf_migration_data.h"
> > +
> > +/**
> > + * xe_sriov_vfio_migration_supported() - Check if migration is supported.
> > + * @pdev: PF PCI device
> > + *
> > + * Return: true if migration is supported, false otherwise.
> > + */
> > +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + return xe_sriov_pf_migration_supported(xe);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_migration_supported, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_wait_flr_done - Wait for VF FLR completion.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
>
> or
>
> * @pdev: the PF struct &pci_dev device
> * @vfid: the VF identifier (can't be 0)
Ok
>
> > + *
> > + * This function will wait until VF FLR is processed by PF on all tiles (or
> > + * until timeout occurs).
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
>
> you also need to validate:
>
> vfid != PFID
> and
> vfid <= xe_sriov_pf_get_totalvfs()
>
> this applies to all exported functions below
Ok.
> > +
> > + return xe_sriov_pf_control_wait_flr(xe, vfid);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_wait_flr_done, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_stop - Stop VF.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will pause VF on all tiles/GTs.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
>
> maybe we should use xe_pm_runtime_get_if_active() to avoid awaking PF if there are no VFs?
>
> when VFs are enabled xe_pm_runtime_get_if_active() will always return true
I'll replace it with assert.
>
>
> > + ret = xe_sriov_pf_control_pause_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_run - Run VF.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will resume VF on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_resume_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_run, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_stop_copy_enter - Copy VF migration data from device (while stopped).
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will save VF migration data on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_save_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_enter, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_stop_copy_exit - Wait until VF migration data save is done.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will wait until VF migration data is saved on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_wait_save_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_exit, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_resume_enter - Copy VF migration data to device (while stopped).
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will restore VF migration data on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_restore_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_enter, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_resume_exit - Wait until VF migration data is copied to the device.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will wait until VF migration data is restored on all tiles.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_wait_restore_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_resume_exit, "xe-vfio-pci");
> > +
> > +/**
> > + * xe_sriov_vfio_error - Move VF to error state.
> > + * @pdev: PF PCI device
> > + * @vfid: VF identifier
> > + *
> > + * This function will stop VF on all tiles.
> > + * Reset is needed to move it out of error state.
> > + *
> > + * Return: 0 on success or a negative error code on failure.
> > + */
> > +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > + int ret;
> > +
> > + if (!IS_SRIOV_PF(xe))
> > + return -ENODEV;
> > +
> > + xe_pm_runtime_get(xe);
> > + ret = xe_sriov_pf_control_stop_vf(xe, vfid);
> > + xe_pm_runtime_put(xe);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_error, "xe-vfio-pci");
> > +
>
> add kernel-doc
Ok.
>
> > +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> > + char __user *buf, size_t len)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
>
> missing param validation
>
> is PF
> is valid VFID
Ok.
>
> no RPM ?
In this case, there's no need for RPM, as read/write calls are only
interacting with the ring.
>
> > +
> > + return xe_sriov_pf_migration_data_read(xe, vfid, buf, len);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_read, "xe-vfio-pci");
> > +
> > +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> > + const char __user *buf, size_t len)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + return xe_sriov_pf_migration_data_write(xe, vfid, buf, len);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_data_write, "xe-vfio-pci");
> > +
> > +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid)
> > +{
> > + struct xe_device *xe = pci_get_drvdata(pdev);
> > +
> > + return xe_sriov_pf_migration_size(xe, vfid);
> > +}
> > +EXPORT_SYMBOL_FOR_MODULES(xe_sriov_vfio_stop_copy_size, "xe-vfio-pci");
> > diff --git a/include/drm/intel/xe_sriov_vfio.h b/include/drm/intel/xe_sriov_vfio.h
> > new file mode 100644
> > index 0000000000000..24e272f84c0e6
> > --- /dev/null
> > +++ b/include/drm/intel/xe_sriov_vfio.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: MIT */
> > +/*
> > + * Copyright © 2025 Intel Corporation
> > + */
> > +
> > +#ifndef _XE_SRIOV_VFIO_H_
> > +#define _XE_SRIOV_VFIO_H_
> > +
> > +#include <linux/types.h>
> > +
> > +struct pci_dev;
> > +
> > +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> > +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_stop_copy_exit(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_resume_enter(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_resume_exit(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_error(struct pci_dev *pdev, unsigned int vfid);
> > +ssize_t xe_sriov_vfio_data_read(struct pci_dev *pdev, unsigned int vfid,
> > + char __user *buf, size_t len);
> > +ssize_t xe_sriov_vfio_data_write(struct pci_dev *pdev, unsigned int vfid,
> > + const char __user *buf, size_t len);
> > +ssize_t xe_sriov_vfio_stop_copy_size(struct pci_dev *pdev, unsigned int vfid);
> > +
> > +#endif /* _XE_SRIOV_VFIO_H_ */
>
> this is a very simple header, no need to repeat include guard name here
Ok.
Thanks,
-Michał
>
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-11 19:38 ` [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
2025-10-13 19:00 ` Rodrigo Vivi
@ 2025-10-21 23:03 ` Jason Gunthorpe
2025-10-21 23:14 ` Matthew Brost
2025-10-22 9:05 ` Michał Winiarski
1 sibling, 2 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2025-10-21 23:03 UTC (permalink / raw)
To: Michał Winiarski
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Yishai Hadas, Kevin Tian, Shameer Kolothum,
intel-xe, linux-kernel, kvm, dri-devel, Matthew Brost,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Sat, Oct 11, 2025 at 09:38:47PM +0200, Michał Winiarski wrote:
> + /*
> + * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
> + * selectively block p2p DMA transfers.
> + * The device is not processing new workload requests when the VF is stopped, and both
> + * memory and MMIO communication channels are transferred to destination (where processing
> + * will be resumed).
> + */
> + if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
> + (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
> + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
This comment is not right, RUNNING_P2P means the device can still
receive P2P activity on it's BAR. Eg a GPU will still allow read/write
to its framebuffer.
But it is not initiating any new transactions.
> +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> +{
> + struct xe_vfio_pci_core_device *xe_vdev =
> + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> +
> + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> + return;
> +
> + /* vfid starts from 1 for xe */
> + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
> + xe_vdev->pf = pdev->physfn;
No, this has to use pci_iov_get_pf_drvdata, and this driver should
never have a naked pf pointer flowing around.
The entire exported interface is wrongly formed:
+bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
+int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
+int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
None of these should be taking in a naked pci_dev, it should all work
on whatever type the drvdata is.
And this gross thing needs to go away too:
> + if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe") == 0)
> + xe_vfio_pci_migration_init(core_vdev);
Jason
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-21 23:03 ` Jason Gunthorpe
@ 2025-10-21 23:14 ` Matthew Brost
2025-10-21 23:38 ` Jason Gunthorpe
2025-10-22 9:05 ` Michał Winiarski
1 sibling, 1 reply; 82+ messages in thread
From: Matthew Brost @ 2025-10-21 23:14 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Tue, Oct 21, 2025 at 08:03:28PM -0300, Jason Gunthorpe wrote:
> On Sat, Oct 11, 2025 at 09:38:47PM +0200, Michał Winiarski wrote:
> > + /*
> > + * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
> > + * selectively block p2p DMA transfers.
> > + * The device is not processing new workload requests when the VF is stopped, and both
> > + * memory and MMIO communication channels are transferred to destination (where processing
> > + * will be resumed).
> > + */
> > + if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
> > + (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
> > + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
>
> This comment is not right, RUNNING_P2P means the device can still
> receive P2P activity on it's BAR. Eg a GPU will still allow read/write
> to its framebuffer.
>
> But it is not initiating any new transactions.
>
> > +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> > +{
> > + struct xe_vfio_pci_core_device *xe_vdev =
> > + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> > + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> > +
> > + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> > + return;
> > +
> > + /* vfid starts from 1 for xe */
> > + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
> > + xe_vdev->pf = pdev->physfn;
>
> No, this has to use pci_iov_get_pf_drvdata, and this driver should
> never have a naked pf pointer flowing around.
>
> The entire exported interface is wrongly formed:
>
> +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
>
> None of these should be taking in a naked pci_dev, it should all work
> on whatever type the drvdata is.
This seems entirely backwards. Why would the Xe module export its driver
structure to the VFIO module? That opens up potential vectors for
abuse—for example, the VFIO module accessing internal Xe device
structures. In my opinion, it's much cleaner to keep interfaces between
modules as opaque / generic as possible.
Matt
>
> And this gross thing needs to go away too:
>
> > + if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe") == 0)
> > + xe_vfio_pci_migration_init(core_vdev);
>
> Jason
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-21 23:14 ` Matthew Brost
@ 2025-10-21 23:38 ` Jason Gunthorpe
2025-10-22 1:15 ` Matthew Brost
0 siblings, 1 reply; 82+ messages in thread
From: Jason Gunthorpe @ 2025-10-21 23:38 UTC (permalink / raw)
To: Matthew Brost
Cc: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Tue, Oct 21, 2025 at 04:14:30PM -0700, Matthew Brost wrote:
> On Tue, Oct 21, 2025 at 08:03:28PM -0300, Jason Gunthorpe wrote:
> > On Sat, Oct 11, 2025 at 09:38:47PM +0200, Michał Winiarski wrote:
> > > + /*
> > > + * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
> > > + * selectively block p2p DMA transfers.
> > > + * The device is not processing new workload requests when the VF is stopped, and both
> > > + * memory and MMIO communication channels are transferred to destination (where processing
> > > + * will be resumed).
> > > + */
> > > + if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
> > > + (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
> > > + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
> >
> > This comment is not right, RUNNING_P2P means the device can still
> > receive P2P activity on it's BAR. Eg a GPU will still allow read/write
> > to its framebuffer.
> >
> > But it is not initiating any new transactions.
> >
> > > +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> > > +{
> > > + struct xe_vfio_pci_core_device *xe_vdev =
> > > + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> > > + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> > > +
> > > + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> > > + return;
> > > +
> > > + /* vfid starts from 1 for xe */
> > > + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
> > > + xe_vdev->pf = pdev->physfn;
> >
> > No, this has to use pci_iov_get_pf_drvdata, and this driver should
> > never have a naked pf pointer flowing around.
> >
> > The entire exported interface is wrongly formed:
> >
> > +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> > +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> > +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
> >
> > None of these should be taking in a naked pci_dev, it should all work
> > on whatever type the drvdata is.
>
> This seems entirely backwards. Why would the Xe module export its driver
> structure to the VFIO module?
Because that is how we designed this to work. You've completely
ignored the safety protocols built into this method.
> That opens up potential vectors for abuse—for example, the VFIO
> module accessing internal Xe device structures.
It does not, just use an opaque struct type.
> much cleaner to keep interfaces between modules as opaque / generic
> as possible.
Nope, don't do that. They should be limited and locked down. Passing
random pci_devs into these API is going to be bad.
Jason
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-21 23:38 ` Jason Gunthorpe
@ 2025-10-22 1:15 ` Matthew Brost
2025-10-22 13:02 ` Jason Gunthorpe
0 siblings, 1 reply; 82+ messages in thread
From: Matthew Brost @ 2025-10-22 1:15 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Tue, Oct 21, 2025 at 08:38:11PM -0300, Jason Gunthorpe wrote:
> On Tue, Oct 21, 2025 at 04:14:30PM -0700, Matthew Brost wrote:
> > On Tue, Oct 21, 2025 at 08:03:28PM -0300, Jason Gunthorpe wrote:
> > > On Sat, Oct 11, 2025 at 09:38:47PM +0200, Michał Winiarski wrote:
> > > > + /*
> > > > + * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
> > > > + * selectively block p2p DMA transfers.
> > > > + * The device is not processing new workload requests when the VF is stopped, and both
> > > > + * memory and MMIO communication channels are transferred to destination (where processing
> > > > + * will be resumed).
> > > > + */
> > > > + if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
> > > > + (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
> > > > + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
> > >
> > > This comment is not right, RUNNING_P2P means the device can still
> > > receive P2P activity on it's BAR. Eg a GPU will still allow read/write
> > > to its framebuffer.
> > >
> > > But it is not initiating any new transactions.
> > >
> > > > +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> > > > +{
> > > > + struct xe_vfio_pci_core_device *xe_vdev =
> > > > + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> > > > + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> > > > +
> > > > + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> > > > + return;
> > > > +
> > > > + /* vfid starts from 1 for xe */
> > > > + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
> > > > + xe_vdev->pf = pdev->physfn;
> > >
> > > No, this has to use pci_iov_get_pf_drvdata, and this driver should
> > > never have a naked pf pointer flowing around.
> > >
> > > The entire exported interface is wrongly formed:
> > >
> > > +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> > > +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> > > +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> > > +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> > > +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
> > >
> > > None of these should be taking in a naked pci_dev, it should all work
> > > on whatever type the drvdata is.
> >
> > This seems entirely backwards. Why would the Xe module export its driver
> > structure to the VFIO module?
>
> Because that is how we designed this to work. You've completely
> ignored the safety protocols built into this method.
>
> > That opens up potential vectors for abuse—for example, the VFIO
> > module accessing internal Xe device structures.
>
> It does not, just use an opaque struct type.
>
> > much cleaner to keep interfaces between modules as opaque / generic
> > as possible.
>
> Nope, don't do that. They should be limited and locked down. Passing
> random pci_devs into these API is going to be bad.
Ok, I think I see what you're getting at. The idea is to call
dev_set_drvdata on the Xe side, then use pci_iov_get_pf_drvdata on the
VFIO side to retrieve that data. This allows passing whatever Xe sets
via dev_set_drvdata between the module interfaces, while only
forward-declaring the interface struct in the shared header.
Am I understanding this correctly?
Matt
>
> Jason
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-21 23:03 ` Jason Gunthorpe
2025-10-21 23:14 ` Matthew Brost
@ 2025-10-22 9:05 ` Michał Winiarski
2025-10-27 7:02 ` Tian, Kevin
1 sibling, 1 reply; 82+ messages in thread
From: Michał Winiarski @ 2025-10-22 9:05 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Alex Williamson, Lucas De Marchi, Thomas Hellström,
Rodrigo Vivi, Yishai Hadas, Kevin Tian, Shameer Kolothum,
intel-xe, linux-kernel, kvm, dri-devel, Matthew Brost,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Tue, Oct 21, 2025 at 08:03:28PM -0300, Jason Gunthorpe wrote:
> On Sat, Oct 11, 2025 at 09:38:47PM +0200, Michał Winiarski wrote:
> > + /*
> > + * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't have the capability to
> > + * selectively block p2p DMA transfers.
> > + * The device is not processing new workload requests when the VF is stopped, and both
> > + * memory and MMIO communication channels are transferred to destination (where processing
> > + * will be resumed).
> > + */
> > + if ((cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_STOP) ||
> > + (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P)) {
> > + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
>
> This comment is not right, RUNNING_P2P means the device can still
> receive P2P activity on it's BAR. Eg a GPU will still allow read/write
> to its framebuffer.
>
> But it is not initiating any new transactions.
/*
* "STOP" handling is reused for "RUNNING_P2P", as the device doesn't
* have the capability to selectively block outgoing p2p DMA transfers.
* While the device is allowing BAR accesses when the VF is stopped, it
* is not processing any new workload requests, effectively stopping
* any outgoing DMA transfers (not just p2p).
* Both memory and MMIO communication channels with the workload
* scheduling firmware are transferred to destination (where processing
* will be resumed).
*/
Does this work better?
>
> > +static void xe_vfio_pci_migration_init(struct vfio_device *core_vdev)
> > +{
> > + struct xe_vfio_pci_core_device *xe_vdev =
> > + container_of(core_vdev, struct xe_vfio_pci_core_device, core_device.vdev);
> > + struct pci_dev *pdev = to_pci_dev(core_vdev->dev);
> > +
> > + if (!xe_sriov_vfio_migration_supported(pdev->physfn))
> > + return;
> > +
> > + /* vfid starts from 1 for xe */
> > + xe_vdev->vfid = pci_iov_vf_id(pdev) + 1;
> > + xe_vdev->pf = pdev->physfn;
>
> No, this has to use pci_iov_get_pf_drvdata, and this driver should
> never have a naked pf pointer flowing around.
>
> The entire exported interface is wrongly formed:
>
> +bool xe_sriov_vfio_migration_supported(struct pci_dev *pdev);
> +int xe_sriov_vfio_wait_flr_done(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_run(struct pci_dev *pdev, unsigned int vfid);
> +int xe_sriov_vfio_stop_copy_enter(struct pci_dev *pdev, unsigned int vfid);
>
> None of these should be taking in a naked pci_dev, it should all work
> on whatever type the drvdata is.
I'll change it to:
struct xe_device *xe_sriov_vfio_get_xe_device(struct pci_dev *pdev);
bool xe_sriov_vfio_migration_supported(struct xe_device *xe);
int xe_sriov_vfio_wait_flr_done(struct xe_device *xe, unsigned int vfid);
int xe_sriov_vfio_stop(struct xe_device *xe, unsigned int vfid);
int xe_sriov_vfio_run(struct xe_device *xe, unsigned int vfid);
int xe_sriov_vfio_stop_copy_enter(struct xe_device *xe, unsigned int vfid);
(...)
>
> And this gross thing needs to go away too:
>
> > + if (pdev->is_virtfn && strcmp(pdev->physfn->dev.driver->name, "xe") == 0)
> > + xe_vfio_pci_migration_init(core_vdev);
Right. With using pci_iov_get_pf_drvdata() it just goes away
automatically.
Thanks,
-Michał
>
> Jason
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-22 1:15 ` Matthew Brost
@ 2025-10-22 13:02 ` Jason Gunthorpe
0 siblings, 0 replies; 82+ messages in thread
From: Jason Gunthorpe @ 2025-10-22 13:02 UTC (permalink / raw)
To: Matthew Brost
Cc: Michał Winiarski, Alex Williamson, Lucas De Marchi,
Thomas Hellström, Rodrigo Vivi, Yishai Hadas, Kevin Tian,
Shameer Kolothum, intel-xe, linux-kernel, kvm, dri-devel,
Michal Wajdeczko, Jani Nikula, Joonas Lahtinen, Tvrtko Ursulin,
David Airlie, Simona Vetter, Lukasz Laguna
On Tue, Oct 21, 2025 at 06:15:19PM -0700, Matthew Brost wrote:
> Ok, I think I see what you're getting at. The idea is to call
> dev_set_drvdata on the Xe side, then use pci_iov_get_pf_drvdata on the
> VFIO side to retrieve that data. This allows passing whatever Xe sets
> via dev_set_drvdata between the module interfaces, while only
> forward-declaring the interface struct in the shared header.
Yes. The other email looks good:
struct xe_device *xe_sriov_vfio_get_xe_device(struct pci_dev *pdev);
Should call pci_iov_get_pf_drvdata() internally.
And 'struct xe_device' can be a forward declared type that cannot be
dereferenced by VFIO to enforce some code modularity.
Using strong types is obviously better than passing around pci_dev and
hoping for the best :)
Jason
^ permalink raw reply [flat|nested] 82+ messages in thread
* RE: [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics
2025-10-22 9:05 ` Michał Winiarski
@ 2025-10-27 7:02 ` Tian, Kevin
0 siblings, 0 replies; 82+ messages in thread
From: Tian, Kevin @ 2025-10-27 7:02 UTC (permalink / raw)
To: Winiarski, Michal, Jason Gunthorpe
Cc: Alex Williamson, De Marchi, Lucas, Thomas Hellström,
Vivi, Rodrigo, Yishai Hadas, Shameer Kolothum,
intel-xe@lists.freedesktop.org, linux-kernel@vger.kernel.org,
kvm@vger.kernel.org, dri-devel@lists.freedesktop.org,
Brost, Matthew, Wajdeczko, Michal, Jani Nikula, Joonas Lahtinen,
Tvrtko Ursulin, David Airlie, Simona Vetter, Laguna, Lukasz
> From: Winiarski, Michal <michal.winiarski@intel.com>
> Sent: Wednesday, October 22, 2025 5:05 PM
>
> On Tue, Oct 21, 2025 at 08:03:28PM -0300, Jason Gunthorpe wrote:
> > On Sat, Oct 11, 2025 at 09:38:47PM +0200, Michał Winiarski wrote:
> > > + /*
> > > + * "STOP" handling is reused for "RUNNING_P2P", as the device
> doesn't have the capability to
> > > + * selectively block p2p DMA transfers.
> > > + * The device is not processing new workload requests when the VF is
> stopped, and both
> > > + * memory and MMIO communication channels are transferred to
> destination (where processing
> > > + * will be resumed).
> > > + */
> > > + if ((cur == VFIO_DEVICE_STATE_RUNNING && new ==
> VFIO_DEVICE_STATE_STOP) ||
> > > + (cur == VFIO_DEVICE_STATE_RUNNING && new ==
> VFIO_DEVICE_STATE_RUNNING_P2P)) {
> > > + ret = xe_sriov_vfio_stop(xe_vdev->pf, xe_vdev->vfid);
> >
> > This comment is not right, RUNNING_P2P means the device can still
> > receive P2P activity on it's BAR. Eg a GPU will still allow read/write
> > to its framebuffer.
> >
> > But it is not initiating any new transactions.
>
> /*
> * "STOP" handling is reused for "RUNNING_P2P", as the device doesn't
> * have the capability to selectively block outgoing p2p DMA transfers.
> * While the device is allowing BAR accesses when the VF is stopped, it
> * is not processing any new workload requests, effectively stopping
> * any outgoing DMA transfers (not just p2p).
> * Both memory and MMIO communication channels with the workload
> * scheduling firmware are transferred to destination (where processing
> * will be resumed).
> */
>
> Does this work better?
it's better to articulate that not only the device allows accessing to its
MMIO regs/framebuffer but also the state of those accesses are queued/
kept and can be migrated as part of the device state later.
the last sentence is somehow related to that point, but let's make it clearer.
^ permalink raw reply [flat|nested] 82+ messages in thread
end of thread, other threads:[~2025-10-27 7:02 UTC | newest]
Thread overview: 82+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-11 19:38 [PATCH 00/26] vfio/xe: Add driver variant for Xe VF migration Michał Winiarski
2025-10-11 19:38 ` [PATCH 01/26] drm/xe/pf: Remove GuC version check for migration support Michał Winiarski
2025-10-12 18:31 ` Michal Wajdeczko
2025-10-20 14:46 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 02/26] drm/xe: Move migration support to device-level struct Michał Winiarski
2025-10-12 18:58 ` Michal Wajdeczko
2025-10-20 14:48 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 03/26] drm/xe/pf: Add save/restore control state stubs and connect to debugfs Michał Winiarski
2025-10-12 20:09 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 04/26] drm/xe/pf: Extract migration mutex out of its struct Michał Winiarski
2025-10-12 19:08 ` Matthew Brost
2025-10-20 14:50 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 05/26] drm/xe/pf: Add data structures and handlers for migration rings Michał Winiarski
2025-10-12 21:06 ` Michal Wajdeczko
2025-10-20 14:56 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 06/26] drm/xe/pf: Add helpers for migration data allocation / free Michał Winiarski
2025-10-12 19:12 ` Matthew Brost
2025-10-21 0:26 ` Michał Winiarski
2025-10-13 10:15 ` Michal Wajdeczko
2025-10-21 0:01 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 07/26] drm/xe/pf: Add support for encap/decap of bitstream to/from packet Michał Winiarski
2025-10-11 22:28 ` kernel test robot
2025-10-13 10:46 ` Michal Wajdeczko
2025-10-21 0:25 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 08/26] drm/xe/pf: Add minimalistic migration descriptor Michał Winiarski
2025-10-11 22:52 ` kernel test robot
2025-10-13 10:56 ` Michal Wajdeczko
2025-10-21 0:31 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 09/26] drm/xe/pf: Expose VF migration data size over debugfs Michał Winiarski
2025-10-12 19:15 ` Matthew Brost
2025-10-21 0:37 ` Michał Winiarski
2025-10-13 11:04 ` Michal Wajdeczko
2025-10-21 0:42 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 10/26] drm/xe: Add sa/guc_buf_cache sync interface Michał Winiarski
2025-10-12 18:06 ` Matthew Brost
2025-10-21 0:45 ` Michał Winiarski
2025-10-13 11:20 ` Michal Wajdeczko
2025-10-21 0:44 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 11/26] drm/xe: Allow the caller to pass guc_buf_cache size Michał Winiarski
2025-10-11 23:35 ` kernel test robot
2025-10-13 11:08 ` Michal Wajdeczko
2025-10-21 0:47 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 12/26] drm/xe/pf: Increase PF GuC Buffer Cache size and use it for VF migration Michał Winiarski
2025-10-13 11:27 ` Michal Wajdeczko
2025-10-21 0:50 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 13/26] drm/xe/pf: Remove GuC migration data save/restore from GT debugfs Michał Winiarski
2025-10-13 11:36 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 14/26] drm/xe/pf: Don't save GuC VF migration data on pause Michał Winiarski
2025-10-13 11:42 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 15/26] drm/xe/pf: Switch VF migration GuC save/restore to struct migration data Michał Winiarski
2025-10-11 19:38 ` [PATCH 16/26] drm/xe/pf: Handle GuC migration data as part of PF control Michał Winiarski
2025-10-13 11:56 ` Michal Wajdeczko
2025-10-21 0:52 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 17/26] drm/xe/pf: Add helpers for VF GGTT migration data handling Michał Winiarski
2025-10-13 12:17 ` Michal Wajdeczko
2025-10-21 1:00 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 18/26] drm/xe/pf: Handle GGTT migration data as part of PF control Michał Winiarski
2025-10-13 12:36 ` Michal Wajdeczko
2025-10-21 1:16 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 19/26] drm/xe/pf: Add helpers for VF MMIO migration data handling Michał Winiarski
2025-10-13 13:28 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 20/26] drm/xe/pf: Handle MMIO migration data as part of PF control Michał Winiarski
2025-10-11 19:38 ` [PATCH 21/26] drm/xe/pf: Add helper to retrieve VF's LMEM object Michał Winiarski
2025-10-11 19:38 ` [PATCH 22/26] drm/xe/migrate: Add function for raw copy of VRAM and CCS Michał Winiarski
2025-10-12 18:54 ` Matthew Brost
2025-10-11 19:38 ` [PATCH 23/26] drm/xe/pf: Handle VRAM migration data as part of PF control Michał Winiarski
2025-10-11 19:38 ` [PATCH 24/26] drm/xe/pf: Add wait helper for VF FLR Michał Winiarski
2025-10-13 13:49 ` Michal Wajdeczko
2025-10-11 19:38 ` [PATCH 25/26] drm/xe/pf: Export helpers for VFIO Michał Winiarski
2025-10-12 18:32 ` Matthew Brost
2025-10-21 1:38 ` Michał Winiarski
2025-10-13 14:02 ` Michal Wajdeczko
2025-10-21 1:49 ` Michał Winiarski
2025-10-11 19:38 ` [PATCH 26/26] vfio/xe: Add vendor-specific vfio_pci driver for Intel graphics Michał Winiarski
2025-10-13 19:00 ` Rodrigo Vivi
2025-10-21 23:03 ` Jason Gunthorpe
2025-10-21 23:14 ` Matthew Brost
2025-10-21 23:38 ` Jason Gunthorpe
2025-10-22 1:15 ` Matthew Brost
2025-10-22 13:02 ` Jason Gunthorpe
2025-10-22 9:05 ` Michał Winiarski
2025-10-27 7:02 ` Tian, Kevin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).