[PATCH v2 00/11] Introduce SRIOV scheduler groups

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 00/11] Introduce SRIOV scheduler groups
@ 2025-12-06 23:03 Daniele Ceraolo Spurio
  2025-12-06 23:03 ` [PATCH v2 01/11] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
                   ` (12 more replies)
  0 siblings, 13 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:03 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

The normal SRIOV setup timeslices the whole GT across VFs. While this is
fine in the great majority of cases, in some cases the admin knows that
a VF is not going to use all the GT HW and that some engines are going
to be permanently idle.
To increase HW utilization in such a scenario, starting from v70.53.0 the
GuC supports scheduler groups (a.k.a. Engine Group Scheduling or EGS);
this feature allows the driver to subdivide a GT into groups of engines,
which the GuC will then independently timeslice across VFs, thus allowing
multiple VF to access the HW at the same time. Given that each group is
independently scheduled, execution quantums and preemption timeouts are
settable per-group-per-VF. Note that while the GuC supports the feature
from v70.53.0, some fixes for it were merged in v70.55.1, so we require
the latter version in the driver.

While the GuC supports any group assignment (as long as each engine
belongs to only one group), we only allow specific tested configuration
to be set by the admin that are tailored to specific use-cases. This
series introduces one of those use cases: if each VF is doing a frame
rendering + encoding at a not-too-high resolution (e.g 1080p@30fps),
like it happens e.g. with a simple remote desktop, the render engine
can produce frames faster than the video engine can encode them.
However, our HW can have multiple video engines, so while one of them is
encoding a frame for a VF the other ones can be used for encoding frames
for other VFs. Given that media slices share some resources (e.g. SFC),
to obtain this parallel execution without impacting VF isolation we can
simply assign each media slice to a different group.

This series only allows enabling/disbling of this feature via debugfs
for now (like several other SRIOV features). Sysfs will be implemented
as a follow up, after the review of this series and the proposed
interface is complete.

The feature is enabled and disabled via the sched_groups_mode PF debugfs.
If any configs are supported on the GT, reading this file will dump the
available configs and which one is selected, e.g:

#cat sriov/pf/tile0/gt1/sched_groups_mode
[disabled] media_slices

Writing the config name to the file will enable that configuration.
Debugfs files are also available to set the per-group exec_quantum and
preempt_timeout, while a series of files under the sched_groups folder
lists the engines belonging to each group. Overall, the tree looks like
the following:

        /sys/kernel/debug/dri/BDF/
        ├── sriov
        :   ├── pf
            :   ├── tile0
                :   ├── gt0
                    :   ├── sched_groups_mode
                        ├── sched_groups_exec_quantums_ms
                        ├── sched_groups_preempt_timeout_us
                        ├── sched_groups
                        :   ├── group0
                            :
                :           └── groupN
                ├── vf1
                :   ├── tile0
                    :   ├── gt0
                        :   ├── sched_groups_exec_quantums_ms
                            ├── sched_groups_preempt_timeout_us
			    :

IMPORTANT NOTE: this series now requires GuC 70.55.1 or newer, while in
linux-firmware we're still on 70.54.0. A new GuC should be published
within the next few days, but this series can't be merged until that
happens and 70.55.1 or a newer GuC is in linux-firmware. We can however
still continue the review and get the patches ready for when the GuC is
published. Once the new GuC is available I'll also re-submit the series
for CI testing.

v2: several fixes, flow and style improvements, drop per-group EQ/PT
settings, allocate debugfs files statically, bump requirements to BMG+
and GuC v70.55.1.

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>

Daniele Ceraolo Spurio (11):
  drm/xe/gt: Add engine masks for each class
  drm/xe/sriov: Initialize scheduler groups
  drm/xe/sriov: Add support for enabling scheduler groups
  drm/xe/sriov: Scheduler groups are incompatible with multi-lrc
  drm/xe/sriov: Add handling for MLRC adverse event threshold
  drm/xe/sriov: Add debugfs to enable scheduler groups
  drm/xe/sriov: Add debugfs with scheduler groups information
  drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts
  drm/xe/sriov: Add functions to set exec quantums for each group
  drm/xe/sriov: Add functions to set preempt timeouts for each group
  drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups

 drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  68 ++++
 drivers/gpu/drm/xe/xe_exec_queue.c            |  19 +
 drivers/gpu/drm/xe/xe_gt.h                    |  15 +-
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c           |   8 +-
 drivers/gpu/drm/xe/xe_gt_ccs_mode.h           |   2 +-
 drivers/gpu/drm/xe/xe_gt_sriov_pf.c           |  20 ++
 drivers/gpu/drm/xe/xe_gt_sriov_pf.h           |   8 +
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    | 298 +++++++++++++++-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h    |  10 +
 .../gpu/drm/xe/xe_gt_sriov_pf_config_types.h  |   5 +-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c   | 325 +++++++++++++++++-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 311 +++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   6 +
 .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |  33 ++
 drivers/gpu/drm/xe/xe_gt_sriov_vf.c           |  60 ++++
 drivers/gpu/drm/xe/xe_gt_sriov_vf.h           |   1 +
 drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h     |   2 +
 drivers/gpu/drm/xe/xe_guc.c                   |   2 +-
 drivers/gpu/drm/xe/xe_guc.h                   |   7 +-
 drivers/gpu/drm/xe/xe_guc_fwif.h              |   2 +
 drivers/gpu/drm/xe/xe_guc_klv_helpers.c       |   9 +
 .../drm/xe/xe_guc_klv_thresholds_set_types.h  |  18 +-
 drivers/gpu/drm/xe/xe_guc_submit.c            |  23 +-
 drivers/gpu/drm/xe/xe_guc_submit.h            |   2 +
 drivers/gpu/drm/xe/xe_guc_version.h           |  36 ++
 25 files changed, 1245 insertions(+), 45 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_guc_version.h

-- 
2.43.0

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 01/11] drm/xe/gt: Add engine masks for each class
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
@ 2025-12-06 23:03 ` Daniele Ceraolo Spurio
  2025-12-07 15:35   ` Michal Wajdeczko
  2025-12-06 23:03 ` [PATCH v2 02/11] drm/xe/sriov: Initialize scheduler groups Daniele Ceraolo Spurio
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:03 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Follow up patches will need the engine masks for VCS and VECS engines.
Since we already have a macro for the CCS engines, just extend the same
approach to all classes.

To avoid confusion with the XE_HW_ENGINE_*_MASK masks, the new macros
use the _INSTANCES suffix instead. For consistency, rename CCS_MASK to
CCS_INSTANCES as well.

v2: Use _INSTANCES suffix (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/xe_gt.h          | 9 ++++++++-
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c | 8 ++++----
 drivers/gpu/drm/xe/xe_gt_ccs_mode.h | 2 +-
 drivers/gpu/drm/xe/xe_guc.c         | 2 +-
 drivers/gpu/drm/xe/xe_guc_submit.c  | 2 +-
 5 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 94969ddd9d88..2eeeeeb6b912 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -20,7 +20,14 @@
 		for_each_if(((hwe__) = (gt__)->hw_engines + (id__)) && \
 			  xe_hw_engine_is_valid((hwe__)))
 
-#define CCS_MASK(gt) (((gt)->info.engine_mask & XE_HW_ENGINE_CCS_MASK) >> XE_HW_ENGINE_CCS0)
+#define __ENGINE_INSTANCES_MASK(gt, name) \
+	(((gt)->info.engine_mask & XE_HW_ENGINE_##name##_MASK) >> XE_HW_ENGINE_##name##0)
+
+#define RCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, RCS)
+#define VCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, VCS)
+#define VECS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, VECS)
+#define CCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, CCS)
+#define GSCCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, GSCCS)
 
 #define GT_VER(gt) ({ \
 	typeof(gt) gt_ = (gt); \
diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
index 50fffc9ebf62..91ac22ef5703 100644
--- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
+++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
@@ -17,7 +17,7 @@
 static void __xe_gt_apply_ccs_mode(struct xe_gt *gt, u32 num_engines)
 {
 	u32 mode = CCS_MODE_CSLICE_0_3_MASK; /* disable all by default */
-	int num_slices = hweight32(CCS_MASK(gt));
+	int num_slices = hweight32(CCS_INSTANCES(gt));
 	struct xe_device *xe = gt_to_xe(gt);
 	int width, cslice = 0;
 	u32 config = 0;
@@ -59,7 +59,7 @@ static void __xe_gt_apply_ccs_mode(struct xe_gt *gt, u32 num_engines)
 			config |= BIT(hwe->instance) << XE_HW_ENGINE_CCS0;
 
 			/* If a slice is fused off, leave disabled */
-			while ((CCS_MASK(gt) & BIT(cslice)) == 0)
+			while ((CCS_INSTANCES(gt) & BIT(cslice)) == 0)
 				cslice++;
 
 			mode &= ~CCS_MODE_CSLICE(cslice, CCS_MODE_CSLICE_MASK);
@@ -94,7 +94,7 @@ num_cslices_show(struct device *kdev,
 {
 	struct xe_gt *gt = kobj_to_gt(&kdev->kobj);
 
-	return sysfs_emit(buf, "%u\n", hweight32(CCS_MASK(gt)));
+	return sysfs_emit(buf, "%u\n", hweight32(CCS_INSTANCES(gt)));
 }
 
 static DEVICE_ATTR_RO(num_cslices);
@@ -131,7 +131,7 @@ ccs_mode_store(struct device *kdev, struct device_attribute *attr,
 	 * Ensure number of engines specified is valid and there is an
 	 * exact multiple of engines for slices.
 	 */
-	num_slices = hweight32(CCS_MASK(gt));
+	num_slices = hweight32(CCS_INSTANCES(gt));
 	if (!num_engines || num_engines > num_slices || num_slices % num_engines) {
 		xe_gt_dbg(gt, "Invalid compute config, %d engines %d slices\n",
 			  num_engines, num_slices);
diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.h b/drivers/gpu/drm/xe/xe_gt_ccs_mode.h
index f8779852cf0d..ef3b853f5c8c 100644
--- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.h
+++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.h
@@ -17,7 +17,7 @@ int xe_gt_ccs_mode_sysfs_init(struct xe_gt *gt);
 static inline bool xe_gt_ccs_mode_enabled(const struct xe_gt *gt)
 {
 	/* Check if there are more than one compute engines available */
-	return hweight32(CCS_MASK(gt)) > 1;
+	return hweight32(CCS_INSTANCES(gt)) > 1;
 }
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
index 88376bc2a483..ccc914563ca0 100644
--- a/drivers/gpu/drm/xe/xe_guc.c
+++ b/drivers/gpu/drm/xe/xe_guc.c
@@ -175,7 +175,7 @@ static bool needs_wa_dual_queue(struct xe_gt *gt)
 	 * the DUAL_QUEUE_WA on all newer platforms on GTs that have CCS engines
 	 * to move management back to the GuC.
 	 */
-	if (CCS_MASK(gt) && GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270)
+	if (CCS_INSTANCES(gt) && GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270)
 		return true;
 
 	return false;
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index 3ca2558c8c96..af43acf7baae 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -388,7 +388,7 @@ static int guc_init_global_schedule_policy(struct xe_guc *guc)
 
 	*emit++ = XE_GUC_ACTION_UPDATE_SCHEDULING_POLICIES_KLV;
 
-	if (CCS_MASK(guc_to_gt(guc)))
+	if (CCS_INSTANCES(guc_to_gt(guc)))
 		emit = emit_render_compute_yield_klv(emit);
 
 	count = emit - data;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 02/11] drm/xe/sriov: Initialize scheduler groups
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
  2025-12-06 23:03 ` [PATCH v2 01/11] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
@ 2025-12-06 23:03 ` Daniele Ceraolo Spurio
  2025-12-07 21:57   ` Michal Wajdeczko
  2025-12-06 23:03 ` [PATCH v2 03/11] drm/xe/sriov: Add support for enabling " Daniele Ceraolo Spurio
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:03 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Scheduler groups (a.k.a. Engine Groups Scheduling, or EGS) is a GuC
feature that allows the driver to define groups of engines that are
independently scheduled across VFs, which allows different VFs to be
active on the HW at the same time on different groups. The feature is
available for BMG and newer HW starting on GuC 70.53.0, but some
required fixes have been added to GuC 70.55.1.

This is intended for specific scenarios where the admin knows that the
VFs are not going to fully utilize the HW and therefore assigning all of
it to a single VF would lead to part of it being permanently idle.
We do not allow the admin to decide how to divide the engines across
groups, but we instead support specific configurations that are designed
for specific use-cases. During PF initialization we detect which
configurations are possible on a given GT and create the relevant
groups. Since the GuC expect a mask for each class for each group, that
is what we save when we init the configs.

Right now we only have one use-case on the media GT. If the VFs are
running a frame render + encoding at a not-too-high resolution (e.g.
1080@30fps) the render can produce frames faster than the video engine
can encode them, which means that the maximum number of parallel VFs is
limited by the VCS bandwidth. Since our products can have multiple VCS
engines, allowing multiple VFs to be active on the different VCS engines
at the same time allows us to run more parallel VFs on the same HW.
Given that engines in the same media slice share some resources (e.g.
SFC), we assign each media slice to a different scheduling group. We
refer to this configuration as "media_slices", given that each slice
gets its own group.

Note that while the GuC interface supports a maximum of 8 groups, the
actual number of groups that can be enabled can be lower than that and
can be different on different devices. For now, all devices support up
to 2 groups.

v2: Use asserts for coding errors, code cleanups, better docs (Michal),
    limit groups to 2, limit to BMG and newer, bump required GuC to
    70.55.1.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/xe_gt.h                    |   6 +
 drivers/gpu/drm/xe/xe_gt_sriov_pf.c           |   3 +
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 142 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   1 +
 .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |  29 ++++
 5 files changed, 181 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 2eeeeeb6b912..0a34e862406e 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -29,6 +29,12 @@
 #define CCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, CCS)
 #define GSCCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, GSCCS)
 
+/*
+ * Each media slice has 1x VECS, so the max number of VECS instances gives us
+ * the max number of slices that a GT can have.
+ */
+#define MAX_MEDIA_SLICES hweight32(XE_HW_ENGINE_VECS_MASK)
+
 #define GT_VER(gt) ({ \
 	typeof(gt) gt_ = (gt); \
 	struct xe_device *xe = gt_to_xe(gt_); \
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
index 0714c758b9c1..0d97a823e702 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
@@ -14,6 +14,7 @@
 #include "xe_gt_sriov_pf_control.h"
 #include "xe_gt_sriov_pf_helpers.h"
 #include "xe_gt_sriov_pf_migration.h"
+#include "xe_gt_sriov_pf_policy.h"
 #include "xe_gt_sriov_pf_service.h"
 #include "xe_gt_sriov_printk.h"
 #include "xe_guc_submit.h"
@@ -123,6 +124,8 @@ int xe_gt_sriov_pf_init(struct xe_gt *gt)
 	if (err)
 		return err;
 
+	xe_gt_sriov_pf_policy_init(gt);
+
 	err = xe_gt_sriov_pf_migration_init(gt);
 	if (err)
 		return err;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
index 4445f660e6d1..158d68aff4b7 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
@@ -3,6 +3,8 @@
  * Copyright © 2023-2024 Intel Corporation
  */
 
+#include <drm/drm_managed.h>
+
 #include "abi/guc_actions_sriov_abi.h"
 
 #include "xe_bo.h"
@@ -10,6 +12,7 @@
 #include "xe_gt_sriov_pf_helpers.h"
 #include "xe_gt_sriov_pf_policy.h"
 #include "xe_gt_sriov_printk.h"
+#include "xe_guc.h"
 #include "xe_guc_buf.h"
 #include "xe_guc_ct.h"
 #include "xe_guc_klv_helpers.h"
@@ -351,6 +354,133 @@ u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt)
 	return value;
 }
 
+static void pf_sched_group_media_slices(struct xe_gt *gt, u32 **masks, u32 *num_masks)
+{
+	u8 slice_to_group[MAX_MEDIA_SLICES];
+	u32 vecs_mask = VECS_INSTANCES(gt);
+	u32 gsc_mask = GSCCS_INSTANCES(gt);
+	u32 vcs_mask = VCS_INSTANCES(gt);
+	struct xe_hw_engine *hwe;
+	enum xe_hw_engine_id id;
+	int groups = 0;
+	u32 *values;
+	int slice;
+
+	xe_gt_assert(gt, xe_gt_is_media_type(gt));
+
+	/* A media slice has 2 VCS and a VECS. We bundle the GSC with the first slice */
+	for (slice = 0; slice < MAX_MEDIA_SLICES; slice++) {
+		if ((vcs_mask & 0x3) || (vecs_mask & 0x1) || (gsc_mask & 0x1))
+			slice_to_group[slice] = groups++;
+
+		vcs_mask >>= 2;
+		vecs_mask >>= 1;
+		gsc_mask >>= 1;
+	}
+
+	xe_gt_assert(gt, !vcs_mask);
+	xe_gt_assert(gt, !vecs_mask);
+	xe_gt_assert(gt, !gsc_mask);
+
+	/* We need at least 2 slices to split them up */
+	if (groups < 2)
+		return;
+
+	if (groups > gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups) {
+		xe_gt_sriov_notice(gt, "media_slice mode has too many groups: %u vs %u\n",
+				   groups,
+				   gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups);
+		return;
+	}
+
+	/*
+	 * The GuC expects an array with GUC_MAX_ENGINE_CLASSES entries for
+	 * each group.
+	 */
+	values = drmm_kzalloc(&gt_to_xe(gt)->drm,
+			      GUC_MAX_ENGINE_CLASSES * groups * sizeof(u32),
+			      GFP_KERNEL);
+	if (!values)
+		return;
+
+	for_each_hw_engine(hwe, gt, id) {
+		u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
+		u8 entry;
+
+		switch (hwe->class) {
+		case XE_ENGINE_CLASS_VIDEO_DECODE:
+			slice = hwe->instance / 2;
+			break;
+		case XE_ENGINE_CLASS_VIDEO_ENHANCE:
+			slice = hwe->instance;
+			break;
+		case XE_ENGINE_CLASS_OTHER:
+			slice = 0;
+			break;
+		default:
+			xe_gt_assert_msg(gt, false,
+					 "unknown media gt class %u (%s) during EGS setup\n",
+					 hwe->class, hwe->name);
+			drmm_kfree(&gt_to_xe(gt)->drm, values);
+			return;
+		}
+
+		entry = slice_to_group[slice] * GUC_MAX_ENGINE_CLASSES + guc_class;
+		values[entry] |= BIT(hwe->logical_instance);
+	}
+
+	*masks = values;
+	*num_masks = GUC_MAX_ENGINE_CLASSES * groups;
+}
+
+static void pf_init_sched_groups(struct xe_gt *gt)
+{
+	int m;
+
+	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+
+	/*
+	 * The GuC supports scheduler groups from v70.53.0, but a fix for it has
+	 * been merged in v70.55.1, so we require the latter. The feature is
+	 * also only enabled on BMG and newer FW.
+	 */
+	if (GUC_FIRMWARE_VER(&gt->uc.guc) < MAKE_GUC_VER(70, 55, 1) ||
+	    gt_to_xe(gt)->info.platform < XE_BATTLEMAGE)
+		return;
+
+	/*
+	 * The GuC interface supports up to 8 groups. However, the GuC only
+	 * fully allocates resources for a subset of groups, based on the number
+	 * of engines and expected usage. The plan is for this to become
+	 * queryable via H2G, but for now GuC FW for all devices supports a
+	 * maximum of 2 groups so we can just hardcode that.
+	 */
+	gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups = 2;
+
+	for (m = 0; m < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; m++) {
+		u32 *masks = NULL;
+		u32 num_masks = 0;
+
+		switch (m) {
+		case XE_SRIOV_SCHED_GROUPS_NONE:
+			break;
+		case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
+			/* this mode only has groups on the media GT */
+			if (xe_gt_is_media_type(gt))
+				pf_sched_group_media_slices(gt, &masks, &num_masks);
+			break;
+		default:
+			xe_gt_assert_msg(gt, false, "unknown sched group mode %u\n", m);
+			return;
+		}
+
+		xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
+
+		gt->sriov.pf.policy.guc.sched_groups.modes[m].masks = masks;
+		gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks = num_masks;
+	}
+}
+
 static void pf_sanitize_guc_policies(struct xe_gt *gt)
 {
 	pf_sanitize_sched_if_idle(gt);
@@ -401,6 +531,18 @@ int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset)
 	return err ? -ENXIO : 0;
 }
 
+/**
+ * xe_gt_sriov_pf_policy_init - Initializes the SW state of the PF policies.
+ * @gt: the &xe_gt
+ *
+ * This function can only be called on PF. This function does not touch the HW,
+ * but must be called after the engines have been initialized.
+ */
+void xe_gt_sriov_pf_policy_init(struct xe_gt *gt)
+{
+	pf_init_sched_groups(gt);
+}
+
 static void print_guc_policies(struct drm_printer *p, struct xe_gt_sriov_guc_policies *policy)
 {
 	drm_printf(p, "%s:\t%s\n",
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
index 2a5dc33dc6d7..52312d24d527 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
@@ -18,6 +18,7 @@ bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
 int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
 u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
 
+void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
 void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
 int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset);
 int xe_gt_sriov_pf_policy_print(struct xe_gt *gt, struct drm_printer *p);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
index 4de532af135e..1d4cdc87e069 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
@@ -8,16 +8,45 @@
 
 #include <linux/types.h>
 
+/**
+ * enum xe_sriov_sched_group_modes - list of possible scheduler group modes
+ * @XE_SRIOV_SCHED_GROUPS_NONE - no separate groups (i.e., all engines in group 0)
+ * @XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES - separate groups for each media slice
+ * @XE_SRIOV_SCHED_GROUPS_MODES_COUNT - number of valid modes
+ */
+enum xe_sriov_sched_group_modes {
+	XE_SRIOV_SCHED_GROUPS_NONE = 0,
+	XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES,
+	XE_SRIOV_SCHED_GROUPS_MODES_COUNT
+};
+
+/**
+ * struct xe_gt_sriov_scheduler_groups - Scheduler groups policy info
+ * @max_num_of_groups: number of groups supported by the GuC for the platform
+ * @modes: array of masks and their number for each mode
+ * @modes.masks: array of masks for a given mode
+ * @modes.num_masks: number of masks in the array
+ */
+struct xe_gt_sriov_scheduler_groups {
+	u8 max_num_of_groups;
+	struct {
+		u32 *masks;
+		u32 num_masks;
+	} modes[XE_SRIOV_SCHED_GROUPS_MODES_COUNT];
+};
+
 /**
  * struct xe_gt_sriov_guc_policies - GuC SR-IOV policies.
  * @sched_if_idle: controls strict scheduling policy.
  * @reset_engine: controls engines reset on VF switch policy.
  * @sample_period: adverse events sampling period (in milliseconds).
+ * @sched_groups: available scheduling group configurations.
  */
 struct xe_gt_sriov_guc_policies {
 	bool sched_if_idle;
 	bool reset_engine;
 	u32 sample_period;
+	struct xe_gt_sriov_scheduler_groups sched_groups;
 };
 
 /**
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 03/11] drm/xe/sriov: Add support for enabling scheduler groups
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
  2025-12-06 23:03 ` [PATCH v2 01/11] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
  2025-12-06 23:03 ` [PATCH v2 02/11] drm/xe/sriov: Initialize scheduler groups Daniele Ceraolo Spurio
@ 2025-12-06 23:03 ` Daniele Ceraolo Spurio
  2025-12-07 21:57   ` Michal Wajdeczko
  2025-12-06 23:04 ` [PATCH v2 04/11] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc Daniele Ceraolo Spurio
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:03 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Scheduler groups are enabled by sending a specific policy configuration
KLV to the GuC. We don't allow changing this policy if there are VF
active, since the expectation is that the VF will only check if the
feature is enabled during driver initialization.

The functions added by this patch will be used by sysfs/debugfs, coming
in follow up patches.

v2: code improvements, add GUC_MAX_SCHED_GROUPS define, don't add
    XE_SRIOV_SCHED_GROUPS_NONE to supported_modes (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  17 +++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 136 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   3 +
 .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |   4 +
 drivers/gpu/drm/xe/xe_guc_fwif.h              |   2 +
 drivers/gpu/drm/xe/xe_guc_klv_helpers.c       |   2 +
 6 files changed, 164 insertions(+)

diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index 265a135e7061..45733a87183a 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -200,6 +200,20 @@ enum  {
  *      :0: adverse events are not counted (default)
  *      :n: sample period in milliseconds
  *
+ * _`GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG` : 0x8004
+ *      This config allows the PF to split the engines across scheduling groups.
+ *      Each group is independently timesliced across VFs, allowing different
+ *      VFs to be active on the HW at the same time. When enabling this feature,
+ *      all engines must be assigned to a group (and only one group), or they
+ *      will be excluded from scheduling after this KLV is sent. To enable
+ *      the groups, the driver must provide a masks array with
+ *      GUC_MAX_ENGINE_CLASSES entries for each group, with each mask indicating
+ *      which logical instances of that class belong to the group. Therefore,
+ *      the length of this KLV when enabling groups is
+ *      num_groups * GUC_MAX_ENGINE_CLASSES. To disable the groups, the driver
+ *      must send the KLV without any payload (i.e. len = 0). The maximum
+ *      number of groups is 8.
+ *
  * _`GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH` : 0x8D00
  *      This enum is to reset utilized HW engine after VF Switch (i.e to clean
  *      up Stale HW register left behind by previous VF)
@@ -214,6 +228,9 @@ enum  {
 #define GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_KEY	0x8002
 #define GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_LEN	1u
 
+#define GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY	0x8004
+#define GUC_KLV_VGT_POLICY_ENGINE_GROUP_MAX_COUNT	8u
+
 #define GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_KEY	0x8D00
 #define GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_LEN	1u
 
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
index 158d68aff4b7..1109fec99fc3 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
@@ -97,6 +97,23 @@ static int pf_push_policy_u32(struct xe_gt *gt, u16 key, u32 value)
 	return pf_push_policy_klvs(gt, 1, klv, ARRAY_SIZE(klv));
 }
 
+static int pf_push_policy_payload(struct xe_gt *gt, u16 key, u32 *payload, u32 num_dwords)
+{
+	CLASS(xe_guc_buf, buf)(&gt->uc.guc.buf, GUC_KLV_LEN_MIN + num_dwords);
+	u32 *klv;
+
+	if (!xe_guc_buf_is_valid(buf))
+		return -ENOBUFS;
+
+	klv = xe_guc_buf_cpu_ptr(buf);
+
+	klv[0] = PREP_GUC_KLV(key, num_dwords);
+	if (num_dwords)
+		memcpy(&klv[1], payload, num_dwords * sizeof(u32));
+
+	return pf_push_policy_buf_klvs(gt, 1, buf, GUC_KLV_LEN_MIN + num_dwords);
+}
+
 static int pf_update_policy_bool(struct xe_gt *gt, u16 key, bool *policy, bool value)
 {
 	int err;
@@ -476,16 +493,134 @@ static void pf_init_sched_groups(struct xe_gt *gt)
 
 		xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
 
+		xe_gt_assert(gt, num_masks / GUC_MAX_ENGINE_CLASSES < GUC_MAX_SCHED_GROUPS);
+
+		if (num_masks)
+			gt->sriov.pf.policy.guc.sched_groups.supported_modes |= BIT(m);
+
 		gt->sriov.pf.policy.guc.sched_groups.modes[m].masks = masks;
 		gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks = num_masks;
 	}
 }
 
+/**
+ * xe_sriov_gt_pf_policy_has_multi_group_modes() - check whether the GT supports
+ * any scheduler modes that have multiple groups
+ * @gt: the &xe_gt to check
+ *
+ * This function can only be called on PF.
+ *
+ * Return: true if the GT supports modes with multiple groups, false otherwise.
+ */
+bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt)
+{
+	return gt->sriov.pf.policy.guc.sched_groups.supported_modes;
+}
+
+/**
+ * xe_sriov_gt_pf_policy_has_sched_group_mode() - check whether the GT supports
+ * a specific scheduler group mode
+ * @gt: the &xe_gt to check
+ * @mode: the mode to check
+ *
+ * This function can only be called on PF.
+ *
+ * Return: true if the GT supports the specified mode, false otherwise.
+ */
+bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode)
+{
+	if (mode == XE_SRIOV_SCHED_GROUPS_NONE)
+		return true;
+
+	return gt->sriov.pf.policy.guc.sched_groups.supported_modes & BIT(mode);
+}
+
+static int __pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
+{
+	u32 *masks = gt->sriov.pf.policy.guc.sched_groups.modes[mode].masks;
+	u32 num_masks = gt->sriov.pf.policy.guc.sched_groups.modes[mode].num_masks;
+
+	xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
+
+	return pf_push_policy_payload(gt, GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY,
+				      masks, num_masks);
+}
+
+static int pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
+{
+	int err;
+
+	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
+
+	if (!xe_sriov_gt_pf_policy_has_sched_group_mode(gt, mode))
+		return -EINVAL;
+
+	/* already in the desired mode */
+	if (gt->sriov.pf.policy.guc.sched_groups.current_mode == mode)
+		return 0;
+
+	/*
+	 * We don't allow changing this with VFs active since it is hard for
+	 * VFs to check.
+	 */
+	if (xe_sriov_pf_num_vfs(gt_to_xe(gt)))
+		return -EBUSY;
+
+	err = __pf_provision_sched_groups(gt, mode);
+	if (err)
+		return err;
+
+	gt->sriov.pf.policy.guc.sched_groups.current_mode = mode;
+
+	return 0;
+}
+
+static int pf_reprovision_sched_groups(struct xe_gt *gt)
+{
+	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
+
+	/* We only have something to provision if we have possible groups */
+	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
+		return 0;
+
+	return __pf_provision_sched_groups(gt, gt->sriov.pf.policy.guc.sched_groups.current_mode);
+}
+
+static void pf_sanitize_sched_groups(struct xe_gt *gt)
+{
+	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
+
+	gt->sriov.pf.policy.guc.sched_groups.current_mode = XE_SRIOV_SCHED_GROUPS_NONE;
+}
+
+/**
+ * xe_gt_sriov_pf_policy_set_sched_groups_mode() - Control the 'sched_groups' policy.
+ * @gt: the &xe_gt where to apply the policy
+ * @value: the sched_group mode to be activated
+ *
+ * This function can only be called on PF.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt,
+						enum xe_sriov_sched_group_modes value)
+{
+	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
+		return -ENODEV;
+
+	guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
+	return pf_provision_sched_groups(gt, value);
+}
+
 static void pf_sanitize_guc_policies(struct xe_gt *gt)
 {
 	pf_sanitize_sched_if_idle(gt);
 	pf_sanitize_reset_engine(gt);
 	pf_sanitize_sample_period(gt);
+	pf_sanitize_sched_groups(gt);
 }
 
 /**
@@ -524,6 +659,7 @@ int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset)
 	err |= pf_reprovision_sched_if_idle(gt);
 	err |= pf_reprovision_reset_engine(gt);
 	err |= pf_reprovision_sample_period(gt);
+	err |= pf_reprovision_sched_groups(gt);
 	mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
 
 	xe_pm_runtime_put(gt_to_xe(gt));
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
index 52312d24d527..6b3e294bc934 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
@@ -17,6 +17,9 @@ int xe_gt_sriov_pf_policy_set_reset_engine(struct xe_gt *gt, bool enable);
 bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
 int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
 u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
+bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
+bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
+int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);
 
 void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
 void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
index 1d4cdc87e069..d9928c200d72 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
@@ -23,12 +23,16 @@ enum xe_sriov_sched_group_modes {
 /**
  * struct xe_gt_sriov_scheduler_groups - Scheduler groups policy info
  * @max_num_of_groups: number of groups supported by the GuC for the platform
+ * @supported_modes: mask of supported modes
+ * @current_mode: active scheduler groups mode
  * @modes: array of masks and their number for each mode
  * @modes.masks: array of masks for a given mode
  * @modes.num_masks: number of masks in the array
  */
 struct xe_gt_sriov_scheduler_groups {
 	u8 max_num_of_groups;
+	u32 supported_modes;
+	enum xe_sriov_sched_group_modes current_mode;
 	struct {
 		u32 *masks;
 		u32 num_masks;
diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
index 7d93c2749485..c2e0a2dae586 100644
--- a/drivers/gpu/drm/xe/xe_guc_fwif.h
+++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
@@ -46,6 +46,8 @@
 #define GUC_MAX_ENGINE_CLASSES		16
 #define GUC_MAX_INSTANCES_PER_CLASS	32
 
+#define GUC_MAX_SCHED_GROUPS GUC_KLV_VGT_POLICY_ENGINE_GROUP_MAX_COUNT
+
 #define GUC_CONTEXT_NORMAL			0
 #define GUC_CONTEXT_COMPRESSION_SAVE		1
 #define GUC_CONTEXT_COMPRESSION_RESTORE	2
diff --git a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
index 146a6eda9e06..1b08b443606e 100644
--- a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
+++ b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
@@ -26,6 +26,8 @@ const char *xe_guc_klv_key_to_string(u16 key)
 		return "sched_if_idle";
 	case GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_KEY:
 		return "sample_period";
+	case GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY:
+		return "engine_group_config";
 	case GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_KEY:
 		return "reset_engine";
 	/* VF CFG keys */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 04/11] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (2 preceding siblings ...)
  2025-12-06 23:03 ` [PATCH v2 03/11] drm/xe/sriov: Add support for enabling " Daniele Ceraolo Spurio
@ 2025-12-06 23:04 ` Daniele Ceraolo Spurio
  2025-12-07 21:58   ` Michal Wajdeczko
  2025-12-06 23:04 ` [PATCH v2 05/11] drm/xe/sriov: Add handling for MLRC adverse event threshold Daniele Ceraolo Spurio
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:04 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Since engines in the same class can be divided across multiple groups,
the GuC does not allow scheduler groups to be active if there are
multi-lrc contexts. This means that:

1) if a MLRC context is registered when we enable scheduler groups, the
   GuC will silently ignore the configuration
2) if a MLRC context is registered after scheduler groups are enabled,
   the GuC will disable the groups and generate an adverse event.

The expectation is that the admin will ensure that all apps that use
MLRC on PF have been terminated before scheduler groups are created. A
check on PF is added anyway to make sure we don't still have contexts
waiting to be cleaned up laying around.
On both PF and VF we block creation of new MLRC queues once scheduler
groups have been enabled.

v2: move threshold handling to its own patch, move MLRC check to
    guc_submit.c, hide SRIOV interals from exec_queue creation code,
    better comments/docs (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h      |  7 +++
 drivers/gpu/drm/xe/xe_exec_queue.c         | 19 +++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf.c        | 17 ++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf.h        |  8 +++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c | 28 ++++++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h |  1 +
 drivers/gpu/drm/xe/xe_gt_sriov_vf.c        | 60 ++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_sriov_vf.h        |  1 +
 drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h  |  2 +
 drivers/gpu/drm/xe/xe_guc_klv_helpers.c    |  3 ++
 drivers/gpu/drm/xe/xe_guc_submit.c         | 21 ++++++++
 drivers/gpu/drm/xe/xe_guc_submit.h         |  2 +
 12 files changed, 169 insertions(+)

diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index 45733a87183a..edb0546fb163 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -46,11 +46,18 @@
  *      Refers to 32 bit architecture version as reported by the HW IP.
  *      This key is supported on MTL+ platforms only.
  *      Requires GuC ABI 1.2+.
+ *
+ * _`GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE` : 0x3001
+ *      Tells the driver whether scheduler groups are enabled or not.
+ *      Requires GuC ABI 1.26+
  */
 
 #define GUC_KLV_GLOBAL_CFG_GMD_ID_KEY			0x3000u
 #define GUC_KLV_GLOBAL_CFG_GMD_ID_LEN			1u
 
+#define GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY	0x3001u
+#define GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_LEN	1u
+
 /**
  * DOC: GuC Self Config KLVs
  *
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
index 226d07a3d852..df01c0664965 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -16,6 +16,7 @@
 #include "xe_dep_scheduler.h"
 #include "xe_device.h"
 #include "xe_gt.h"
+#include "xe_gt_sriov_pf.h"
 #include "xe_gt_sriov_vf.h"
 #include "xe_hw_engine_class_sysfs.h"
 #include "xe_hw_engine_group.h"
@@ -718,6 +719,17 @@ static u32 calc_validate_logical_mask(struct xe_device *xe,
 	return return_mask;
 }
 
+static bool has_sched_groups(struct xe_gt *gt)
+{
+	if (IS_SRIOV_PF(gt_to_xe(gt)) && xe_gt_sriov_pf_sched_groups_enabled(gt))
+		return true;
+
+	if (IS_SRIOV_VF(gt_to_xe(gt)) && xe_gt_sriov_vf_sched_groups_enabled(gt))
+		return true;
+
+	return false;
+}
+
 int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file)
 {
@@ -810,6 +822,13 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
 			return -ENOENT;
 		}
 
+		/* SRIOV sched groups are not compatible with multi-lrc */
+		if (XE_IOCTL_DBG(xe, args->width > 1 && has_sched_groups(hwe->gt))) {
+			up_read(&vm->lock);
+			xe_vm_put(vm);
+			return -EINVAL;
+		}
+
 		q = xe_exec_queue_create(xe, vm, logical_mask,
 					 args->width, hwe, flags,
 					 args->extensions);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
index 0d97a823e702..fb5c9101e275 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
@@ -284,3 +284,20 @@ int xe_gt_sriov_pf_wait_ready(struct xe_gt *gt)
 	pf_flush_restart(gt);
 	return 0;
 }
+
+/**
+ * xe_gt_sriov_pf_sched_groups_enabled - Check if multiple scheduler groups are
+ * enabled
+ * @gt: the &xe_gt
+ *
+ * This function is for PF use only.
+ *
+ * Return: true if shed groups were enabled, false otherwise.
+ */
+bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt)
+{
+	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+
+	return xe_gt_sriov_pf_policy_sched_groups_enabled(gt);
+}
+
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
index e7fde3f9937a..1ccfc7137b98 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
@@ -6,6 +6,8 @@
 #ifndef _XE_GT_SRIOV_PF_H_
 #define _XE_GT_SRIOV_PF_H_
 
+#include <linux/types.h>
+
 struct xe_gt;
 
 #ifdef CONFIG_PCI_IOV
@@ -16,6 +18,7 @@ void xe_gt_sriov_pf_init_hw(struct xe_gt *gt);
 void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid);
 void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt);
 void xe_gt_sriov_pf_restart(struct xe_gt *gt);
+bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt);
 #else
 static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
 {
@@ -38,6 +41,11 @@ static inline void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt)
 static inline void xe_gt_sriov_pf_restart(struct xe_gt *gt)
 {
 }
+
+static inline bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt)
+{
+	return false;
+}
 #endif
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
index 1109fec99fc3..6a682d788b02 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
@@ -16,6 +16,7 @@
 #include "xe_guc_buf.h"
 #include "xe_guc_ct.h"
 #include "xe_guc_klv_helpers.h"
+#include "xe_guc_submit.h"
 #include "xe_pm.h"
 
 /*
@@ -567,6 +568,19 @@ static int pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
 	if (xe_sriov_pf_num_vfs(gt_to_xe(gt)))
 		return -EBUSY;
 
+	/*
+	 * The GuC silently ignores the setting if any MLRC contexts are
+	 * registered. We expect the admin to make sure that all apps that use
+	 * MLRC are terminated before scheduler groups are enabled, so this
+	 * check is just to make sure that the exec_queue destruction has been
+	 * completed.
+	 */
+	if (mode != XE_SRIOV_SCHED_GROUPS_NONE &&
+	    xe_guc_has_registered_mlrc_queues(&gt->uc.guc)) {
+		xe_gt_sriov_notice(gt, "can't enable sched groups with active mlrc queues\n");
+		return -EPERM;
+	}
+
 	err = __pf_provision_sched_groups(gt, mode);
 	if (err)
 		return err;
@@ -615,6 +629,20 @@ int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt,
 	return pf_provision_sched_groups(gt, value);
 }
 
+/**
+ * xe_gt_sriov_pf_policy_sched_groups_enabled() - check whether the GT has
+ * multiple scheduler groups enabled
+ * @gt: the &xe_gt to check
+ *
+ * This function can only be called on PF.
+ *
+ * Return: true if the GT has multiple groups enabled, false otherwise.
+ */
+bool xe_gt_sriov_pf_policy_sched_groups_enabled(struct xe_gt *gt)
+{
+	return gt->sriov.pf.policy.guc.sched_groups.current_mode != XE_SRIOV_SCHED_GROUPS_NONE;
+}
+
 static void pf_sanitize_guc_policies(struct xe_gt *gt)
 {
 	pf_sanitize_sched_if_idle(gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
index 6b3e294bc934..ceaf797ca21b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
@@ -20,6 +20,7 @@ u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
 bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
 bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
 int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);
+bool xe_gt_sriov_pf_policy_sched_groups_enabled(struct xe_gt *gt);
 
 void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
 void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
index 97c29c55f885..48e11c1a2d08 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
@@ -438,6 +438,30 @@ u32 xe_gt_sriov_vf_gmdid(struct xe_gt *gt)
 	return value;
 }
 
+static int query_vf_sched_groups(struct xe_gt *gt)
+{
+	struct xe_guc *guc = &gt->uc.guc;
+	u32 value = 0;
+	int err;
+
+	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
+
+	if (MAKE_GUC_VER_STRUCT(gt->sriov.vf.guc_version) < MAKE_GUC_VER(1, 26, 0))
+		return 0;
+
+	err = guc_action_query_single_klv32(guc,
+					    GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY,
+					    &value);
+	if (unlikely(err)) {
+		xe_gt_sriov_err(gt, "Failed to obtain sched groups status (%pe)\n",
+				ERR_PTR(err));
+		return err;
+	}
+
+	xe_gt_sriov_dbg(gt, "sched groups %s\n", str_enabled_disabled(value));
+	return value;
+}
+
 static int vf_get_ggtt_info(struct xe_gt *gt)
 {
 	struct xe_tile *tile = gt_to_tile(gt);
@@ -564,6 +588,21 @@ static void vf_cache_gmdid(struct xe_gt *gt)
 	gt->sriov.vf.runtime.gmdid = xe_gt_sriov_vf_gmdid(gt);
 }
 
+static int vf_cache_sched_groups_status(struct xe_gt *gt)
+{
+	int ret;
+
+	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
+
+	ret = query_vf_sched_groups(gt);
+	if (ret < 0)
+		return ret;
+
+	gt->sriov.vf.runtime.uses_sched_groups = ret;
+
+	return 0;
+}
+
 /**
  * xe_gt_sriov_vf_query_config - Query SR-IOV config data over MMIO.
  * @gt: the &xe_gt
@@ -593,12 +632,33 @@ int xe_gt_sriov_vf_query_config(struct xe_gt *gt)
 	if (unlikely(err))
 		return err;
 
+	err = vf_cache_sched_groups_status(gt);
+	if (unlikely(err))
+		return err;
+
 	if (has_gmdid(xe))
 		vf_cache_gmdid(gt);
 
 	return 0;
 }
 
+/**
+ * xe_gt_sriov_vf_sched_groups_enabled() - Check if PF has enabled multiple
+ * scheduler groups
+ * @gt: the &xe_gt
+ *
+ * This function is for VF use only.
+ *
+ * Return: true if shed groups were enabled, false otherwise.
+ */
+bool xe_gt_sriov_vf_sched_groups_enabled(struct xe_gt *gt)
+{
+	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
+	xe_gt_assert(gt, gt->sriov.vf.guc_version.major);
+
+	return gt->sriov.vf.runtime.uses_sched_groups;
+}
+
 /**
  * xe_gt_sriov_vf_guc_ids - VF GuC context IDs configuration.
  * @gt: the &xe_gt
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
index af40276790fa..7d97189c2d3d 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
@@ -30,6 +30,7 @@ bool xe_gt_sriov_vf_recovery_pending(struct xe_gt *gt);
 u32 xe_gt_sriov_vf_gmdid(struct xe_gt *gt);
 u16 xe_gt_sriov_vf_guc_ids(struct xe_gt *gt);
 u64 xe_gt_sriov_vf_lmem(struct xe_gt *gt);
+bool xe_gt_sriov_vf_sched_groups_enabled(struct xe_gt *gt);
 
 u32 xe_gt_sriov_vf_read32(struct xe_gt *gt, struct xe_reg reg);
 void xe_gt_sriov_vf_write32(struct xe_gt *gt, struct xe_reg reg, u32 val);
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
index 420b0e6089de..5267c097ecd0 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
@@ -27,6 +27,8 @@ struct xe_gt_sriov_vf_selfconfig {
 struct xe_gt_sriov_vf_runtime {
 	/** @gmdid: cached value of the GDMID register. */
 	u32 gmdid;
+	/** @uses_sched_groups: whether PF enabled sched groups or not. */
+	bool uses_sched_groups;
 	/** @regs_size: size of runtime register array. */
 	u32 regs_size;
 	/** @num_regs: number of runtime registers in the array. */
diff --git a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
index 1b08b443606e..dd504b77cb17 100644
--- a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
+++ b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
@@ -21,6 +21,9 @@
 const char *xe_guc_klv_key_to_string(u16 key)
 {
 	switch (key) {
+	/* GuC Global Config KLVs */
+	case GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY:
+		return "group_scheduling_available";
 	/* VGT POLICY keys */
 	case GUC_KLV_VGT_POLICY_SCHED_IF_IDLE_KEY:
 		return "sched_if_idle";
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
index af43acf7baae..e8921219ac4e 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -2985,6 +2985,27 @@ void xe_guc_submit_print(struct xe_guc *guc, struct drm_printer *p)
 	mutex_unlock(&guc->submission_state.lock);
 }
 
+/**
+ * xe_guc_has_registered_mlrc_queues - check whether there are any MLRC queues
+ * registered with the GuC
+ * @guc: GuC.
+ *
+ * Return: true if any MLRC queue is registered with the GuC, false otherwise.
+ */
+bool xe_guc_has_registered_mlrc_queues(struct xe_guc *guc)
+{
+	struct xe_exec_queue *q;
+	unsigned long index;
+
+	guard(mutex)(&guc->submission_state.lock);
+
+	xa_for_each(&guc->submission_state.exec_queue_lookup, index, q)
+		if (q->width > 1)
+			return true;
+
+	return false;
+}
+
 /**
  * xe_guc_contexts_hwsp_rebase - Re-compute GGTT references within all
  * exec queues registered to given GuC.
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.h b/drivers/gpu/drm/xe/xe_guc_submit.h
index 100a7891b918..49e608500a4e 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.h
+++ b/drivers/gpu/drm/xe/xe_guc_submit.h
@@ -49,6 +49,8 @@ xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *snapsh
 void xe_guc_submit_print(struct xe_guc *guc, struct drm_printer *p);
 void xe_guc_register_vf_exec_queue(struct xe_exec_queue *q, int ctx_type);
 
+bool xe_guc_has_registered_mlrc_queues(struct xe_guc *guc);
+
 int xe_guc_contexts_hwsp_rebase(struct xe_guc *guc, void *scratch);
 
 #endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 05/11] drm/xe/sriov: Add handling for MLRC adverse event threshold
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (3 preceding siblings ...)
  2025-12-06 23:04 ` [PATCH v2 04/11] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc Daniele Ceraolo Spurio
@ 2025-12-06 23:04 ` Daniele Ceraolo Spurio
  2025-12-07 22:03   ` Michal Wajdeczko
  2025-12-06 23:04 ` [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups Daniele Ceraolo Spurio
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:04 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Since it is illegal to register a MLRC context when scheduler groups are
enabled, the GuC consider the VF doing so as an adverse event. Like for
other adverse event, there is a threshold for how many times the event
can happen before the GuC throws an error, which we need to add support
for.

Since this is the first threshold that we have that has a minimum GuC
version requirement, support for checking that has been added to the
generic threshold handling. As part of it, some of the version code has
been moved to its own file and with the occasion some SRIOV
documentation has been added.

v2: split from previous patch, add GuC version checking

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  9 +++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    | 19 ++++++----
 drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c   |  9 +++--
 drivers/gpu/drm/xe/xe_guc.h                   |  7 +---
 .../drm/xe/xe_guc_klv_thresholds_set_types.h  | 18 +++++-----
 drivers/gpu/drm/xe/xe_guc_version.h           | 36 +++++++++++++++++++
 6 files changed, 74 insertions(+), 24 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_guc_version.h

diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index edb0546fb163..30a051a0b4ee 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -376,6 +376,12 @@ enum  {
  *      :1: NORMAL = schedule VF always, irrespective of whether it has work or not
  *      :2: HIGH = schedule VF in the next time-slice after current active
  *          time-slice completes if it has active work
+ *
+ * _`GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT` : 0x8A0D
+ *      Given that multi-LRC contexts are incompatible with SRIOV scheduler
+ *      groups and cause the latter to be turned off when registered with the
+ *      GuC, this config allows the PF to set a threshold for multi-LRC context
+ *      registrations by VFs to monitor their behavior.
  */
 
 #define GUC_KLV_VF_CFG_GGTT_START_KEY		0x0001
@@ -434,6 +440,9 @@ enum  {
 #define   GUC_SCHED_PRIORITY_NORMAL		1u
 #define   GUC_SCHED_PRIORITY_HIGH		2u
 
+#define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_KEY	0x8a0d
+#define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_LEN	1u
+
 /*
  * Workaround keys:
  */
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index 59c5c6b4d994..dda671d05b89 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -269,7 +269,8 @@ static u32 encode_config_ggtt(u32 *cfg, const struct xe_gt_sriov_config *config,
 }
 
 /* Return: number of configuration dwords written */
-static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config *config, bool details)
+static u32 encode_config(struct xe_gt *gt, u32 *cfg,
+			 const struct xe_gt_sriov_config *config, bool details)
 {
 	u32 n = 0;
 
@@ -303,9 +304,11 @@ static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config *config, bool
 	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_PREEMPT_TIMEOUT);
 	cfg[n++] = config->preempt_timeout;
 
-#define encode_threshold_config(TAG, ...) ({					\
-	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
-	cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
+#define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
+		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
+		cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
+	}										\
 });
 
 	MAKE_XE_GUC_KLV_THRESHOLDS_SET(encode_threshold_config);
@@ -328,7 +331,7 @@ static int pf_push_full_vf_config(struct xe_gt *gt, unsigned int vfid)
 		return -ENOBUFS;
 
 	cfg = xe_guc_buf_cpu_ptr(buf);
-	num_dwords = encode_config(cfg, config, true);
+	num_dwords = encode_config(gt, cfg, config, true);
 	xe_gt_assert(gt, num_dwords <= max_cfg_dwords);
 
 	if (xe_gt_is_media_type(gt)) {
@@ -2518,7 +2521,7 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
 			ret = -ENOBUFS;
 		} else {
 			config = pf_pick_vf_config(gt, vfid);
-			ret = encode_config(buf, config, false) * sizeof(u32);
+			ret = encode_config(gt, buf, config, false) * sizeof(u32);
 		}
 	}
 	mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
@@ -2551,9 +2554,11 @@ static int pf_restore_vf_config_klv(struct xe_gt *gt, unsigned int vfid,
 		return pf_provision_preempt_timeout(gt, vfid, value[0]);
 
 	/* auto-generate case statements */
-#define define_threshold_key_to_provision_case(TAG, ...)				\
+#define define_threshold_key_to_provision_case(TAG, NAME, MIN_GUC_VER)			\
 	case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):					\
 		BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u);		\
+		if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < MIN_GUC_VER)		\
+			return -ENOKEY;							\
 		if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))			\
 			return -EBADMSG;						\
 		return pf_provision_threshold(gt, vfid,					\
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
index 0fd863609848..5123ff1fb116 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
@@ -21,6 +21,7 @@
 #include "xe_gt_sriov_pf_monitor.h"
 #include "xe_gt_sriov_pf_policy.h"
 #include "xe_gt_sriov_pf_service.h"
+#include "xe_guc.h"
 #include "xe_pm.h"
 #include "xe_sriov_pf.h"
 #include "xe_sriov_pf_provision.h"
@@ -301,9 +302,11 @@ static void pf_add_config_attrs(struct xe_gt *gt, struct dentry *parent, unsigne
 				   &sched_priority_fops);
 
 	/* register all threshold attributes */
-#define register_threshold_attribute(TAG, NAME, ...) \
-	debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent, \
-				   &NAME##_fops);
+#define register_threshold_attribute(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER)		\
+		debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent,	\
+					   &NAME##_fops);				\
+});
 	MAKE_XE_GUC_KLV_THRESHOLDS_SET(register_threshold_attribute)
 #undef register_threshold_attribute
 }
diff --git a/drivers/gpu/drm/xe/xe_guc.h b/drivers/gpu/drm/xe/xe_guc.h
index fdb08658d05a..9028718189ed 100644
--- a/drivers/gpu/drm/xe/xe_guc.h
+++ b/drivers/gpu/drm/xe/xe_guc.h
@@ -8,15 +8,10 @@
 
 #include "xe_gt.h"
 #include "xe_guc_types.h"
+#include "xe_guc_version.h"
 #include "xe_hw_engine_types.h"
 #include "xe_macros.h"
 
-/*
- * GuC version number components are defined to be only 8-bit size,
- * so converting to a 32bit 8.8.8 integer allows simple (and safe)
- * numerical comparisons.
- */
-#define MAKE_GUC_VER(maj, min, pat)	(((maj) << 16) | ((min) << 8) | (pat))
 #define MAKE_GUC_VER_STRUCT(ver)	MAKE_GUC_VER((ver).major, (ver).minor, (ver).patch)
 #define GUC_SUBMIT_VER(guc) \
 	MAKE_GUC_VER_STRUCT((guc)->fw.versions.found[XE_UC_FW_VER_COMPATIBILITY])
diff --git a/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h b/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
index 0a028c94756d..f7ed32244c6b 100644
--- a/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
+++ b/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
@@ -7,6 +7,7 @@
 #define _XE_GUC_KLV_THRESHOLDS_SET_TYPES_H_
 
 #include "xe_args.h"
+#include "xe_guc_version.h"
 
 /**
  * MAKE_XE_GUC_KLV_THRESHOLDS_SET - Generate various GuC thresholds definitions.
@@ -23,15 +24,16 @@
  * with the &TAG, that corresponds to the GuC threshold KLV key name defined by
  * ABI and the associated &NAME, that may be used in code or debugfs/sysfs::
  *
- *	define(TAG, NAME)
+ *	define(TAG, NAME, MIN_GUC_VER)
  */
-#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)		\
-	define(CAT_ERR, cat_error_count)		\
-	define(ENGINE_RESET, engine_reset_count)	\
-	define(PAGE_FAULT, page_fault_count)		\
-	define(H2G_STORM, guc_time_us)			\
-	define(IRQ_STORM, irq_time_us)			\
-	define(DOORBELL_STORM, doorbell_time_us)	\
+#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)					\
+	define(CAT_ERR, cat_error_count, 0)					\
+	define(ENGINE_RESET, engine_reset_count, 0)				\
+	define(PAGE_FAULT, page_fault_count, 0)					\
+	define(H2G_STORM, guc_time_us, 0)					\
+	define(IRQ_STORM, irq_time_us, 0)					\
+	define(DOORBELL_STORM, doorbell_time_us, 0)				\
+	define(MULTI_LRC_COUNT, multi_lrc_count, MAKE_GUC_VER(70, 53, 0))	\
 	/* end */
 
 /**
diff --git a/drivers/gpu/drm/xe/xe_guc_version.h b/drivers/gpu/drm/xe/xe_guc_version.h
new file mode 100644
index 000000000000..e6f80abd2f05
--- /dev/null
+++ b/drivers/gpu/drm/xe/xe_guc_version.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2025 Intel Corporation
+ */
+
+#ifndef _XE_GUC_VERSION_H_
+#define _XE_GUC_VERSION_H_
+
+/*
+ * GuC version number components are defined to be only 8-bit size,
+ * so converting to a 32bit 8.8.8 integer allows simple (and safe)
+ * numerical comparisons.
+ */
+#define MAKE_GUC_VER(maj, min, pat)	(((maj) << 16) | ((min) << 8) | (pat))
+
+/**
+ * DOC: SRIOV-changes
+ *
+ * We record SRIOV-specific changes here as those need to be tracked carefully.
+ *
+ * GuC 70.53.0 (VF interface 1.26.0):
+ *
+ * Added support for EGS. See:
+ *  * GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG
+ *  * GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT
+ *
+ * GuC 70.54.0 (VF interface 1.27.0):
+ *
+ * Updated VF migration support. See RESFIX actions
+ *
+ * GuC 70.55.1 (VF interface 1.28.1):
+ *
+ * Fixes for EGS.
+ */
+
+#endif
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (4 preceding siblings ...)
  2025-12-06 23:04 ` [PATCH v2 05/11] drm/xe/sriov: Add handling for MLRC adverse event threshold Daniele Ceraolo Spurio
@ 2025-12-06 23:04 ` Daniele Ceraolo Spurio
  2025-12-08 23:38   ` Michal Wajdeczko
  2025-12-06 23:04 ` [PATCH v2 07/11] drm/xe/sriov: Add debugfs with scheduler groups information Daniele Ceraolo Spurio
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:04 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Reading the debugfs file lists the available configurations by name.
Writing the name of a configuration to the file will enable it.

v2: don't print anything if the feature is unsupported (Michal), add
    TODO for reworking init order to know if there are valid groups
    when we register debugfs, check for basic feature support.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 126 ++++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c  |  19 +--
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h  |   1 +
 3 files changed, 139 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
index 5123ff1fb116..1be23809e624 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
@@ -156,6 +156,131 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
 	debugfs_create_file_unsafe("sample_period_ms", 0644, parent, parent, &sample_period_fops);
 }
 
+/*
+ *      /sys/kernel/debug/dri/BDF/
+ *      ├── sriov
+ *      :   ├── pf
+ *          :   ├── tile0
+ *              :   ├── gt0
+ *                  :   ├── sched_groups_mode
+ */
+
+static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
+{
+	switch (mode) {
+	case XE_SRIOV_SCHED_GROUPS_NONE:
+		return "disabled";
+	case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
+		return "media_slices";
+	default:
+		return "unknown";
+	}
+}
+
+static int sched_groups_info(struct seq_file *m, void *data)
+{
+	struct drm_printer p = drm_seq_file_printer(m);
+	struct xe_gt *gt = extract_gt(m->private);
+	u32 current_mode = gt->sriov.pf.policy.guc.sched_groups.current_mode;
+	int mode = 0;
+
+	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
+		return 0;
+
+	for (mode = 0; mode < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; mode++) {
+		if (!xe_sriov_gt_pf_policy_has_sched_group_mode(gt, mode))
+			continue;
+
+		if (mode)
+			drm_printf(&p, " ");
+
+		if (mode == current_mode)
+			drm_printf(&p, "[");
+
+		drm_printf(&p, "%s", sched_group_mode_to_string(mode));
+
+		if (mode == current_mode)
+			drm_printf(&p, "]");
+	}
+
+	drm_printf(&p, "\n");
+
+	return 0;
+}
+
+static int sched_groups_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, sched_groups_info, inode->i_private);
+}
+
+static ssize_t sched_groups_write(struct file *file, const char __user *ubuf,
+				  size_t size, loff_t *pos)
+{
+	struct xe_gt *gt = extract_gt(file_inode(file)->i_private);
+	char name[32];
+	int ret;
+	int m;
+
+	if (*pos)
+		return -ESPIPE;
+
+	if (!size)
+		return -ENODATA;
+
+	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
+		return -ENODEV;
+
+	if (size > sizeof(name) - 1)
+		return -EINVAL;
+
+	ret = simple_write_to_buffer(name, sizeof(name) - 1, pos, ubuf, size);
+	if (ret < 0)
+		return ret;
+	name[ret] = '\0';
+
+	for (m = 0; m < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; m++)
+		if (sysfs_streq(name, sched_group_mode_to_string(m)))
+			break;
+
+	if (m == XE_SRIOV_SCHED_GROUPS_MODES_COUNT)
+		return -EINVAL;
+
+	guard(xe_pm_runtime)(gt_to_xe(gt));
+	ret = xe_gt_sriov_pf_policy_set_sched_groups_mode(gt, m);
+
+	return (ret < 0) ? ret : size;
+}
+
+static const struct file_operations sched_groups_fops = {
+	.owner = THIS_MODULE,
+	.open = sched_groups_open,
+	.read = seq_read,
+	.write = sched_groups_write,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
+static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
+{
+	xe_gt_assert(gt, gt == extract_gt(parent));
+	xe_gt_assert(gt, PFID == extract_vfid(parent));
+
+	/*
+	 * TODO: we currently call this function before we initialize scheduler
+	 * groups, so at this point in time we don't know if there are any
+	 * valid groups on the GT and we can't selectively register the debugfs
+	 * only if there are any. Therefore, we always register the debugfs
+	 * files if we're on a platform that has support for groups.
+	 * We should rework the flow so that debugfs is registered after the
+	 * policy init, so that we check if there are valid groups before
+	 * adding the debugfs files.
+	 */
+	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
+		return;
+
+	debugfs_create_file("sched_groups_mode", 0644, parent, parent, &sched_groups_fops);
+}
+
 /*
  *      /sys/kernel/debug/dri/BDF/
  *      ├── sriov
@@ -531,6 +656,7 @@ static void pf_populate_gt(struct xe_gt *gt, struct dentry *dent, unsigned int v
 	} else {
 		pf_add_config_attrs(gt, dent, PFID);
 		pf_add_policy_attrs(gt, dent);
+		pf_add_sched_groups(gt, dent);
 
 		drm_debugfs_create_files(pf_info, ARRAY_SIZE(pf_info), dent, minor);
 	}
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
index 6a682d788b02..2cafacac5d8e 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
@@ -451,19 +451,24 @@ static void pf_sched_group_media_slices(struct xe_gt *gt, u32 **masks, u32 *num_
 	*num_masks = GUC_MAX_ENGINE_CLASSES * groups;
 }
 
-static void pf_init_sched_groups(struct xe_gt *gt)
+bool xe_sriov_gt_pf_policy_has_sched_groups_support(struct xe_gt *gt)
 {
-	int m;
-
-	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
-
 	/*
 	 * The GuC supports scheduler groups from v70.53.0, but a fix for it has
 	 * been merged in v70.55.1, so we require the latter. The feature is
 	 * also only enabled on BMG and newer FW.
 	 */
-	if (GUC_FIRMWARE_VER(&gt->uc.guc) < MAKE_GUC_VER(70, 55, 1) ||
-	    gt_to_xe(gt)->info.platform < XE_BATTLEMAGE)
+	return GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 55, 1) &&
+	       gt_to_xe(gt)->info.platform >= XE_BATTLEMAGE;
+}
+
+static void pf_init_sched_groups(struct xe_gt *gt)
+{
+	int m;
+
+	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
+
+	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
 		return;
 
 	/*
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
index ceaf797ca21b..f5ea44dcaf82 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
@@ -17,6 +17,7 @@ int xe_gt_sriov_pf_policy_set_reset_engine(struct xe_gt *gt, bool enable);
 bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
 int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
 u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
+bool xe_sriov_gt_pf_policy_has_sched_groups_support(struct xe_gt *gt);
 bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
 bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
 int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 07/11] drm/xe/sriov: Add debugfs with scheduler groups information
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (5 preceding siblings ...)
  2025-12-06 23:04 ` [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups Daniele Ceraolo Spurio
@ 2025-12-06 23:04 ` Daniele Ceraolo Spurio
  2025-12-09  0:08   ` Michal Wajdeczko
  2025-12-06 23:04 ` [PATCH v2 08/11] drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts Daniele Ceraolo Spurio
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:04 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Under a new subfolder, an entry is created for each group to list the
engines assigned to them. We create enough entries for each possible
group, with the disabled groups just returning an empty list.

v2: drop subfolders, always register debugfs for all groups (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 70 +++++++++++++++++++++
 1 file changed, 70 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
index 1be23809e624..15f5f3a40471 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
@@ -163,6 +163,10 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
  *          :   ├── tile0
  *              :   ├── gt0
  *                  :   ├── sched_groups_mode
+ *                      ├── sched_groups
+ *                      :   ├── group0
+ *                          :
+ *                          └── groupN
  */
 
 static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
@@ -260,8 +264,60 @@ static const struct file_operations sched_groups_fops = {
 	.release = single_release,
 };
 
+static ssize_t sched_group_engines_read(struct file *file, char __user *buf,
+					size_t count, loff_t *ppos)
+{
+	struct dentry *dent = file_dentry(file);
+	struct xe_gt *gt = extract_gt(dent->d_parent);
+	struct xe_gt_sriov_scheduler_groups *groups = &gt->sriov.pf.policy.guc.sched_groups;
+	u32 num_masks = groups->modes[groups->current_mode].num_masks;
+	u32 *masks = groups->modes[groups->current_mode].masks;
+	unsigned int group = GUC_MAX_SCHED_GROUPS;
+	struct xe_hw_engine *hwe;
+	enum xe_hw_engine_id id;
+	char engines[128];
+	int ret;
+
+	ret = sscanf(dent->d_name.name, "group%u", &group);
+	xe_gt_assert(gt, ret == 1 && group < GUC_MAX_SCHED_GROUPS);
+
+	engines[0] = '\0';
+
+	/* If there are no masks it means that all the engines are in group 0 */
+	if (num_masks >= (group + 1) * GUC_MAX_ENGINE_CLASSES) {
+		for_each_hw_engine(hwe, gt, id) {
+			u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
+			u32 mask = masks[group * GUC_MAX_ENGINE_CLASSES + guc_class];
+
+			if (mask & BIT(hwe->logical_instance)) {
+				strlcat(engines, hwe->name, sizeof(engines));
+				strlcat(engines, " ", sizeof(engines));
+			}
+		}
+		strlcat(engines, "\n", sizeof(engines));
+	} else if (group == 0) {
+		for_each_hw_engine(hwe, gt, id) {
+			strlcat(engines, hwe->name, sizeof(engines));
+			strlcat(engines, " ", sizeof(engines));
+		}
+		strlcat(engines, "\n", sizeof(engines));
+	}
+
+	return simple_read_from_buffer(buf, count, ppos, engines, strlen(engines));
+}
+
+static const struct file_operations sched_group_engines_fops = {
+	.owner = THIS_MODULE,
+	.open = simple_open,
+	.read = sched_group_engines_read,
+	.llseek = default_llseek,
+};
+
 static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
 {
+	struct dentry *groups;
+	u8 g;
+
 	xe_gt_assert(gt, gt == extract_gt(parent));
 	xe_gt_assert(gt, PFID == extract_vfid(parent));
 
@@ -274,11 +330,25 @@ static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
 	 * We should rework the flow so that debugfs is registered after the
 	 * policy init, so that we check if there are valid groups before
 	 * adding the debugfs files.
+	 * Similarly, instead of using GUC_MAX_SCHED_GROUPS we could use
+	 * gt->sriov.pf.policy.guc.sched_groups.max_number_of_groups.
 	 */
 	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
 		return;
 
 	debugfs_create_file("sched_groups_mode", 0644, parent, parent, &sched_groups_fops);
+
+	groups = debugfs_create_dir("sched_groups", parent);
+	if (IS_ERR(groups))
+		return;
+	groups->d_inode->i_private = gt;
+
+	for (g = 0; g < GUC_MAX_SCHED_GROUPS; g++) {
+		char name[10];
+
+		snprintf(name, sizeof(name), "group%u", g);
+		debugfs_create_file(name, 0644, groups, parent, &sched_group_engines_fops);
+	}
 }
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 08/11] drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (6 preceding siblings ...)
  2025-12-06 23:04 ` [PATCH v2 07/11] drm/xe/sriov: Add debugfs with scheduler groups information Daniele Ceraolo Spurio
@ 2025-12-06 23:04 ` Daniele Ceraolo Spurio
  2025-12-06 23:04 ` [PATCH v2 09/11] drm/xe/sriov: Add functions to set exec quantums for each group Daniele Ceraolo Spurio
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:04 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Each scheduler group can be independently configured with its own exec
quantum and preemption timeouts. The existing KLVs to configure those
parameters will apply the value to all groups (even if they're not
enabled at the moment).

When scheduler groups are disabled, the GuC uses the values from Group 0.

v2: improve doc, use ARRAY_SIZE for loops (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  8 ++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    | 25 +++++++++++++------
 .../gpu/drm/xe/xe_gt_sriov_pf_config_types.h  |  5 ++--
 3 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index 30a051a0b4ee..6331836ddea7 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -292,6 +292,10 @@ enum  {
  *      it to take effect. Such cases might typically happen on a 1PF+1VF
  *      Virtualization config enabled for heavier workloads like AI/ML.
  *
+ *      If scheduling groups are supported, the provided value is applied to all
+ *      groups (even if they've not yet been enabled). Support for this feature
+ *      is available from GuC 70.53.0.
+ *
  *      The max value for this KLV is 100 seconds, anything exceeding that
  *      will be clamped to the max.
  *
@@ -314,6 +318,10 @@ enum  {
  *      on a 1PF+1VF Virtualization config enabled for heavier workloads like
  *      AI/ML.
  *
+ *      If scheduling groups are supported, the provided value is applied to all
+ *      groups (even if they've not yet been enabled). Support for this feature
+ *      is available from GuC 70.53.0.
+ *
  *      The max value for this KLV is 100 seconds, anything exceeding that
  *      will be clamped to the max.
  *
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index dda671d05b89..be25c439a10b 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -299,10 +299,10 @@ static u32 encode_config(struct xe_gt *gt, u32 *cfg,
 	}
 
 	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_EXEC_QUANTUM);
-	cfg[n++] = config->exec_quantum;
+	cfg[n++] = config->exec_quantum[0];
 
 	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_PREEMPT_TIMEOUT);
-	cfg[n++] = config->preempt_timeout;
+	cfg[n++] = config->preempt_timeout[0];
 
 #define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
 	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
@@ -1860,12 +1860,15 @@ static int pf_provision_exec_quantum(struct xe_gt *gt, unsigned int vfid,
 {
 	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
 	int err;
+	int i;
 
 	err = pf_push_vf_cfg_exec_quantum(gt, vfid, &exec_quantum);
 	if (unlikely(err))
 		return err;
 
-	config->exec_quantum = exec_quantum;
+	for (i = 0; i < ARRAY_SIZE(config->exec_quantum); i++)
+		config->exec_quantum[i] = exec_quantum;
+
 	return 0;
 }
 
@@ -1873,7 +1876,7 @@ static u32 pf_get_exec_quantum(struct xe_gt *gt, unsigned int vfid)
 {
 	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
 
-	return config->exec_quantum;
+	return config->exec_quantum[0];
 }
 
 /**
@@ -1990,12 +1993,14 @@ static int pf_provision_preempt_timeout(struct xe_gt *gt, unsigned int vfid,
 {
 	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
 	int err;
+	int i;
 
 	err = pf_push_vf_cfg_preempt_timeout(gt, vfid, &preempt_timeout);
 	if (unlikely(err))
 		return err;
 
-	config->preempt_timeout = preempt_timeout;
+	for (i = 0; i < ARRAY_SIZE(config->preempt_timeout); i++)
+		config->preempt_timeout[i] = preempt_timeout;
 
 	return 0;
 }
@@ -2004,7 +2009,7 @@ static u32 pf_get_preempt_timeout(struct xe_gt *gt, unsigned int vfid)
 {
 	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
 
-	return config->preempt_timeout;
+	return config->preempt_timeout[0];
 }
 
 /**
@@ -2183,10 +2188,14 @@ u32 xe_gt_sriov_pf_config_get_sched_priority(struct xe_gt *gt, unsigned int vfid
 
 static void pf_reset_config_sched(struct xe_gt *gt, struct xe_gt_sriov_config *config)
 {
+	int i;
+
 	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
 
-	config->exec_quantum = 0;
-	config->preempt_timeout = 0;
+	for (i = 0; i < ARRAY_SIZE(config->exec_quantum); i++) {
+		config->exec_quantum[i] = 0;
+		config->preempt_timeout[i] = 0;
+	}
 }
 
 static int pf_provision_threshold(struct xe_gt *gt, unsigned int vfid,
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config_types.h
index 686c7b3b6d7a..a417d099688d 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config_types.h
@@ -7,6 +7,7 @@
 #define _XE_GT_SRIOV_PF_CONFIG_TYPES_H_
 
 #include "xe_ggtt_types.h"
+#include "xe_guc_fwif.h"
 #include "xe_guc_klv_thresholds_set_types.h"
 
 struct xe_bo;
@@ -30,9 +31,9 @@ struct xe_gt_sriov_config {
 	/** @begin_db: start index of GuC doorbell ID range. */
 	u16 begin_db;
 	/** @exec_quantum: execution-quantum in milliseconds. */
-	u32 exec_quantum;
+	u32 exec_quantum[GUC_MAX_SCHED_GROUPS];
 	/** @preempt_timeout: preemption timeout in microseconds. */
-	u32 preempt_timeout;
+	u32 preempt_timeout[GUC_MAX_SCHED_GROUPS];
 	/** @sched_priority: scheduling priority. */
 	u32 sched_priority;
 	/** @thresholds: GuC thresholds for adverse events notifications. */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 09/11] drm/xe/sriov: Add functions to set exec quantums for each group
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (7 preceding siblings ...)
  2025-12-06 23:04 ` [PATCH v2 08/11] drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts Daniele Ceraolo Spurio
@ 2025-12-06 23:04 ` Daniele Ceraolo Spurio
  2025-12-06 23:04 ` [PATCH v2 10/11] drm/xe/sriov: Add functions to set preempt timeouts " Daniele Ceraolo Spurio
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:04 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

The GuC has a new dedicated KLV to set the EQs for the groups. The GuC
always sets the EQs for all the groups (even the ones not enabled). If
we provide fewer values than the max number of grops (8), the GuC will
set the remaining ones to 0.

Note that the new KLV can be used even when groups are disabled (as the
GuC always consider group0 to be active), so we can use it when encoding
the SRIOV config.

v2: drop the option of setting a single group, add an helper to encode
    the scheduler configs, rework setting change logging code, code
    improvements (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h      |  14 ++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 165 ++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h |   5 +
 drivers/gpu/drm/xe/xe_guc_klv_helpers.c    |   2 +
 4 files changed, 181 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index 6331836ddea7..f8c7bc4d110a 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -390,6 +390,16 @@ enum  {
  *      groups and cause the latter to be turned off when registered with the
  *      GuC, this config allows the PF to set a threshold for multi-LRC context
  *      registrations by VFs to monitor their behavior.
+ *
+ * _`GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM' : 0x8A0E
+ *      This config sets the VFs-execution-quantum for each scheduling group in
+ *      milliseconds. The driver must provide an array of values, with each of
+ *      them matching the respective group index (first value goes to group 0,
+ *      second to group 1, etc). The setting of group values follows the same
+ *      behavior and rules as setting via GUC_KLV_VF_CFG_EXEC_QUANTUM. Note that
+ *      the GuC always sets the EQ for all groups (even the non-enabled ones),
+ *      so if we provide fewer values than the max the GuC will use 0 for the
+ *      remaining groups.
  */
 
 #define GUC_KLV_VF_CFG_GGTT_START_KEY		0x0001
@@ -451,6 +461,10 @@ enum  {
 #define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_KEY	0x8a0d
 #define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_LEN	1u
 
+#define GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_KEY		0x8a0e
+#define GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_MIN_LEN	1u
+#define GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_MAX_LEN	8u
+
 /*
  * Workaround keys:
  */
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index be25c439a10b..ac3583594603 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -195,6 +195,25 @@ static int pf_push_vf_cfg_dbs(struct xe_gt *gt, unsigned int vfid, u32 begin, u3
 	return pf_push_vf_cfg_klvs(gt, vfid, 2, klvs, ARRAY_SIZE(klvs));
 }
 
+static int pf_push_vf_grp_cfg_u32(struct xe_gt *gt, unsigned int vfid,
+				  u16 key, const u32 *values, u32 count)
+{
+	CLASS(xe_guc_buf, buf)(&gt->uc.guc.buf, GUC_KLV_LEN_MIN + GUC_MAX_SCHED_GROUPS);
+	u32 *klv;
+
+	xe_gt_assert(gt, count && count <= GUC_MAX_SCHED_GROUPS);
+
+	if (!xe_guc_buf_is_valid(buf))
+		return -ENOBUFS;
+
+	klv = xe_guc_buf_cpu_ptr(buf);
+
+	klv[0] = FIELD_PREP(GUC_KLV_0_KEY, key) | FIELD_PREP(GUC_KLV_0_LEN, count);
+	memcpy(&klv[1], values, count * sizeof(u32));
+
+	return pf_push_vf_buf_klvs(gt, vfid, 1, buf, GUC_KLV_LEN_MIN + count);
+}
+
 static int pf_push_vf_cfg_exec_quantum(struct xe_gt *gt, unsigned int vfid, u32 *exec_quantum)
 {
 	/* GuC will silently clamp values exceeding max */
@@ -268,6 +287,32 @@ static u32 encode_config_ggtt(u32 *cfg, const struct xe_gt_sriov_config *config,
 	return encode_ggtt(cfg, node->base.start, node->base.size, details);
 }
 
+static u32 encode_config_sched(struct xe_gt *gt, u32 *cfg, u32 n,
+			       const struct xe_gt_sriov_config *config)
+{
+	int i;
+
+	if (xe_sriov_gt_pf_policy_has_multi_group_modes(gt)) {
+		BUILD_BUG_ON(ARRAY_SIZE(config->exec_quantum) >
+			     GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_MAX_LEN);
+
+		cfg[n++] = PREP_GUC_KLV_CONST(GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_KEY,
+					      ARRAY_SIZE(config->exec_quantum));
+		for (i = 0; i < ARRAY_SIZE(config->exec_quantum); i++)
+			cfg[n++] = config->exec_quantum[i];
+
+		/* TODO: add group preempt timeout setting */
+	} else {
+		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_EXEC_QUANTUM);
+		cfg[n++] = config->exec_quantum[0];
+
+		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_PREEMPT_TIMEOUT);
+		cfg[n++] = config->preempt_timeout[0];
+	}
+
+	return n;
+}
+
 /* Return: number of configuration dwords written */
 static u32 encode_config(struct xe_gt *gt, u32 *cfg,
 			 const struct xe_gt_sriov_config *config, bool details)
@@ -298,11 +343,7 @@ static u32 encode_config(struct xe_gt *gt, u32 *cfg,
 		cfg[n++] = upper_32_bits(xe_bo_size(config->lmem_obj));
 	}
 
-	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_EXEC_QUANTUM);
-	cfg[n++] = config->exec_quantum[0];
-
-	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_PREEMPT_TIMEOUT);
-	cfg[n++] = config->preempt_timeout[0];
+	n = encode_config_sched(gt, cfg, n, config);
 
 #define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
 	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
@@ -976,6 +1017,33 @@ static int pf_config_set_u32_done(struct xe_gt *gt, unsigned int vfid, u32 value
 	return 0;
 }
 
+static char *to_group_name(const char *what, u8 group, char *buf, size_t size)
+{
+	snprintf(buf, size, "group%u%s%s", group, what ? " " : "", what ?: "");
+	return buf;
+}
+
+static int
+pf_groups_cfg_set_u32_done(struct xe_gt *gt, unsigned int vfid, u32 *values, u32 count,
+			   void (*get_actual)(struct xe_gt *, unsigned int, u32 *, u32),
+			   const char *what, const char *(*unit)(u32), int err)
+{
+	u32 actual[GUC_MAX_SCHED_GROUPS];
+	char group_name[32];
+	u8 g;
+
+	xe_gt_assert(gt, count <= ARRAY_SIZE(actual));
+
+	get_actual(gt, vfid, actual, count);
+
+	for (g = 0; g < count; g++)
+		pf_config_set_u32_done(gt, vfid, values[g], actual[g],
+				       to_group_name(what, g, group_name, sizeof(group_name)),
+				       unit, err);
+
+	return err;
+}
+
 /**
  * xe_gt_sriov_pf_config_set_ctxs - Configure GuC contexts IDs quota for the VF.
  * @gt: the &xe_gt
@@ -1983,6 +2051,88 @@ int xe_gt_sriov_pf_config_bulk_set_exec_quantum_locked(struct xe_gt *gt, u32 exe
 					   exec_quantum_unit, n, err);
 }
 
+static int pf_provision_groups_exec_quantums(struct xe_gt *gt, unsigned int vfid,
+					     const u32 *exec_quantums, u32 count)
+{
+	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
+	int err;
+	int i;
+
+	err = pf_push_vf_grp_cfg_u32(gt, vfid, GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_KEY,
+				     exec_quantums, count);
+	if (unlikely(err))
+		return err;
+
+	/*
+	 * GuC silently clamps values exceeding the max and zeroes out the
+	 * quantum for groups not in the klv payload
+	 */
+	for (i = 0; i < ARRAY_SIZE(config->exec_quantum); i++) {
+		if (i < count)
+			config->exec_quantum[i] = min_t(u32, exec_quantums[i],
+							GUC_KLV_VF_CFG_EXEC_QUANTUM_MAX_VALUE);
+		else
+			config->exec_quantum[i] = 0;
+	}
+
+	return 0;
+}
+
+static void pf_get_groups_exec_quantums(struct xe_gt *gt, unsigned int vfid,
+					u32 *exec_quantums, u32 max_count)
+{
+	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
+	u32 count = min_t(u32, max_count, ARRAY_SIZE(config->exec_quantum));
+
+	memcpy(exec_quantums, config->exec_quantum, sizeof(u32) * count);
+}
+
+/**
+ * xe_gt_sriov_pf_config_set_groups_exec_quantums() - Configure PF/VF EQs for sched groups.
+ * @gt: the &xe_gt
+ * @vfid: the PF or VF identifier
+ * @exec_quantums: array of requested EQs in milliseconds (0 is infinity)
+ * @count: number of entries in the array
+ *
+ * This function can only be called on PF.
+ * It will log the provisioned value or an error in case of the failure.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_config_set_groups_exec_quantums(struct xe_gt *gt, unsigned int vfid,
+						   u32 *exec_quantums, u32 count)
+{
+	int err;
+
+	guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
+
+	err = pf_provision_groups_exec_quantums(gt, vfid, exec_quantums, count);
+
+	return pf_groups_cfg_set_u32_done(gt, vfid, exec_quantums, count,
+					  pf_get_groups_exec_quantums,
+					  "execution quantum",
+					  exec_quantum_unit, err);
+}
+
+/**
+ * xe_gt_sriov_pf_config_get_groups_exec_quantums - Get PF/VF sched groups EQs
+ * @gt: the &xe_gt
+ * @vfid: the PF or VF identifier
+ * @exec_quantums: array in which to store the execution quantums values
+ * @count: maximum number of entries to store
+ *
+ * This function can only be called on PF.
+ */
+void xe_gt_sriov_pf_config_get_groups_exec_quantums(struct xe_gt *gt, unsigned int vfid,
+						    u32 *exec_quantums, u32 count)
+{
+	guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
+
+	xe_gt_assert(gt, count <= GUC_MAX_SCHED_GROUPS);
+
+	pf_get_groups_exec_quantums(gt, vfid, exec_quantums, count);
+}
+
 static const char *preempt_timeout_unit(u32 preempt_timeout)
 {
 	return preempt_timeout ? "us" : "(infinity)";
@@ -2557,6 +2707,11 @@ static int pf_restore_vf_config_klv(struct xe_gt *gt, unsigned int vfid,
 			return -EBADMSG;
 		return pf_provision_exec_quantum(gt, vfid, value[0]);
 
+	case GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_KEY:
+		if (len > GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_MAX_LEN)
+			return -EBADMSG;
+		return pf_provision_groups_exec_quantums(gt, vfid, value, len);
+
 	case GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_KEY:
 		if (len != GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_LEN)
 			return -EBADMSG;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
index 4975730423d7..aaed1f490da8 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
@@ -46,6 +46,11 @@ int xe_gt_sriov_pf_config_set_exec_quantum_locked(struct xe_gt *gt, unsigned int
 						  u32 exec_quantum);
 int xe_gt_sriov_pf_config_bulk_set_exec_quantum_locked(struct xe_gt *gt, u32 exec_quantum);
 
+void xe_gt_sriov_pf_config_get_groups_exec_quantums(struct xe_gt *gt, unsigned int vfid,
+						    u32 *exec_quantum, u32 max_count);
+int xe_gt_sriov_pf_config_set_groups_exec_quantums(struct xe_gt *gt, unsigned int vfid,
+						   u32 *exec_quantum, u32 count);
+
 u32 xe_gt_sriov_pf_config_get_preempt_timeout(struct xe_gt *gt, unsigned int vfid);
 int xe_gt_sriov_pf_config_set_preempt_timeout(struct xe_gt *gt, unsigned int vfid,
 					      u32 preempt_timeout);
diff --git a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
index dd504b77cb17..b696a21f87e8 100644
--- a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
+++ b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
@@ -56,6 +56,8 @@ const char *xe_guc_klv_key_to_string(u16 key)
 		return "begin_ctx_id";
 	case GUC_KLV_VF_CFG_SCHED_PRIORITY_KEY:
 		return "sched_priority";
+	case GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_KEY:
+		return "sched_groups_exec_quantum";
 
 	/* VF CFG threshold keys */
 #define define_threshold_key_to_string_case(TAG, NAME, ...)	\
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 10/11] drm/xe/sriov: Add functions to set preempt timeouts for each group
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (8 preceding siblings ...)
  2025-12-06 23:04 ` [PATCH v2 09/11] drm/xe/sriov: Add functions to set exec quantums for each group Daniele Ceraolo Spurio
@ 2025-12-06 23:04 ` Daniele Ceraolo Spurio
  2025-12-06 23:04 ` [PATCH v2 11/11] drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups Daniele Ceraolo Spurio
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:04 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

The KLV to set the preemption timeout for each groups works the exact
same way as the one for the exec quantums, so we add similar functions.

v2: drop the option of setting a single group, minor updates (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/abi/guc_klvs_abi.h      | 13 +++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c | 95 +++++++++++++++++++++-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h |  5 ++
 drivers/gpu/drm/xe/xe_guc_klv_helpers.c    |  2 +
 4 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
index f8c7bc4d110a..d605c31aae6c 100644
--- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
+++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
@@ -400,6 +400,16 @@ enum  {
  *      the GuC always sets the EQ for all groups (even the non-enabled ones),
  *      so if we provide fewer values than the max the GuC will use 0 for the
  *      remaining groups.
+ *
+ * _`GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT' : 0x8A0F
+ *      This config sets the VFs-preemption-timeout for each scheduling group in
+ *      microseconds. The driver must provide an array of values, with each of
+ *      them matching the respective group index (first value goes to group 0,
+ *      second to group 1, etc). The setting of group values follows the same
+ *      behavior and rules as setting via GUC_KLV_VF_CFG_PREEMPT_TIMEOUT. Note
+ *      that the GuC always sets the EQ for all groups (even the non-enabled
+ *      ones), so if we provide fewer values than the max the GuC will use 0 for
+ *      the remaining groups.
  */
 
 #define GUC_KLV_VF_CFG_GGTT_START_KEY		0x0001
@@ -465,6 +475,9 @@ enum  {
 #define GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_MIN_LEN	1u
 #define GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_MAX_LEN	8u
 
+#define GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_KEY		0x8a0f
+#define GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_MIN_LEN	1u
+#define GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_MAX_LEN	8u
 /*
  * Workaround keys:
  */
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
index ac3583594603..57f32b6c7814 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
@@ -295,13 +295,18 @@ static u32 encode_config_sched(struct xe_gt *gt, u32 *cfg, u32 n,
 	if (xe_sriov_gt_pf_policy_has_multi_group_modes(gt)) {
 		BUILD_BUG_ON(ARRAY_SIZE(config->exec_quantum) >
 			     GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_MAX_LEN);
+		BUILD_BUG_ON(ARRAY_SIZE(config->preempt_timeout) >
+			     GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_MAX_LEN);
 
 		cfg[n++] = PREP_GUC_KLV_CONST(GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_KEY,
 					      ARRAY_SIZE(config->exec_quantum));
 		for (i = 0; i < ARRAY_SIZE(config->exec_quantum); i++)
 			cfg[n++] = config->exec_quantum[i];
 
-		/* TODO: add group preempt timeout setting */
+		cfg[n++] = PREP_GUC_KLV_CONST(GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_KEY,
+					      ARRAY_SIZE(config->preempt_timeout));
+		for (i = 0; i < ARRAY_SIZE(config->preempt_timeout); i++)
+			cfg[n++] = config->preempt_timeout[i];
 	} else {
 		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_EXEC_QUANTUM);
 		cfg[n++] = config->exec_quantum[0];
@@ -2265,6 +2270,89 @@ int xe_gt_sriov_pf_config_bulk_set_preempt_timeout_locked(struct xe_gt *gt, u32
 					   preempt_timeout_unit, n, err);
 }
 
+static int pf_provision_groups_preempt_timeouts(struct xe_gt *gt, unsigned int vfid,
+						const u32 *preempt_timeouts, u32 count)
+{
+	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
+	int err;
+	int i;
+
+	err = pf_push_vf_grp_cfg_u32(gt, vfid, GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_KEY,
+				     preempt_timeouts, count);
+	if (unlikely(err))
+		return err;
+
+	/*
+	 * GuC silently clamps values exceeding the max and zeroes out the
+	 * quantum for groups not in the klv payload
+	 */
+	for (i = 0; i < ARRAY_SIZE(config->preempt_timeout); i++) {
+		if (i < count)
+			config->preempt_timeout[i] =
+				min_t(u32, preempt_timeouts[i],
+				      GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_MAX_VALUE);
+		else
+			config->preempt_timeout[i] = 0;
+	}
+
+	return 0;
+}
+
+static void pf_get_groups_preempt_timeouts(struct xe_gt *gt, unsigned int vfid,
+					   u32 *preempt_timeouts, u32 max_count)
+{
+	struct xe_gt_sriov_config *config = pf_pick_vf_config(gt, vfid);
+	u32 count = min_t(u32, max_count, ARRAY_SIZE(config->preempt_timeout));
+
+	memcpy(preempt_timeouts, config->preempt_timeout, sizeof(u32) * count);
+}
+
+/**
+ * xe_gt_sriov_pf_config_set_groups_preempt_timeouts() - Configure PF/VF PTs for sched groups.
+ * @gt: the &xe_gt
+ * @vfid: the PF or VF identifier
+ * @preempt_timeouts: array of requested PTs in microseconds (0 is infinity)
+ * @count: number of entries in the array
+ *
+ * This function can only be called on PF.
+ * It will log the provisioned value or an error in case of the failure.
+ *
+ * Return: 0 on success or a negative error code on failure.
+ */
+int xe_gt_sriov_pf_config_set_groups_preempt_timeouts(struct xe_gt *gt, unsigned int vfid,
+						      u32 *preempt_timeouts, u32 count)
+{
+	int err;
+
+	guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
+
+	err = pf_provision_groups_preempt_timeouts(gt, vfid, preempt_timeouts, count);
+
+	return pf_groups_cfg_set_u32_done(gt, vfid, preempt_timeouts, count,
+					  pf_get_groups_preempt_timeouts,
+					  "preempt_timeout",
+					  preempt_timeout_unit, err);
+}
+
+/**
+ * xe_gt_sriov_pf_config_get_groups_preempt_timeouts - Get PF/VF sched groups PTs
+ * @gt: the &xe_gt
+ * @vfid: the PF or VF identifier
+ * @preempt_timeouts: array in which to store the preemption timeouts values
+ * @count: maximum number of entries to store
+ *
+ * This function can only be called on PF.
+ */
+void xe_gt_sriov_pf_config_get_groups_preempt_timeouts(struct xe_gt *gt, unsigned int vfid,
+						       u32 *preempt_timeouts, u32 count)
+{
+	guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
+
+	xe_gt_assert(gt, count <= GUC_MAX_SCHED_GROUPS);
+
+	pf_get_groups_preempt_timeouts(gt, vfid, preempt_timeouts, count);
+}
+
 static const char *sched_priority_unit(u32 priority)
 {
 	return priority == GUC_SCHED_PRIORITY_LOW ? "(low)" :
@@ -2712,6 +2800,11 @@ static int pf_restore_vf_config_klv(struct xe_gt *gt, unsigned int vfid,
 			return -EBADMSG;
 		return pf_provision_groups_exec_quantums(gt, vfid, value, len);
 
+	case GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_KEY:
+		if (len > GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_MAX_LEN)
+			return -EBADMSG;
+		return pf_provision_groups_preempt_timeouts(gt, vfid, value, len);
+
 	case GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_KEY:
 		if (len != GUC_KLV_VF_CFG_PREEMPT_TIMEOUT_LEN)
 			return -EBADMSG;
diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
index aaed1f490da8..3c6c8b6655af 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h
@@ -60,6 +60,11 @@ int xe_gt_sriov_pf_config_set_preempt_timeout_locked(struct xe_gt *gt, unsigned
 						     u32 preempt_timeout);
 int xe_gt_sriov_pf_config_bulk_set_preempt_timeout_locked(struct xe_gt *gt, u32 preempt_timeout);
 
+void xe_gt_sriov_pf_config_get_groups_preempt_timeouts(struct xe_gt *gt, unsigned int vfid,
+						       u32 *preempt_timeout, u32 max_count);
+int xe_gt_sriov_pf_config_set_groups_preempt_timeouts(struct xe_gt *gt, unsigned int vfid,
+						      u32 *preempt_timeout, u32 count);
+
 u32 xe_gt_sriov_pf_config_get_sched_priority(struct xe_gt *gt, unsigned int vfid);
 int xe_gt_sriov_pf_config_set_sched_priority(struct xe_gt *gt, unsigned int vfid, u32 priority);
 
diff --git a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
index b696a21f87e8..97600edda837 100644
--- a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
+++ b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
@@ -58,6 +58,8 @@ const char *xe_guc_klv_key_to_string(u16 key)
 		return "sched_priority";
 	case GUC_KLV_VF_CFG_ENGINE_GROUP_EXEC_QUANTUM_KEY:
 		return "sched_groups_exec_quantum";
+	case GUC_KLV_VF_CFG_ENGINE_GROUP_PREEMPT_TIMEOUT_KEY:
+		return "sched_groups_preempt_timeout";
 
 	/* VF CFG threshold keys */
 #define define_threshold_key_to_string_case(TAG, NAME, ...)	\
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 11/11] drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (9 preceding siblings ...)
  2025-12-06 23:04 ` [PATCH v2 10/11] drm/xe/sriov: Add functions to set preempt timeouts " Daniele Ceraolo Spurio
@ 2025-12-06 23:04 ` Daniele Ceraolo Spurio
  2025-12-06 23:10 ` ✗ CI.checkpatch: warning for Introduce SRIOV scheduler groups (rev2) Patchwork
  2025-12-06 23:11 ` ✓ CI.KUnit: success " Patchwork
  12 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-06 23:04 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

Debugfs files are added to allow a user to provide a comma-separated list
of values to assign to each group for each VF.

v2: drop files for individual groups, check input length (Michal)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
---
 drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 128 +++++++++++++++++++-
 1 file changed, 124 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
index 15f5f3a40471..014dcdc233bd 100644
--- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
+++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
@@ -163,10 +163,18 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
  *          :   ├── tile0
  *              :   ├── gt0
  *                  :   ├── sched_groups_mode
+ *                      ├── sched_groups_exec_quantums_ms
+ *                      ├── sched_groups_preempt_timeout_us
  *                      ├── sched_groups
  *                      :   ├── group0
  *                          :
- *                          └── groupN
+ *          :               └── groupN
+ *          ├── vf1
+ *          :   ├── tile0
+ *              :   ├── gt0
+ *                  :   ├── sched_groups_exec_quantums_ms
+ *                      ├── sched_groups_preempt_timeout_us
+ *                      :
  */
 
 static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
@@ -264,6 +272,109 @@ static const struct file_operations sched_groups_fops = {
 	.release = single_release,
 };
 
+static int sched_groups_config_show(struct seq_file *m, void *data,
+				    void (*get)(struct xe_gt *, unsigned int, u32 *, u32))
+{
+	struct drm_printer p = drm_seq_file_printer(m);
+	unsigned int vfid = extract_vfid(m->private);
+	struct xe_gt *gt = extract_gt(m->private);
+	u32 values[GUC_MAX_SCHED_GROUPS];
+	bool first = true;
+	u8 g;
+
+	get(gt, vfid, values, ARRAY_SIZE(values));
+
+	for (g = 0; g < ARRAY_SIZE(values); g++) {
+		drm_printf(&p, "%s%u", first ? "" : ",", values[g]);
+
+		first = false;
+	}
+
+	drm_printf(&p, "\n");
+
+	return 0;
+}
+
+static ssize_t sched_groups_config_write(struct file *file, const char __user *ubuf,
+					 size_t size, loff_t *pos,
+					 int (*set)(struct xe_gt *, unsigned int, u32 *, u32))
+{
+	struct dentry *parent = file_inode(file)->i_private;
+	unsigned int vfid = extract_vfid(parent);
+	struct xe_gt *gt = extract_gt(parent);
+	u32 values[GUC_MAX_SCHED_GROUPS];
+	int *input;
+	u32 count;
+	int ret;
+	int i;
+
+	if (*pos)
+		return -ESPIPE;
+
+	if (!size)
+		return -ENODATA;
+
+	ret = parse_int_array_user(ubuf, min(size, GUC_MAX_SCHED_GROUPS * sizeof(u32)), &input);
+	if (ret)
+		return ret;
+
+	count = input[0];
+	if (count > GUC_MAX_SCHED_GROUPS) {
+		ret = -E2BIG;
+		goto out;
+	}
+
+	for (i = 0; i < count; i++) {
+		if (input[i + 1] < 0 || input[i + 1] > S32_MAX) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		values[i] = input[i + 1];
+	}
+
+	xe_pm_runtime_get(gt_to_xe(gt));
+	ret = set(gt, vfid, values, count);
+	xe_pm_runtime_put(gt_to_xe(gt));
+
+out:
+	kfree(input);
+	return (ret < 0) ? ret : size;
+}
+
+#define DEFINE_SRIOV_GT_GRP_CFG_DEBUGFS_ATTRIBUTE(CONFIG)			\
+static int sched_groups_##CONFIG##_show(struct seq_file *m, void *data)		\
+{										\
+	return sched_groups_config_show(m, data,				\
+					xe_gt_sriov_pf_config_get_groups_##CONFIG); \
+}										\
+										\
+static int sched_groups_##CONFIG##_open(struct inode *inode, struct file *file)	\
+{										\
+	return single_open(file, sched_groups_##CONFIG##_show,			\
+			   inode->i_private);					\
+}										\
+										\
+static ssize_t sched_groups_##CONFIG##_write(struct file *file,			\
+					      const char __user *ubuf,		\
+					      size_t size, loff_t *pos)		\
+{										\
+	return sched_groups_config_write(file, ubuf, size, pos,			\
+					 xe_gt_sriov_pf_config_set_groups_##CONFIG); \
+}										\
+										\
+static const struct file_operations sched_groups_##CONFIG##_fops = {		\
+	.owner = THIS_MODULE,							\
+	.open = sched_groups_##CONFIG##_open,					\
+	.read = seq_read,							\
+	.llseek = seq_lseek,							\
+	.write = sched_groups_##CONFIG##_write,					\
+	.release = single_release,						\
+}
+
+DEFINE_SRIOV_GT_GRP_CFG_DEBUGFS_ATTRIBUTE(exec_quantums);
+DEFINE_SRIOV_GT_GRP_CFG_DEBUGFS_ATTRIBUTE(preempt_timeouts);
+
 static ssize_t sched_group_engines_read(struct file *file, char __user *buf,
 					size_t count, loff_t *ppos)
 {
@@ -313,13 +424,13 @@ static const struct file_operations sched_group_engines_fops = {
 	.llseek = default_llseek,
 };
 
-static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
+static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent, unsigned int vfid)
 {
 	struct dentry *groups;
 	u8 g;
 
 	xe_gt_assert(gt, gt == extract_gt(parent));
-	xe_gt_assert(gt, PFID == extract_vfid(parent));
+	xe_gt_assert(gt, vfid == extract_vfid(parent));
 
 	/*
 	 * TODO: we currently call this function before we initialize scheduler
@@ -336,6 +447,14 @@ static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
 	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
 		return;
 
+	debugfs_create_file("sched_groups_exec_quantums_ms", 0644, parent, parent,
+			    &sched_groups_exec_quantums_fops);
+	debugfs_create_file("sched_groups_preempt_timeouts_us", 0644, parent, parent,
+			    &sched_groups_preempt_timeouts_fops);
+
+	if (vfid != PFID)
+		return;
+
 	debugfs_create_file("sched_groups_mode", 0644, parent, parent, &sched_groups_fops);
 
 	groups = debugfs_create_dir("sched_groups", parent);
@@ -713,6 +832,7 @@ static void pf_populate_gt(struct xe_gt *gt, struct dentry *dent, unsigned int v
 
 	if (vfid) {
 		pf_add_config_attrs(gt, dent, vfid);
+		pf_add_sched_groups(gt, dent, vfid);
 
 		debugfs_create_file("control", 0600, dent, NULL, &control_ops);
 
@@ -726,7 +846,7 @@ static void pf_populate_gt(struct xe_gt *gt, struct dentry *dent, unsigned int v
 	} else {
 		pf_add_config_attrs(gt, dent, PFID);
 		pf_add_policy_attrs(gt, dent);
-		pf_add_sched_groups(gt, dent);
+		pf_add_sched_groups(gt, dent, PFID);
 
 		drm_debugfs_create_files(pf_info, ARRAY_SIZE(pf_info), dent, minor);
 	}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* ✗ CI.checkpatch: warning for Introduce SRIOV scheduler groups (rev2)
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (10 preceding siblings ...)
  2025-12-06 23:04 ` [PATCH v2 11/11] drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups Daniele Ceraolo Spurio
@ 2025-12-06 23:10 ` Patchwork
  2025-12-06 23:11 ` ✓ CI.KUnit: success " Patchwork
  12 siblings, 0 replies; 30+ messages in thread
From: Patchwork @ 2025-12-06 23:10 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-xe

== Series Details ==

Series: Introduce SRIOV scheduler groups (rev2)
URL   : https://patchwork.freedesktop.org/series/158142/
State : warning

== Summary ==

+ KERNEL=/kernel
+ git clone https://gitlab.freedesktop.org/drm/maintainer-tools mt
Cloning into 'mt'...
warning: redirecting to https://gitlab.freedesktop.org/drm/maintainer-tools.git/
+ git -C mt rev-list -n1 origin/master
2de9a3901bc28757c7906b454717b64e2a214021
+ cd /kernel
+ git config --global --add safe.directory /kernel
+ git log -n1
commit 1f6e1f507cadef94aa53f2574ecf6b736545dc81
Author: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Date:   Sat Dec 6 15:04:07 2025 -0800

    drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups
    
    Debugfs files are added to allow a user to provide a comma-separated list
    of values to assign to each group for each VF.
    
    v2: drop files for individual groups, check input length (Michal)
    
    Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
+ /mt/dim checkpatch db6505187efb9c255df1dd6e78c00d95fadfef79 drm-intel
c1478a9e936c drm/xe/gt: Add engine masks for each class
debfe37cf169 drm/xe/sriov: Initialize scheduler groups
3f3ff0982d59 drm/xe/sriov: Add support for enabling scheduler groups
74ac2637e362 drm/xe/sriov: Scheduler groups are incompatible with multi-lrc
2d345b7a1092 drm/xe/sriov: Add handling for MLRC adverse event threshold
-:71: WARNING:MACRO_ARG_UNUSED: Argument 'NAME' is not used in function-like macro
#71: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:307:
+#define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
+		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
+		cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
+	}										\
 });

-:71: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'MIN_GUC_VER' - possible side-effects?
#71: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:307:
+#define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
+		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
+		cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
+	}										\
 });

-:71: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'MIN_GUC_VER' may be better as '(MIN_GUC_VER)' to avoid precedence issues
#71: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:307:
+#define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
+		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
+		cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
+	}										\
 });

-:71: WARNING:TRAILING_SEMICOLON: macros should not use a trailing semicolon
#71: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:307:
+#define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
+		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
+		cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
+	}										\
 });

-:102: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'TAG' - possible side-effects?
#102: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:2557:
+#define define_threshold_key_to_provision_case(TAG, NAME, MIN_GUC_VER)			\
 	case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):					\
 		BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u);		\
+		if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < MIN_GUC_VER)		\
+			return -ENOKEY;							\
 		if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))			\
 			return -EBADMSG;						\
 		return pf_provision_threshold(gt, vfid,					\

-:102: WARNING:MACRO_ARG_UNUSED: Argument 'NAME' is not used in function-like macro
#102: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:2557:
+#define define_threshold_key_to_provision_case(TAG, NAME, MIN_GUC_VER)			\
 	case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):					\
 		BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u);		\
+		if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < MIN_GUC_VER)		\
+			return -ENOKEY;							\
 		if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))			\
 			return -EBADMSG;						\
 		return pf_provision_threshold(gt, vfid,					\

-:102: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'MIN_GUC_VER' - possible side-effects?
#102: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:2557:
+#define define_threshold_key_to_provision_case(TAG, NAME, MIN_GUC_VER)			\
 	case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):					\
 		BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u);		\
+		if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < MIN_GUC_VER)		\
+			return -ENOKEY;							\
 		if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))			\
 			return -EBADMSG;						\
 		return pf_provision_threshold(gt, vfid,					\

-:102: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'MIN_GUC_VER' may be better as '(MIN_GUC_VER)' to avoid precedence issues
#102: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:2557:
+#define define_threshold_key_to_provision_case(TAG, NAME, MIN_GUC_VER)			\
 	case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):					\
 		BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u);		\
+		if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < MIN_GUC_VER)		\
+			return -ENOKEY;							\
 		if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))			\
 			return -EBADMSG;						\
 		return pf_provision_threshold(gt, vfid,					\

-:102: WARNING:MACRO_WITH_FLOW_CONTROL: Macros with flow control statements should be avoided
#102: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c:2557:
+#define define_threshold_key_to_provision_case(TAG, NAME, MIN_GUC_VER)			\
 	case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):					\
 		BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u);		\
+		if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < MIN_GUC_VER)		\
+			return -ENOKEY;							\
 		if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))			\
 			return -EBADMSG;						\
 		return pf_provision_threshold(gt, vfid,					\

-:129: WARNING:MACRO_ARG_UNUSED: Argument 'TAG' is not used in function-like macro
#129: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c:305:
+#define register_threshold_attribute(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER)		\
+		debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent,	\
+					   &NAME##_fops);				\
+});

-:129: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'MIN_GUC_VER' - possible side-effects?
#129: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c:305:
+#define register_threshold_attribute(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER)		\
+		debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent,	\
+					   &NAME##_fops);				\
+});

-:129: CHECK:MACRO_ARG_PRECEDENCE: Macro argument 'MIN_GUC_VER' may be better as '(MIN_GUC_VER)' to avoid precedence issues
#129: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c:305:
+#define register_threshold_attribute(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER)		\
+		debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent,	\
+					   &NAME##_fops);				\
+});

-:129: WARNING:TRAILING_SEMICOLON: macros should not use a trailing semicolon
#129: FILE: drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c:305:
+#define register_threshold_attribute(TAG, NAME, MIN_GUC_VER) ({				\
+	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER)		\
+		debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent,	\
+					   &NAME##_fops);				\
+});

-:184: ERROR:COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses
#184: FILE: drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h:29:
+#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)					\
+	define(CAT_ERR, cat_error_count, 0)					\
+	define(ENGINE_RESET, engine_reset_count, 0)				\
+	define(PAGE_FAULT, page_fault_count, 0)					\
+	define(H2G_STORM, guc_time_us, 0)					\
+	define(IRQ_STORM, irq_time_us, 0)					\
+	define(DOORBELL_STORM, doorbell_time_us, 0)				\
+	define(MULTI_LRC_COUNT, multi_lrc_count, MAKE_GUC_VER(70, 53, 0))	\
 	/* end */

BUT SEE:

   do {} while (0) advice is over-stated in a few situations:

   The more obvious case is macros, like MODULE_PARM_DESC, invoked at
   file-scope, where C disallows code (it must be in functions).  See
   $exceptions if you have one to add by name.

   More troublesome is declarative macros used at top of new scope,
   like DECLARE_PER_CPU.  These might just compile with a do-while-0
   wrapper, but would be incorrect.  Most of these are handled by
   detecting struct,union,etc declaration primitives in $exceptions.

   Theres also macros called inside an if (block), which "return" an
   expression.  These cannot do-while, and need a ({}) wrapper.

   Enjoy this qualification while we work to improve our heuristics.

-:184: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'define' - possible side-effects?
#184: FILE: drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h:29:
+#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)					\
+	define(CAT_ERR, cat_error_count, 0)					\
+	define(ENGINE_RESET, engine_reset_count, 0)				\
+	define(PAGE_FAULT, page_fault_count, 0)					\
+	define(H2G_STORM, guc_time_us, 0)					\
+	define(IRQ_STORM, irq_time_us, 0)					\
+	define(DOORBELL_STORM, doorbell_time_us, 0)				\
+	define(MULTI_LRC_COUNT, multi_lrc_count, MAKE_GUC_VER(70, 53, 0))	\
 	/* end */

-:196: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#196: 
new file mode 100644

total: 1 errors, 7 warnings, 8 checks, 176 lines checked
417297f80ac3 drm/xe/sriov: Add debugfs to enable scheduler groups
bdfbb4896c33 drm/xe/sriov: Add debugfs with scheduler groups information
185e7ae0c6f5 drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts
b9b4bb1dc695 drm/xe/sriov: Add functions to set exec quantums for each group
264207e14aac drm/xe/sriov: Add functions to set preempt timeouts for each group
1f6e1f507cad drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups



^ permalink raw reply	[flat|nested] 30+ messages in thread

* ✓ CI.KUnit: success for Introduce SRIOV scheduler groups (rev2)
  2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
                   ` (11 preceding siblings ...)
  2025-12-06 23:10 ` ✗ CI.checkpatch: warning for Introduce SRIOV scheduler groups (rev2) Patchwork
@ 2025-12-06 23:11 ` Patchwork
  12 siblings, 0 replies; 30+ messages in thread
From: Patchwork @ 2025-12-06 23:11 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio; +Cc: intel-xe

== Series Details ==

Series: Introduce SRIOV scheduler groups (rev2)
URL   : https://patchwork.freedesktop.org/series/158142/
State : success

== Summary ==

+ trap cleanup EXIT
+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/xe/.kunitconfig
[23:10:18] Configuring KUnit Kernel ...
Generating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[23:10:22] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
../drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c:627:5: warning: conflicting types for ‘xe_gt_sriov_pf_policy_set_sched_groups_mode’ due to enum/integer mismatch; have ‘int(struct xe_gt *, enum xe_sriov_sched_group_modes)’ [-Wenum-int-mismatch]
  627 | int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt,
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from ../drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c:13:
../drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h:23:5: note: previous declaration of ‘xe_gt_sriov_pf_policy_set_sched_groups_mode’ with type ‘int(struct xe_gt *, u32)’ {aka ‘int(struct xe_gt *, unsigned int)’}
   23 | int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[23:10:52] Starting KUnit Kernel (1/1)...
[23:10:52] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[23:10:53] ================== guc_buf (11 subtests) ===================
[23:10:53] [PASSED] test_smallest
[23:10:53] [PASSED] test_largest
[23:10:53] [PASSED] test_granular
[23:10:53] [PASSED] test_unique
[23:10:53] [PASSED] test_overlap
[23:10:53] [PASSED] test_reusable
[23:10:53] [PASSED] test_too_big
[23:10:53] [PASSED] test_flush
[23:10:53] [PASSED] test_lookup
[23:10:53] [PASSED] test_data
[23:10:53] [PASSED] test_class
[23:10:53] ===================== [PASSED] guc_buf =====================
[23:10:53] =================== guc_dbm (7 subtests) ===================
[23:10:53] [PASSED] test_empty
[23:10:53] [PASSED] test_default
[23:10:53] ======================== test_size  ========================
[23:10:53] [PASSED] 4
[23:10:53] [PASSED] 8
[23:10:53] [PASSED] 32
[23:10:53] [PASSED] 256
[23:10:53] ==================== [PASSED] test_size ====================
[23:10:53] ======================= test_reuse  ========================
[23:10:53] [PASSED] 4
[23:10:53] [PASSED] 8
[23:10:53] [PASSED] 32
[23:10:53] [PASSED] 256
[23:10:53] =================== [PASSED] test_reuse ====================
[23:10:53] =================== test_range_overlap  ====================
[23:10:53] [PASSED] 4
[23:10:53] [PASSED] 8
[23:10:53] [PASSED] 32
[23:10:53] [PASSED] 256
[23:10:53] =============== [PASSED] test_range_overlap ================
[23:10:53] =================== test_range_compact  ====================
[23:10:53] [PASSED] 4
[23:10:53] [PASSED] 8
[23:10:53] [PASSED] 32
[23:10:53] [PASSED] 256
[23:10:53] =============== [PASSED] test_range_compact ================
[23:10:53] ==================== test_range_spare  =====================
[23:10:53] [PASSED] 4
[23:10:53] [PASSED] 8
[23:10:53] [PASSED] 32
[23:10:53] [PASSED] 256
[23:10:53] ================ [PASSED] test_range_spare =================
[23:10:53] ===================== [PASSED] guc_dbm =====================
[23:10:53] =================== guc_idm (6 subtests) ===================
[23:10:53] [PASSED] bad_init
[23:10:53] [PASSED] no_init
[23:10:53] [PASSED] init_fini
[23:10:53] [PASSED] check_used
[23:10:53] [PASSED] check_quota
[23:10:53] [PASSED] check_all
[23:10:53] ===================== [PASSED] guc_idm =====================
[23:10:53] ================== no_relay (3 subtests) ===================
[23:10:53] [PASSED] xe_drops_guc2pf_if_not_ready
[23:10:53] [PASSED] xe_drops_guc2vf_if_not_ready
[23:10:53] [PASSED] xe_rejects_send_if_not_ready
[23:10:53] ==================== [PASSED] no_relay =====================
[23:10:53] ================== pf_relay (14 subtests) ==================
[23:10:53] [PASSED] pf_rejects_guc2pf_too_short
[23:10:53] [PASSED] pf_rejects_guc2pf_too_long
[23:10:53] [PASSED] pf_rejects_guc2pf_no_payload
[23:10:53] [PASSED] pf_fails_no_payload
[23:10:53] [PASSED] pf_fails_bad_origin
[23:10:53] [PASSED] pf_fails_bad_type
[23:10:53] [PASSED] pf_txn_reports_error
[23:10:53] [PASSED] pf_txn_sends_pf2guc
[23:10:53] [PASSED] pf_sends_pf2guc
[23:10:53] [SKIPPED] pf_loopback_nop
[23:10:53] [SKIPPED] pf_loopback_echo
[23:10:53] [SKIPPED] pf_loopback_fail
[23:10:53] [SKIPPED] pf_loopback_busy
[23:10:53] [SKIPPED] pf_loopback_retry
[23:10:53] ==================== [PASSED] pf_relay =====================
[23:10:53] ================== vf_relay (3 subtests) ===================
[23:10:53] [PASSED] vf_rejects_guc2vf_too_short
[23:10:53] [PASSED] vf_rejects_guc2vf_too_long
[23:10:53] [PASSED] vf_rejects_guc2vf_no_payload
[23:10:53] ==================== [PASSED] vf_relay =====================
[23:10:53] ================ pf_gt_config (6 subtests) =================
[23:10:53] [PASSED] fair_contexts_1vf
[23:10:53] [PASSED] fair_doorbells_1vf
[23:10:53] [PASSED] fair_ggtt_1vf
[23:10:53] ====================== fair_contexts  ======================
[23:10:53] [PASSED] 1 VF
[23:10:53] [PASSED] 2 VFs
[23:10:53] [PASSED] 3 VFs
[23:10:53] [PASSED] 4 VFs
[23:10:53] [PASSED] 5 VFs
[23:10:53] [PASSED] 6 VFs
[23:10:53] [PASSED] 7 VFs
[23:10:53] [PASSED] 8 VFs
[23:10:53] [PASSED] 9 VFs
[23:10:53] [PASSED] 10 VFs
[23:10:53] [PASSED] 11 VFs
[23:10:53] [PASSED] 12 VFs
[23:10:53] [PASSED] 13 VFs
[23:10:53] [PASSED] 14 VFs
[23:10:53] [PASSED] 15 VFs
[23:10:53] [PASSED] 16 VFs
[23:10:53] [PASSED] 17 VFs
[23:10:53] [PASSED] 18 VFs
[23:10:53] [PASSED] 19 VFs
[23:10:53] [PASSED] 20 VFs
[23:10:53] [PASSED] 21 VFs
[23:10:53] [PASSED] 22 VFs
[23:10:53] [PASSED] 23 VFs
[23:10:53] [PASSED] 24 VFs
[23:10:53] [PASSED] 25 VFs
[23:10:53] [PASSED] 26 VFs
[23:10:53] [PASSED] 27 VFs
[23:10:53] [PASSED] 28 VFs
[23:10:53] [PASSED] 29 VFs
[23:10:53] [PASSED] 30 VFs
[23:10:53] [PASSED] 31 VFs
[23:10:53] [PASSED] 32 VFs
[23:10:53] [PASSED] 33 VFs
[23:10:53] [PASSED] 34 VFs
[23:10:53] [PASSED] 35 VFs
[23:10:53] [PASSED] 36 VFs
[23:10:53] [PASSED] 37 VFs
[23:10:53] [PASSED] 38 VFs
[23:10:53] [PASSED] 39 VFs
[23:10:53] [PASSED] 40 VFs
[23:10:53] [PASSED] 41 VFs
[23:10:53] [PASSED] 42 VFs
[23:10:53] [PASSED] 43 VFs
[23:10:53] [PASSED] 44 VFs
[23:10:53] [PASSED] 45 VFs
[23:10:53] [PASSED] 46 VFs
[23:10:53] [PASSED] 47 VFs
[23:10:53] [PASSED] 48 VFs
[23:10:53] [PASSED] 49 VFs
[23:10:53] [PASSED] 50 VFs
[23:10:53] [PASSED] 51 VFs
[23:10:53] [PASSED] 52 VFs
[23:10:53] [PASSED] 53 VFs
[23:10:53] [PASSED] 54 VFs
[23:10:53] [PASSED] 55 VFs
[23:10:53] [PASSED] 56 VFs
[23:10:53] [PASSED] 57 VFs
[23:10:53] [PASSED] 58 VFs
[23:10:53] [PASSED] 59 VFs
[23:10:53] [PASSED] 60 VFs
[23:10:53] [PASSED] 61 VFs
[23:10:53] [PASSED] 62 VFs
[23:10:53] [PASSED] 63 VFs
[23:10:53] ================== [PASSED] fair_contexts ==================
[23:10:53] ===================== fair_doorbells  ======================
[23:10:53] [PASSED] 1 VF
[23:10:53] [PASSED] 2 VFs
[23:10:53] [PASSED] 3 VFs
[23:10:53] [PASSED] 4 VFs
[23:10:53] [PASSED] 5 VFs
[23:10:53] [PASSED] 6 VFs
[23:10:53] [PASSED] 7 VFs
[23:10:53] [PASSED] 8 VFs
[23:10:53] [PASSED] 9 VFs
[23:10:53] [PASSED] 10 VFs
[23:10:53] [PASSED] 11 VFs
[23:10:53] [PASSED] 12 VFs
[23:10:53] [PASSED] 13 VFs
[23:10:53] [PASSED] 14 VFs
[23:10:53] [PASSED] 15 VFs
[23:10:53] [PASSED] 16 VFs
[23:10:53] [PASSED] 17 VFs
[23:10:53] [PASSED] 18 VFs
[23:10:53] [PASSED] 19 VFs
[23:10:53] [PASSED] 20 VFs
[23:10:53] [PASSED] 21 VFs
[23:10:53] [PASSED] 22 VFs
[23:10:53] [PASSED] 23 VFs
[23:10:53] [PASSED] 24 VFs
[23:10:53] [PASSED] 25 VFs
[23:10:53] [PASSED] 26 VFs
[23:10:53] [PASSED] 27 VFs
[23:10:53] [PASSED] 28 VFs
[23:10:53] [PASSED] 29 VFs
[23:10:53] [PASSED] 30 VFs
[23:10:53] [PASSED] 31 VFs
[23:10:53] [PASSED] 32 VFs
[23:10:53] [PASSED] 33 VFs
[23:10:53] [PASSED] 34 VFs
[23:10:53] [PASSED] 35 VFs
[23:10:53] [PASSED] 36 VFs
[23:10:53] [PASSED] 37 VFs
[23:10:53] [PASSED] 38 VFs
[23:10:53] [PASSED] 39 VFs
[23:10:53] [PASSED] 40 VFs
[23:10:53] [PASSED] 41 VFs
[23:10:53] [PASSED] 42 VFs
[23:10:53] [PASSED] 43 VFs
[23:10:53] [PASSED] 44 VFs
[23:10:53] [PASSED] 45 VFs
[23:10:53] [PASSED] 46 VFs
[23:10:53] [PASSED] 47 VFs
[23:10:53] [PASSED] 48 VFs
[23:10:53] [PASSED] 49 VFs
[23:10:53] [PASSED] 50 VFs
[23:10:53] [PASSED] 51 VFs
[23:10:53] [PASSED] 52 VFs
[23:10:53] [PASSED] 53 VFs
[23:10:53] [PASSED] 54 VFs
[23:10:53] [PASSED] 55 VFs
[23:10:53] [PASSED] 56 VFs
[23:10:53] [PASSED] 57 VFs
[23:10:53] [PASSED] 58 VFs
[23:10:53] [PASSED] 59 VFs
[23:10:53] [PASSED] 60 VFs
[23:10:53] [PASSED] 61 VFs
[23:10:53] [PASSED] 62 VFs
[23:10:53] [PASSED] 63 VFs
[23:10:53] ================= [PASSED] fair_doorbells ==================
[23:10:53] ======================== fair_ggtt  ========================
[23:10:53] [PASSED] 1 VF
[23:10:53] [PASSED] 2 VFs
[23:10:53] [PASSED] 3 VFs
[23:10:53] [PASSED] 4 VFs
[23:10:53] [PASSED] 5 VFs
[23:10:53] [PASSED] 6 VFs
[23:10:53] [PASSED] 7 VFs
[23:10:53] [PASSED] 8 VFs
[23:10:53] [PASSED] 9 VFs
[23:10:53] [PASSED] 10 VFs
[23:10:53] [PASSED] 11 VFs
[23:10:53] [PASSED] 12 VFs
[23:10:53] [PASSED] 13 VFs
[23:10:53] [PASSED] 14 VFs
[23:10:53] [PASSED] 15 VFs
[23:10:53] [PASSED] 16 VFs
[23:10:53] [PASSED] 17 VFs
[23:10:53] [PASSED] 18 VFs
[23:10:53] [PASSED] 19 VFs
[23:10:53] [PASSED] 20 VFs
[23:10:53] [PASSED] 21 VFs
[23:10:53] [PASSED] 22 VFs
[23:10:53] [PASSED] 23 VFs
[23:10:53] [PASSED] 24 VFs
[23:10:53] [PASSED] 25 VFs
[23:10:53] [PASSED] 26 VFs
[23:10:53] [PASSED] 27 VFs
[23:10:53] [PASSED] 28 VFs
[23:10:53] [PASSED] 29 VFs
[23:10:53] [PASSED] 30 VFs
[23:10:53] [PASSED] 31 VFs
[23:10:53] [PASSED] 32 VFs
[23:10:53] [PASSED] 33 VFs
[23:10:53] [PASSED] 34 VFs
[23:10:53] [PASSED] 35 VFs
[23:10:53] [PASSED] 36 VFs
[23:10:53] [PASSED] 37 VFs
[23:10:53] [PASSED] 38 VFs
[23:10:53] [PASSED] 39 VFs
[23:10:53] [PASSED] 40 VFs
[23:10:53] [PASSED] 41 VFs
[23:10:53] [PASSED] 42 VFs
[23:10:53] [PASSED] 43 VFs
[23:10:53] [PASSED] 44 VFs
[23:10:53] [PASSED] 45 VFs
[23:10:53] [PASSED] 46 VFs
[23:10:53] [PASSED] 47 VFs
[23:10:53] [PASSED] 48 VFs
[23:10:53] [PASSED] 49 VFs
[23:10:53] [PASSED] 50 VFs
[23:10:53] [PASSED] 51 VFs
[23:10:53] [PASSED] 52 VFs
[23:10:53] [PASSED] 53 VFs
[23:10:53] [PASSED] 54 VFs
[23:10:53] [PASSED] 55 VFs
[23:10:53] [PASSED] 56 VFs
[23:10:53] [PASSED] 57 VFs
[23:10:53] [PASSED] 58 VFs
[23:10:53] [PASSED] 59 VFs
[23:10:53] [PASSED] 60 VFs
[23:10:53] [PASSED] 61 VFs
[23:10:53] [PASSED] 62 VFs
[23:10:53] [PASSED] 63 VFs
[23:10:53] ==================== [PASSED] fair_ggtt ====================
[23:10:53] ================== [PASSED] pf_gt_config ===================
[23:10:53] ===================== lmtt (1 subtest) =====================
[23:10:53] ======================== test_ops  =========================
[23:10:53] [PASSED] 2-level
[23:10:53] [PASSED] multi-level
[23:10:53] ==================== [PASSED] test_ops =====================
[23:10:53] ====================== [PASSED] lmtt =======================
[23:10:53] ================= pf_service (11 subtests) =================
[23:10:53] [PASSED] pf_negotiate_any
[23:10:53] [PASSED] pf_negotiate_base_match
[23:10:53] [PASSED] pf_negotiate_base_newer
[23:10:53] [PASSED] pf_negotiate_base_next
[23:10:53] [SKIPPED] pf_negotiate_base_older
[23:10:53] [PASSED] pf_negotiate_base_prev
[23:10:53] [PASSED] pf_negotiate_latest_match
[23:10:53] [PASSED] pf_negotiate_latest_newer
[23:10:53] [PASSED] pf_negotiate_latest_next
[23:10:53] [SKIPPED] pf_negotiate_latest_older
[23:10:53] [SKIPPED] pf_negotiate_latest_prev
[23:10:53] =================== [PASSED] pf_service ====================
[23:10:53] ================= xe_guc_g2g (2 subtests) ==================
[23:10:53] ============== xe_live_guc_g2g_kunit_default  ==============
[23:10:53] ========= [SKIPPED] xe_live_guc_g2g_kunit_default ==========
[23:10:53] ============== xe_live_guc_g2g_kunit_allmem  ===============
[23:10:53] ========== [SKIPPED] xe_live_guc_g2g_kunit_allmem ==========
[23:10:53] =================== [SKIPPED] xe_guc_g2g ===================
[23:10:53] =================== xe_mocs (2 subtests) ===================
[23:10:53] ================ xe_live_mocs_kernel_kunit  ================
[23:10:53] =========== [SKIPPED] xe_live_mocs_kernel_kunit ============
[23:10:53] ================ xe_live_mocs_reset_kunit  =================
[23:10:53] ============ [SKIPPED] xe_live_mocs_reset_kunit ============
[23:10:53] ==================== [SKIPPED] xe_mocs =====================
[23:10:53] ================= xe_migrate (2 subtests) ==================
[23:10:53] ================= xe_migrate_sanity_kunit  =================
[23:10:53] ============ [SKIPPED] xe_migrate_sanity_kunit =============
[23:10:53] ================== xe_validate_ccs_kunit  ==================
[23:10:53] ============= [SKIPPED] xe_validate_ccs_kunit ==============
[23:10:53] =================== [SKIPPED] xe_migrate ===================
[23:10:53] ================== xe_dma_buf (1 subtest) ==================
[23:10:53] ==================== xe_dma_buf_kunit  =====================
[23:10:53] ================ [SKIPPED] xe_dma_buf_kunit ================
[23:10:53] =================== [SKIPPED] xe_dma_buf ===================
[23:10:53] ================= xe_bo_shrink (1 subtest) =================
[23:10:53] =================== xe_bo_shrink_kunit  ====================
[23:10:53] =============== [SKIPPED] xe_bo_shrink_kunit ===============
[23:10:53] ================== [SKIPPED] xe_bo_shrink ==================
[23:10:53] ==================== xe_bo (2 subtests) ====================
[23:10:53] ================== xe_ccs_migrate_kunit  ===================
[23:10:53] ============== [SKIPPED] xe_ccs_migrate_kunit ==============
[23:10:53] ==================== xe_bo_evict_kunit  ====================
[23:10:53] =============== [SKIPPED] xe_bo_evict_kunit ================
[23:10:53] ===================== [SKIPPED] xe_bo ======================
[23:10:53] ==================== args (11 subtests) ====================
[23:10:53] [PASSED] count_args_test
[23:10:53] [PASSED] call_args_example
[23:10:53] [PASSED] call_args_test
[23:10:53] [PASSED] drop_first_arg_example
[23:10:53] [PASSED] drop_first_arg_test
[23:10:53] [PASSED] first_arg_example
[23:10:53] [PASSED] first_arg_test
[23:10:53] [PASSED] last_arg_example
[23:10:53] [PASSED] last_arg_test
[23:10:53] [PASSED] pick_arg_example
[23:10:53] [PASSED] sep_comma_example
[23:10:53] ====================== [PASSED] args =======================
[23:10:53] =================== xe_pci (3 subtests) ====================
[23:10:53] ==================== check_graphics_ip  ====================
[23:10:53] [PASSED] 12.00 Xe_LP
[23:10:53] [PASSED] 12.10 Xe_LP+
[23:10:53] [PASSED] 12.55 Xe_HPG
[23:10:53] [PASSED] 12.60 Xe_HPC
[23:10:53] [PASSED] 12.70 Xe_LPG
[23:10:53] [PASSED] 12.71 Xe_LPG
[23:10:53] [PASSED] 12.74 Xe_LPG+
[23:10:53] [PASSED] 20.01 Xe2_HPG
[23:10:53] [PASSED] 20.02 Xe2_HPG
[23:10:53] [PASSED] 20.04 Xe2_LPG
[23:10:53] [PASSED] 30.00 Xe3_LPG
[23:10:53] [PASSED] 30.01 Xe3_LPG
[23:10:53] [PASSED] 30.03 Xe3_LPG
[23:10:53] [PASSED] 30.04 Xe3_LPG
[23:10:53] [PASSED] 30.05 Xe3_LPG
[23:10:53] [PASSED] 35.11 Xe3p_XPC
[23:10:53] ================ [PASSED] check_graphics_ip ================
[23:10:53] ===================== check_media_ip  ======================
[23:10:53] [PASSED] 12.00 Xe_M
[23:10:53] [PASSED] 12.55 Xe_HPM
[23:10:53] [PASSED] 13.00 Xe_LPM+
[23:10:53] [PASSED] 13.01 Xe2_HPM
[23:10:53] [PASSED] 20.00 Xe2_LPM
[23:10:53] [PASSED] 30.00 Xe3_LPM
[23:10:53] [PASSED] 30.02 Xe3_LPM
[23:10:53] [PASSED] 35.00 Xe3p_LPM
[23:10:53] [PASSED] 35.03 Xe3p_HPM
[23:10:53] ================= [PASSED] check_media_ip ==================
[23:10:53] =================== check_platform_desc  ===================
[23:10:53] [PASSED] 0x9A60 (TIGERLAKE)
[23:10:53] [PASSED] 0x9A68 (TIGERLAKE)
[23:10:53] [PASSED] 0x9A70 (TIGERLAKE)
[23:10:53] [PASSED] 0x9A40 (TIGERLAKE)
[23:10:53] [PASSED] 0x9A49 (TIGERLAKE)
[23:10:53] [PASSED] 0x9A59 (TIGERLAKE)
[23:10:53] [PASSED] 0x9A78 (TIGERLAKE)
[23:10:53] [PASSED] 0x9AC0 (TIGERLAKE)
[23:10:53] [PASSED] 0x9AC9 (TIGERLAKE)
[23:10:53] [PASSED] 0x9AD9 (TIGERLAKE)
[23:10:53] [PASSED] 0x9AF8 (TIGERLAKE)
[23:10:53] [PASSED] 0x4C80 (ROCKETLAKE)
[23:10:53] [PASSED] 0x4C8A (ROCKETLAKE)
[23:10:53] [PASSED] 0x4C8B (ROCKETLAKE)
[23:10:53] [PASSED] 0x4C8C (ROCKETLAKE)
[23:10:53] [PASSED] 0x4C90 (ROCKETLAKE)
[23:10:53] [PASSED] 0x4C9A (ROCKETLAKE)
stty: 'standard input': Inappropriate ioctl for device
[23:10:53] [PASSED] 0x4680 (ALDERLAKE_S)
[23:10:53] [PASSED] 0x4682 (ALDERLAKE_S)
[23:10:53] [PASSED] 0x4688 (ALDERLAKE_S)
[23:10:53] [PASSED] 0x468A (ALDERLAKE_S)
[23:10:53] [PASSED] 0x468B (ALDERLAKE_S)
[23:10:53] [PASSED] 0x4690 (ALDERLAKE_S)
[23:10:53] [PASSED] 0x4692 (ALDERLAKE_S)
[23:10:53] [PASSED] 0x4693 (ALDERLAKE_S)
[23:10:53] [PASSED] 0x46A0 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46A1 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46A2 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46A3 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46A6 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46A8 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46AA (ALDERLAKE_P)
[23:10:53] [PASSED] 0x462A (ALDERLAKE_P)
[23:10:53] [PASSED] 0x4626 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x4628 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46B0 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46B1 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46B2 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46B3 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46C0 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46C1 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46C2 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46C3 (ALDERLAKE_P)
[23:10:53] [PASSED] 0x46D0 (ALDERLAKE_N)
[23:10:53] [PASSED] 0x46D1 (ALDERLAKE_N)
[23:10:53] [PASSED] 0x46D2 (ALDERLAKE_N)
[23:10:53] [PASSED] 0x46D3 (ALDERLAKE_N)
[23:10:53] [PASSED] 0x46D4 (ALDERLAKE_N)
[23:10:53] [PASSED] 0xA721 (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA7A1 (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA7A9 (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA7AC (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA7AD (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA720 (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA7A0 (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA7A8 (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA7AA (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA7AB (ALDERLAKE_P)
[23:10:53] [PASSED] 0xA780 (ALDERLAKE_S)
[23:10:53] [PASSED] 0xA781 (ALDERLAKE_S)
[23:10:53] [PASSED] 0xA782 (ALDERLAKE_S)
[23:10:53] [PASSED] 0xA783 (ALDERLAKE_S)
[23:10:53] [PASSED] 0xA788 (ALDERLAKE_S)
[23:10:53] [PASSED] 0xA789 (ALDERLAKE_S)
[23:10:53] [PASSED] 0xA78A (ALDERLAKE_S)
[23:10:53] [PASSED] 0xA78B (ALDERLAKE_S)
[23:10:53] [PASSED] 0x4905 (DG1)
[23:10:53] [PASSED] 0x4906 (DG1)
[23:10:53] [PASSED] 0x4907 (DG1)
[23:10:53] [PASSED] 0x4908 (DG1)
[23:10:53] [PASSED] 0x4909 (DG1)
[23:10:53] [PASSED] 0x56C0 (DG2)
[23:10:53] [PASSED] 0x56C2 (DG2)
[23:10:53] [PASSED] 0x56C1 (DG2)
[23:10:53] [PASSED] 0x7D51 (METEORLAKE)
[23:10:53] [PASSED] 0x7DD1 (METEORLAKE)
[23:10:53] [PASSED] 0x7D41 (METEORLAKE)
[23:10:53] [PASSED] 0x7D67 (METEORLAKE)
[23:10:53] [PASSED] 0xB640 (METEORLAKE)
[23:10:53] [PASSED] 0x56A0 (DG2)
[23:10:53] [PASSED] 0x56A1 (DG2)
[23:10:53] [PASSED] 0x56A2 (DG2)
[23:10:53] [PASSED] 0x56BE (DG2)
[23:10:53] [PASSED] 0x56BF (DG2)
[23:10:53] [PASSED] 0x5690 (DG2)
[23:10:53] [PASSED] 0x5691 (DG2)
[23:10:53] [PASSED] 0x5692 (DG2)
[23:10:53] [PASSED] 0x56A5 (DG2)
[23:10:53] [PASSED] 0x56A6 (DG2)
[23:10:53] [PASSED] 0x56B0 (DG2)
[23:10:53] [PASSED] 0x56B1 (DG2)
[23:10:53] [PASSED] 0x56BA (DG2)
[23:10:53] [PASSED] 0x56BB (DG2)
[23:10:53] [PASSED] 0x56BC (DG2)
[23:10:53] [PASSED] 0x56BD (DG2)
[23:10:53] [PASSED] 0x5693 (DG2)
[23:10:53] [PASSED] 0x5694 (DG2)
[23:10:53] [PASSED] 0x5695 (DG2)
[23:10:53] [PASSED] 0x56A3 (DG2)
[23:10:53] [PASSED] 0x56A4 (DG2)
[23:10:53] [PASSED] 0x56B2 (DG2)
[23:10:53] [PASSED] 0x56B3 (DG2)
[23:10:53] [PASSED] 0x5696 (DG2)
[23:10:53] [PASSED] 0x5697 (DG2)
[23:10:53] [PASSED] 0xB69 (PVC)
[23:10:53] [PASSED] 0xB6E (PVC)
[23:10:53] [PASSED] 0xBD4 (PVC)
[23:10:53] [PASSED] 0xBD5 (PVC)
[23:10:53] [PASSED] 0xBD6 (PVC)
[23:10:53] [PASSED] 0xBD7 (PVC)
[23:10:53] [PASSED] 0xBD8 (PVC)
[23:10:53] [PASSED] 0xBD9 (PVC)
[23:10:53] [PASSED] 0xBDA (PVC)
[23:10:53] [PASSED] 0xBDB (PVC)
[23:10:53] [PASSED] 0xBE0 (PVC)
[23:10:53] [PASSED] 0xBE1 (PVC)
[23:10:53] [PASSED] 0xBE5 (PVC)
[23:10:53] [PASSED] 0x7D40 (METEORLAKE)
[23:10:53] [PASSED] 0x7D45 (METEORLAKE)
[23:10:53] [PASSED] 0x7D55 (METEORLAKE)
[23:10:53] [PASSED] 0x7D60 (METEORLAKE)
[23:10:53] [PASSED] 0x7DD5 (METEORLAKE)
[23:10:53] [PASSED] 0x6420 (LUNARLAKE)
[23:10:53] [PASSED] 0x64A0 (LUNARLAKE)
[23:10:53] [PASSED] 0x64B0 (LUNARLAKE)
[23:10:53] [PASSED] 0xE202 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE209 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE20B (BATTLEMAGE)
[23:10:53] [PASSED] 0xE20C (BATTLEMAGE)
[23:10:53] [PASSED] 0xE20D (BATTLEMAGE)
[23:10:53] [PASSED] 0xE210 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE211 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE212 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE216 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE220 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE221 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE222 (BATTLEMAGE)
[23:10:53] [PASSED] 0xE223 (BATTLEMAGE)
[23:10:53] [PASSED] 0xB080 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB081 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB082 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB083 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB084 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB085 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB086 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB087 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB08F (PANTHERLAKE)
[23:10:53] [PASSED] 0xB090 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB0A0 (PANTHERLAKE)
[23:10:53] [PASSED] 0xB0B0 (PANTHERLAKE)
[23:10:53] [PASSED] 0xD740 (NOVALAKE_S)
[23:10:53] [PASSED] 0xD741 (NOVALAKE_S)
[23:10:53] [PASSED] 0xD742 (NOVALAKE_S)
[23:10:53] [PASSED] 0xD743 (NOVALAKE_S)
[23:10:53] [PASSED] 0xD744 (NOVALAKE_S)
[23:10:53] [PASSED] 0xD745 (NOVALAKE_S)
[23:10:53] [PASSED] 0x674C (CRESCENTISLAND)
[23:10:53] [PASSED] 0xFD80 (PANTHERLAKE)
[23:10:53] [PASSED] 0xFD81 (PANTHERLAKE)
[23:10:53] =============== [PASSED] check_platform_desc ===============
[23:10:53] ===================== [PASSED] xe_pci ======================
[23:10:53] =================== xe_rtp (2 subtests) ====================
[23:10:53] =============== xe_rtp_process_to_sr_tests  ================
[23:10:53] [PASSED] coalesce-same-reg
[23:10:53] [PASSED] no-match-no-add
[23:10:53] [PASSED] match-or
[23:10:53] [PASSED] match-or-xfail
[23:10:53] [PASSED] no-match-no-add-multiple-rules
[23:10:53] [PASSED] two-regs-two-entries
[23:10:53] [PASSED] clr-one-set-other
[23:10:53] [PASSED] set-field
[23:10:53] [PASSED] conflict-duplicate
[23:10:53] [PASSED] conflict-not-disjoint
[23:10:53] [PASSED] conflict-reg-type
[23:10:53] =========== [PASSED] xe_rtp_process_to_sr_tests ============
[23:10:53] ================== xe_rtp_process_tests  ===================
[23:10:53] [PASSED] active1
[23:10:53] [PASSED] active2
[23:10:53] [PASSED] active-inactive
[23:10:53] [PASSED] inactive-active
[23:10:53] [PASSED] inactive-1st_or_active-inactive
[23:10:53] [PASSED] inactive-2nd_or_active-inactive
[23:10:53] [PASSED] inactive-last_or_active-inactive
[23:10:53] [PASSED] inactive-no_or_active-inactive
[23:10:53] ============== [PASSED] xe_rtp_process_tests ===============
[23:10:53] ===================== [PASSED] xe_rtp ======================
[23:10:53] ==================== xe_wa (1 subtest) =====================
[23:10:53] ======================== xe_wa_gt  =========================
[23:10:53] [PASSED] TIGERLAKE B0
[23:10:53] [PASSED] DG1 A0
[23:10:53] [PASSED] DG1 B0
[23:10:53] [PASSED] ALDERLAKE_S A0
[23:10:53] [PASSED] ALDERLAKE_S B0
[23:10:53] [PASSED] ALDERLAKE_S C0
[23:10:53] [PASSED] ALDERLAKE_S D0
[23:10:53] [PASSED] ALDERLAKE_P A0
[23:10:53] [PASSED] ALDERLAKE_P B0
[23:10:53] [PASSED] ALDERLAKE_P C0
[23:10:53] [PASSED] ALDERLAKE_S RPLS D0
[23:10:53] [PASSED] ALDERLAKE_P RPLU E0
[23:10:53] [PASSED] DG2 G10 C0
[23:10:53] [PASSED] DG2 G11 B1
[23:10:53] [PASSED] DG2 G12 A1
[23:10:53] [PASSED] METEORLAKE 12.70(Xe_LPG) A0 13.00(Xe_LPM+) A0
[23:10:53] [PASSED] METEORLAKE 12.71(Xe_LPG) A0 13.00(Xe_LPM+) A0
[23:10:53] [PASSED] METEORLAKE 12.74(Xe_LPG+) A0 13.00(Xe_LPM+) A0
[23:10:53] [PASSED] LUNARLAKE 20.04(Xe2_LPG) A0 20.00(Xe2_LPM) A0
[23:10:53] [PASSED] LUNARLAKE 20.04(Xe2_LPG) B0 20.00(Xe2_LPM) A0
[23:10:53] [PASSED] BATTLEMAGE 20.01(Xe2_HPG) A0 13.01(Xe2_HPM) A1
[23:10:53] [PASSED] PANTHERLAKE 30.00(Xe3_LPG) A0 30.00(Xe3_LPM) A0
[23:10:53] ==================== [PASSED] xe_wa_gt =====================
[23:10:53] ====================== [PASSED] xe_wa ======================
[23:10:53] ============================================================
[23:10:53] Testing complete. Ran 510 tests: passed: 492, skipped: 18
[23:10:53] Elapsed time: 35.163s total, 4.209s configuring, 30.436s building, 0.467s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/tests/.kunitconfig
[23:10:53] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[23:10:55] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[23:11:19] Starting KUnit Kernel (1/1)...
[23:11:19] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[23:11:19] ============ drm_test_pick_cmdline (2 subtests) ============
[23:11:19] [PASSED] drm_test_pick_cmdline_res_1920_1080_60
[23:11:19] =============== drm_test_pick_cmdline_named  ===============
[23:11:19] [PASSED] NTSC
[23:11:19] [PASSED] NTSC-J
[23:11:19] [PASSED] PAL
[23:11:19] [PASSED] PAL-M
[23:11:19] =========== [PASSED] drm_test_pick_cmdline_named ===========
[23:11:19] ============== [PASSED] drm_test_pick_cmdline ==============
[23:11:19] == drm_test_atomic_get_connector_for_encoder (1 subtest) ===
[23:11:19] [PASSED] drm_test_drm_atomic_get_connector_for_encoder
[23:11:19] ==== [PASSED] drm_test_atomic_get_connector_for_encoder ====
[23:11:19] =========== drm_validate_clone_mode (2 subtests) ===========
[23:11:19] ============== drm_test_check_in_clone_mode  ===============
[23:11:19] [PASSED] in_clone_mode
[23:11:19] [PASSED] not_in_clone_mode
[23:11:19] ========== [PASSED] drm_test_check_in_clone_mode ===========
[23:11:19] =============== drm_test_check_valid_clones  ===============
[23:11:19] [PASSED] not_in_clone_mode
[23:11:19] [PASSED] valid_clone
[23:11:19] [PASSED] invalid_clone
[23:11:19] =========== [PASSED] drm_test_check_valid_clones ===========
[23:11:19] ============= [PASSED] drm_validate_clone_mode =============
[23:11:19] ============= drm_validate_modeset (1 subtest) =============
[23:11:19] [PASSED] drm_test_check_connector_changed_modeset
[23:11:19] ============== [PASSED] drm_validate_modeset ===============
[23:11:19] ====== drm_test_bridge_get_current_state (2 subtests) ======
[23:11:19] [PASSED] drm_test_drm_bridge_get_current_state_atomic
[23:11:19] [PASSED] drm_test_drm_bridge_get_current_state_legacy
[23:11:19] ======== [PASSED] drm_test_bridge_get_current_state ========
[23:11:19] ====== drm_test_bridge_helper_reset_crtc (3 subtests) ======
[23:11:19] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic
[23:11:19] [PASSED] drm_test_drm_bridge_helper_reset_crtc_atomic_disabled
[23:11:19] [PASSED] drm_test_drm_bridge_helper_reset_crtc_legacy
[23:11:19] ======== [PASSED] drm_test_bridge_helper_reset_crtc ========
[23:11:19] ============== drm_bridge_alloc (2 subtests) ===============
[23:11:19] [PASSED] drm_test_drm_bridge_alloc_basic
[23:11:19] [PASSED] drm_test_drm_bridge_alloc_get_put
[23:11:19] ================ [PASSED] drm_bridge_alloc =================
[23:11:19] ================== drm_buddy (8 subtests) ==================
[23:11:19] [PASSED] drm_test_buddy_alloc_limit
[23:11:19] [PASSED] drm_test_buddy_alloc_optimistic
[23:11:19] [PASSED] drm_test_buddy_alloc_pessimistic
[23:11:19] [PASSED] drm_test_buddy_alloc_pathological
[23:11:19] [PASSED] drm_test_buddy_alloc_contiguous
[23:11:19] [PASSED] drm_test_buddy_alloc_clear
[23:11:20] [PASSED] drm_test_buddy_alloc_range_bias
[23:11:20] [PASSED] drm_test_buddy_fragmentation_performance
[23:11:20] ==================== [PASSED] drm_buddy ====================
[23:11:20] ============= drm_cmdline_parser (40 subtests) =============
[23:11:20] [PASSED] drm_test_cmdline_force_d_only
[23:11:20] [PASSED] drm_test_cmdline_force_D_only_dvi
[23:11:20] [PASSED] drm_test_cmdline_force_D_only_hdmi
[23:11:20] [PASSED] drm_test_cmdline_force_D_only_not_digital
[23:11:20] [PASSED] drm_test_cmdline_force_e_only
[23:11:20] [PASSED] drm_test_cmdline_res
[23:11:20] [PASSED] drm_test_cmdline_res_vesa
[23:11:20] [PASSED] drm_test_cmdline_res_vesa_rblank
[23:11:20] [PASSED] drm_test_cmdline_res_rblank
[23:11:20] [PASSED] drm_test_cmdline_res_bpp
[23:11:20] [PASSED] drm_test_cmdline_res_refresh
[23:11:20] [PASSED] drm_test_cmdline_res_bpp_refresh
[23:11:20] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced
[23:11:20] [PASSED] drm_test_cmdline_res_bpp_refresh_margins
[23:11:20] [PASSED] drm_test_cmdline_res_bpp_refresh_force_off
[23:11:20] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on
[23:11:20] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_analog
[23:11:20] [PASSED] drm_test_cmdline_res_bpp_refresh_force_on_digital
[23:11:20] [PASSED] drm_test_cmdline_res_bpp_refresh_interlaced_margins_force_on
[23:11:20] [PASSED] drm_test_cmdline_res_margins_force_on
[23:11:20] [PASSED] drm_test_cmdline_res_vesa_margins
[23:11:20] [PASSED] drm_test_cmdline_name
[23:11:20] [PASSED] drm_test_cmdline_name_bpp
[23:11:20] [PASSED] drm_test_cmdline_name_option
[23:11:20] [PASSED] drm_test_cmdline_name_bpp_option
[23:11:20] [PASSED] drm_test_cmdline_rotate_0
[23:11:20] [PASSED] drm_test_cmdline_rotate_90
[23:11:20] [PASSED] drm_test_cmdline_rotate_180
[23:11:20] [PASSED] drm_test_cmdline_rotate_270
[23:11:20] [PASSED] drm_test_cmdline_hmirror
[23:11:20] [PASSED] drm_test_cmdline_vmirror
[23:11:20] [PASSED] drm_test_cmdline_margin_options
[23:11:20] [PASSED] drm_test_cmdline_multiple_options
[23:11:20] [PASSED] drm_test_cmdline_bpp_extra_and_option
[23:11:20] [PASSED] drm_test_cmdline_extra_and_option
[23:11:20] [PASSED] drm_test_cmdline_freestanding_options
[23:11:20] [PASSED] drm_test_cmdline_freestanding_force_e_and_options
[23:11:20] [PASSED] drm_test_cmdline_panel_orientation
[23:11:20] ================ drm_test_cmdline_invalid  =================
[23:11:20] [PASSED] margin_only
[23:11:20] [PASSED] interlace_only
[23:11:20] [PASSED] res_missing_x
[23:11:20] [PASSED] res_missing_y
[23:11:20] [PASSED] res_bad_y
[23:11:20] [PASSED] res_missing_y_bpp
[23:11:20] [PASSED] res_bad_bpp
[23:11:20] [PASSED] res_bad_refresh
[23:11:20] [PASSED] res_bpp_refresh_force_on_off
[23:11:20] [PASSED] res_invalid_mode
[23:11:20] [PASSED] res_bpp_wrong_place_mode
[23:11:20] [PASSED] name_bpp_refresh
[23:11:20] [PASSED] name_refresh
[23:11:20] [PASSED] name_refresh_wrong_mode
[23:11:20] [PASSED] name_refresh_invalid_mode
[23:11:20] [PASSED] rotate_multiple
[23:11:20] [PASSED] rotate_invalid_val
[23:11:20] [PASSED] rotate_truncated
[23:11:20] [PASSED] invalid_option
[23:11:20] [PASSED] invalid_tv_option
[23:11:20] [PASSED] truncated_tv_option
[23:11:20] ============ [PASSED] drm_test_cmdline_invalid =============
[23:11:20] =============== drm_test_cmdline_tv_options  ===============
[23:11:20] [PASSED] NTSC
[23:11:20] [PASSED] NTSC_443
[23:11:20] [PASSED] NTSC_J
[23:11:20] [PASSED] PAL
[23:11:20] [PASSED] PAL_M
[23:11:20] [PASSED] PAL_N
[23:11:20] [PASSED] SECAM
[23:11:20] [PASSED] MONO_525
[23:11:20] [PASSED] MONO_625
[23:11:20] =========== [PASSED] drm_test_cmdline_tv_options ===========
[23:11:20] =============== [PASSED] drm_cmdline_parser ================
[23:11:20] ========== drmm_connector_hdmi_init (20 subtests) ==========
[23:11:20] [PASSED] drm_test_connector_hdmi_init_valid
[23:11:20] [PASSED] drm_test_connector_hdmi_init_bpc_8
[23:11:20] [PASSED] drm_test_connector_hdmi_init_bpc_10
[23:11:20] [PASSED] drm_test_connector_hdmi_init_bpc_12
[23:11:20] [PASSED] drm_test_connector_hdmi_init_bpc_invalid
[23:11:20] [PASSED] drm_test_connector_hdmi_init_bpc_null
[23:11:20] [PASSED] drm_test_connector_hdmi_init_formats_empty
[23:11:20] [PASSED] drm_test_connector_hdmi_init_formats_no_rgb
[23:11:20] === drm_test_connector_hdmi_init_formats_yuv420_allowed  ===
[23:11:20] [PASSED] supported_formats=0x9 yuv420_allowed=1
[23:11:20] [PASSED] supported_formats=0x9 yuv420_allowed=0
[23:11:20] [PASSED] supported_formats=0x3 yuv420_allowed=1
[23:11:20] [PASSED] supported_formats=0x3 yuv420_allowed=0
[23:11:20] === [PASSED] drm_test_connector_hdmi_init_formats_yuv420_allowed ===
[23:11:20] [PASSED] drm_test_connector_hdmi_init_null_ddc
[23:11:20] [PASSED] drm_test_connector_hdmi_init_null_product
[23:11:20] [PASSED] drm_test_connector_hdmi_init_null_vendor
[23:11:20] [PASSED] drm_test_connector_hdmi_init_product_length_exact
[23:11:20] [PASSED] drm_test_connector_hdmi_init_product_length_too_long
[23:11:20] [PASSED] drm_test_connector_hdmi_init_product_valid
[23:11:20] [PASSED] drm_test_connector_hdmi_init_vendor_length_exact
[23:11:20] [PASSED] drm_test_connector_hdmi_init_vendor_length_too_long
[23:11:20] [PASSED] drm_test_connector_hdmi_init_vendor_valid
[23:11:20] ========= drm_test_connector_hdmi_init_type_valid  =========
[23:11:20] [PASSED] HDMI-A
[23:11:20] [PASSED] HDMI-B
[23:11:20] ===== [PASSED] drm_test_connector_hdmi_init_type_valid =====
[23:11:20] ======== drm_test_connector_hdmi_init_type_invalid  ========
[23:11:20] [PASSED] Unknown
[23:11:20] [PASSED] VGA
[23:11:20] [PASSED] DVI-I
[23:11:20] [PASSED] DVI-D
[23:11:20] [PASSED] DVI-A
[23:11:20] [PASSED] Composite
[23:11:20] [PASSED] SVIDEO
[23:11:20] [PASSED] LVDS
[23:11:20] [PASSED] Component
[23:11:20] [PASSED] DIN
[23:11:20] [PASSED] DP
[23:11:20] [PASSED] TV
[23:11:20] [PASSED] eDP
[23:11:20] [PASSED] Virtual
[23:11:20] [PASSED] DSI
[23:11:20] [PASSED] DPI
[23:11:20] [PASSED] Writeback
[23:11:20] [PASSED] SPI
[23:11:20] [PASSED] USB
[23:11:20] ==== [PASSED] drm_test_connector_hdmi_init_type_invalid ====
[23:11:20] ============ [PASSED] drmm_connector_hdmi_init =============
[23:11:20] ============= drmm_connector_init (3 subtests) =============
[23:11:20] [PASSED] drm_test_drmm_connector_init
[23:11:20] [PASSED] drm_test_drmm_connector_init_null_ddc
[23:11:20] ========= drm_test_drmm_connector_init_type_valid  =========
[23:11:20] [PASSED] Unknown
[23:11:20] [PASSED] VGA
[23:11:20] [PASSED] DVI-I
[23:11:20] [PASSED] DVI-D
[23:11:20] [PASSED] DVI-A
[23:11:20] [PASSED] Composite
[23:11:20] [PASSED] SVIDEO
[23:11:20] [PASSED] LVDS
[23:11:20] [PASSED] Component
[23:11:20] [PASSED] DIN
[23:11:20] [PASSED] DP
[23:11:20] [PASSED] HDMI-A
[23:11:20] [PASSED] HDMI-B
[23:11:20] [PASSED] TV
[23:11:20] [PASSED] eDP
[23:11:20] [PASSED] Virtual
[23:11:20] [PASSED] DSI
[23:11:20] [PASSED] DPI
[23:11:20] [PASSED] Writeback
[23:11:20] [PASSED] SPI
[23:11:20] [PASSED] USB
[23:11:20] ===== [PASSED] drm_test_drmm_connector_init_type_valid =====
[23:11:20] =============== [PASSED] drmm_connector_init ===============
[23:11:20] ========= drm_connector_dynamic_init (6 subtests) ==========
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_init
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_init_null_ddc
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_init_not_added
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_init_properties
[23:11:20] ===== drm_test_drm_connector_dynamic_init_type_valid  ======
[23:11:20] [PASSED] Unknown
[23:11:20] [PASSED] VGA
[23:11:20] [PASSED] DVI-I
[23:11:20] [PASSED] DVI-D
[23:11:20] [PASSED] DVI-A
[23:11:20] [PASSED] Composite
[23:11:20] [PASSED] SVIDEO
[23:11:20] [PASSED] LVDS
[23:11:20] [PASSED] Component
[23:11:20] [PASSED] DIN
[23:11:20] [PASSED] DP
[23:11:20] [PASSED] HDMI-A
[23:11:20] [PASSED] HDMI-B
[23:11:20] [PASSED] TV
[23:11:20] [PASSED] eDP
[23:11:20] [PASSED] Virtual
[23:11:20] [PASSED] DSI
[23:11:20] [PASSED] DPI
[23:11:20] [PASSED] Writeback
[23:11:20] [PASSED] SPI
[23:11:20] [PASSED] USB
[23:11:20] = [PASSED] drm_test_drm_connector_dynamic_init_type_valid ==
[23:11:20] ======== drm_test_drm_connector_dynamic_init_name  =========
[23:11:20] [PASSED] Unknown
[23:11:20] [PASSED] VGA
[23:11:20] [PASSED] DVI-I
[23:11:20] [PASSED] DVI-D
[23:11:20] [PASSED] DVI-A
[23:11:20] [PASSED] Composite
[23:11:20] [PASSED] SVIDEO
[23:11:20] [PASSED] LVDS
[23:11:20] [PASSED] Component
[23:11:20] [PASSED] DIN
[23:11:20] [PASSED] DP
[23:11:20] [PASSED] HDMI-A
[23:11:20] [PASSED] HDMI-B
[23:11:20] [PASSED] TV
[23:11:20] [PASSED] eDP
[23:11:20] [PASSED] Virtual
[23:11:20] [PASSED] DSI
[23:11:20] [PASSED] DPI
[23:11:20] [PASSED] Writeback
[23:11:20] [PASSED] SPI
[23:11:20] [PASSED] USB
[23:11:20] ==== [PASSED] drm_test_drm_connector_dynamic_init_name =====
[23:11:20] =========== [PASSED] drm_connector_dynamic_init ============
[23:11:20] ==== drm_connector_dynamic_register_early (4 subtests) =====
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_early_on_list
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_early_defer
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_early_no_init
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_early_no_mode_object
[23:11:20] ====== [PASSED] drm_connector_dynamic_register_early =======
[23:11:20] ======= drm_connector_dynamic_register (7 subtests) ========
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_on_list
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_no_defer
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_no_init
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_mode_object
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_sysfs
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_sysfs_name
[23:11:20] [PASSED] drm_test_drm_connector_dynamic_register_debugfs
[23:11:20] ========= [PASSED] drm_connector_dynamic_register ==========
[23:11:20] = drm_connector_attach_broadcast_rgb_property (2 subtests) =
[23:11:20] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property
[23:11:20] [PASSED] drm_test_drm_connector_attach_broadcast_rgb_property_hdmi_connector
[23:11:20] === [PASSED] drm_connector_attach_broadcast_rgb_property ===
[23:11:20] ========== drm_get_tv_mode_from_name (2 subtests) ==========
[23:11:20] ========== drm_test_get_tv_mode_from_name_valid  ===========
[23:11:20] [PASSED] NTSC
[23:11:20] [PASSED] NTSC-443
[23:11:20] [PASSED] NTSC-J
[23:11:20] [PASSED] PAL
[23:11:20] [PASSED] PAL-M
[23:11:20] [PASSED] PAL-N
[23:11:20] [PASSED] SECAM
[23:11:20] [PASSED] Mono
[23:11:20] ====== [PASSED] drm_test_get_tv_mode_from_name_valid =======
[23:11:20] [PASSED] drm_test_get_tv_mode_from_name_truncated
[23:11:20] ============ [PASSED] drm_get_tv_mode_from_name ============
[23:11:20] = drm_test_connector_hdmi_compute_mode_clock (12 subtests) =
[23:11:20] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb
[23:11:20] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc
[23:11:20] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_10bpc_vic_1
[23:11:20] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc
[23:11:20] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_12bpc_vic_1
[23:11:20] [PASSED] drm_test_drm_hdmi_compute_mode_clock_rgb_double
[23:11:20] = drm_test_connector_hdmi_compute_mode_clock_yuv420_valid  =
[23:11:20] [PASSED] VIC 96
[23:11:20] [PASSED] VIC 97
[23:11:20] [PASSED] VIC 101
[23:11:20] [PASSED] VIC 102
[23:11:20] [PASSED] VIC 106
[23:11:20] [PASSED] VIC 107
[23:11:20] === [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_valid ===
[23:11:20] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_10_bpc
[23:11:20] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv420_12_bpc
[23:11:20] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_8_bpc
[23:11:20] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_10_bpc
[23:11:20] [PASSED] drm_test_connector_hdmi_compute_mode_clock_yuv422_12_bpc
[23:11:20] === [PASSED] drm_test_connector_hdmi_compute_mode_clock ====
[23:11:20] == drm_hdmi_connector_get_broadcast_rgb_name (2 subtests) ==
[23:11:20] === drm_test_drm_hdmi_connector_get_broadcast_rgb_name  ====
[23:11:20] [PASSED] Automatic
[23:11:20] [PASSED] Full
[23:11:20] [PASSED] Limited 16:235
[23:11:20] === [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name ===
[23:11:20] [PASSED] drm_test_drm_hdmi_connector_get_broadcast_rgb_name_invalid
[23:11:20] ==== [PASSED] drm_hdmi_connector_get_broadcast_rgb_name ====
[23:11:20] == drm_hdmi_connector_get_output_format_name (2 subtests) ==
[23:11:20] === drm_test_drm_hdmi_connector_get_output_format_name  ====
[23:11:20] [PASSED] RGB
[23:11:20] [PASSED] YUV 4:2:0
[23:11:20] [PASSED] YUV 4:2:2
[23:11:20] [PASSED] YUV 4:4:4
[23:11:20] === [PASSED] drm_test_drm_hdmi_connector_get_output_format_name ===
[23:11:20] [PASSED] drm_test_drm_hdmi_connector_get_output_format_name_invalid
[23:11:20] ==== [PASSED] drm_hdmi_connector_get_output_format_name ====
[23:11:20] ============= drm_damage_helper (21 subtests) ==============
[23:11:20] [PASSED] drm_test_damage_iter_no_damage
[23:11:20] [PASSED] drm_test_damage_iter_no_damage_fractional_src
[23:11:20] [PASSED] drm_test_damage_iter_no_damage_src_moved
[23:11:20] [PASSED] drm_test_damage_iter_no_damage_fractional_src_moved
[23:11:20] [PASSED] drm_test_damage_iter_no_damage_not_visible
[23:11:20] [PASSED] drm_test_damage_iter_no_damage_no_crtc
[23:11:20] [PASSED] drm_test_damage_iter_no_damage_no_fb
[23:11:20] [PASSED] drm_test_damage_iter_simple_damage
[23:11:20] [PASSED] drm_test_damage_iter_single_damage
[23:11:20] [PASSED] drm_test_damage_iter_single_damage_intersect_src
[23:11:20] [PASSED] drm_test_damage_iter_single_damage_outside_src
[23:11:20] [PASSED] drm_test_damage_iter_single_damage_fractional_src
[23:11:20] [PASSED] drm_test_damage_iter_single_damage_intersect_fractional_src
[23:11:20] [PASSED] drm_test_damage_iter_single_damage_outside_fractional_src
[23:11:20] [PASSED] drm_test_damage_iter_single_damage_src_moved
[23:11:20] [PASSED] drm_test_damage_iter_single_damage_fractional_src_moved
[23:11:20] [PASSED] drm_test_damage_iter_damage
[23:11:20] [PASSED] drm_test_damage_iter_damage_one_intersect
[23:11:20] [PASSED] drm_test_damage_iter_damage_one_outside
[23:11:20] [PASSED] drm_test_damage_iter_damage_src_moved
[23:11:20] [PASSED] drm_test_damage_iter_damage_not_visible
[23:11:20] ================ [PASSED] drm_damage_helper ================
[23:11:20] ============== drm_dp_mst_helper (3 subtests) ==============
[23:11:20] ============== drm_test_dp_mst_calc_pbn_mode  ==============
[23:11:20] [PASSED] Clock 154000 BPP 30 DSC disabled
[23:11:20] [PASSED] Clock 234000 BPP 30 DSC disabled
[23:11:20] [PASSED] Clock 297000 BPP 24 DSC disabled
[23:11:20] [PASSED] Clock 332880 BPP 24 DSC enabled
[23:11:20] [PASSED] Clock 324540 BPP 24 DSC enabled
[23:11:20] ========== [PASSED] drm_test_dp_mst_calc_pbn_mode ==========
[23:11:20] ============== drm_test_dp_mst_calc_pbn_div  ===============
[23:11:20] [PASSED] Link rate 2000000 lane count 4
[23:11:20] [PASSED] Link rate 2000000 lane count 2
[23:11:20] [PASSED] Link rate 2000000 lane count 1
[23:11:20] [PASSED] Link rate 1350000 lane count 4
[23:11:20] [PASSED] Link rate 1350000 lane count 2
[23:11:20] [PASSED] Link rate 1350000 lane count 1
[23:11:20] [PASSED] Link rate 1000000 lane count 4
[23:11:20] [PASSED] Link rate 1000000 lane count 2
[23:11:20] [PASSED] Link rate 1000000 lane count 1
[23:11:20] [PASSED] Link rate 810000 lane count 4
[23:11:20] [PASSED] Link rate 810000 lane count 2
[23:11:20] [PASSED] Link rate 810000 lane count 1
[23:11:20] [PASSED] Link rate 540000 lane count 4
[23:11:20] [PASSED] Link rate 540000 lane count 2
[23:11:20] [PASSED] Link rate 540000 lane count 1
[23:11:20] [PASSED] Link rate 270000 lane count 4
[23:11:20] [PASSED] Link rate 270000 lane count 2
[23:11:20] [PASSED] Link rate 270000 lane count 1
[23:11:20] [PASSED] Link rate 162000 lane count 4
[23:11:20] [PASSED] Link rate 162000 lane count 2
[23:11:20] [PASSED] Link rate 162000 lane count 1
[23:11:20] ========== [PASSED] drm_test_dp_mst_calc_pbn_div ===========
[23:11:20] ========= drm_test_dp_mst_sideband_msg_req_decode  =========
[23:11:20] [PASSED] DP_ENUM_PATH_RESOURCES with port number
[23:11:20] [PASSED] DP_POWER_UP_PHY with port number
[23:11:20] [PASSED] DP_POWER_DOWN_PHY with port number
[23:11:20] [PASSED] DP_ALLOCATE_PAYLOAD with SDP stream sinks
[23:11:20] [PASSED] DP_ALLOCATE_PAYLOAD with port number
[23:11:20] [PASSED] DP_ALLOCATE_PAYLOAD with VCPI
[23:11:20] [PASSED] DP_ALLOCATE_PAYLOAD with PBN
[23:11:20] [PASSED] DP_QUERY_PAYLOAD with port number
[23:11:20] [PASSED] DP_QUERY_PAYLOAD with VCPI
[23:11:20] [PASSED] DP_REMOTE_DPCD_READ with port number
[23:11:20] [PASSED] DP_REMOTE_DPCD_READ with DPCD address
[23:11:20] [PASSED] DP_REMOTE_DPCD_READ with max number of bytes
[23:11:20] [PASSED] DP_REMOTE_DPCD_WRITE with port number
[23:11:20] [PASSED] DP_REMOTE_DPCD_WRITE with DPCD address
[23:11:20] [PASSED] DP_REMOTE_DPCD_WRITE with data array
[23:11:20] [PASSED] DP_REMOTE_I2C_READ with port number
[23:11:20] [PASSED] DP_REMOTE_I2C_READ with I2C device ID
[23:11:20] [PASSED] DP_REMOTE_I2C_READ with transactions array
[23:11:20] [PASSED] DP_REMOTE_I2C_WRITE with port number
[23:11:20] [PASSED] DP_REMOTE_I2C_WRITE with I2C device ID
[23:11:20] [PASSED] DP_REMOTE_I2C_WRITE with data array
[23:11:20] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream ID
[23:11:20] [PASSED] DP_QUERY_STREAM_ENC_STATUS with client ID
[23:11:20] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream event
[23:11:20] [PASSED] DP_QUERY_STREAM_ENC_STATUS with valid stream event
[23:11:20] [PASSED] DP_QUERY_STREAM_ENC_STATUS with stream behavior
[23:11:20] [PASSED] DP_QUERY_STREAM_ENC_STATUS with a valid stream behavior
[23:11:20] ===== [PASSED] drm_test_dp_mst_sideband_msg_req_decode =====
[23:11:20] ================ [PASSED] drm_dp_mst_helper ================
[23:11:20] ================== drm_exec (7 subtests) ===================
[23:11:20] [PASSED] sanitycheck
[23:11:20] [PASSED] test_lock
[23:11:20] [PASSED] test_lock_unlock
[23:11:20] [PASSED] test_duplicates
[23:11:20] [PASSED] test_prepare
[23:11:20] [PASSED] test_prepare_array
[23:11:20] [PASSED] test_multiple_loops
[23:11:20] ==================== [PASSED] drm_exec =====================
[23:11:20] =========== drm_format_helper_test (17 subtests) ===========
[23:11:20] ============== drm_test_fb_xrgb8888_to_gray8  ==============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ========== [PASSED] drm_test_fb_xrgb8888_to_gray8 ==========
[23:11:20] ============= drm_test_fb_xrgb8888_to_rgb332  ==============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb332 ==========
[23:11:20] ============= drm_test_fb_xrgb8888_to_rgb565  ==============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb565 ==========
[23:11:20] ============ drm_test_fb_xrgb8888_to_xrgb1555  =============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ======== [PASSED] drm_test_fb_xrgb8888_to_xrgb1555 =========
[23:11:20] ============ drm_test_fb_xrgb8888_to_argb1555  =============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ======== [PASSED] drm_test_fb_xrgb8888_to_argb1555 =========
[23:11:20] ============ drm_test_fb_xrgb8888_to_rgba5551  =============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ======== [PASSED] drm_test_fb_xrgb8888_to_rgba5551 =========
[23:11:20] ============= drm_test_fb_xrgb8888_to_rgb888  ==============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ========= [PASSED] drm_test_fb_xrgb8888_to_rgb888 ==========
[23:11:20] ============= drm_test_fb_xrgb8888_to_bgr888  ==============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ========= [PASSED] drm_test_fb_xrgb8888_to_bgr888 ==========
[23:11:20] ============ drm_test_fb_xrgb8888_to_argb8888  =============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ======== [PASSED] drm_test_fb_xrgb8888_to_argb8888 =========
[23:11:20] =========== drm_test_fb_xrgb8888_to_xrgb2101010  ===========
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ======= [PASSED] drm_test_fb_xrgb8888_to_xrgb2101010 =======
[23:11:20] =========== drm_test_fb_xrgb8888_to_argb2101010  ===========
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ======= [PASSED] drm_test_fb_xrgb8888_to_argb2101010 =======
[23:11:20] ============== drm_test_fb_xrgb8888_to_mono  ===============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ========== [PASSED] drm_test_fb_xrgb8888_to_mono ===========
[23:11:20] ==================== drm_test_fb_swab  =====================
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ================ [PASSED] drm_test_fb_swab =================
[23:11:20] ============ drm_test_fb_xrgb8888_to_xbgr8888  =============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ======== [PASSED] drm_test_fb_xrgb8888_to_xbgr8888 =========
[23:11:20] ============ drm_test_fb_xrgb8888_to_abgr8888  =============
[23:11:20] [PASSED] single_pixel_source_buffer
[23:11:20] [PASSED] single_pixel_clip_rectangle
[23:11:20] [PASSED] well_known_colors
[23:11:20] [PASSED] destination_pitch
[23:11:20] ======== [PASSED] drm_test_fb_xrgb8888_to_abgr8888 =========
[23:11:20] ================= drm_test_fb_clip_offset  =================
[23:11:20] [PASSED] pass through
[23:11:20] [PASSED] horizontal offset
[23:11:20] [PASSED] vertical offset
[23:11:20] [PASSED] horizontal and vertical offset
[23:11:20] [PASSED] horizontal offset (custom pitch)
[23:11:20] [PASSED] vertical offset (custom pitch)
[23:11:20] [PASSED] horizontal and vertical offset (custom pitch)
[23:11:20] ============= [PASSED] drm_test_fb_clip_offset =============
[23:11:20] =================== drm_test_fb_memcpy  ====================
[23:11:20] [PASSED] single_pixel_source_buffer: XR24 little-endian (0x34325258)
[23:11:20] [PASSED] single_pixel_source_buffer: XRA8 little-endian (0x38415258)
[23:11:20] [PASSED] single_pixel_source_buffer: YU24 little-endian (0x34325559)
[23:11:20] [PASSED] single_pixel_clip_rectangle: XB24 little-endian (0x34324258)
[23:11:20] [PASSED] single_pixel_clip_rectangle: XRA8 little-endian (0x38415258)
[23:11:20] [PASSED] single_pixel_clip_rectangle: YU24 little-endian (0x34325559)
[23:11:20] [PASSED] well_known_colors: XB24 little-endian (0x34324258)
[23:11:20] [PASSED] well_known_colors: XRA8 little-endian (0x38415258)
[23:11:20] [PASSED] well_known_colors: YU24 little-endian (0x34325559)
[23:11:20] [PASSED] destination_pitch: XB24 little-endian (0x34324258)
[23:11:20] [PASSED] destination_pitch: XRA8 little-endian (0x38415258)
[23:11:20] [PASSED] destination_pitch: YU24 little-endian (0x34325559)
[23:11:20] =============== [PASSED] drm_test_fb_memcpy ================
[23:11:20] ============= [PASSED] drm_format_helper_test ==============
[23:11:20] ================= drm_format (18 subtests) =================
[23:11:20] [PASSED] drm_test_format_block_width_invalid
[23:11:20] [PASSED] drm_test_format_block_width_one_plane
[23:11:20] [PASSED] drm_test_format_block_width_two_plane
[23:11:20] [PASSED] drm_test_format_block_width_three_plane
[23:11:20] [PASSED] drm_test_format_block_width_tiled
[23:11:20] [PASSED] drm_test_format_block_height_invalid
[23:11:20] [PASSED] drm_test_format_block_height_one_plane
[23:11:20] [PASSED] drm_test_format_block_height_two_plane
[23:11:20] [PASSED] drm_test_format_block_height_three_plane
[23:11:20] [PASSED] drm_test_format_block_height_tiled
[23:11:20] [PASSED] drm_test_format_min_pitch_invalid
[23:11:20] [PASSED] drm_test_format_min_pitch_one_plane_8bpp
[23:11:20] [PASSED] drm_test_format_min_pitch_one_plane_16bpp
[23:11:20] [PASSED] drm_test_format_min_pitch_one_plane_24bpp
[23:11:20] [PASSED] drm_test_format_min_pitch_one_plane_32bpp
[23:11:20] [PASSED] drm_test_format_min_pitch_two_plane
[23:11:20] [PASSED] drm_test_format_min_pitch_three_plane_8bpp
[23:11:20] [PASSED] drm_test_format_min_pitch_tiled
[23:11:20] =================== [PASSED] drm_format ====================
[23:11:20] ============== drm_framebuffer (10 subtests) ===============
[23:11:20] ========== drm_test_framebuffer_check_src_coords  ==========
[23:11:20] [PASSED] Success: source fits into fb
[23:11:20] [PASSED] Fail: overflowing fb with x-axis coordinate
[23:11:20] [PASSED] Fail: overflowing fb with y-axis coordinate
[23:11:20] [PASSED] Fail: overflowing fb with source width
[23:11:20] [PASSED] Fail: overflowing fb with source height
[23:11:20] ====== [PASSED] drm_test_framebuffer_check_src_coords ======
[23:11:20] [PASSED] drm_test_framebuffer_cleanup
[23:11:20] =============== drm_test_framebuffer_create  ===============
[23:11:20] [PASSED] ABGR8888 normal sizes
[23:11:20] [PASSED] ABGR8888 max sizes
[23:11:20] [PASSED] ABGR8888 pitch greater than min required
[23:11:20] [PASSED] ABGR8888 pitch less than min required
[23:11:20] [PASSED] ABGR8888 Invalid width
[23:11:20] [PASSED] ABGR8888 Invalid buffer handle
[23:11:20] [PASSED] No pixel format
[23:11:20] [PASSED] ABGR8888 Width 0
[23:11:20] [PASSED] ABGR8888 Height 0
[23:11:20] [PASSED] ABGR8888 Out of bound height * pitch combination
[23:11:20] [PASSED] ABGR8888 Large buffer offset
[23:11:20] [PASSED] ABGR8888 Buffer offset for inexistent plane
[23:11:20] [PASSED] ABGR8888 Invalid flag
[23:11:20] [PASSED] ABGR8888 Set DRM_MODE_FB_MODIFIERS without modifiers
[23:11:20] [PASSED] ABGR8888 Valid buffer modifier
[23:11:20] [PASSED] ABGR8888 Invalid buffer modifier(DRM_FORMAT_MOD_SAMSUNG_64_32_TILE)
[23:11:20] [PASSED] ABGR8888 Extra pitches without DRM_MODE_FB_MODIFIERS
[23:11:20] [PASSED] ABGR8888 Extra pitches with DRM_MODE_FB_MODIFIERS
[23:11:20] [PASSED] NV12 Normal sizes
[23:11:20] [PASSED] NV12 Max sizes
[23:11:20] [PASSED] NV12 Invalid pitch
[23:11:20] [PASSED] NV12 Invalid modifier/missing DRM_MODE_FB_MODIFIERS flag
[23:11:20] [PASSED] NV12 different  modifier per-plane
[23:11:20] [PASSED] NV12 with DRM_FORMAT_MOD_SAMSUNG_64_32_TILE
[23:11:20] [PASSED] NV12 Valid modifiers without DRM_MODE_FB_MODIFIERS
[23:11:20] [PASSED] NV12 Modifier for inexistent plane
[23:11:20] [PASSED] NV12 Handle for inexistent plane
[23:11:20] [PASSED] NV12 Handle for inexistent plane without DRM_MODE_FB_MODIFIERS
[23:11:20] [PASSED] YVU420 DRM_MODE_FB_MODIFIERS set without modifier
[23:11:20] [PASSED] YVU420 Normal sizes
[23:11:20] [PASSED] YVU420 Max sizes
[23:11:20] [PASSED] YVU420 Invalid pitch
[23:11:20] [PASSED] YVU420 Different pitches
[23:11:20] [PASSED] YVU420 Different buffer offsets/pitches
[23:11:20] [PASSED] YVU420 Modifier set just for plane 0, without DRM_MODE_FB_MODIFIERS
[23:11:20] [PASSED] YVU420 Modifier set just for planes 0, 1, without DRM_MODE_FB_MODIFIERS
[23:11:20] [PASSED] YVU420 Modifier set just for plane 0, 1, with DRM_MODE_FB_MODIFIERS
[23:11:20] [PASSED] YVU420 Valid modifier
[23:11:20] [PASSED] YVU420 Different modifiers per plane
[23:11:20] [PASSED] YVU420 Modifier for inexistent plane
[23:11:20] [PASSED] YUV420_10BIT Invalid modifier(DRM_FORMAT_MOD_LINEAR)
[23:11:20] [PASSED] X0L2 Normal sizes
[23:11:20] [PASSED] X0L2 Max sizes
[23:11:20] [PASSED] X0L2 Invalid pitch
[23:11:20] [PASSED] X0L2 Pitch greater than minimum required
[23:11:20] [PASSED] X0L2 Handle for inexistent plane
[23:11:20] [PASSED] X0L2 Offset for inexistent plane, without DRM_MODE_FB_MODIFIERS set
[23:11:20] [PASSED] X0L2 Modifier without DRM_MODE_FB_MODIFIERS set
[23:11:20] [PASSED] X0L2 Valid modifier
[23:11:20] [PASSED] X0L2 Modifier for inexistent plane
[23:11:20] =========== [PASSED] drm_test_framebuffer_create ===========
[23:11:20] [PASSED] drm_test_framebuffer_free
[23:11:20] [PASSED] drm_test_framebuffer_init
[23:11:20] [PASSED] drm_test_framebuffer_init_bad_format
[23:11:20] [PASSED] drm_test_framebuffer_init_dev_mismatch
[23:11:20] [PASSED] drm_test_framebuffer_lookup
[23:11:20] [PASSED] drm_test_framebuffer_lookup_inexistent
[23:11:20] [PASSED] drm_test_framebuffer_modifiers_not_supported
[23:11:20] ================= [PASSED] drm_framebuffer =================
[23:11:20] ================ drm_gem_shmem (8 subtests) ================
[23:11:20] [PASSED] drm_gem_shmem_test_obj_create
[23:11:20] [PASSED] drm_gem_shmem_test_obj_create_private
[23:11:20] [PASSED] drm_gem_shmem_test_pin_pages
[23:11:20] [PASSED] drm_gem_shmem_test_vmap
[23:11:20] [PASSED] drm_gem_shmem_test_get_pages_sgt
[23:11:20] [PASSED] drm_gem_shmem_test_get_sg_table
[23:11:20] [PASSED] drm_gem_shmem_test_madvise
[23:11:20] [PASSED] drm_gem_shmem_test_purge
[23:11:20] ================== [PASSED] drm_gem_shmem ==================
[23:11:20] === drm_atomic_helper_connector_hdmi_check (27 subtests) ===
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_auto_cea_mode_vic_1
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_full_cea_mode_vic_1
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_limited_cea_mode_vic_1
[23:11:20] ====== drm_test_check_broadcast_rgb_cea_mode_yuv420  =======
[23:11:20] [PASSED] Automatic
[23:11:20] [PASSED] Full
[23:11:20] [PASSED] Limited 16:235
[23:11:20] == [PASSED] drm_test_check_broadcast_rgb_cea_mode_yuv420 ===
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_changed
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_crtc_mode_not_changed
[23:11:20] [PASSED] drm_test_check_disable_connector
[23:11:20] [PASSED] drm_test_check_hdmi_funcs_reject_rate
[23:11:20] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_rgb
[23:11:20] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_yuv420
[23:11:20] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422
[23:11:20] [PASSED] drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv420
[23:11:20] [PASSED] drm_test_check_driver_unsupported_fallback_yuv420
[23:11:20] [PASSED] drm_test_check_output_bpc_crtc_mode_changed
[23:11:20] [PASSED] drm_test_check_output_bpc_crtc_mode_not_changed
[23:11:20] [PASSED] drm_test_check_output_bpc_dvi
[23:11:20] [PASSED] drm_test_check_output_bpc_format_vic_1
[23:11:20] [PASSED] drm_test_check_output_bpc_format_display_8bpc_only
[23:11:20] [PASSED] drm_test_check_output_bpc_format_display_rgb_only
[23:11:20] [PASSED] drm_test_check_output_bpc_format_driver_8bpc_only
[23:11:20] [PASSED] drm_test_check_output_bpc_format_driver_rgb_only
[23:11:20] [PASSED] drm_test_check_tmds_char_rate_rgb_8bpc
[23:11:20] [PASSED] drm_test_check_tmds_char_rate_rgb_10bpc
[23:11:20] [PASSED] drm_test_check_tmds_char_rate_rgb_12bpc
[23:11:20] ===== [PASSED] drm_atomic_helper_connector_hdmi_check ======
[23:11:20] === drm_atomic_helper_connector_hdmi_reset (6 subtests) ====
[23:11:20] [PASSED] drm_test_check_broadcast_rgb_value
[23:11:20] [PASSED] drm_test_check_bpc_8_value
[23:11:20] [PASSED] drm_test_check_bpc_10_value
[23:11:20] [PASSED] drm_test_check_bpc_12_value
[23:11:20] [PASSED] drm_test_check_format_value
[23:11:20] [PASSED] drm_test_check_tmds_char_value
[23:11:20] ===== [PASSED] drm_atomic_helper_connector_hdmi_reset ======
[23:11:20] = drm_atomic_helper_connector_hdmi_mode_valid (4 subtests) =
[23:11:20] [PASSED] drm_test_check_mode_valid
[23:11:20] [PASSED] drm_test_check_mode_valid_reject
[23:11:20] [PASSED] drm_test_check_mode_valid_reject_rate
[23:11:20] [PASSED] drm_test_check_mode_valid_reject_max_clock
[23:11:20] === [PASSED] drm_atomic_helper_connector_hdmi_mode_valid ===
[23:11:20] ================= drm_managed (2 subtests) =================
[23:11:20] [PASSED] drm_test_managed_release_action
[23:11:20] [PASSED] drm_test_managed_run_action
[23:11:20] =================== [PASSED] drm_managed ===================
[23:11:20] =================== drm_mm (6 subtests) ====================
[23:11:20] [PASSED] drm_test_mm_init
[23:11:20] [PASSED] drm_test_mm_debug
[23:11:20] [PASSED] drm_test_mm_align32
[23:11:20] [PASSED] drm_test_mm_align64
[23:11:20] [PASSED] drm_test_mm_lowest
[23:11:20] [PASSED] drm_test_mm_highest
[23:11:20] ===================== [PASSED] drm_mm ======================
[23:11:20] ============= drm_modes_analog_tv (5 subtests) =============
[23:11:20] [PASSED] drm_test_modes_analog_tv_mono_576i
[23:11:20] [PASSED] drm_test_modes_analog_tv_ntsc_480i
[23:11:20] [PASSED] drm_test_modes_analog_tv_ntsc_480i_inlined
[23:11:20] [PASSED] drm_test_modes_analog_tv_pal_576i
[23:11:20] [PASSED] drm_test_modes_analog_tv_pal_576i_inlined
[23:11:20] =============== [PASSED] drm_modes_analog_tv ===============
[23:11:20] ============== drm_plane_helper (2 subtests) ===============
[23:11:20] =============== drm_test_check_plane_state  ================
[23:11:20] [PASSED] clipping_simple
[23:11:20] [PASSED] clipping_rotate_reflect
[23:11:20] [PASSED] positioning_simple
[23:11:20] [PASSED] upscaling
[23:11:20] [PASSED] downscaling
[23:11:20] [PASSED] rounding1
[23:11:20] [PASSED] rounding2
[23:11:20] [PASSED] rounding3
[23:11:20] [PASSED] rounding4
[23:11:20] =========== [PASSED] drm_test_check_plane_state ============
[23:11:20] =========== drm_test_check_invalid_plane_state  ============
[23:11:20] [PASSED] positioning_invalid
[23:11:20] [PASSED] upscaling_invalid
[23:11:20] [PASSED] downscaling_invalid
[23:11:20] ======= [PASSED] drm_test_check_invalid_plane_state ========
[23:11:20] ================ [PASSED] drm_plane_helper =================
[23:11:20] ====== drm_connector_helper_tv_get_modes (1 subtest) =======
[23:11:20] ====== drm_test_connector_helper_tv_get_modes_check  =======
[23:11:20] [PASSED] None
[23:11:20] [PASSED] PAL
[23:11:20] [PASSED] NTSC
[23:11:20] [PASSED] Both, NTSC Default
[23:11:20] [PASSED] Both, PAL Default
[23:11:20] [PASSED] Both, NTSC Default, with PAL on command-line
[23:11:20] [PASSED] Both, PAL Default, with NTSC on command-line
[23:11:20] == [PASSED] drm_test_connector_helper_tv_get_modes_check ===
[23:11:20] ======== [PASSED] drm_connector_helper_tv_get_modes ========
[23:11:20] ================== drm_rect (9 subtests) ===================
[23:11:20] [PASSED] drm_test_rect_clip_scaled_div_by_zero
[23:11:20] [PASSED] drm_test_rect_clip_scaled_not_clipped
[23:11:20] [PASSED] drm_test_rect_clip_scaled_clipped
[23:11:20] [PASSED] drm_test_rect_clip_scaled_signed_vs_unsigned
[23:11:20] ================= drm_test_rect_intersect  =================
[23:11:20] [PASSED] top-left x bottom-right: 2x2+1+1 x 2x2+0+0
[23:11:20] [PASSED] top-right x bottom-left: 2x2+0+0 x 2x2+1-1
[23:11:20] [PASSED] bottom-left x top-right: 2x2+1-1 x 2x2+0+0
[23:11:20] [PASSED] bottom-right x top-left: 2x2+0+0 x 2x2+1+1
[23:11:20] [PASSED] right x left: 2x1+0+0 x 3x1+1+0
[23:11:20] [PASSED] left x right: 3x1+1+0 x 2x1+0+0
[23:11:20] [PASSED] up x bottom: 1x2+0+0 x 1x3+0-1
[23:11:20] [PASSED] bottom x up: 1x3+0-1 x 1x2+0+0
[23:11:20] [PASSED] touching corner: 1x1+0+0 x 2x2+1+1
[23:11:20] [PASSED] touching side: 1x1+0+0 x 1x1+1+0
[23:11:20] [PASSED] equal rects: 2x2+0+0 x 2x2+0+0
[23:11:20] [PASSED] inside another: 2x2+0+0 x 1x1+1+1
[23:11:20] [PASSED] far away: 1x1+0+0 x 1x1+3+6
[23:11:20] [PASSED] points intersecting: 0x0+5+10 x 0x0+5+10
[23:11:20] [PASSED] points not intersecting: 0x0+0+0 x 0x0+5+10
[23:11:20] ============= [PASSED] drm_test_rect_intersect =============
[23:11:20] ================ drm_test_rect_calc_hscale  ================
[23:11:20] [PASSED] normal use
[23:11:20] [PASSED] out of max range
[23:11:20] [PASSED] out of min range
[23:11:20] [PASSED] zero dst
[23:11:20] [PASSED] negative src
[23:11:20] [PASSED] negative dst
[23:11:20] ============ [PASSED] drm_test_rect_calc_hscale ============
[23:11:20] ================ drm_test_rect_calc_vscale  ================
[23:11:20] [PASSED] normal use
stty: 'standard input': Inappropriate ioctl for device
[23:11:20] [PASSED] out of max range
[23:11:20] [PASSED] out of min range
[23:11:20] [PASSED] zero dst
[23:11:20] [PASSED] negative src
[23:11:20] [PASSED] negative dst
[23:11:20] ============ [PASSED] drm_test_rect_calc_vscale ============
[23:11:20] ================== drm_test_rect_rotate  ===================
[23:11:20] [PASSED] reflect-x
[23:11:20] [PASSED] reflect-y
[23:11:20] [PASSED] rotate-0
[23:11:20] [PASSED] rotate-90
[23:11:20] [PASSED] rotate-180
[23:11:20] [PASSED] rotate-270
[23:11:20] ============== [PASSED] drm_test_rect_rotate ===============
[23:11:20] ================ drm_test_rect_rotate_inv  =================
[23:11:20] [PASSED] reflect-x
[23:11:20] [PASSED] reflect-y
[23:11:20] [PASSED] rotate-0
[23:11:20] [PASSED] rotate-90
[23:11:20] [PASSED] rotate-180
[23:11:20] [PASSED] rotate-270
[23:11:20] ============ [PASSED] drm_test_rect_rotate_inv =============
[23:11:20] ==================== [PASSED] drm_rect =====================
[23:11:20] ============ drm_sysfb_modeset_test (1 subtest) ============
[23:11:20] ============ drm_test_sysfb_build_fourcc_list  =============
[23:11:20] [PASSED] no native formats
[23:11:20] [PASSED] XRGB8888 as native format
[23:11:20] [PASSED] remove duplicates
[23:11:20] [PASSED] convert alpha formats
[23:11:20] [PASSED] random formats
[23:11:20] ======== [PASSED] drm_test_sysfb_build_fourcc_list =========
[23:11:20] ============= [PASSED] drm_sysfb_modeset_test ==============
[23:11:20] ================== drm_fixp (2 subtests) ===================
[23:11:20] [PASSED] drm_test_int2fixp
[23:11:20] [PASSED] drm_test_sm2fixp
[23:11:20] ==================== [PASSED] drm_fixp =====================
[23:11:20] ============================================================
[23:11:20] Testing complete. Ran 624 tests: passed: 624
[23:11:20] Elapsed time: 26.810s total, 1.657s configuring, 24.735s building, 0.393s running

+ /kernel/tools/testing/kunit/kunit.py run --kunitconfig /kernel/drivers/gpu/drm/ttm/tests/.kunitconfig
[23:11:20] Configuring KUnit Kernel ...
Regenerating .config ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
[23:11:22] Building KUnit Kernel ...
Populating config with:
$ make ARCH=um O=.kunit olddefconfig
Building with:
$ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=48
[23:11:31] Starting KUnit Kernel (1/1)...
[23:11:31] ============================================================
Running tests with:
$ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt
[23:11:31] ================= ttm_device (5 subtests) ==================
[23:11:31] [PASSED] ttm_device_init_basic
[23:11:31] [PASSED] ttm_device_init_multiple
[23:11:31] [PASSED] ttm_device_fini_basic
[23:11:31] [PASSED] ttm_device_init_no_vma_man
[23:11:31] ================== ttm_device_init_pools  ==================
[23:11:31] [PASSED] No DMA allocations, no DMA32 required
[23:11:31] [PASSED] DMA allocations, DMA32 required
[23:11:31] [PASSED] No DMA allocations, DMA32 required
[23:11:31] [PASSED] DMA allocations, no DMA32 required
[23:11:31] ============== [PASSED] ttm_device_init_pools ==============
[23:11:31] =================== [PASSED] ttm_device ====================
[23:11:31] ================== ttm_pool (8 subtests) ===================
[23:11:31] ================== ttm_pool_alloc_basic  ===================
[23:11:31] [PASSED] One page
[23:11:31] [PASSED] More than one page
[23:11:31] [PASSED] Above the allocation limit
[23:11:31] [PASSED] One page, with coherent DMA mappings enabled
[23:11:31] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[23:11:31] ============== [PASSED] ttm_pool_alloc_basic ===============
[23:11:31] ============== ttm_pool_alloc_basic_dma_addr  ==============
[23:11:31] [PASSED] One page
[23:11:31] [PASSED] More than one page
[23:11:31] [PASSED] Above the allocation limit
[23:11:31] [PASSED] One page, with coherent DMA mappings enabled
[23:11:31] [PASSED] Above the allocation limit, with coherent DMA mappings enabled
[23:11:31] ========== [PASSED] ttm_pool_alloc_basic_dma_addr ==========
[23:11:31] [PASSED] ttm_pool_alloc_order_caching_match
[23:11:31] [PASSED] ttm_pool_alloc_caching_mismatch
[23:11:31] [PASSED] ttm_pool_alloc_order_mismatch
[23:11:31] [PASSED] ttm_pool_free_dma_alloc
[23:11:31] [PASSED] ttm_pool_free_no_dma_alloc
[23:11:31] [PASSED] ttm_pool_fini_basic
[23:11:31] ==================== [PASSED] ttm_pool =====================
[23:11:31] ================ ttm_resource (8 subtests) =================
[23:11:31] ================= ttm_resource_init_basic  =================
[23:11:31] [PASSED] Init resource in TTM_PL_SYSTEM
[23:11:31] [PASSED] Init resource in TTM_PL_VRAM
[23:11:31] [PASSED] Init resource in a private placement
[23:11:31] [PASSED] Init resource in TTM_PL_SYSTEM, set placement flags
[23:11:31] ============= [PASSED] ttm_resource_init_basic =============
[23:11:31] [PASSED] ttm_resource_init_pinned
[23:11:31] [PASSED] ttm_resource_fini_basic
[23:11:31] [PASSED] ttm_resource_manager_init_basic
[23:11:31] [PASSED] ttm_resource_manager_usage_basic
[23:11:31] [PASSED] ttm_resource_manager_set_used_basic
[23:11:31] [PASSED] ttm_sys_man_alloc_basic
[23:11:31] [PASSED] ttm_sys_man_free_basic
[23:11:31] ================== [PASSED] ttm_resource ===================
[23:11:31] =================== ttm_tt (15 subtests) ===================
[23:11:31] ==================== ttm_tt_init_basic  ====================
[23:11:31] [PASSED] Page-aligned size
[23:11:31] [PASSED] Extra pages requested
[23:11:31] ================ [PASSED] ttm_tt_init_basic ================
[23:11:31] [PASSED] ttm_tt_init_misaligned
[23:11:31] [PASSED] ttm_tt_fini_basic
[23:11:31] [PASSED] ttm_tt_fini_sg
[23:11:31] [PASSED] ttm_tt_fini_shmem
[23:11:31] [PASSED] ttm_tt_create_basic
[23:11:31] [PASSED] ttm_tt_create_invalid_bo_type
[23:11:31] [PASSED] ttm_tt_create_ttm_exists
[23:11:31] [PASSED] ttm_tt_create_failed
[23:11:31] [PASSED] ttm_tt_destroy_basic
[23:11:31] [PASSED] ttm_tt_populate_null_ttm
[23:11:31] [PASSED] ttm_tt_populate_populated_ttm
[23:11:31] [PASSED] ttm_tt_unpopulate_basic
[23:11:31] [PASSED] ttm_tt_unpopulate_empty_ttm
[23:11:31] [PASSED] ttm_tt_swapin_basic
[23:11:31] ===================== [PASSED] ttm_tt ======================
[23:11:31] =================== ttm_bo (14 subtests) ===================
[23:11:31] =========== ttm_bo_reserve_optimistic_no_ticket  ===========
[23:11:31] [PASSED] Cannot be interrupted and sleeps
[23:11:31] [PASSED] Cannot be interrupted, locks straight away
[23:11:31] [PASSED] Can be interrupted, sleeps
[23:11:31] ======= [PASSED] ttm_bo_reserve_optimistic_no_ticket =======
[23:11:31] [PASSED] ttm_bo_reserve_locked_no_sleep
[23:11:31] [PASSED] ttm_bo_reserve_no_wait_ticket
[23:11:31] [PASSED] ttm_bo_reserve_double_resv
[23:11:31] [PASSED] ttm_bo_reserve_interrupted
[23:11:31] [PASSED] ttm_bo_reserve_deadlock
[23:11:31] [PASSED] ttm_bo_unreserve_basic
[23:11:31] [PASSED] ttm_bo_unreserve_pinned
[23:11:31] [PASSED] ttm_bo_unreserve_bulk
[23:11:31] [PASSED] ttm_bo_fini_basic
[23:11:31] [PASSED] ttm_bo_fini_shared_resv
[23:11:31] [PASSED] ttm_bo_pin_basic
[23:11:31] [PASSED] ttm_bo_pin_unpin_resource
[23:11:31] [PASSED] ttm_bo_multiple_pin_one_unpin
[23:11:31] ===================== [PASSED] ttm_bo ======================
[23:11:31] ============== ttm_bo_validate (21 subtests) ===============
[23:11:31] ============== ttm_bo_init_reserved_sys_man  ===============
[23:11:31] [PASSED] Buffer object for userspace
[23:11:31] [PASSED] Kernel buffer object
[23:11:31] [PASSED] Shared buffer object
[23:11:31] ========== [PASSED] ttm_bo_init_reserved_sys_man ===========
[23:11:31] ============== ttm_bo_init_reserved_mock_man  ==============
[23:11:31] [PASSED] Buffer object for userspace
[23:11:31] [PASSED] Kernel buffer object
[23:11:31] [PASSED] Shared buffer object
[23:11:31] ========== [PASSED] ttm_bo_init_reserved_mock_man ==========
[23:11:31] [PASSED] ttm_bo_init_reserved_resv
[23:11:31] ================== ttm_bo_validate_basic  ==================
[23:11:31] [PASSED] Buffer object for userspace
[23:11:31] [PASSED] Kernel buffer object
[23:11:31] [PASSED] Shared buffer object
[23:11:31] ============== [PASSED] ttm_bo_validate_basic ==============
[23:11:31] [PASSED] ttm_bo_validate_invalid_placement
[23:11:31] ============= ttm_bo_validate_same_placement  ==============
[23:11:31] [PASSED] System manager
[23:11:31] [PASSED] VRAM manager
[23:11:31] ========= [PASSED] ttm_bo_validate_same_placement ==========
[23:11:31] [PASSED] ttm_bo_validate_failed_alloc
[23:11:31] [PASSED] ttm_bo_validate_pinned
[23:11:31] [PASSED] ttm_bo_validate_busy_placement
[23:11:31] ================ ttm_bo_validate_multihop  =================
[23:11:31] [PASSED] Buffer object for userspace
[23:11:31] [PASSED] Kernel buffer object
[23:11:31] [PASSED] Shared buffer object
[23:11:31] ============ [PASSED] ttm_bo_validate_multihop =============
[23:11:31] ========== ttm_bo_validate_no_placement_signaled  ==========
[23:11:31] [PASSED] Buffer object in system domain, no page vector
[23:11:31] [PASSED] Buffer object in system domain with an existing page vector
[23:11:31] ====== [PASSED] ttm_bo_validate_no_placement_signaled ======
[23:11:31] ======== ttm_bo_validate_no_placement_not_signaled  ========
[23:11:31] [PASSED] Buffer object for userspace
[23:11:31] [PASSED] Kernel buffer object
[23:11:31] [PASSED] Shared buffer object
[23:11:31] ==== [PASSED] ttm_bo_validate_no_placement_not_signaled ====
[23:11:31] [PASSED] ttm_bo_validate_move_fence_signaled
[23:11:31] ========= ttm_bo_validate_move_fence_not_signaled  =========
[23:11:31] [PASSED] Waits for GPU
[23:11:31] [PASSED] Tries to lock straight away
[23:11:31] ===== [PASSED] ttm_bo_validate_move_fence_not_signaled =====
[23:11:31] [PASSED] ttm_bo_validate_happy_evict
[23:11:31] [PASSED] ttm_bo_validate_all_pinned_evict
[23:11:31] [PASSED] ttm_bo_validate_allowed_only_evict
[23:11:31] [PASSED] ttm_bo_validate_deleted_evict
[23:11:31] [PASSED] ttm_bo_validate_busy_domain_evict
[23:11:31] [PASSED] ttm_bo_validate_evict_gutting
[23:11:31] [PASSED] ttm_bo_validate_recrusive_evict
stty: 'standard input': Inappropriate ioctl for device
[23:11:31] ================= [PASSED] ttm_bo_validate =================
[23:11:31] ============================================================
[23:11:31] Testing complete. Ran 101 tests: passed: 101
[23:11:31] Elapsed time: 10.921s total, 1.636s configuring, 9.069s building, 0.178s running

+ cleanup
++ stat -c %u:%g /kernel
+ chown -R 1003:1003 /kernel



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 01/11] drm/xe/gt: Add engine masks for each class
  2025-12-06 23:03 ` [PATCH v2 01/11] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
@ 2025-12-07 15:35   ` Michal Wajdeczko
  0 siblings, 0 replies; 30+ messages in thread
From: Michal Wajdeczko @ 2025-12-07 15:35 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe



On 12/7/2025 12:03 AM, Daniele Ceraolo Spurio wrote:
> Follow up patches will need the engine masks for VCS and VECS engines.
> Since we already have a macro for the CCS engines, just extend the same
> approach to all classes.
> 
> To avoid confusion with the XE_HW_ENGINE_*_MASK masks, the new macros
> use the _INSTANCES suffix instead. For consistency, rename CCS_MASK to
> CCS_INSTANCES as well.
> 
> v2: Use _INSTANCES suffix (Michal)
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt.h          | 9 ++++++++-
>  drivers/gpu/drm/xe/xe_gt_ccs_mode.c | 8 ++++----
>  drivers/gpu/drm/xe/xe_gt_ccs_mode.h | 2 +-
>  drivers/gpu/drm/xe/xe_guc.c         | 2 +-
>  drivers/gpu/drm/xe/xe_guc_submit.c  | 2 +-
>  5 files changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> index 94969ddd9d88..2eeeeeb6b912 100644
> --- a/drivers/gpu/drm/xe/xe_gt.h
> +++ b/drivers/gpu/drm/xe/xe_gt.h
> @@ -20,7 +20,14 @@
>  		for_each_if(((hwe__) = (gt__)->hw_engines + (id__)) && \
>  			  xe_hw_engine_is_valid((hwe__)))
>  
> -#define CCS_MASK(gt) (((gt)->info.engine_mask & XE_HW_ENGINE_CCS_MASK) >> XE_HW_ENGINE_CCS0)
> +#define __ENGINE_INSTANCES_MASK(gt, name) \

nit: this MASK suffix here still suggests it returns some kind of mask
if you still want to have MASK tag here, then maybe name macro as:

   #define XE_ENGINE_MASK_TO_INSTANCES(gt, ENGINE)
or
   #define XE_ENGINE_INSTANCES_FROM_MASK(gt, ENGINE)

but final macros look fine, so with improved name of helper macro,

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>

> +	(((gt)->info.engine_mask & XE_HW_ENGINE_##name##_MASK) >> XE_HW_ENGINE_##name##0)
> +
> +#define RCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, RCS)
> +#define VCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, VCS)
> +#define VECS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, VECS)
> +#define CCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, CCS)
> +#define GSCCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, GSCCS)
>  
>  #define GT_VER(gt) ({ \
>  	typeof(gt) gt_ = (gt); \
> diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
> index 50fffc9ebf62..91ac22ef5703 100644
> --- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
> +++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
> @@ -17,7 +17,7 @@
>  static void __xe_gt_apply_ccs_mode(struct xe_gt *gt, u32 num_engines)
>  {
>  	u32 mode = CCS_MODE_CSLICE_0_3_MASK; /* disable all by default */
> -	int num_slices = hweight32(CCS_MASK(gt));
> +	int num_slices = hweight32(CCS_INSTANCES(gt));
>  	struct xe_device *xe = gt_to_xe(gt);
>  	int width, cslice = 0;
>  	u32 config = 0;
> @@ -59,7 +59,7 @@ static void __xe_gt_apply_ccs_mode(struct xe_gt *gt, u32 num_engines)
>  			config |= BIT(hwe->instance) << XE_HW_ENGINE_CCS0;
>  
>  			/* If a slice is fused off, leave disabled */
> -			while ((CCS_MASK(gt) & BIT(cslice)) == 0)
> +			while ((CCS_INSTANCES(gt) & BIT(cslice)) == 0)
>  				cslice++;
>  
>  			mode &= ~CCS_MODE_CSLICE(cslice, CCS_MODE_CSLICE_MASK);
> @@ -94,7 +94,7 @@ num_cslices_show(struct device *kdev,
>  {
>  	struct xe_gt *gt = kobj_to_gt(&kdev->kobj);
>  
> -	return sysfs_emit(buf, "%u\n", hweight32(CCS_MASK(gt)));
> +	return sysfs_emit(buf, "%u\n", hweight32(CCS_INSTANCES(gt)));
>  }
>  
>  static DEVICE_ATTR_RO(num_cslices);
> @@ -131,7 +131,7 @@ ccs_mode_store(struct device *kdev, struct device_attribute *attr,
>  	 * Ensure number of engines specified is valid and there is an
>  	 * exact multiple of engines for slices.
>  	 */
> -	num_slices = hweight32(CCS_MASK(gt));
> +	num_slices = hweight32(CCS_INSTANCES(gt));
>  	if (!num_engines || num_engines > num_slices || num_slices % num_engines) {
>  		xe_gt_dbg(gt, "Invalid compute config, %d engines %d slices\n",
>  			  num_engines, num_slices);
> diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.h b/drivers/gpu/drm/xe/xe_gt_ccs_mode.h
> index f8779852cf0d..ef3b853f5c8c 100644
> --- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.h
> +++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.h
> @@ -17,7 +17,7 @@ int xe_gt_ccs_mode_sysfs_init(struct xe_gt *gt);
>  static inline bool xe_gt_ccs_mode_enabled(const struct xe_gt *gt)
>  {
>  	/* Check if there are more than one compute engines available */
> -	return hweight32(CCS_MASK(gt)) > 1;
> +	return hweight32(CCS_INSTANCES(gt)) > 1;
>  }
>  
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_guc.c b/drivers/gpu/drm/xe/xe_guc.c
> index 88376bc2a483..ccc914563ca0 100644
> --- a/drivers/gpu/drm/xe/xe_guc.c
> +++ b/drivers/gpu/drm/xe/xe_guc.c
> @@ -175,7 +175,7 @@ static bool needs_wa_dual_queue(struct xe_gt *gt)
>  	 * the DUAL_QUEUE_WA on all newer platforms on GTs that have CCS engines
>  	 * to move management back to the GuC.
>  	 */
> -	if (CCS_MASK(gt) && GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270)
> +	if (CCS_INSTANCES(gt) && GRAPHICS_VERx100(gt_to_xe(gt)) >= 1270)
>  		return true;
>  
>  	return false;
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index 3ca2558c8c96..af43acf7baae 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -388,7 +388,7 @@ static int guc_init_global_schedule_policy(struct xe_guc *guc)
>  
>  	*emit++ = XE_GUC_ACTION_UPDATE_SCHEDULING_POLICIES_KLV;
>  
> -	if (CCS_MASK(guc_to_gt(guc)))
> +	if (CCS_INSTANCES(guc_to_gt(guc)))
>  		emit = emit_render_compute_yield_klv(emit);
>  
>  	count = emit - data;


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/11] drm/xe/sriov: Initialize scheduler groups
  2025-12-06 23:03 ` [PATCH v2 02/11] drm/xe/sriov: Initialize scheduler groups Daniele Ceraolo Spurio
@ 2025-12-07 21:57   ` Michal Wajdeczko
  2025-12-08 17:36     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Wajdeczko @ 2025-12-07 21:57 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe



On 12/7/2025 12:03 AM, Daniele Ceraolo Spurio wrote:
> Scheduler groups (a.k.a. Engine Groups Scheduling, or EGS) is a GuC
> feature that allows the driver to define groups of engines that are
> independently scheduled across VFs, which allows different VFs to be
> active on the HW at the same time on different groups. The feature is
> available for BMG and newer HW starting on GuC 70.53.0, but some
> required fixes have been added to GuC 70.55.1.
> 
> This is intended for specific scenarios where the admin knows that the
> VFs are not going to fully utilize the HW and therefore assigning all of
> it to a single VF would lead to part of it being permanently idle.
> We do not allow the admin to decide how to divide the engines across
> groups, but we instead support specific configurations that are designed
> for specific use-cases. During PF initialization we detect which
> configurations are possible on a given GT and create the relevant
> groups. Since the GuC expect a mask for each class for each group, that
> is what we save when we init the configs.
> 
> Right now we only have one use-case on the media GT. If the VFs are
> running a frame render + encoding at a not-too-high resolution (e.g.
> 1080@30fps) the render can produce frames faster than the video engine
> can encode them, which means that the maximum number of parallel VFs is
> limited by the VCS bandwidth. Since our products can have multiple VCS
> engines, allowing multiple VFs to be active on the different VCS engines
> at the same time allows us to run more parallel VFs on the same HW.
> Given that engines in the same media slice share some resources (e.g.
> SFC), we assign each media slice to a different scheduling group. We
> refer to this configuration as "media_slices", given that each slice
> gets its own group.
> 
> Note that while the GuC interface supports a maximum of 8 groups, the
> actual number of groups that can be enabled can be lower than that and
> can be different on different devices. For now, all devices support up
> to 2 groups.
> 
> v2: Use asserts for coding errors, code cleanups, better docs (Michal),
>     limit groups to 2, limit to BMG and newer, bump required GuC to
>     70.55.1.
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt.h                    |   6 +
>  drivers/gpu/drm/xe/xe_gt_sriov_pf.c           |   3 +
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 142 ++++++++++++++++++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   1 +
>  .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |  29 ++++
>  5 files changed, 181 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> index 2eeeeeb6b912..0a34e862406e 100644
> --- a/drivers/gpu/drm/xe/xe_gt.h
> +++ b/drivers/gpu/drm/xe/xe_gt.h
> @@ -29,6 +29,12 @@
>  #define CCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, CCS)
>  #define GSCCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, GSCCS)
>  
> +/*
> + * Each media slice has 1x VECS, so the max number of VECS instances gives us
> + * the max number of slices that a GT can have.
> + */
> +#define MAX_MEDIA_SLICES hweight32(XE_HW_ENGINE_VECS_MASK)

isn't xe_hw_engine_types.h file a better fit for this macro?

> +
>  #define GT_VER(gt) ({ \
>  	typeof(gt) gt_ = (gt); \
>  	struct xe_device *xe = gt_to_xe(gt_); \
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> index 0714c758b9c1..0d97a823e702 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> @@ -14,6 +14,7 @@
>  #include "xe_gt_sriov_pf_control.h"
>  #include "xe_gt_sriov_pf_helpers.h"
>  #include "xe_gt_sriov_pf_migration.h"
> +#include "xe_gt_sriov_pf_policy.h"
>  #include "xe_gt_sriov_pf_service.h"
>  #include "xe_gt_sriov_printk.h"
>  #include "xe_guc_submit.h"
> @@ -123,6 +124,8 @@ int xe_gt_sriov_pf_init(struct xe_gt *gt)
>  	if (err)
>  		return err;
>  
> +	xe_gt_sriov_pf_policy_init(gt);
> +
>  	err = xe_gt_sriov_pf_migration_init(gt);
>  	if (err)
>  		return err;
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> index 4445f660e6d1..158d68aff4b7 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> @@ -3,6 +3,8 @@
>   * Copyright © 2023-2024 Intel Corporation
>   */
>  
> +#include <drm/drm_managed.h>
> +
>  #include "abi/guc_actions_sriov_abi.h"
>  
>  #include "xe_bo.h"
> @@ -10,6 +12,7 @@
>  #include "xe_gt_sriov_pf_helpers.h"
>  #include "xe_gt_sriov_pf_policy.h"
>  #include "xe_gt_sriov_printk.h"
> +#include "xe_guc.h"
>  #include "xe_guc_buf.h"
>  #include "xe_guc_ct.h"
>  #include "xe_guc_klv_helpers.h"
> @@ -351,6 +354,133 @@ u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt)
>  	return value;
>  }
>  
> +static void pf_sched_group_media_slices(struct xe_gt *gt, u32 **masks, u32 *num_masks)
> +{
> +	u8 slice_to_group[MAX_MEDIA_SLICES];
> +	u32 vecs_mask = VECS_INSTANCES(gt);
> +	u32 gsc_mask = GSCCS_INSTANCES(gt);
> +	u32 vcs_mask = VCS_INSTANCES(gt);
> +	struct xe_hw_engine *hwe;
> +	enum xe_hw_engine_id id;
> +	int groups = 0;
> +	u32 *values;
> +	int slice;
> +
> +	xe_gt_assert(gt, xe_gt_is_media_type(gt));
> +
> +	/* A media slice has 2 VCS and a VECS. We bundle the GSC with the first slice */
> +	for (slice = 0; slice < MAX_MEDIA_SLICES; slice++) {
> +		if ((vcs_mask & 0x3) || (vecs_mask & 0x1) || (gsc_mask & 0x1))
> +			slice_to_group[slice] = groups++;
> +
> +		vcs_mask >>= 2;
> +		vecs_mask >>= 1;
> +		gsc_mask >>= 1;
> +	}
> +
> +	xe_gt_assert(gt, !vcs_mask);
> +	xe_gt_assert(gt, !vecs_mask);
> +	xe_gt_assert(gt, !gsc_mask);
> +
> +	/* We need at least 2 slices to split them up */
> +	if (groups < 2)
> +		return;
> +
> +	if (groups > gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups) {

should we care about this limitation here in this function?

we will allocate groups here as required, only GuC might not correctly handle that
but that's mainly a GuC's problem and we may just report that during sending provisioning to GuC

> +		xe_gt_sriov_notice(gt, "media_slice mode has too many groups: %u vs %u\n",
> +				   groups,
> +				   gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups);
> +		return;
> +	}
> +
> +	/*
> +	 * The GuC expects an array with GUC_MAX_ENGINE_CLASSES entries for
> +	 * each group.
> +	 */
> +	values = drmm_kzalloc(&gt_to_xe(gt)->drm,

drmm_kcalloc ?

> +			      GUC_MAX_ENGINE_CLASSES * groups * sizeof(u32),
> +			      GFP_KERNEL);
> +	if (!values)
> +		return;
> +
> +	for_each_hw_engine(hwe, gt, id) {
> +		u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
> +		u8 entry;
> +
> +		switch (hwe->class) {
> +		case XE_ENGINE_CLASS_VIDEO_DECODE:
> +			slice = hwe->instance / 2;
> +			break;
> +		case XE_ENGINE_CLASS_VIDEO_ENHANCE:
> +			slice = hwe->instance;
> +			break;
> +		case XE_ENGINE_CLASS_OTHER:
> +			slice = 0;
> +			break;
> +		default:
> +			xe_gt_assert_msg(gt, false,
> +					 "unknown media gt class %u (%s) during EGS setup\n",
> +					 hwe->class, hwe->name);
> +			drmm_kfree(&gt_to_xe(gt)->drm, values);

hmm, do we really need to abort here?
maybe assert and then still assign this unk engine to slice 0 (like fallback to class_other)

> +			return;
> +		}
> +
> +		entry = slice_to_group[slice] * GUC_MAX_ENGINE_CLASSES + guc_class;
> +		values[entry] |= BIT(hwe->logical_instance);
> +	}
> +
> +	*masks = values;
> +	*num_masks = GUC_MAX_ENGINE_CLASSES * groups;

IMO we should store number of groups only

we know that each group is just set of GUC_MAX_ENGINE_CLASSES instances

maybe even we should have below struct defined somewhere:

	struct guc_sched_group {
		u32 engines[GUC_MAX_ENGINE_CLASSES];
	} __packed;


> +}
> +
> +static void pf_init_sched_groups(struct xe_gt *gt)
> +{
> +	int m;
> +
> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +
> +	/*
> +	 * The GuC supports scheduler groups from v70.53.0, but a fix for it has
> +	 * been merged in v70.55.1, so we require the latter. The feature is
> +	 * also only enabled on BMG and newer FW.
> +	 */
> +	if (GUC_FIRMWARE_VER(&gt->uc.guc) < MAKE_GUC_VER(70, 55, 1) ||
> +	    gt_to_xe(gt)->info.platform < XE_BATTLEMAGE)
> +		return;
> +
> +	/*
> +	 * The GuC interface supports up to 8 groups. However, the GuC only
> +	 * fully allocates resources for a subset of groups, based on the number
> +	 * of engines and expected usage. The plan is for this to become
> +	 * queryable via H2G, but for now GuC FW for all devices supports a
> +	 * maximum of 2 groups so we can just hardcode that.
> +	 */
> +	gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups = 2;

maybe this limitation should be introduced in next patch 3/11 where you are
actually trying to provision sched_groups within GuC?

> +
> +	for (m = 0; m < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; m++) {
> +		u32 *masks = NULL;
> +		u32 num_masks = 0;

maybe initialize them as pointers instead:

		u32 *num_masks = &gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks;
		u32 **masks = &gt->sriov.pf.policy.guc.sched_groups.modes[m].masks;

> +
> +		switch (m) {
> +		case XE_SRIOV_SCHED_GROUPS_NONE:

I'm still not convinced that we need to waste an array index for the NONE mode

can't we just loop over known modes (for now its MEDIA slices only)
and if NONE is selected just implicitly assume .num_masks is zero?

then array will hold only potentially valid group definitions

> +			break;
> +		case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
> +			/* this mode only has groups on the media GT */
> +			if (xe_gt_is_media_type(gt))
> +				pf_sched_group_media_slices(gt, &masks, &num_masks);
> +			break;
> +		default:
> +			xe_gt_assert_msg(gt, false, "unknown sched group mode %u\n", m);
> +			return;
> +		}
> +
> +		xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
> +
> +		gt->sriov.pf.policy.guc.sched_groups.modes[m].masks = masks;
> +		gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks = num_masks;
> +	}
> +}
> +
>  static void pf_sanitize_guc_policies(struct xe_gt *gt)
>  {
>  	pf_sanitize_sched_if_idle(gt);
> @@ -401,6 +531,18 @@ int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset)
>  	return err ? -ENXIO : 0;
>  }
>  
> +/**
> + * xe_gt_sriov_pf_policy_init - Initializes the SW state of the PF policies.

    * xe_gt_sriov_pf_policy_init() - Initializes ...

> + * @gt: the &xe_gt
> + *
> + * This function can only be called on PF. This function does not touch the HW,
> + * but must be called after the engines have been initialized.
> + */
> +void xe_gt_sriov_pf_policy_init(struct xe_gt *gt)
> +{
> +	pf_init_sched_groups(gt);
> +}
> +
>  static void print_guc_policies(struct drm_printer *p, struct xe_gt_sriov_guc_policies *policy)
>  {
>  	drm_printf(p, "%s:\t%s\n",
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> index 2a5dc33dc6d7..52312d24d527 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> @@ -18,6 +18,7 @@ bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
>  int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
>  u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
>  
> +void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
>  void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
>  int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset);
>  int xe_gt_sriov_pf_policy_print(struct xe_gt *gt, struct drm_printer *p);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> index 4de532af135e..1d4cdc87e069 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> @@ -8,16 +8,45 @@
>  
>  #include <linux/types.h>
>  
> +/**
> + * enum xe_sriov_sched_group_modes - list of possible scheduler group modes
> + * @XE_SRIOV_SCHED_GROUPS_NONE - no separate groups (i.e., all engines in group 0)
> + * @XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES - separate groups for each media slice
> + * @XE_SRIOV_SCHED_GROUPS_MODES_COUNT - number of valid modes
> + */
> +enum xe_sriov_sched_group_modes {
> +	XE_SRIOV_SCHED_GROUPS_NONE = 0,
> +	XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES,

hmm, I was assuming that enum NONE is just an alias for 0,
and that individual supported modes are defined as BITs,

	XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES = BIT(0),

> +	XE_SRIOV_SCHED_GROUPS_MODES_COUNT
> +};
> +
> +/**
> + * struct xe_gt_sriov_scheduler_groups - Scheduler groups policy info
> + * @max_num_of_groups: number of groups supported by the GuC for the platform

nit: just @max_groups ?

> + * @modes: array of masks and their number for each mode

	@modes: array of defined scheduling group modes

> + * @modes.masks: array of masks for a given mode

hmm, 'mask' alone is not the best name here, maybe:

	@modes.groups: array of engine instance groups in given mode,
			or NULL if mode is not supported.
			each group consists of set of
			GUC_MAX_ENGINE_CLASSES of engine instances mask

> + * @modes.num_masks: number of masks in the array

	@modes.num_groups: number of groups in given mode,
			or zero if mode is not supported

> + */
> +struct xe_gt_sriov_scheduler_groups {
> +	u8 max_num_of_groups;
> +	struct {
> +		u32 *masks;
> +		u32 num_masks;

		struct guc_sched_group *groups;
		u32 num_groups;

> +	} modes[XE_SRIOV_SCHED_GROUPS_MODES_COUNT];
> +};
> +
>  /**
>   * struct xe_gt_sriov_guc_policies - GuC SR-IOV policies.
>   * @sched_if_idle: controls strict scheduling policy.
>   * @reset_engine: controls engines reset on VF switch policy.
>   * @sample_period: adverse events sampling period (in milliseconds).
> + * @sched_groups: available scheduling group configurations.
>   */
>  struct xe_gt_sriov_guc_policies {
>  	bool sched_if_idle;
>  	bool reset_engine;
>  	u32 sample_period;
> +	struct xe_gt_sriov_scheduler_groups sched_groups;
>  };
>  
>  /**


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/11] drm/xe/sriov: Add support for enabling scheduler groups
  2025-12-06 23:03 ` [PATCH v2 03/11] drm/xe/sriov: Add support for enabling " Daniele Ceraolo Spurio
@ 2025-12-07 21:57   ` Michal Wajdeczko
  2025-12-08 17:41     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Wajdeczko @ 2025-12-07 21:57 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe



On 12/7/2025 12:03 AM, Daniele Ceraolo Spurio wrote:
> Scheduler groups are enabled by sending a specific policy configuration
> KLV to the GuC. We don't allow changing this policy if there are VF
> active, since the expectation is that the VF will only check if the
> feature is enabled during driver initialization.
> 
> The functions added by this patch will be used by sysfs/debugfs, coming
> in follow up patches.
> 
> v2: code improvements, add GUC_MAX_SCHED_GROUPS define, don't add
>     XE_SRIOV_SCHED_GROUPS_NONE to supported_modes (Michal)
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>  drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  17 +++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 136 ++++++++++++++++++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   3 +
>  .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |   4 +
>  drivers/gpu/drm/xe/xe_guc_fwif.h              |   2 +
>  drivers/gpu/drm/xe/xe_guc_klv_helpers.c       |   2 +
>  6 files changed, 164 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> index 265a135e7061..45733a87183a 100644
> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> @@ -200,6 +200,20 @@ enum  {
>   *      :0: adverse events are not counted (default)
>   *      :n: sample period in milliseconds
>   *
> + * _`GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG` : 0x8004
> + *      This config allows the PF to split the engines across scheduling groups.
> + *      Each group is independently timesliced across VFs, allowing different
> + *      VFs to be active on the HW at the same time. When enabling this feature,
> + *      all engines must be assigned to a group (and only one group), or they
> + *      will be excluded from scheduling after this KLV is sent. To enable
> + *      the groups, the driver must provide a masks array with
> + *      GUC_MAX_ENGINE_CLASSES entries for each group, with each mask indicating
> + *      which logical instances of that class belong to the group. Therefore,
> + *      the length of this KLV when enabling groups is
> + *      num_groups * GUC_MAX_ENGINE_CLASSES. To disable the groups, the driver
> + *      must send the KLV without any payload (i.e. len = 0). The maximum
> + *      number of groups is 8.
> + *
>   * _`GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH` : 0x8D00
>   *      This enum is to reset utilized HW engine after VF Switch (i.e to clean
>   *      up Stale HW register left behind by previous VF)
> @@ -214,6 +228,9 @@ enum  {
>  #define GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_KEY	0x8002
>  #define GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_LEN	1u
>  
> +#define GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY	0x8004
> +#define GUC_KLV_VGT_POLICY_ENGINE_GROUP_MAX_COUNT	8u
> +
>  #define GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_KEY	0x8D00
>  #define GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_LEN	1u
>  
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> index 158d68aff4b7..1109fec99fc3 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> @@ -97,6 +97,23 @@ static int pf_push_policy_u32(struct xe_gt *gt, u16 key, u32 value)
>  	return pf_push_policy_klvs(gt, 1, klv, ARRAY_SIZE(klv));
>  }
>  
> +static int pf_push_policy_payload(struct xe_gt *gt, u16 key, u32 *payload, u32 num_dwords)
> +{
> +	CLASS(xe_guc_buf, buf)(&gt->uc.guc.buf, GUC_KLV_LEN_MIN + num_dwords);
> +	u32 *klv;
> +
> +	if (!xe_guc_buf_is_valid(buf))
> +		return -ENOBUFS;
> +
> +	klv = xe_guc_buf_cpu_ptr(buf);
> +
> +	klv[0] = PREP_GUC_KLV(key, num_dwords);
> +	if (num_dwords)
> +		memcpy(&klv[1], payload, num_dwords * sizeof(u32));
> +
> +	return pf_push_policy_buf_klvs(gt, 1, buf, GUC_KLV_LEN_MIN + num_dwords);
> +}
> +
>  static int pf_update_policy_bool(struct xe_gt *gt, u16 key, bool *policy, bool value)
>  {
>  	int err;
> @@ -476,16 +493,134 @@ static void pf_init_sched_groups(struct xe_gt *gt)
>  
>  		xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
>  

please keep asserts together

> +		xe_gt_assert(gt, num_masks / GUC_MAX_ENGINE_CLASSES < GUC_MAX_SCHED_GROUPS);
> +
> +		if (num_masks)
> +			gt->sriov.pf.policy.guc.sched_groups.supported_modes |= BIT(m);
> +
>  		gt->sriov.pf.policy.guc.sched_groups.modes[m].masks = masks;
>  		gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks = num_masks;
>  	}
>  }
>  
> +/**
> + * xe_sriov_gt_pf_policy_has_multi_group_modes() - check whether the GT supports
> + * any scheduler modes that have multiple groups
> + * @gt: the &xe_gt to check
> + *
> + * This function can only be called on PF.
> + *
> + * Return: true if the GT supports modes with multiple groups, false otherwise.
> + */
> +bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt)
> +{
> +	return gt->sriov.pf.policy.guc.sched_groups.supported_modes;
> +}
> +
> +/**
> + * xe_sriov_gt_pf_policy_has_sched_group_mode() - check whether the GT supports
> + * a specific scheduler group mode
> + * @gt: the &xe_gt to check
> + * @mode: the mode to check
> + *
> + * This function can only be called on PF.
> + *
> + * Return: true if the GT supports the specified mode, false otherwise.
> + */
> +bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode)
> +{
> +	if (mode == XE_SRIOV_SCHED_GROUPS_NONE)
> +		return true;
> +
> +	return gt->sriov.pf.policy.guc.sched_groups.supported_modes & BIT(mode);
> +}
> +
> +static int __pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
> +{
> +	u32 *masks = gt->sriov.pf.policy.guc.sched_groups.modes[mode].masks;
> +	u32 num_masks = gt->sriov.pf.policy.guc.sched_groups.modes[mode].num_masks;
> +
> +	xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
> +
> +	return pf_push_policy_payload(gt, GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY,
> +				      masks, num_masks);
> +}
> +
> +static int pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
> +{
> +	int err;
> +
> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
> +
> +	if (!xe_sriov_gt_pf_policy_has_sched_group_mode(gt, mode))
> +		return -EINVAL;
> +
> +	/* already in the desired mode */
> +	if (gt->sriov.pf.policy.guc.sched_groups.current_mode == mode)
> +		return 0;
> +
> +	/*
> +	 * We don't allow changing this with VFs active since it is hard for
> +	 * VFs to check.
> +	 */
> +	if (xe_sriov_pf_num_vfs(gt_to_xe(gt)))
> +		return -EBUSY;
> +
> +	err = __pf_provision_sched_groups(gt, mode);
> +	if (err)
> +		return err;
> +
> +	gt->sriov.pf.policy.guc.sched_groups.current_mode = mode;
> +
> +	return 0;
> +}
> +
> +static int pf_reprovision_sched_groups(struct xe_gt *gt)
> +{
> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
> +
> +	/* We only have something to provision if we have possible groups */
> +	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
> +		return 0;
> +
> +	return __pf_provision_sched_groups(gt, gt->sriov.pf.policy.guc.sched_groups.current_mode);
> +}
> +
> +static void pf_sanitize_sched_groups(struct xe_gt *gt)
> +{
> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
> +
> +	gt->sriov.pf.policy.guc.sched_groups.current_mode = XE_SRIOV_SCHED_GROUPS_NONE;
> +}
> +
> +/**
> + * xe_gt_sriov_pf_policy_set_sched_groups_mode() - Control the 'sched_groups' policy.
> + * @gt: the &xe_gt where to apply the policy
> + * @value: the sched_group mode to be activated
> + *
> + * This function can only be called on PF.
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt,
> +						enum xe_sriov_sched_group_modes value)
> +{
> +	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
> +		return -ENODEV;
> +
> +	guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
> +	return pf_provision_sched_groups(gt, value);
> +}
> +
>  static void pf_sanitize_guc_policies(struct xe_gt *gt)
>  {
>  	pf_sanitize_sched_if_idle(gt);
>  	pf_sanitize_reset_engine(gt);
>  	pf_sanitize_sample_period(gt);
> +	pf_sanitize_sched_groups(gt);
>  }
>  
>  /**
> @@ -524,6 +659,7 @@ int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset)
>  	err |= pf_reprovision_sched_if_idle(gt);
>  	err |= pf_reprovision_reset_engine(gt);
>  	err |= pf_reprovision_sample_period(gt);
> +	err |= pf_reprovision_sched_groups(gt);
>  	mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
>  
>  	xe_pm_runtime_put(gt_to_xe(gt));
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> index 52312d24d527..6b3e294bc934 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> @@ -17,6 +17,9 @@ int xe_gt_sriov_pf_policy_set_reset_engine(struct xe_gt *gt, bool enable);
>  bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
>  int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
>  u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
> +bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
> +bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
> +int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);
>  
>  void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
>  void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> index 1d4cdc87e069..d9928c200d72 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> @@ -23,12 +23,16 @@ enum xe_sriov_sched_group_modes {
>  /**
>   * struct xe_gt_sriov_scheduler_groups - Scheduler groups policy info
>   * @max_num_of_groups: number of groups supported by the GuC for the platform
> + * @supported_modes: mask of supported modes
> + * @current_mode: active scheduler groups mode
>   * @modes: array of masks and their number for each mode
>   * @modes.masks: array of masks for a given mode
>   * @modes.num_masks: number of masks in the array
>   */
>  struct xe_gt_sriov_scheduler_groups {
>  	u8 max_num_of_groups;
> +	u32 supported_modes;
> +	enum xe_sriov_sched_group_modes current_mode;
>  	struct {
>  		u32 *masks;
>  		u32 num_masks;
> diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
> index 7d93c2749485..c2e0a2dae586 100644
> --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
> +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
> @@ -46,6 +46,8 @@
>  #define GUC_MAX_ENGINE_CLASSES		16
>  #define GUC_MAX_INSTANCES_PER_CLASS	32
>  
> +#define GUC_MAX_SCHED_GROUPS GUC_KLV_VGT_POLICY_ENGINE_GROUP_MAX_COUNT

actually my idea was to have here:

	#define GUC_MAX_SCHED_GROUPS	8

and then in the klv abi header:

	#define GUC_KLV_VGT_POLICY_ENGINE_GROUP_MAX_COUNT GUC_MAX_SCHED_GROUPS

as IMO the KLV definition follows FW capability, not the other way around

> +
>  #define GUC_CONTEXT_NORMAL			0
>  #define GUC_CONTEXT_COMPRESSION_SAVE		1
>  #define GUC_CONTEXT_COMPRESSION_RESTORE	2
> diff --git a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
> index 146a6eda9e06..1b08b443606e 100644
> --- a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
> +++ b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
> @@ -26,6 +26,8 @@ const char *xe_guc_klv_key_to_string(u16 key)
>  		return "sched_if_idle";
>  	case GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_KEY:
>  		return "sample_period";
> +	case GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY:
> +		return "engine_group_config";
>  	case GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_KEY:
>  		return "reset_engine";
>  	/* VF CFG keys */


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/11] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc
  2025-12-06 23:04 ` [PATCH v2 04/11] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc Daniele Ceraolo Spurio
@ 2025-12-07 21:58   ` Michal Wajdeczko
  2025-12-08 17:48     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Wajdeczko @ 2025-12-07 21:58 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe



On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
> Since engines in the same class can be divided across multiple groups,
> the GuC does not allow scheduler groups to be active if there are
> multi-lrc contexts. This means that:
> 
> 1) if a MLRC context is registered when we enable scheduler groups, the
>    GuC will silently ignore the configuration
> 2) if a MLRC context is registered after scheduler groups are enabled,
>    the GuC will disable the groups and generate an adverse event.
> 
> The expectation is that the admin will ensure that all apps that use
> MLRC on PF have been terminated before scheduler groups are created. A
> check on PF is added anyway to make sure we don't still have contexts
> waiting to be cleaned up laying around.
> On both PF and VF we block creation of new MLRC queues once scheduler
> groups have been enabled.
> 
> v2: move threshold handling to its own patch, move MLRC check to
>     guc_submit.c, hide SRIOV interals from exec_queue creation code,
>     better comments/docs (Michal)
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>  drivers/gpu/drm/xe/abi/guc_klvs_abi.h      |  7 +++
>  drivers/gpu/drm/xe/xe_exec_queue.c         | 19 +++++++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf.c        | 17 ++++++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf.h        |  8 +++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c | 28 ++++++++++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h |  1 +
>  drivers/gpu/drm/xe/xe_gt_sriov_vf.c        | 60 ++++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_gt_sriov_vf.h        |  1 +
>  drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h  |  2 +
>  drivers/gpu/drm/xe/xe_guc_klv_helpers.c    |  3 ++
>  drivers/gpu/drm/xe/xe_guc_submit.c         | 21 ++++++++
>  drivers/gpu/drm/xe/xe_guc_submit.h         |  2 +
>  12 files changed, 169 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> index 45733a87183a..edb0546fb163 100644
> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> @@ -46,11 +46,18 @@
>   *      Refers to 32 bit architecture version as reported by the HW IP.
>   *      This key is supported on MTL+ platforms only.
>   *      Requires GuC ABI 1.2+.
> + *
> + * _`GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE` : 0x3001
> + *      Tells the driver whether scheduler groups are enabled or not.
> + *      Requires GuC ABI 1.26+
>   */
>  
>  #define GUC_KLV_GLOBAL_CFG_GMD_ID_KEY			0x3000u
>  #define GUC_KLV_GLOBAL_CFG_GMD_ID_LEN			1u
>  
> +#define GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY	0x3001u
> +#define GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_LEN	1u
> +
>  /**
>   * DOC: GuC Self Config KLVs
>   *
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
> index 226d07a3d852..df01c0664965 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -16,6 +16,7 @@
>  #include "xe_dep_scheduler.h"
>  #include "xe_device.h"
>  #include "xe_gt.h"
> +#include "xe_gt_sriov_pf.h"
>  #include "xe_gt_sriov_vf.h"
>  #include "xe_hw_engine_class_sysfs.h"
>  #include "xe_hw_engine_group.h"
> @@ -718,6 +719,17 @@ static u32 calc_validate_logical_mask(struct xe_device *xe,
>  	return return_mask;
>  }
>  
> +static bool has_sched_groups(struct xe_gt *gt)
> +{
> +	if (IS_SRIOV_PF(gt_to_xe(gt)) && xe_gt_sriov_pf_sched_groups_enabled(gt))
> +		return true;
> +
> +	if (IS_SRIOV_VF(gt_to_xe(gt)) && xe_gt_sriov_vf_sched_groups_enabled(gt))
> +		return true;
> +
> +	return false;
> +}
> +
>  int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>  			       struct drm_file *file)
>  {
> @@ -810,6 +822,13 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>  			return -ENOENT;
>  		}
>  
> +		/* SRIOV sched groups are not compatible with multi-lrc */
> +		if (XE_IOCTL_DBG(xe, args->width > 1 && has_sched_groups(hwe->gt))) {
> +			up_read(&vm->lock);
> +			xe_vm_put(vm);
> +			return -EINVAL;
> +		}
> +
>  		q = xe_exec_queue_create(xe, vm, logical_mask,
>  					 args->width, hwe, flags,
>  					 args->extensions);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> index 0d97a823e702..fb5c9101e275 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> @@ -284,3 +284,20 @@ int xe_gt_sriov_pf_wait_ready(struct xe_gt *gt)
>  	pf_flush_restart(gt);
>  	return 0;
>  }
> +
> +/**
> + * xe_gt_sriov_pf_sched_groups_enabled - Check if multiple scheduler groups are
> + * enabled
> + * @gt: the &xe_gt
> + *
> + * This function is for PF use only.
> + *
> + * Return: true if shed groups were enabled, false otherwise.
> + */
> +bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt)
> +{
> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +
> +	return xe_gt_sriov_pf_policy_sched_groups_enabled(gt);
> +}
> +
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> index e7fde3f9937a..1ccfc7137b98 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
> @@ -6,6 +6,8 @@
>  #ifndef _XE_GT_SRIOV_PF_H_
>  #define _XE_GT_SRIOV_PF_H_
>  
> +#include <linux/types.h>
> +
>  struct xe_gt;
>  
>  #ifdef CONFIG_PCI_IOV
> @@ -16,6 +18,7 @@ void xe_gt_sriov_pf_init_hw(struct xe_gt *gt);
>  void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid);
>  void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt);
>  void xe_gt_sriov_pf_restart(struct xe_gt *gt);
> +bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt);
>  #else
>  static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
>  {
> @@ -38,6 +41,11 @@ static inline void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt)
>  static inline void xe_gt_sriov_pf_restart(struct xe_gt *gt)
>  {
>  }
> +
> +static inline bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt)
> +{
> +	return false;
> +}
>  #endif
>  
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> index 1109fec99fc3..6a682d788b02 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> @@ -16,6 +16,7 @@
>  #include "xe_guc_buf.h"
>  #include "xe_guc_ct.h"
>  #include "xe_guc_klv_helpers.h"
> +#include "xe_guc_submit.h"
>  #include "xe_pm.h"
>  
>  /*
> @@ -567,6 +568,19 @@ static int pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
>  	if (xe_sriov_pf_num_vfs(gt_to_xe(gt)))
>  		return -EBUSY;
>  
> +	/*
> +	 * The GuC silently ignores the setting if any MLRC contexts are
> +	 * registered. We expect the admin to make sure that all apps that use
> +	 * MLRC are terminated before scheduler groups are enabled, so this
> +	 * check is just to make sure that the exec_queue destruction has been
> +	 * completed.
> +	 */
> +	if (mode != XE_SRIOV_SCHED_GROUPS_NONE &&
> +	    xe_guc_has_registered_mlrc_queues(&gt->uc.guc)) {
> +		xe_gt_sriov_notice(gt, "can't enable sched groups with active mlrc queues\n");

s/mlrc/MLRC

> +		return -EPERM;
> +	}
> +
>  	err = __pf_provision_sched_groups(gt, mode);
>  	if (err)
>  		return err;
> @@ -615,6 +629,20 @@ int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt,
>  	return pf_provision_sched_groups(gt, value);
>  }
>  
> +/**
> + * xe_gt_sriov_pf_policy_sched_groups_enabled() - check whether the GT has
> + * multiple scheduler groups enabled
> + * @gt: the &xe_gt to check
> + *
> + * This function can only be called on PF.
> + *
> + * Return: true if the GT has multiple groups enabled, false otherwise.
> + */
> +bool xe_gt_sriov_pf_policy_sched_groups_enabled(struct xe_gt *gt)
> +{
> +	return gt->sriov.pf.policy.guc.sched_groups.current_mode != XE_SRIOV_SCHED_GROUPS_NONE;
> +}
> +
>  static void pf_sanitize_guc_policies(struct xe_gt *gt)
>  {
>  	pf_sanitize_sched_if_idle(gt);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> index 6b3e294bc934..ceaf797ca21b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> @@ -20,6 +20,7 @@ u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
>  bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
>  bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
>  int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);
> +bool xe_gt_sriov_pf_policy_sched_groups_enabled(struct xe_gt *gt);
>  
>  void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
>  void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> index 97c29c55f885..48e11c1a2d08 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
> @@ -438,6 +438,30 @@ u32 xe_gt_sriov_vf_gmdid(struct xe_gt *gt)
>  	return value;
>  }
>  
> +static int query_vf_sched_groups(struct xe_gt *gt)

s/query_vf_sched_groups/vf_query_sched_groups

and keep it closer to vf_cache_sched_groups_status

> +{
> +	struct xe_guc *guc = &gt->uc.guc;
> +	u32 value = 0;
> +	int err;
> +
> +	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
> +
> +	if (MAKE_GUC_VER_STRUCT(gt->sriov.vf.guc_version) < MAKE_GUC_VER(1, 26, 0))
> +		return 0;

nit: maybe we can split above 'check' code from rest of 'query' code?

and as we have more and more cases where version check is needed, maybe it's also a time to add helper like:

	bool vf_runs_on_guc(gt, MAKE_GUC_VER)

> +
> +	err = guc_action_query_single_klv32(guc,
> +					    GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY,
> +					    &value);
> +	if (unlikely(err)) {
> +		xe_gt_sriov_err(gt, "Failed to obtain sched groups status (%pe)\n",
> +				ERR_PTR(err));
> +		return err;
> +	}
> +
> +	xe_gt_sriov_dbg(gt, "sched groups %s\n", str_enabled_disabled(value));
> +	return value;
> +}
> +
>  static int vf_get_ggtt_info(struct xe_gt *gt)
>  {
>  	struct xe_tile *tile = gt_to_tile(gt);
> @@ -564,6 +588,21 @@ static void vf_cache_gmdid(struct xe_gt *gt)
>  	gt->sriov.vf.runtime.gmdid = xe_gt_sriov_vf_gmdid(gt);
>  }
>  
> +static int vf_cache_sched_groups_status(struct xe_gt *gt)
> +{
> +	int ret;
> +
> +	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
> +
> +	ret = query_vf_sched_groups(gt);
> +	if (ret < 0)
> +		return ret;
> +
> +	gt->sriov.vf.runtime.uses_sched_groups = ret;
> +
> +	return 0;
> +}
> +
>  /**
>   * xe_gt_sriov_vf_query_config - Query SR-IOV config data over MMIO.
>   * @gt: the &xe_gt
> @@ -593,12 +632,33 @@ int xe_gt_sriov_vf_query_config(struct xe_gt *gt)
>  	if (unlikely(err))
>  		return err;
>  
> +	err = vf_cache_sched_groups_status(gt);
> +	if (unlikely(err))
> +		return err;
> +
>  	if (has_gmdid(xe))
>  		vf_cache_gmdid(gt);
>  
>  	return 0;
>  }
>  
> +/**
> + * xe_gt_sriov_vf_sched_groups_enabled() - Check if PF has enabled multiple
> + * scheduler groups
> + * @gt: the &xe_gt
> + *
> + * This function is for VF use only.
> + *
> + * Return: true if shed groups were enabled, false otherwise.
> + */
> +bool xe_gt_sriov_vf_sched_groups_enabled(struct xe_gt *gt)
> +{
> +	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
> +	xe_gt_assert(gt, gt->sriov.vf.guc_version.major);
> +
> +	return gt->sriov.vf.runtime.uses_sched_groups;
> +}
> +
>  /**
>   * xe_gt_sriov_vf_guc_ids - VF GuC context IDs configuration.
>   * @gt: the &xe_gt
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
> index af40276790fa..7d97189c2d3d 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
> @@ -30,6 +30,7 @@ bool xe_gt_sriov_vf_recovery_pending(struct xe_gt *gt);
>  u32 xe_gt_sriov_vf_gmdid(struct xe_gt *gt);
>  u16 xe_gt_sriov_vf_guc_ids(struct xe_gt *gt);
>  u64 xe_gt_sriov_vf_lmem(struct xe_gt *gt);
> +bool xe_gt_sriov_vf_sched_groups_enabled(struct xe_gt *gt);
>  
>  u32 xe_gt_sriov_vf_read32(struct xe_gt *gt, struct xe_reg reg);
>  void xe_gt_sriov_vf_write32(struct xe_gt *gt, struct xe_reg reg, u32 val);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
> index 420b0e6089de..5267c097ecd0 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
> @@ -27,6 +27,8 @@ struct xe_gt_sriov_vf_selfconfig {
>  struct xe_gt_sriov_vf_runtime {
>  	/** @gmdid: cached value of the GDMID register. */
>  	u32 gmdid;
> +	/** @uses_sched_groups: whether PF enabled sched groups or not. */
> +	bool uses_sched_groups;
>  	/** @regs_size: size of runtime register array. */
>  	u32 regs_size;
>  	/** @num_regs: number of runtime registers in the array. */
> diff --git a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
> index 1b08b443606e..dd504b77cb17 100644
> --- a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
> +++ b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
> @@ -21,6 +21,9 @@
>  const char *xe_guc_klv_key_to_string(u16 key)
>  {
>  	switch (key) {
> +	/* GuC Global Config KLVs */
> +	case GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY:
> +		return "group_scheduling_available";
>  	/* VGT POLICY keys */
>  	case GUC_KLV_VGT_POLICY_SCHED_IF_IDLE_KEY:
>  		return "sched_if_idle";
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
> index af43acf7baae..e8921219ac4e 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
> @@ -2985,6 +2985,27 @@ void xe_guc_submit_print(struct xe_guc *guc, struct drm_printer *p)
>  	mutex_unlock(&guc->submission_state.lock);
>  }
>  
> +/**
> + * xe_guc_has_registered_mlrc_queues - check whether there are any MLRC queues
> + * registered with the GuC
> + * @guc: GuC.
> + *
> + * Return: true if any MLRC queue is registered with the GuC, false otherwise.
> + */
> +bool xe_guc_has_registered_mlrc_queues(struct xe_guc *guc)
> +{
> +	struct xe_exec_queue *q;
> +	unsigned long index;
> +
> +	guard(mutex)(&guc->submission_state.lock);
> +
> +	xa_for_each(&guc->submission_state.exec_queue_lookup, index, q)
> +		if (q->width > 1)
> +			return true;
> +
> +	return false;
> +}
> +
>  /**
>   * xe_guc_contexts_hwsp_rebase - Re-compute GGTT references within all
>   * exec queues registered to given GuC.
> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.h b/drivers/gpu/drm/xe/xe_guc_submit.h
> index 100a7891b918..49e608500a4e 100644
> --- a/drivers/gpu/drm/xe/xe_guc_submit.h
> +++ b/drivers/gpu/drm/xe/xe_guc_submit.h
> @@ -49,6 +49,8 @@ xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *snapsh
>  void xe_guc_submit_print(struct xe_guc *guc, struct drm_printer *p);
>  void xe_guc_register_vf_exec_queue(struct xe_exec_queue *q, int ctx_type);
>  
> +bool xe_guc_has_registered_mlrc_queues(struct xe_guc *guc);
> +
>  int xe_guc_contexts_hwsp_rebase(struct xe_guc *guc, void *scratch);
>  
>  #endif


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/11] drm/xe/sriov: Add handling for MLRC adverse event threshold
  2025-12-06 23:04 ` [PATCH v2 05/11] drm/xe/sriov: Add handling for MLRC adverse event threshold Daniele Ceraolo Spurio
@ 2025-12-07 22:03   ` Michal Wajdeczko
  2025-12-08 17:52     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Wajdeczko @ 2025-12-07 22:03 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe



On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
> Since it is illegal to register a MLRC context when scheduler groups are
> enabled, the GuC consider the VF doing so as an adverse event. Like for
> other adverse event, there is a threshold for how many times the event
> can happen before the GuC throws an error, which we need to add support
> for.
> 
> Since this is the first threshold that we have that has a minimum GuC
> version requirement, support for checking that has been added to the
> generic threshold handling. As part of it, some of the version code has
> been moved to its own file and with the occasion some SRIOV
> documentation has been added.
> 
> v2: split from previous patch, add GuC version checking
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>  drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  9 +++++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    | 19 ++++++----
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c   |  9 +++--
>  drivers/gpu/drm/xe/xe_guc.h                   |  7 +---
>  .../drm/xe/xe_guc_klv_thresholds_set_types.h  | 18 +++++-----
>  drivers/gpu/drm/xe/xe_guc_version.h           | 36 +++++++++++++++++++
>  6 files changed, 74 insertions(+), 24 deletions(-)
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_version.h
> 
> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> index edb0546fb163..30a051a0b4ee 100644
> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
> @@ -376,6 +376,12 @@ enum  {
>   *      :1: NORMAL = schedule VF always, irrespective of whether it has work or not
>   *      :2: HIGH = schedule VF in the next time-slice after current active
>   *          time-slice completes if it has active work
> + *
> + * _`GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT` : 0x8A0D
> + *      Given that multi-LRC contexts are incompatible with SRIOV scheduler
> + *      groups and cause the latter to be turned off when registered with the
> + *      GuC, this config allows the PF to set a threshold for multi-LRC context
> + *      registrations by VFs to monitor their behavior.
>   */
>  
>  #define GUC_KLV_VF_CFG_GGTT_START_KEY		0x0001
> @@ -434,6 +440,9 @@ enum  {
>  #define   GUC_SCHED_PRIORITY_NORMAL		1u
>  #define   GUC_SCHED_PRIORITY_HIGH		2u
>  
> +#define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_KEY	0x8a0d
> +#define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_LEN	1u
> +
>  /*
>   * Workaround keys:
>   */
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> index 59c5c6b4d994..dda671d05b89 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
> @@ -269,7 +269,8 @@ static u32 encode_config_ggtt(u32 *cfg, const struct xe_gt_sriov_config *config,
>  }
>  
>  /* Return: number of configuration dwords written */
> -static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config *config, bool details)
> +static u32 encode_config(struct xe_gt *gt, u32 *cfg,
> +			 const struct xe_gt_sriov_config *config, bool details)
>  {
>  	u32 n = 0;
>  
> @@ -303,9 +304,11 @@ static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config *config, bool
>  	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_PREEMPT_TIMEOUT);
>  	cfg[n++] = config->preempt_timeout;
>  
> -#define encode_threshold_config(TAG, ...) ({					\
> -	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
> -	cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
> +#define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
> +	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
> +		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
> +		cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
> +	}										\
>  });
>  
>  	MAKE_XE_GUC_KLV_THRESHOLDS_SET(encode_threshold_config);
> @@ -328,7 +331,7 @@ static int pf_push_full_vf_config(struct xe_gt *gt, unsigned int vfid)
>  		return -ENOBUFS;
>  
>  	cfg = xe_guc_buf_cpu_ptr(buf);
> -	num_dwords = encode_config(cfg, config, true);
> +	num_dwords = encode_config(gt, cfg, config, true);
>  	xe_gt_assert(gt, num_dwords <= max_cfg_dwords);
>  
>  	if (xe_gt_is_media_type(gt)) {
> @@ -2518,7 +2521,7 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
>  			ret = -ENOBUFS;
>  		} else {
>  			config = pf_pick_vf_config(gt, vfid);
> -			ret = encode_config(buf, config, false) * sizeof(u32);
> +			ret = encode_config(gt, buf, config, false) * sizeof(u32);
>  		}
>  	}
>  	mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
> @@ -2551,9 +2554,11 @@ static int pf_restore_vf_config_klv(struct xe_gt *gt, unsigned int vfid,
>  		return pf_provision_preempt_timeout(gt, vfid, value[0]);
>  
>  	/* auto-generate case statements */
> -#define define_threshold_key_to_provision_case(TAG, ...)				\
> +#define define_threshold_key_to_provision_case(TAG, NAME, MIN_GUC_VER)			\
>  	case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):					\
>  		BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u);		\
> +		if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < MIN_GUC_VER)		\
> +			return -ENOKEY;							\
>  		if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))			\
>  			return -EBADMSG;						\
>  		return pf_provision_threshold(gt, vfid,					\
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> index 0fd863609848..5123ff1fb116 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> @@ -21,6 +21,7 @@
>  #include "xe_gt_sriov_pf_monitor.h"
>  #include "xe_gt_sriov_pf_policy.h"
>  #include "xe_gt_sriov_pf_service.h"
> +#include "xe_guc.h"
>  #include "xe_pm.h"
>  #include "xe_sriov_pf.h"
>  #include "xe_sriov_pf_provision.h"
> @@ -301,9 +302,11 @@ static void pf_add_config_attrs(struct xe_gt *gt, struct dentry *parent, unsigne
>  				   &sched_priority_fops);
>  
>  	/* register all threshold attributes */
> -#define register_threshold_attribute(TAG, NAME, ...) \
> -	debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent, \
> -				   &NAME##_fops);
> +#define register_threshold_attribute(TAG, NAME, MIN_GUC_VER) ({				\
> +	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER)		\
> +		debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent,	\
> +					   &NAME##_fops);				\
> +});
>  	MAKE_XE_GUC_KLV_THRESHOLDS_SET(register_threshold_attribute)
>  #undef register_threshold_attribute
>  }
> diff --git a/drivers/gpu/drm/xe/xe_guc.h b/drivers/gpu/drm/xe/xe_guc.h
> index fdb08658d05a..9028718189ed 100644
> --- a/drivers/gpu/drm/xe/xe_guc.h
> +++ b/drivers/gpu/drm/xe/xe_guc.h
> @@ -8,15 +8,10 @@
>  
>  #include "xe_gt.h"
>  #include "xe_guc_types.h"
> +#include "xe_guc_version.h"
>  #include "xe_hw_engine_types.h"
>  #include "xe_macros.h"
>  
> -/*
> - * GuC version number components are defined to be only 8-bit size,
> - * so converting to a 32bit 8.8.8 integer allows simple (and safe)
> - * numerical comparisons.
> - */
> -#define MAKE_GUC_VER(maj, min, pat)	(((maj) << 16) | ((min) << 8) | (pat))
>  #define MAKE_GUC_VER_STRUCT(ver)	MAKE_GUC_VER((ver).major, (ver).minor, (ver).patch)

I guess this macro can also be moved

>  #define GUC_SUBMIT_VER(guc) \
>  	MAKE_GUC_VER_STRUCT((guc)->fw.versions.found[XE_UC_FW_VER_COMPATIBILITY])
> diff --git a/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h b/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
> index 0a028c94756d..f7ed32244c6b 100644
> --- a/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
> +++ b/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
> @@ -7,6 +7,7 @@
>  #define _XE_GUC_KLV_THRESHOLDS_SET_TYPES_H_
>  
>  #include "xe_args.h"
> +#include "xe_guc_version.h"
>  
>  /**
>   * MAKE_XE_GUC_KLV_THRESHOLDS_SET - Generate various GuC thresholds definitions.
> @@ -23,15 +24,16 @@
>   * with the &TAG, that corresponds to the GuC threshold KLV key name defined by
>   * ABI and the associated &NAME, that may be used in code or debugfs/sysfs::
>   *
> - *	define(TAG, NAME)
> + *	define(TAG, NAME, MIN_GUC_VER)
>   */
> -#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)		\
> -	define(CAT_ERR, cat_error_count)		\
> -	define(ENGINE_RESET, engine_reset_count)	\
> -	define(PAGE_FAULT, page_fault_count)		\
> -	define(H2G_STORM, guc_time_us)			\
> -	define(IRQ_STORM, irq_time_us)			\
> -	define(DOORBELL_STORM, doorbell_time_us)	\
> +#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)					\
> +	define(CAT_ERR, cat_error_count, 0)					\
> +	define(ENGINE_RESET, engine_reset_count, 0)				\
> +	define(PAGE_FAULT, page_fault_count, 0)					\
> +	define(H2G_STORM, guc_time_us, 0)					\
> +	define(IRQ_STORM, irq_time_us, 0)					\
> +	define(DOORBELL_STORM, doorbell_time_us, 0)				\
> +	define(MULTI_LRC_COUNT, multi_lrc_count, MAKE_GUC_VER(70, 53, 0))	\
>  	/* end */
>  
>  /**
> diff --git a/drivers/gpu/drm/xe/xe_guc_version.h b/drivers/gpu/drm/xe/xe_guc_version.h
> new file mode 100644

introduction of this new ver.h file is self-contained so maybe it should be in its own patch?

> index 000000000000..e6f80abd2f05
> --- /dev/null
> +++ b/drivers/gpu/drm/xe/xe_guc_version.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2025 Intel Corporation
> + */
> +
> +#ifndef _XE_GUC_VERSION_H_
> +#define _XE_GUC_VERSION_H_
> +
> +/*

this should be regular kernel-doc

> + * GuC version number components are defined to be only 8-bit size,
> + * so converting to a 32bit 8.8.8 integer allows simple (and safe)
> + * numerical comparisons.
> + */
> +#define MAKE_GUC_VER(maj, min, pat)	(((maj) << 16) | ((min) << 8) | (pat))
> +
> +/**
> + * DOC: SRIOV-changes

	DOC: SR-IOV Changes

> + *
> + * We record SRIOV-specific changes here as those need to be tracked carefully.
> + *

what about 1.23.0 (CCS) ?

> + * GuC 70.53.0 (VF interface 1.26.0):
> + *
> + * Added support for EGS. See:

probably we need extra line here to render correctly

> + *  * GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG
> + *  * GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT
> + *
> + * GuC 70.54.0 (VF interface 1.27.0):
> + *
> + * Updated VF migration support. See RESFIX actions

maybe we can list those actions:

	* VF2GUC_RESFIX_START
	* VF2GUC_RESFIX_DONE

> + *
> + * GuC 70.55.1 (VF interface 1.28.1):
> + *
> + * Fixes for EGS.
> + */
> +
> +#endif


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 02/11] drm/xe/sriov: Initialize scheduler groups
  2025-12-07 21:57   ` Michal Wajdeczko
@ 2025-12-08 17:36     ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-08 17:36 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-xe



On 12/7/2025 1:57 PM, Michal Wajdeczko wrote:
>
> On 12/7/2025 12:03 AM, Daniele Ceraolo Spurio wrote:
>> Scheduler groups (a.k.a. Engine Groups Scheduling, or EGS) is a GuC
>> feature that allows the driver to define groups of engines that are
>> independently scheduled across VFs, which allows different VFs to be
>> active on the HW at the same time on different groups. The feature is
>> available for BMG and newer HW starting on GuC 70.53.0, but some
>> required fixes have been added to GuC 70.55.1.
>>
>> This is intended for specific scenarios where the admin knows that the
>> VFs are not going to fully utilize the HW and therefore assigning all of
>> it to a single VF would lead to part of it being permanently idle.
>> We do not allow the admin to decide how to divide the engines across
>> groups, but we instead support specific configurations that are designed
>> for specific use-cases. During PF initialization we detect which
>> configurations are possible on a given GT and create the relevant
>> groups. Since the GuC expect a mask for each class for each group, that
>> is what we save when we init the configs.
>>
>> Right now we only have one use-case on the media GT. If the VFs are
>> running a frame render + encoding at a not-too-high resolution (e.g.
>> 1080@30fps) the render can produce frames faster than the video engine
>> can encode them, which means that the maximum number of parallel VFs is
>> limited by the VCS bandwidth. Since our products can have multiple VCS
>> engines, allowing multiple VFs to be active on the different VCS engines
>> at the same time allows us to run more parallel VFs on the same HW.
>> Given that engines in the same media slice share some resources (e.g.
>> SFC), we assign each media slice to a different scheduling group. We
>> refer to this configuration as "media_slices", given that each slice
>> gets its own group.
>>
>> Note that while the GuC interface supports a maximum of 8 groups, the
>> actual number of groups that can be enabled can be lower than that and
>> can be different on different devices. For now, all devices support up
>> to 2 groups.
>>
>> v2: Use asserts for coding errors, code cleanups, better docs (Michal),
>>      limit groups to 2, limit to BMG and newer, bump required GuC to
>>      70.55.1.
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_gt.h                    |   6 +
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf.c           |   3 +
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 142 ++++++++++++++++++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   1 +
>>   .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |  29 ++++
>>   5 files changed, 181 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
>> index 2eeeeeb6b912..0a34e862406e 100644
>> --- a/drivers/gpu/drm/xe/xe_gt.h
>> +++ b/drivers/gpu/drm/xe/xe_gt.h
>> @@ -29,6 +29,12 @@
>>   #define CCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, CCS)
>>   #define GSCCS_INSTANCES(gt) __ENGINE_INSTANCES_MASK(gt, GSCCS)
>>   
>> +/*
>> + * Each media slice has 1x VECS, so the max number of VECS instances gives us
>> + * the max number of slices that a GT can have.
>> + */
>> +#define MAX_MEDIA_SLICES hweight32(XE_HW_ENGINE_VECS_MASK)
> isn't xe_hw_engine_types.h file a better fit for this macro?

I thought having it with the _INSTANCES macros was better, because this 
is extrapolated data the same way the instances lists are. Also slices 
are not an engine-level concept, as they include stuff outside the 
engine as well (e.g. SFC).

>
>> +
>>   #define GT_VER(gt) ({ \
>>   	typeof(gt) gt_ = (gt); \
>>   	struct xe_device *xe = gt_to_xe(gt_); \
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> index 0714c758b9c1..0d97a823e702 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> @@ -14,6 +14,7 @@
>>   #include "xe_gt_sriov_pf_control.h"
>>   #include "xe_gt_sriov_pf_helpers.h"
>>   #include "xe_gt_sriov_pf_migration.h"
>> +#include "xe_gt_sriov_pf_policy.h"
>>   #include "xe_gt_sriov_pf_service.h"
>>   #include "xe_gt_sriov_printk.h"
>>   #include "xe_guc_submit.h"
>> @@ -123,6 +124,8 @@ int xe_gt_sriov_pf_init(struct xe_gt *gt)
>>   	if (err)
>>   		return err;
>>   
>> +	xe_gt_sriov_pf_policy_init(gt);
>> +
>>   	err = xe_gt_sriov_pf_migration_init(gt);
>>   	if (err)
>>   		return err;
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> index 4445f660e6d1..158d68aff4b7 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> @@ -3,6 +3,8 @@
>>    * Copyright © 2023-2024 Intel Corporation
>>    */
>>   
>> +#include <drm/drm_managed.h>
>> +
>>   #include "abi/guc_actions_sriov_abi.h"
>>   
>>   #include "xe_bo.h"
>> @@ -10,6 +12,7 @@
>>   #include "xe_gt_sriov_pf_helpers.h"
>>   #include "xe_gt_sriov_pf_policy.h"
>>   #include "xe_gt_sriov_printk.h"
>> +#include "xe_guc.h"
>>   #include "xe_guc_buf.h"
>>   #include "xe_guc_ct.h"
>>   #include "xe_guc_klv_helpers.h"
>> @@ -351,6 +354,133 @@ u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt)
>>   	return value;
>>   }
>>   
>> +static void pf_sched_group_media_slices(struct xe_gt *gt, u32 **masks, u32 *num_masks)
>> +{
>> +	u8 slice_to_group[MAX_MEDIA_SLICES];
>> +	u32 vecs_mask = VECS_INSTANCES(gt);
>> +	u32 gsc_mask = GSCCS_INSTANCES(gt);
>> +	u32 vcs_mask = VCS_INSTANCES(gt);
>> +	struct xe_hw_engine *hwe;
>> +	enum xe_hw_engine_id id;
>> +	int groups = 0;
>> +	u32 *values;
>> +	int slice;
>> +
>> +	xe_gt_assert(gt, xe_gt_is_media_type(gt));
>> +
>> +	/* A media slice has 2 VCS and a VECS. We bundle the GSC with the first slice */
>> +	for (slice = 0; slice < MAX_MEDIA_SLICES; slice++) {
>> +		if ((vcs_mask & 0x3) || (vecs_mask & 0x1) || (gsc_mask & 0x1))
>> +			slice_to_group[slice] = groups++;
>> +
>> +		vcs_mask >>= 2;
>> +		vecs_mask >>= 1;
>> +		gsc_mask >>= 1;
>> +	}
>> +
>> +	xe_gt_assert(gt, !vcs_mask);
>> +	xe_gt_assert(gt, !vecs_mask);
>> +	xe_gt_assert(gt, !gsc_mask);
>> +
>> +	/* We need at least 2 slices to split them up */
>> +	if (groups < 2)
>> +		return;
>> +
>> +	if (groups > gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups) {
> should we care about this limitation here in this function?
>
> we will allocate groups here as required, only GuC might not correctly handle that
> but that's mainly a GuC's problem and we may just report that during sending provisioning to GuC

I want to avoid exposing the mode entirely if there are too many groups, 
not give the admin the impression we support it and then fail when the 
they try to turn it on.

>
>> +		xe_gt_sriov_notice(gt, "media_slice mode has too many groups: %u vs %u\n",
>> +				   groups,
>> +				   gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups);
>> +		return;
>> +	}
>> +
>> +	/*
>> +	 * The GuC expects an array with GUC_MAX_ENGINE_CLASSES entries for
>> +	 * each group.
>> +	 */
>> +	values = drmm_kzalloc(&gt_to_xe(gt)->drm,
> drmm_kcalloc ?

ok

>
>> +			      GUC_MAX_ENGINE_CLASSES * groups * sizeof(u32),
>> +			      GFP_KERNEL);
>> +	if (!values)
>> +		return;
>> +
>> +	for_each_hw_engine(hwe, gt, id) {
>> +		u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
>> +		u8 entry;
>> +
>> +		switch (hwe->class) {
>> +		case XE_ENGINE_CLASS_VIDEO_DECODE:
>> +			slice = hwe->instance / 2;
>> +			break;
>> +		case XE_ENGINE_CLASS_VIDEO_ENHANCE:
>> +			slice = hwe->instance;
>> +			break;
>> +		case XE_ENGINE_CLASS_OTHER:
>> +			slice = 0;
>> +			break;
>> +		default:
>> +			xe_gt_assert_msg(gt, false,
>> +					 "unknown media gt class %u (%s) during EGS setup\n",
>> +					 hwe->class, hwe->name);
>> +			drmm_kfree(&gt_to_xe(gt)->drm, values);
> hmm, do we really need to abort here?
> maybe assert and then still assign this unk engine to slice 0 (like fallback to class_other)

ok

>
>> +			return;
>> +		}
>> +
>> +		entry = slice_to_group[slice] * GUC_MAX_ENGINE_CLASSES + guc_class;
>> +		values[entry] |= BIT(hwe->logical_instance);
>> +	}
>> +
>> +	*masks = values;
>> +	*num_masks = GUC_MAX_ENGINE_CLASSES * groups;
> IMO we should store number of groups only
>
> we know that each group is just set of GUC_MAX_ENGINE_CLASSES instances

It's the same for me, I can flip it

>
> maybe even we should have below struct defined somewhere:
>
> 	struct guc_sched_group {
> 		u32 engines[GUC_MAX_ENGINE_CLASSES];
> 	} __packed;

I'll see if it makes things clearer.

>
>> +}
>> +
>> +static void pf_init_sched_groups(struct xe_gt *gt)
>> +{
>> +	int m;
>> +
>> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>> +
>> +	/*
>> +	 * The GuC supports scheduler groups from v70.53.0, but a fix for it has
>> +	 * been merged in v70.55.1, so we require the latter. The feature is
>> +	 * also only enabled on BMG and newer FW.
>> +	 */
>> +	if (GUC_FIRMWARE_VER(&gt->uc.guc) < MAKE_GUC_VER(70, 55, 1) ||
>> +	    gt_to_xe(gt)->info.platform < XE_BATTLEMAGE)
>> +		return;
>> +
>> +	/*
>> +	 * The GuC interface supports up to 8 groups. However, the GuC only
>> +	 * fully allocates resources for a subset of groups, based on the number
>> +	 * of engines and expected usage. The plan is for this to become
>> +	 * queryable via H2G, but for now GuC FW for all devices supports a
>> +	 * maximum of 2 groups so we can just hardcode that.
>> +	 */
>> +	gt->sriov.pf.policy.guc.sched_groups.max_num_of_groups = 2;
> maybe this limitation should be introduced in next patch 3/11 where you are
> actually trying to provision sched_groups within GuC?

ok

>
>> +
>> +	for (m = 0; m < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; m++) {
>> +		u32 *masks = NULL;
>> +		u32 num_masks = 0;
> maybe initialize them as pointers instead:
>
> 		u32 *num_masks = &gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks;
> 		u32 **masks = &gt->sriov.pf.policy.guc.sched_groups.modes[m].masks;
>
>> +
>> +		switch (m) {
>> +		case XE_SRIOV_SCHED_GROUPS_NONE:
> I'm still not convinced that we need to waste an array index for the NONE mode
>
> can't we just loop over known modes (for now its MEDIA slices only)
> and if NONE is selected just implicitly assume .num_masks is zero?
>
> then array will hold only potentially valid group definitions

As mentioned in the previous rev, having the NONE as part of the array 
makes things simpler in the interfaces, because we don't have to special 
case it, I can just map the "disabled" case directly to 
XE_SRIOV_SCHED_GROUPS_NONE and it all just works.
Also, I thought we agreed to go with this approach for now and rework it 
later if necessary?

>
>> +			break;
>> +		case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
>> +			/* this mode only has groups on the media GT */
>> +			if (xe_gt_is_media_type(gt))
>> +				pf_sched_group_media_slices(gt, &masks, &num_masks);
>> +			break;
>> +		default:
>> +			xe_gt_assert_msg(gt, false, "unknown sched group mode %u\n", m);
>> +			return;
>> +		}
>> +
>> +		xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
>> +
>> +		gt->sriov.pf.policy.guc.sched_groups.modes[m].masks = masks;
>> +		gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks = num_masks;
>> +	}
>> +}
>> +
>>   static void pf_sanitize_guc_policies(struct xe_gt *gt)
>>   {
>>   	pf_sanitize_sched_if_idle(gt);
>> @@ -401,6 +531,18 @@ int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset)
>>   	return err ? -ENXIO : 0;
>>   }
>>   
>> +/**
>> + * xe_gt_sriov_pf_policy_init - Initializes the SW state of the PF policies.
>      * xe_gt_sriov_pf_policy_init() - Initializes ...
>
>> + * @gt: the &xe_gt
>> + *
>> + * This function can only be called on PF. This function does not touch the HW,
>> + * but must be called after the engines have been initialized.
>> + */
>> +void xe_gt_sriov_pf_policy_init(struct xe_gt *gt)
>> +{
>> +	pf_init_sched_groups(gt);
>> +}
>> +
>>   static void print_guc_policies(struct drm_printer *p, struct xe_gt_sriov_guc_policies *policy)
>>   {
>>   	drm_printf(p, "%s:\t%s\n",
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> index 2a5dc33dc6d7..52312d24d527 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> @@ -18,6 +18,7 @@ bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
>>   int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
>>   u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
>>   
>> +void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
>>   void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
>>   int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset);
>>   int xe_gt_sriov_pf_policy_print(struct xe_gt *gt, struct drm_printer *p);
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
>> index 4de532af135e..1d4cdc87e069 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
>> @@ -8,16 +8,45 @@
>>   
>>   #include <linux/types.h>
>>   
>> +/**
>> + * enum xe_sriov_sched_group_modes - list of possible scheduler group modes
>> + * @XE_SRIOV_SCHED_GROUPS_NONE - no separate groups (i.e., all engines in group 0)
>> + * @XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES - separate groups for each media slice
>> + * @XE_SRIOV_SCHED_GROUPS_MODES_COUNT - number of valid modes
>> + */
>> +enum xe_sriov_sched_group_modes {
>> +	XE_SRIOV_SCHED_GROUPS_NONE = 0,
>> +	XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES,
> hmm, I was assuming that enum NONE is just an alias for 0,
> and that individual supported modes are defined as BITs,
>
> 	XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES = BIT(0),

That only makes sense if you don't want NONE to be a valid recorded 
group, because otherwise it just makes looping through the modes more 
complicated.

>
>> +	XE_SRIOV_SCHED_GROUPS_MODES_COUNT
>> +};
>> +
>> +/**
>> + * struct xe_gt_sriov_scheduler_groups - Scheduler groups policy info
>> + * @max_num_of_groups: number of groups supported by the GuC for the platform
> nit: just @max_groups ?
>
>> + * @modes: array of masks and their number for each mode
> 	@modes: array of defined scheduling group modes
>
>> + * @modes.masks: array of masks for a given mode
> hmm, 'mask' alone is not the best name here, maybe:
>
> 	@modes.groups: array of engine instance groups in given mode,
> 			or NULL if mode is not supported.
> 			each group consists of set of
> 			GUC_MAX_ENGINE_CLASSES of engine instances mask
>
>> + * @modes.num_masks: number of masks in the array
> 	@modes.num_groups: number of groups in given mode,
> 			or zero if mode is not supported
>
>> + */
>> +struct xe_gt_sriov_scheduler_groups {
>> +	u8 max_num_of_groups;
>> +	struct {
>> +		u32 *masks;
>> +		u32 num_masks;
> 		struct guc_sched_group *groups;
> 		u32 num_groups;

ok to those style changes.

Daniele

>
>> +	} modes[XE_SRIOV_SCHED_GROUPS_MODES_COUNT];
>> +};
>> +
>>   /**
>>    * struct xe_gt_sriov_guc_policies - GuC SR-IOV policies.
>>    * @sched_if_idle: controls strict scheduling policy.
>>    * @reset_engine: controls engines reset on VF switch policy.
>>    * @sample_period: adverse events sampling period (in milliseconds).
>> + * @sched_groups: available scheduling group configurations.
>>    */
>>   struct xe_gt_sriov_guc_policies {
>>   	bool sched_if_idle;
>>   	bool reset_engine;
>>   	u32 sample_period;
>> +	struct xe_gt_sriov_scheduler_groups sched_groups;
>>   };
>>   
>>   /**


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 03/11] drm/xe/sriov: Add support for enabling scheduler groups
  2025-12-07 21:57   ` Michal Wajdeczko
@ 2025-12-08 17:41     ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-08 17:41 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-xe



On 12/7/2025 1:57 PM, Michal Wajdeczko wrote:
>
> On 12/7/2025 12:03 AM, Daniele Ceraolo Spurio wrote:
>> Scheduler groups are enabled by sending a specific policy configuration
>> KLV to the GuC. We don't allow changing this policy if there are VF
>> active, since the expectation is that the VF will only check if the
>> feature is enabled during driver initialization.
>>
>> The functions added by this patch will be used by sysfs/debugfs, coming
>> in follow up patches.
>>
>> v2: code improvements, add GUC_MAX_SCHED_GROUPS define, don't add
>>      XE_SRIOV_SCHED_GROUPS_NONE to supported_modes (Michal)
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> ---
>>   drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  17 +++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 136 ++++++++++++++++++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   3 +
>>   .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |   4 +
>>   drivers/gpu/drm/xe/xe_guc_fwif.h              |   2 +
>>   drivers/gpu/drm/xe/xe_guc_klv_helpers.c       |   2 +
>>   6 files changed, 164 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> index 265a135e7061..45733a87183a 100644
>> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> @@ -200,6 +200,20 @@ enum  {
>>    *      :0: adverse events are not counted (default)
>>    *      :n: sample period in milliseconds
>>    *
>> + * _`GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG` : 0x8004
>> + *      This config allows the PF to split the engines across scheduling groups.
>> + *      Each group is independently timesliced across VFs, allowing different
>> + *      VFs to be active on the HW at the same time. When enabling this feature,
>> + *      all engines must be assigned to a group (and only one group), or they
>> + *      will be excluded from scheduling after this KLV is sent. To enable
>> + *      the groups, the driver must provide a masks array with
>> + *      GUC_MAX_ENGINE_CLASSES entries for each group, with each mask indicating
>> + *      which logical instances of that class belong to the group. Therefore,
>> + *      the length of this KLV when enabling groups is
>> + *      num_groups * GUC_MAX_ENGINE_CLASSES. To disable the groups, the driver
>> + *      must send the KLV without any payload (i.e. len = 0). The maximum
>> + *      number of groups is 8.
>> + *
>>    * _`GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH` : 0x8D00
>>    *      This enum is to reset utilized HW engine after VF Switch (i.e to clean
>>    *      up Stale HW register left behind by previous VF)
>> @@ -214,6 +228,9 @@ enum  {
>>   #define GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_KEY	0x8002
>>   #define GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_LEN	1u
>>   
>> +#define GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY	0x8004
>> +#define GUC_KLV_VGT_POLICY_ENGINE_GROUP_MAX_COUNT	8u
>> +
>>   #define GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_KEY	0x8D00
>>   #define GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_LEN	1u
>>   
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> index 158d68aff4b7..1109fec99fc3 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> @@ -97,6 +97,23 @@ static int pf_push_policy_u32(struct xe_gt *gt, u16 key, u32 value)
>>   	return pf_push_policy_klvs(gt, 1, klv, ARRAY_SIZE(klv));
>>   }
>>   
>> +static int pf_push_policy_payload(struct xe_gt *gt, u16 key, u32 *payload, u32 num_dwords)
>> +{
>> +	CLASS(xe_guc_buf, buf)(&gt->uc.guc.buf, GUC_KLV_LEN_MIN + num_dwords);
>> +	u32 *klv;
>> +
>> +	if (!xe_guc_buf_is_valid(buf))
>> +		return -ENOBUFS;
>> +
>> +	klv = xe_guc_buf_cpu_ptr(buf);
>> +
>> +	klv[0] = PREP_GUC_KLV(key, num_dwords);
>> +	if (num_dwords)
>> +		memcpy(&klv[1], payload, num_dwords * sizeof(u32));
>> +
>> +	return pf_push_policy_buf_klvs(gt, 1, buf, GUC_KLV_LEN_MIN + num_dwords);
>> +}
>> +
>>   static int pf_update_policy_bool(struct xe_gt *gt, u16 key, bool *policy, bool value)
>>   {
>>   	int err;
>> @@ -476,16 +493,134 @@ static void pf_init_sched_groups(struct xe_gt *gt)
>>   
>>   		xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
>>   
> please keep asserts together

ok

>
>> +		xe_gt_assert(gt, num_masks / GUC_MAX_ENGINE_CLASSES < GUC_MAX_SCHED_GROUPS);
>> +
>> +		if (num_masks)
>> +			gt->sriov.pf.policy.guc.sched_groups.supported_modes |= BIT(m);
>> +
>>   		gt->sriov.pf.policy.guc.sched_groups.modes[m].masks = masks;
>>   		gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks = num_masks;
>>   	}
>>   }
>>   
>> +/**
>> + * xe_sriov_gt_pf_policy_has_multi_group_modes() - check whether the GT supports
>> + * any scheduler modes that have multiple groups
>> + * @gt: the &xe_gt to check
>> + *
>> + * This function can only be called on PF.
>> + *
>> + * Return: true if the GT supports modes with multiple groups, false otherwise.
>> + */
>> +bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt)
>> +{
>> +	return gt->sriov.pf.policy.guc.sched_groups.supported_modes;
>> +}
>> +
>> +/**
>> + * xe_sriov_gt_pf_policy_has_sched_group_mode() - check whether the GT supports
>> + * a specific scheduler group mode
>> + * @gt: the &xe_gt to check
>> + * @mode: the mode to check
>> + *
>> + * This function can only be called on PF.
>> + *
>> + * Return: true if the GT supports the specified mode, false otherwise.
>> + */
>> +bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode)
>> +{
>> +	if (mode == XE_SRIOV_SCHED_GROUPS_NONE)
>> +		return true;
>> +
>> +	return gt->sriov.pf.policy.guc.sched_groups.supported_modes & BIT(mode);
>> +}
>> +
>> +static int __pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
>> +{
>> +	u32 *masks = gt->sriov.pf.policy.guc.sched_groups.modes[mode].masks;
>> +	u32 num_masks = gt->sriov.pf.policy.guc.sched_groups.modes[mode].num_masks;
>> +
>> +	xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
>> +
>> +	return pf_push_policy_payload(gt, GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY,
>> +				      masks, num_masks);
>> +}
>> +
>> +static int pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
>> +{
>> +	int err;
>> +
>> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>> +	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
>> +
>> +	if (!xe_sriov_gt_pf_policy_has_sched_group_mode(gt, mode))
>> +		return -EINVAL;
>> +
>> +	/* already in the desired mode */
>> +	if (gt->sriov.pf.policy.guc.sched_groups.current_mode == mode)
>> +		return 0;
>> +
>> +	/*
>> +	 * We don't allow changing this with VFs active since it is hard for
>> +	 * VFs to check.
>> +	 */
>> +	if (xe_sriov_pf_num_vfs(gt_to_xe(gt)))
>> +		return -EBUSY;
>> +
>> +	err = __pf_provision_sched_groups(gt, mode);
>> +	if (err)
>> +		return err;
>> +
>> +	gt->sriov.pf.policy.guc.sched_groups.current_mode = mode;
>> +
>> +	return 0;
>> +}
>> +
>> +static int pf_reprovision_sched_groups(struct xe_gt *gt)
>> +{
>> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>> +	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
>> +
>> +	/* We only have something to provision if we have possible groups */
>> +	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
>> +		return 0;
>> +
>> +	return __pf_provision_sched_groups(gt, gt->sriov.pf.policy.guc.sched_groups.current_mode);
>> +}
>> +
>> +static void pf_sanitize_sched_groups(struct xe_gt *gt)
>> +{
>> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>> +	lockdep_assert_held(xe_gt_sriov_pf_master_mutex(gt));
>> +
>> +	gt->sriov.pf.policy.guc.sched_groups.current_mode = XE_SRIOV_SCHED_GROUPS_NONE;
>> +}
>> +
>> +/**
>> + * xe_gt_sriov_pf_policy_set_sched_groups_mode() - Control the 'sched_groups' policy.
>> + * @gt: the &xe_gt where to apply the policy
>> + * @value: the sched_group mode to be activated
>> + *
>> + * This function can only be called on PF.
>> + *
>> + * Return: 0 on success or a negative error code on failure.
>> + */
>> +int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt,
>> +						enum xe_sriov_sched_group_modes value)
>> +{
>> +	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
>> +		return -ENODEV;
>> +
>> +	guard(mutex)(xe_gt_sriov_pf_master_mutex(gt));
>> +	return pf_provision_sched_groups(gt, value);
>> +}
>> +
>>   static void pf_sanitize_guc_policies(struct xe_gt *gt)
>>   {
>>   	pf_sanitize_sched_if_idle(gt);
>>   	pf_sanitize_reset_engine(gt);
>>   	pf_sanitize_sample_period(gt);
>> +	pf_sanitize_sched_groups(gt);
>>   }
>>   
>>   /**
>> @@ -524,6 +659,7 @@ int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset)
>>   	err |= pf_reprovision_sched_if_idle(gt);
>>   	err |= pf_reprovision_reset_engine(gt);
>>   	err |= pf_reprovision_sample_period(gt);
>> +	err |= pf_reprovision_sched_groups(gt);
>>   	mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
>>   
>>   	xe_pm_runtime_put(gt_to_xe(gt));
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> index 52312d24d527..6b3e294bc934 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> @@ -17,6 +17,9 @@ int xe_gt_sriov_pf_policy_set_reset_engine(struct xe_gt *gt, bool enable);
>>   bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
>>   int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
>>   u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
>> +bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
>> +bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
>> +int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);
>>   
>>   void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
>>   void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
>> index 1d4cdc87e069..d9928c200d72 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
>> @@ -23,12 +23,16 @@ enum xe_sriov_sched_group_modes {
>>   /**
>>    * struct xe_gt_sriov_scheduler_groups - Scheduler groups policy info
>>    * @max_num_of_groups: number of groups supported by the GuC for the platform
>> + * @supported_modes: mask of supported modes
>> + * @current_mode: active scheduler groups mode
>>    * @modes: array of masks and their number for each mode
>>    * @modes.masks: array of masks for a given mode
>>    * @modes.num_masks: number of masks in the array
>>    */
>>   struct xe_gt_sriov_scheduler_groups {
>>   	u8 max_num_of_groups;
>> +	u32 supported_modes;
>> +	enum xe_sriov_sched_group_modes current_mode;
>>   	struct {
>>   		u32 *masks;
>>   		u32 num_masks;
>> diff --git a/drivers/gpu/drm/xe/xe_guc_fwif.h b/drivers/gpu/drm/xe/xe_guc_fwif.h
>> index 7d93c2749485..c2e0a2dae586 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_fwif.h
>> +++ b/drivers/gpu/drm/xe/xe_guc_fwif.h
>> @@ -46,6 +46,8 @@
>>   #define GUC_MAX_ENGINE_CLASSES		16
>>   #define GUC_MAX_INSTANCES_PER_CLASS	32
>>   
>> +#define GUC_MAX_SCHED_GROUPS GUC_KLV_VGT_POLICY_ENGINE_GROUP_MAX_COUNT
> actually my idea was to have here:
>
> 	#define GUC_MAX_SCHED_GROUPS	8
>
> and then in the klv abi header:
>
> 	#define GUC_KLV_VGT_POLICY_ENGINE_GROUP_MAX_COUNT GUC_MAX_SCHED_GROUPS
>
> as IMO the KLV definition follows FW capability, not the other way around

That would require a rework of the header hierarchy, because guc_fwif.h 
includes guc_klvs_abi.h, not the other way around. IMO that does not 
belong in this series. I can just define it as 8 in both places for now 
if you think that's better, and later on if we rework the include 
hierarchy we can have the KLV define based on the one in fwif.h

Daniele

>
>> +
>>   #define GUC_CONTEXT_NORMAL			0
>>   #define GUC_CONTEXT_COMPRESSION_SAVE		1
>>   #define GUC_CONTEXT_COMPRESSION_RESTORE	2
>> diff --git a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
>> index 146a6eda9e06..1b08b443606e 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
>> @@ -26,6 +26,8 @@ const char *xe_guc_klv_key_to_string(u16 key)
>>   		return "sched_if_idle";
>>   	case GUC_KLV_VGT_POLICY_ADVERSE_SAMPLE_PERIOD_KEY:
>>   		return "sample_period";
>> +	case GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG_KEY:
>> +		return "engine_group_config";
>>   	case GUC_KLV_VGT_POLICY_RESET_AFTER_VF_SWITCH_KEY:
>>   		return "reset_engine";
>>   	/* VF CFG keys */


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 04/11] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc
  2025-12-07 21:58   ` Michal Wajdeczko
@ 2025-12-08 17:48     ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-08 17:48 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-xe



On 12/7/2025 1:58 PM, Michal Wajdeczko wrote:
>
> On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
>> Since engines in the same class can be divided across multiple groups,
>> the GuC does not allow scheduler groups to be active if there are
>> multi-lrc contexts. This means that:
>>
>> 1) if a MLRC context is registered when we enable scheduler groups, the
>>     GuC will silently ignore the configuration
>> 2) if a MLRC context is registered after scheduler groups are enabled,
>>     the GuC will disable the groups and generate an adverse event.
>>
>> The expectation is that the admin will ensure that all apps that use
>> MLRC on PF have been terminated before scheduler groups are created. A
>> check on PF is added anyway to make sure we don't still have contexts
>> waiting to be cleaned up laying around.
>> On both PF and VF we block creation of new MLRC queues once scheduler
>> groups have been enabled.
>>
>> v2: move threshold handling to its own patch, move MLRC check to
>>      guc_submit.c, hide SRIOV interals from exec_queue creation code,
>>      better comments/docs (Michal)
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> ---
>>   drivers/gpu/drm/xe/abi/guc_klvs_abi.h      |  7 +++
>>   drivers/gpu/drm/xe/xe_exec_queue.c         | 19 +++++++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf.c        | 17 ++++++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf.h        |  8 +++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c | 28 ++++++++++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h |  1 +
>>   drivers/gpu/drm/xe/xe_gt_sriov_vf.c        | 60 ++++++++++++++++++++++
>>   drivers/gpu/drm/xe/xe_gt_sriov_vf.h        |  1 +
>>   drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h  |  2 +
>>   drivers/gpu/drm/xe/xe_guc_klv_helpers.c    |  3 ++
>>   drivers/gpu/drm/xe/xe_guc_submit.c         | 21 ++++++++
>>   drivers/gpu/drm/xe/xe_guc_submit.h         |  2 +
>>   12 files changed, 169 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> index 45733a87183a..edb0546fb163 100644
>> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> @@ -46,11 +46,18 @@
>>    *      Refers to 32 bit architecture version as reported by the HW IP.
>>    *      This key is supported on MTL+ platforms only.
>>    *      Requires GuC ABI 1.2+.
>> + *
>> + * _`GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE` : 0x3001
>> + *      Tells the driver whether scheduler groups are enabled or not.
>> + *      Requires GuC ABI 1.26+
>>    */
>>   
>>   #define GUC_KLV_GLOBAL_CFG_GMD_ID_KEY			0x3000u
>>   #define GUC_KLV_GLOBAL_CFG_GMD_ID_LEN			1u
>>   
>> +#define GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY	0x3001u
>> +#define GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_LEN	1u
>> +
>>   /**
>>    * DOC: GuC Self Config KLVs
>>    *
>> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c
>> index 226d07a3d852..df01c0664965 100644
>> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
>> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
>> @@ -16,6 +16,7 @@
>>   #include "xe_dep_scheduler.h"
>>   #include "xe_device.h"
>>   #include "xe_gt.h"
>> +#include "xe_gt_sriov_pf.h"
>>   #include "xe_gt_sriov_vf.h"
>>   #include "xe_hw_engine_class_sysfs.h"
>>   #include "xe_hw_engine_group.h"
>> @@ -718,6 +719,17 @@ static u32 calc_validate_logical_mask(struct xe_device *xe,
>>   	return return_mask;
>>   }
>>   
>> +static bool has_sched_groups(struct xe_gt *gt)
>> +{
>> +	if (IS_SRIOV_PF(gt_to_xe(gt)) && xe_gt_sriov_pf_sched_groups_enabled(gt))
>> +		return true;
>> +
>> +	if (IS_SRIOV_VF(gt_to_xe(gt)) && xe_gt_sriov_vf_sched_groups_enabled(gt))
>> +		return true;
>> +
>> +	return false;
>> +}
>> +
>>   int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>>   			       struct drm_file *file)
>>   {
>> @@ -810,6 +822,13 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data,
>>   			return -ENOENT;
>>   		}
>>   
>> +		/* SRIOV sched groups are not compatible with multi-lrc */
>> +		if (XE_IOCTL_DBG(xe, args->width > 1 && has_sched_groups(hwe->gt))) {
>> +			up_read(&vm->lock);
>> +			xe_vm_put(vm);
>> +			return -EINVAL;
>> +		}
>> +
>>   		q = xe_exec_queue_create(xe, vm, logical_mask,
>>   					 args->width, hwe, flags,
>>   					 args->extensions);
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> index 0d97a823e702..fb5c9101e275 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
>> @@ -284,3 +284,20 @@ int xe_gt_sriov_pf_wait_ready(struct xe_gt *gt)
>>   	pf_flush_restart(gt);
>>   	return 0;
>>   }
>> +
>> +/**
>> + * xe_gt_sriov_pf_sched_groups_enabled - Check if multiple scheduler groups are
>> + * enabled
>> + * @gt: the &xe_gt
>> + *
>> + * This function is for PF use only.
>> + *
>> + * Return: true if shed groups were enabled, false otherwise.
>> + */
>> +bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt)
>> +{
>> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>> +
>> +	return xe_gt_sriov_pf_policy_sched_groups_enabled(gt);
>> +}
>> +
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
>> index e7fde3f9937a..1ccfc7137b98 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.h
>> @@ -6,6 +6,8 @@
>>   #ifndef _XE_GT_SRIOV_PF_H_
>>   #define _XE_GT_SRIOV_PF_H_
>>   
>> +#include <linux/types.h>
>> +
>>   struct xe_gt;
>>   
>>   #ifdef CONFIG_PCI_IOV
>> @@ -16,6 +18,7 @@ void xe_gt_sriov_pf_init_hw(struct xe_gt *gt);
>>   void xe_gt_sriov_pf_sanitize_hw(struct xe_gt *gt, unsigned int vfid);
>>   void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt);
>>   void xe_gt_sriov_pf_restart(struct xe_gt *gt);
>> +bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt);
>>   #else
>>   static inline int xe_gt_sriov_pf_init_early(struct xe_gt *gt)
>>   {
>> @@ -38,6 +41,11 @@ static inline void xe_gt_sriov_pf_stop_prepare(struct xe_gt *gt)
>>   static inline void xe_gt_sriov_pf_restart(struct xe_gt *gt)
>>   {
>>   }
>> +
>> +static inline bool xe_gt_sriov_pf_sched_groups_enabled(struct xe_gt *gt)
>> +{
>> +	return false;
>> +}
>>   #endif
>>   
>>   #endif
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> index 1109fec99fc3..6a682d788b02 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> @@ -16,6 +16,7 @@
>>   #include "xe_guc_buf.h"
>>   #include "xe_guc_ct.h"
>>   #include "xe_guc_klv_helpers.h"
>> +#include "xe_guc_submit.h"
>>   #include "xe_pm.h"
>>   
>>   /*
>> @@ -567,6 +568,19 @@ static int pf_provision_sched_groups(struct xe_gt *gt, u32 mode)
>>   	if (xe_sriov_pf_num_vfs(gt_to_xe(gt)))
>>   		return -EBUSY;
>>   
>> +	/*
>> +	 * The GuC silently ignores the setting if any MLRC contexts are
>> +	 * registered. We expect the admin to make sure that all apps that use
>> +	 * MLRC are terminated before scheduler groups are enabled, so this
>> +	 * check is just to make sure that the exec_queue destruction has been
>> +	 * completed.
>> +	 */
>> +	if (mode != XE_SRIOV_SCHED_GROUPS_NONE &&
>> +	    xe_guc_has_registered_mlrc_queues(&gt->uc.guc)) {
>> +		xe_gt_sriov_notice(gt, "can't enable sched groups with active mlrc queues\n");
> s/mlrc/MLRC
>
>> +		return -EPERM;
>> +	}
>> +
>>   	err = __pf_provision_sched_groups(gt, mode);
>>   	if (err)
>>   		return err;
>> @@ -615,6 +629,20 @@ int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt,
>>   	return pf_provision_sched_groups(gt, value);
>>   }
>>   
>> +/**
>> + * xe_gt_sriov_pf_policy_sched_groups_enabled() - check whether the GT has
>> + * multiple scheduler groups enabled
>> + * @gt: the &xe_gt to check
>> + *
>> + * This function can only be called on PF.
>> + *
>> + * Return: true if the GT has multiple groups enabled, false otherwise.
>> + */
>> +bool xe_gt_sriov_pf_policy_sched_groups_enabled(struct xe_gt *gt)
>> +{
>> +	return gt->sriov.pf.policy.guc.sched_groups.current_mode != XE_SRIOV_SCHED_GROUPS_NONE;
>> +}
>> +
>>   static void pf_sanitize_guc_policies(struct xe_gt *gt)
>>   {
>>   	pf_sanitize_sched_if_idle(gt);
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> index 6b3e294bc934..ceaf797ca21b 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> @@ -20,6 +20,7 @@ u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
>>   bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
>>   bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
>>   int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);
>> +bool xe_gt_sriov_pf_policy_sched_groups_enabled(struct xe_gt *gt);
>>   
>>   void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
>>   void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
>> index 97c29c55f885..48e11c1a2d08 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c
>> @@ -438,6 +438,30 @@ u32 xe_gt_sriov_vf_gmdid(struct xe_gt *gt)
>>   	return value;
>>   }
>>   
>> +static int query_vf_sched_groups(struct xe_gt *gt)
> s/query_vf_sched_groups/vf_query_sched_groups
>
> and keep it closer to vf_cache_sched_groups_status

ok

>
>> +{
>> +	struct xe_guc *guc = &gt->uc.guc;
>> +	u32 value = 0;
>> +	int err;
>> +
>> +	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
>> +
>> +	if (MAKE_GUC_VER_STRUCT(gt->sriov.vf.guc_version) < MAKE_GUC_VER(1, 26, 0))
>> +		return 0;
> nit: maybe we can split above 'check' code from rest of 'query' code?
>
> and as we have more and more cases where version check is needed, maybe it's also a time to add helper like:
>
> 	bool vf_runs_on_guc(gt, MAKE_GUC_VER)

As far as I can tell this is only the second similar check we do (with 
the other one being the one in vf_migration_ccs_bb_support_check), so 
IMO a bit early for a dedicated helper.

Daniele

>
>> +
>> +	err = guc_action_query_single_klv32(guc,
>> +					    GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY,
>> +					    &value);
>> +	if (unlikely(err)) {
>> +		xe_gt_sriov_err(gt, "Failed to obtain sched groups status (%pe)\n",
>> +				ERR_PTR(err));
>> +		return err;
>> +	}
>> +
>> +	xe_gt_sriov_dbg(gt, "sched groups %s\n", str_enabled_disabled(value));
>> +	return value;
>> +}
>> +
>>   static int vf_get_ggtt_info(struct xe_gt *gt)
>>   {
>>   	struct xe_tile *tile = gt_to_tile(gt);
>> @@ -564,6 +588,21 @@ static void vf_cache_gmdid(struct xe_gt *gt)
>>   	gt->sriov.vf.runtime.gmdid = xe_gt_sriov_vf_gmdid(gt);
>>   }
>>   
>> +static int vf_cache_sched_groups_status(struct xe_gt *gt)
>> +{
>> +	int ret;
>> +
>> +	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
>> +
>> +	ret = query_vf_sched_groups(gt);
>> +	if (ret < 0)
>> +		return ret;
>> +
>> +	gt->sriov.vf.runtime.uses_sched_groups = ret;
>> +
>> +	return 0;
>> +}
>> +
>>   /**
>>    * xe_gt_sriov_vf_query_config - Query SR-IOV config data over MMIO.
>>    * @gt: the &xe_gt
>> @@ -593,12 +632,33 @@ int xe_gt_sriov_vf_query_config(struct xe_gt *gt)
>>   	if (unlikely(err))
>>   		return err;
>>   
>> +	err = vf_cache_sched_groups_status(gt);
>> +	if (unlikely(err))
>> +		return err;
>> +
>>   	if (has_gmdid(xe))
>>   		vf_cache_gmdid(gt);
>>   
>>   	return 0;
>>   }
>>   
>> +/**
>> + * xe_gt_sriov_vf_sched_groups_enabled() - Check if PF has enabled multiple
>> + * scheduler groups
>> + * @gt: the &xe_gt
>> + *
>> + * This function is for VF use only.
>> + *
>> + * Return: true if shed groups were enabled, false otherwise.
>> + */
>> +bool xe_gt_sriov_vf_sched_groups_enabled(struct xe_gt *gt)
>> +{
>> +	xe_gt_assert(gt, IS_SRIOV_VF(gt_to_xe(gt)));
>> +	xe_gt_assert(gt, gt->sriov.vf.guc_version.major);
>> +
>> +	return gt->sriov.vf.runtime.uses_sched_groups;
>> +}
>> +
>>   /**
>>    * xe_gt_sriov_vf_guc_ids - VF GuC context IDs configuration.
>>    * @gt: the &xe_gt
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
>> index af40276790fa..7d97189c2d3d 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h
>> @@ -30,6 +30,7 @@ bool xe_gt_sriov_vf_recovery_pending(struct xe_gt *gt);
>>   u32 xe_gt_sriov_vf_gmdid(struct xe_gt *gt);
>>   u16 xe_gt_sriov_vf_guc_ids(struct xe_gt *gt);
>>   u64 xe_gt_sriov_vf_lmem(struct xe_gt *gt);
>> +bool xe_gt_sriov_vf_sched_groups_enabled(struct xe_gt *gt);
>>   
>>   u32 xe_gt_sriov_vf_read32(struct xe_gt *gt, struct xe_reg reg);
>>   void xe_gt_sriov_vf_write32(struct xe_gt *gt, struct xe_reg reg, u32 val);
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
>> index 420b0e6089de..5267c097ecd0 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h
>> @@ -27,6 +27,8 @@ struct xe_gt_sriov_vf_selfconfig {
>>   struct xe_gt_sriov_vf_runtime {
>>   	/** @gmdid: cached value of the GDMID register. */
>>   	u32 gmdid;
>> +	/** @uses_sched_groups: whether PF enabled sched groups or not. */
>> +	bool uses_sched_groups;
>>   	/** @regs_size: size of runtime register array. */
>>   	u32 regs_size;
>>   	/** @num_regs: number of runtime registers in the array. */
>> diff --git a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
>> index 1b08b443606e..dd504b77cb17 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_klv_helpers.c
>> @@ -21,6 +21,9 @@
>>   const char *xe_guc_klv_key_to_string(u16 key)
>>   {
>>   	switch (key) {
>> +	/* GuC Global Config KLVs */
>> +	case GUC_KLV_GLOBAL_CFG_GROUP_SCHEDULING_AVAILABLE_KEY:
>> +		return "group_scheduling_available";
>>   	/* VGT POLICY keys */
>>   	case GUC_KLV_VGT_POLICY_SCHED_IF_IDLE_KEY:
>>   		return "sched_if_idle";
>> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c
>> index af43acf7baae..e8921219ac4e 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_submit.c
>> +++ b/drivers/gpu/drm/xe/xe_guc_submit.c
>> @@ -2985,6 +2985,27 @@ void xe_guc_submit_print(struct xe_guc *guc, struct drm_printer *p)
>>   	mutex_unlock(&guc->submission_state.lock);
>>   }
>>   
>> +/**
>> + * xe_guc_has_registered_mlrc_queues - check whether there are any MLRC queues
>> + * registered with the GuC
>> + * @guc: GuC.
>> + *
>> + * Return: true if any MLRC queue is registered with the GuC, false otherwise.
>> + */
>> +bool xe_guc_has_registered_mlrc_queues(struct xe_guc *guc)
>> +{
>> +	struct xe_exec_queue *q;
>> +	unsigned long index;
>> +
>> +	guard(mutex)(&guc->submission_state.lock);
>> +
>> +	xa_for_each(&guc->submission_state.exec_queue_lookup, index, q)
>> +		if (q->width > 1)
>> +			return true;
>> +
>> +	return false;
>> +}
>> +
>>   /**
>>    * xe_guc_contexts_hwsp_rebase - Re-compute GGTT references within all
>>    * exec queues registered to given GuC.
>> diff --git a/drivers/gpu/drm/xe/xe_guc_submit.h b/drivers/gpu/drm/xe/xe_guc_submit.h
>> index 100a7891b918..49e608500a4e 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_submit.h
>> +++ b/drivers/gpu/drm/xe/xe_guc_submit.h
>> @@ -49,6 +49,8 @@ xe_guc_exec_queue_snapshot_free(struct xe_guc_submit_exec_queue_snapshot *snapsh
>>   void xe_guc_submit_print(struct xe_guc *guc, struct drm_printer *p);
>>   void xe_guc_register_vf_exec_queue(struct xe_exec_queue *q, int ctx_type);
>>   
>> +bool xe_guc_has_registered_mlrc_queues(struct xe_guc *guc);
>> +
>>   int xe_guc_contexts_hwsp_rebase(struct xe_guc *guc, void *scratch);
>>   
>>   #endif


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/11] drm/xe/sriov: Add handling for MLRC adverse event threshold
  2025-12-07 22:03   ` Michal Wajdeczko
@ 2025-12-08 17:52     ` Daniele Ceraolo Spurio
  2025-12-08 18:27       ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-08 17:52 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-xe



On 12/7/2025 2:03 PM, Michal Wajdeczko wrote:
>
> On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
>> Since it is illegal to register a MLRC context when scheduler groups are
>> enabled, the GuC consider the VF doing so as an adverse event. Like for
>> other adverse event, there is a threshold for how many times the event
>> can happen before the GuC throws an error, which we need to add support
>> for.
>>
>> Since this is the first threshold that we have that has a minimum GuC
>> version requirement, support for checking that has been added to the
>> generic threshold handling. As part of it, some of the version code has
>> been moved to its own file and with the occasion some SRIOV
>> documentation has been added.
>>
>> v2: split from previous patch, add GuC version checking
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> ---
>>   drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  9 +++++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    | 19 ++++++----
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c   |  9 +++--
>>   drivers/gpu/drm/xe/xe_guc.h                   |  7 +---
>>   .../drm/xe/xe_guc_klv_thresholds_set_types.h  | 18 +++++-----
>>   drivers/gpu/drm/xe/xe_guc_version.h           | 36 +++++++++++++++++++
>>   6 files changed, 74 insertions(+), 24 deletions(-)
>>   create mode 100644 drivers/gpu/drm/xe/xe_guc_version.h
>>
>> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> index edb0546fb163..30a051a0b4ee 100644
>> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>> @@ -376,6 +376,12 @@ enum  {
>>    *      :1: NORMAL = schedule VF always, irrespective of whether it has work or not
>>    *      :2: HIGH = schedule VF in the next time-slice after current active
>>    *          time-slice completes if it has active work
>> + *
>> + * _`GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT` : 0x8A0D
>> + *      Given that multi-LRC contexts are incompatible with SRIOV scheduler
>> + *      groups and cause the latter to be turned off when registered with the
>> + *      GuC, this config allows the PF to set a threshold for multi-LRC context
>> + *      registrations by VFs to monitor their behavior.
>>    */
>>   
>>   #define GUC_KLV_VF_CFG_GGTT_START_KEY		0x0001
>> @@ -434,6 +440,9 @@ enum  {
>>   #define   GUC_SCHED_PRIORITY_NORMAL		1u
>>   #define   GUC_SCHED_PRIORITY_HIGH		2u
>>   
>> +#define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_KEY	0x8a0d
>> +#define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_LEN	1u
>> +
>>   /*
>>    * Workaround keys:
>>    */
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
>> index 59c5c6b4d994..dda671d05b89 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
>> @@ -269,7 +269,8 @@ static u32 encode_config_ggtt(u32 *cfg, const struct xe_gt_sriov_config *config,
>>   }
>>   
>>   /* Return: number of configuration dwords written */
>> -static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config *config, bool details)
>> +static u32 encode_config(struct xe_gt *gt, u32 *cfg,
>> +			 const struct xe_gt_sriov_config *config, bool details)
>>   {
>>   	u32 n = 0;
>>   
>> @@ -303,9 +304,11 @@ static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config *config, bool
>>   	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_PREEMPT_TIMEOUT);
>>   	cfg[n++] = config->preempt_timeout;
>>   
>> -#define encode_threshold_config(TAG, ...) ({					\
>> -	cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
>> -	cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
>> +#define encode_threshold_config(TAG, NAME, MIN_GUC_VER) ({				\
>> +	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER) {		\
>> +		cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);			\
>> +		cfg[n++] = config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)];	\
>> +	}										\
>>   });
>>   
>>   	MAKE_XE_GUC_KLV_THRESHOLDS_SET(encode_threshold_config);
>> @@ -328,7 +331,7 @@ static int pf_push_full_vf_config(struct xe_gt *gt, unsigned int vfid)
>>   		return -ENOBUFS;
>>   
>>   	cfg = xe_guc_buf_cpu_ptr(buf);
>> -	num_dwords = encode_config(cfg, config, true);
>> +	num_dwords = encode_config(gt, cfg, config, true);
>>   	xe_gt_assert(gt, num_dwords <= max_cfg_dwords);
>>   
>>   	if (xe_gt_is_media_type(gt)) {
>> @@ -2518,7 +2521,7 @@ ssize_t xe_gt_sriov_pf_config_save(struct xe_gt *gt, unsigned int vfid, void *bu
>>   			ret = -ENOBUFS;
>>   		} else {
>>   			config = pf_pick_vf_config(gt, vfid);
>> -			ret = encode_config(buf, config, false) * sizeof(u32);
>> +			ret = encode_config(gt, buf, config, false) * sizeof(u32);
>>   		}
>>   	}
>>   	mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
>> @@ -2551,9 +2554,11 @@ static int pf_restore_vf_config_klv(struct xe_gt *gt, unsigned int vfid,
>>   		return pf_provision_preempt_timeout(gt, vfid, value[0]);
>>   
>>   	/* auto-generate case statements */
>> -#define define_threshold_key_to_provision_case(TAG, ...)				\
>> +#define define_threshold_key_to_provision_case(TAG, NAME, MIN_GUC_VER)			\
>>   	case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):					\
>>   		BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 1u);		\
>> +		if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < MIN_GUC_VER)		\
>> +			return -ENOKEY;							\
>>   		if (len != MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))			\
>>   			return -EBADMSG;						\
>>   		return pf_provision_threshold(gt, vfid,					\
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> index 0fd863609848..5123ff1fb116 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> @@ -21,6 +21,7 @@
>>   #include "xe_gt_sriov_pf_monitor.h"
>>   #include "xe_gt_sriov_pf_policy.h"
>>   #include "xe_gt_sriov_pf_service.h"
>> +#include "xe_guc.h"
>>   #include "xe_pm.h"
>>   #include "xe_sriov_pf.h"
>>   #include "xe_sriov_pf_provision.h"
>> @@ -301,9 +302,11 @@ static void pf_add_config_attrs(struct xe_gt *gt, struct dentry *parent, unsigne
>>   				   &sched_priority_fops);
>>   
>>   	/* register all threshold attributes */
>> -#define register_threshold_attribute(TAG, NAME, ...) \
>> -	debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent, \
>> -				   &NAME##_fops);
>> +#define register_threshold_attribute(TAG, NAME, MIN_GUC_VER) ({				\
>> +	if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= MIN_GUC_VER)		\
>> +		debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, parent,	\
>> +					   &NAME##_fops);				\
>> +});
>>   	MAKE_XE_GUC_KLV_THRESHOLDS_SET(register_threshold_attribute)
>>   #undef register_threshold_attribute
>>   }
>> diff --git a/drivers/gpu/drm/xe/xe_guc.h b/drivers/gpu/drm/xe/xe_guc.h
>> index fdb08658d05a..9028718189ed 100644
>> --- a/drivers/gpu/drm/xe/xe_guc.h
>> +++ b/drivers/gpu/drm/xe/xe_guc.h
>> @@ -8,15 +8,10 @@
>>   
>>   #include "xe_gt.h"
>>   #include "xe_guc_types.h"
>> +#include "xe_guc_version.h"
>>   #include "xe_hw_engine_types.h"
>>   #include "xe_macros.h"
>>   
>> -/*
>> - * GuC version number components are defined to be only 8-bit size,
>> - * so converting to a 32bit 8.8.8 integer allows simple (and safe)
>> - * numerical comparisons.
>> - */
>> -#define MAKE_GUC_VER(maj, min, pat)	(((maj) << 16) | ((min) << 8) | (pat))
>>   #define MAKE_GUC_VER_STRUCT(ver)	MAKE_GUC_VER((ver).major, (ver).minor, (ver).patch)
> I guess this macro can also be moved

I purposely didn't move this as MAKE_GUC_VER_STRUCT is specific to how 
we code xe_uc_fw_version, while MAKE_GUC_VER is based on what the GuC 
interface define.

>
>>   #define GUC_SUBMIT_VER(guc) \
>>   	MAKE_GUC_VER_STRUCT((guc)->fw.versions.found[XE_UC_FW_VER_COMPATIBILITY])
>> diff --git a/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h b/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
>> index 0a028c94756d..f7ed32244c6b 100644
>> --- a/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
>> +++ b/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
>> @@ -7,6 +7,7 @@
>>   #define _XE_GUC_KLV_THRESHOLDS_SET_TYPES_H_
>>   
>>   #include "xe_args.h"
>> +#include "xe_guc_version.h"
>>   
>>   /**
>>    * MAKE_XE_GUC_KLV_THRESHOLDS_SET - Generate various GuC thresholds definitions.
>> @@ -23,15 +24,16 @@
>>    * with the &TAG, that corresponds to the GuC threshold KLV key name defined by
>>    * ABI and the associated &NAME, that may be used in code or debugfs/sysfs::
>>    *
>> - *	define(TAG, NAME)
>> + *	define(TAG, NAME, MIN_GUC_VER)
>>    */
>> -#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)		\
>> -	define(CAT_ERR, cat_error_count)		\
>> -	define(ENGINE_RESET, engine_reset_count)	\
>> -	define(PAGE_FAULT, page_fault_count)		\
>> -	define(H2G_STORM, guc_time_us)			\
>> -	define(IRQ_STORM, irq_time_us)			\
>> -	define(DOORBELL_STORM, doorbell_time_us)	\
>> +#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)					\
>> +	define(CAT_ERR, cat_error_count, 0)					\
>> +	define(ENGINE_RESET, engine_reset_count, 0)				\
>> +	define(PAGE_FAULT, page_fault_count, 0)					\
>> +	define(H2G_STORM, guc_time_us, 0)					\
>> +	define(IRQ_STORM, irq_time_us, 0)					\
>> +	define(DOORBELL_STORM, doorbell_time_us, 0)				\
>> +	define(MULTI_LRC_COUNT, multi_lrc_count, MAKE_GUC_VER(70, 53, 0))	\
>>   	/* end */
>>   
>>   /**
>> diff --git a/drivers/gpu/drm/xe/xe_guc_version.h b/drivers/gpu/drm/xe/xe_guc_version.h
>> new file mode 100644
> introduction of this new ver.h file is self-contained so maybe it should be in its own patch?

IMO it is simple enough to keep here, to avoid too many small patches in 
the series.

>
>> index 000000000000..e6f80abd2f05
>> --- /dev/null
>> +++ b/drivers/gpu/drm/xe/xe_guc_version.h
>> @@ -0,0 +1,36 @@
>> +/* SPDX-License-Identifier: MIT */
>> +/*
>> + * Copyright © 2025 Intel Corporation
>> + */
>> +
>> +#ifndef _XE_GUC_VERSION_H_
>> +#define _XE_GUC_VERSION_H_
>> +
>> +/*
> this should be regular kernel-doc
>
>> + * GuC version number components are defined to be only 8-bit size,
>> + * so converting to a 32bit 8.8.8 integer allows simple (and safe)
>> + * numerical comparisons.
>> + */
>> +#define MAKE_GUC_VER(maj, min, pat)	(((maj) << 16) | ((min) << 8) | (pat))
>> +
>> +/**
>> + * DOC: SRIOV-changes
> 	DOC: SR-IOV Changes
>
>> + *
>> + * We record SRIOV-specific changes here as those need to be tracked carefully.
>> + *
> what about 1.23.0 (CCS) ?

If you tell me exactly what to write I'll add it in, because I don't 
know the specifics.

>
>> + * GuC 70.53.0 (VF interface 1.26.0):
>> + *
>> + * Added support for EGS. See:
> probably we need extra line here to render correctly
>
>> + *  * GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG
>> + *  * GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT
>> + *
>> + * GuC 70.54.0 (VF interface 1.27.0):
>> + *
>> + * Updated VF migration support. See RESFIX actions
> maybe we can list those actions:
>
> 	* VF2GUC_RESFIX_START
> 	* VF2GUC_RESFIX_DONE

Those don't seem to yet be in the driver. Should I list them anyway?

Daniele

>> + *
>> + * GuC 70.55.1 (VF interface 1.28.1):
>> + *
>> + * Fixes for EGS.
>> + */
>> +
>> +#endif


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 05/11] drm/xe/sriov: Add handling for MLRC adverse event threshold
  2025-12-08 17:52     ` Daniele Ceraolo Spurio
@ 2025-12-08 18:27       ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-08 18:27 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-xe



On 12/8/2025 9:52 AM, Daniele Ceraolo Spurio wrote:
>
>
> On 12/7/2025 2:03 PM, Michal Wajdeczko wrote:
>>
>> On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
>>> Since it is illegal to register a MLRC context when scheduler groups 
>>> are
>>> enabled, the GuC consider the VF doing so as an adverse event. Like for
>>> other adverse event, there is a threshold for how many times the event
>>> can happen before the GuC throws an error, which we need to add support
>>> for.
>>>
>>> Since this is the first threshold that we have that has a minimum GuC
>>> version requirement, support for checking that has been added to the
>>> generic threshold handling. As part of it, some of the version code has
>>> been moved to its own file and with the occasion some SRIOV
>>> documentation has been added.
>>>
>>> v2: split from previous patch, add GuC version checking
>>>
>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> ---
>>>   drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  9 +++++
>>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    | 19 ++++++----
>>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c   |  9 +++--
>>>   drivers/gpu/drm/xe/xe_guc.h                   |  7 +---
>>>   .../drm/xe/xe_guc_klv_thresholds_set_types.h  | 18 +++++-----
>>>   drivers/gpu/drm/xe/xe_guc_version.h           | 36 
>>> +++++++++++++++++++
>>>   6 files changed, 74 insertions(+), 24 deletions(-)
>>>   create mode 100644 drivers/gpu/drm/xe/xe_guc_version.h
>>>
>>> diff --git a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h 
>>> b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>>> index edb0546fb163..30a051a0b4ee 100644
>>> --- a/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>>> +++ b/drivers/gpu/drm/xe/abi/guc_klvs_abi.h
>>> @@ -376,6 +376,12 @@ enum  {
>>>    *      :1: NORMAL = schedule VF always, irrespective of whether 
>>> it has work or not
>>>    *      :2: HIGH = schedule VF in the next time-slice after 
>>> current active
>>>    *          time-slice completes if it has active work
>>> + *
>>> + * _`GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT` : 0x8A0D
>>> + *      Given that multi-LRC contexts are incompatible with SRIOV 
>>> scheduler
>>> + *      groups and cause the latter to be turned off when 
>>> registered with the
>>> + *      GuC, this config allows the PF to set a threshold for 
>>> multi-LRC context
>>> + *      registrations by VFs to monitor their behavior.
>>>    */
>>>     #define GUC_KLV_VF_CFG_GGTT_START_KEY        0x0001
>>> @@ -434,6 +440,9 @@ enum  {
>>>   #define   GUC_SCHED_PRIORITY_NORMAL        1u
>>>   #define   GUC_SCHED_PRIORITY_HIGH        2u
>>>   +#define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_KEY 0x8a0d
>>> +#define GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT_LEN    1u
>>> +
>>>   /*
>>>    * Workaround keys:
>>>    */
>>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c 
>>> b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
>>> index 59c5c6b4d994..dda671d05b89 100644
>>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
>>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c
>>> @@ -269,7 +269,8 @@ static u32 encode_config_ggtt(u32 *cfg, const 
>>> struct xe_gt_sriov_config *config,
>>>   }
>>>     /* Return: number of configuration dwords written */
>>> -static u32 encode_config(u32 *cfg, const struct xe_gt_sriov_config 
>>> *config, bool details)
>>> +static u32 encode_config(struct xe_gt *gt, u32 *cfg,
>>> +             const struct xe_gt_sriov_config *config, bool details)
>>>   {
>>>       u32 n = 0;
>>>   @@ -303,9 +304,11 @@ static u32 encode_config(u32 *cfg, const 
>>> struct xe_gt_sriov_config *config, bool
>>>       cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_PREEMPT_TIMEOUT);
>>>       cfg[n++] = config->preempt_timeout;
>>>   -#define encode_threshold_config(TAG, ...) ({                    \
>>> -    cfg[n++] = PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);            \
>>> -    cfg[n++] = 
>>> config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)]; \
>>> +#define encode_threshold_config(TAG, NAME, MIN_GUC_VER) 
>>> ({                \
>>> +    if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= 
>>> MIN_GUC_VER) {        \
>>> +        cfg[n++] = 
>>> PREP_GUC_KLV_TAG(VF_CFG_THRESHOLD_##TAG);            \
>>> +        cfg[n++] = 
>>> config->thresholds[MAKE_XE_GUC_KLV_THRESHOLD_INDEX(TAG)]; \
>>> +    }                                        \
>>>   });
>>> MAKE_XE_GUC_KLV_THRESHOLDS_SET(encode_threshold_config);
>>> @@ -328,7 +331,7 @@ static int pf_push_full_vf_config(struct xe_gt 
>>> *gt, unsigned int vfid)
>>>           return -ENOBUFS;
>>>         cfg = xe_guc_buf_cpu_ptr(buf);
>>> -    num_dwords = encode_config(cfg, config, true);
>>> +    num_dwords = encode_config(gt, cfg, config, true);
>>>       xe_gt_assert(gt, num_dwords <= max_cfg_dwords);
>>>         if (xe_gt_is_media_type(gt)) {
>>> @@ -2518,7 +2521,7 @@ ssize_t xe_gt_sriov_pf_config_save(struct 
>>> xe_gt *gt, unsigned int vfid, void *bu
>>>               ret = -ENOBUFS;
>>>           } else {
>>>               config = pf_pick_vf_config(gt, vfid);
>>> -            ret = encode_config(buf, config, false) * sizeof(u32);
>>> +            ret = encode_config(gt, buf, config, false) * sizeof(u32);
>>>           }
>>>       }
>>>       mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
>>> @@ -2551,9 +2554,11 @@ static int pf_restore_vf_config_klv(struct 
>>> xe_gt *gt, unsigned int vfid,
>>>           return pf_provision_preempt_timeout(gt, vfid, value[0]);
>>>         /* auto-generate case statements */
>>> -#define define_threshold_key_to_provision_case(TAG, 
>>> ...)                \
>>> +#define define_threshold_key_to_provision_case(TAG, NAME, 
>>> MIN_GUC_VER)            \
>>>       case MAKE_GUC_KLV_VF_CFG_THRESHOLD_KEY(TAG):                    \
>>>           BUILD_BUG_ON(MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG) != 
>>> 1u);        \
>>> +        if (MIN_GUC_VER && GUC_FIRMWARE_VER(&gt->uc.guc) < 
>>> MIN_GUC_VER) \
>>> +            return -ENOKEY;                            \
>>>           if (len != 
>>> MAKE_GUC_KLV_VF_CFG_THRESHOLD_LEN(TAG))            \
>>>               return -EBADMSG;                        \
>>>           return pf_provision_threshold(gt, vfid,                    \
>>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c 
>>> b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> index 0fd863609848..5123ff1fb116 100644
>>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> @@ -21,6 +21,7 @@
>>>   #include "xe_gt_sriov_pf_monitor.h"
>>>   #include "xe_gt_sriov_pf_policy.h"
>>>   #include "xe_gt_sriov_pf_service.h"
>>> +#include "xe_guc.h"
>>>   #include "xe_pm.h"
>>>   #include "xe_sriov_pf.h"
>>>   #include "xe_sriov_pf_provision.h"
>>> @@ -301,9 +302,11 @@ static void pf_add_config_attrs(struct xe_gt 
>>> *gt, struct dentry *parent, unsigne
>>>                      &sched_priority_fops);
>>>         /* register all threshold attributes */
>>> -#define register_threshold_attribute(TAG, NAME, ...) \
>>> -    debugfs_create_file_unsafe("threshold_" #NAME, 0644, parent, 
>>> parent, \
>>> -                   &NAME##_fops);
>>> +#define register_threshold_attribute(TAG, NAME, MIN_GUC_VER) 
>>> ({                \
>>> +    if (!MIN_GUC_VER || GUC_FIRMWARE_VER(&gt->uc.guc) >= 
>>> MIN_GUC_VER)        \
>>> +        debugfs_create_file_unsafe("threshold_" #NAME, 0644, 
>>> parent, parent,    \
>>> +                       &NAME##_fops);                \
>>> +});
>>> MAKE_XE_GUC_KLV_THRESHOLDS_SET(register_threshold_attribute)
>>>   #undef register_threshold_attribute
>>>   }
>>> diff --git a/drivers/gpu/drm/xe/xe_guc.h b/drivers/gpu/drm/xe/xe_guc.h
>>> index fdb08658d05a..9028718189ed 100644
>>> --- a/drivers/gpu/drm/xe/xe_guc.h
>>> +++ b/drivers/gpu/drm/xe/xe_guc.h
>>> @@ -8,15 +8,10 @@
>>>     #include "xe_gt.h"
>>>   #include "xe_guc_types.h"
>>> +#include "xe_guc_version.h"
>>>   #include "xe_hw_engine_types.h"
>>>   #include "xe_macros.h"
>>>   -/*
>>> - * GuC version number components are defined to be only 8-bit size,
>>> - * so converting to a 32bit 8.8.8 integer allows simple (and safe)
>>> - * numerical comparisons.
>>> - */
>>> -#define MAKE_GUC_VER(maj, min, pat)    (((maj) << 16) | ((min) << 
>>> 8) | (pat))
>>>   #define MAKE_GUC_VER_STRUCT(ver) MAKE_GUC_VER((ver).major, 
>>> (ver).minor, (ver).patch)
>> I guess this macro can also be moved
>
> I purposely didn't move this as MAKE_GUC_VER_STRUCT is specific to how 
> we code xe_uc_fw_version, while MAKE_GUC_VER is based on what the GuC 
> interface define.
>
>>
>>>   #define GUC_SUBMIT_VER(guc) \
>>> MAKE_GUC_VER_STRUCT((guc)->fw.versions.found[XE_UC_FW_VER_COMPATIBILITY])
>>> diff --git a/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h 
>>> b/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
>>> index 0a028c94756d..f7ed32244c6b 100644
>>> --- a/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
>>> +++ b/drivers/gpu/drm/xe/xe_guc_klv_thresholds_set_types.h
>>> @@ -7,6 +7,7 @@
>>>   #define _XE_GUC_KLV_THRESHOLDS_SET_TYPES_H_
>>>     #include "xe_args.h"
>>> +#include "xe_guc_version.h"
>>>     /**
>>>    * MAKE_XE_GUC_KLV_THRESHOLDS_SET - Generate various GuC 
>>> thresholds definitions.
>>> @@ -23,15 +24,16 @@
>>>    * with the &TAG, that corresponds to the GuC threshold KLV key 
>>> name defined by
>>>    * ABI and the associated &NAME, that may be used in code or 
>>> debugfs/sysfs::
>>>    *
>>> - *    define(TAG, NAME)
>>> + *    define(TAG, NAME, MIN_GUC_VER)
>>>    */
>>> -#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)        \
>>> -    define(CAT_ERR, cat_error_count)        \
>>> -    define(ENGINE_RESET, engine_reset_count)    \
>>> -    define(PAGE_FAULT, page_fault_count)        \
>>> -    define(H2G_STORM, guc_time_us)            \
>>> -    define(IRQ_STORM, irq_time_us)            \
>>> -    define(DOORBELL_STORM, doorbell_time_us)    \
>>> +#define MAKE_XE_GUC_KLV_THRESHOLDS_SET(define)                    \
>>> +    define(CAT_ERR, cat_error_count, 0)                    \
>>> +    define(ENGINE_RESET, engine_reset_count, 0)                \
>>> +    define(PAGE_FAULT, page_fault_count, 0)                    \
>>> +    define(H2G_STORM, guc_time_us, 0)                    \
>>> +    define(IRQ_STORM, irq_time_us, 0)                    \
>>> +    define(DOORBELL_STORM, doorbell_time_us, 0)                \
>>> +    define(MULTI_LRC_COUNT, multi_lrc_count, MAKE_GUC_VER(70, 53, 
>>> 0))    \
>>>       /* end */
>>>     /**
>>> diff --git a/drivers/gpu/drm/xe/xe_guc_version.h 
>>> b/drivers/gpu/drm/xe/xe_guc_version.h
>>> new file mode 100644
>> introduction of this new ver.h file is self-contained so maybe it 
>> should be in its own patch?
>
> IMO it is simple enough to keep here, to avoid too many small patches 
> in the series.
>
>>
>>> index 000000000000..e6f80abd2f05
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/xe/xe_guc_version.h
>>> @@ -0,0 +1,36 @@
>>> +/* SPDX-License-Identifier: MIT */
>>> +/*
>>> + * Copyright © 2025 Intel Corporation
>>> + */
>>> +
>>> +#ifndef _XE_GUC_VERSION_H_
>>> +#define _XE_GUC_VERSION_H_
>>> +
>>> +/*
>> this should be regular kernel-doc
>>
>>> + * GuC version number components are defined to be only 8-bit size,
>>> + * so converting to a 32bit 8.8.8 integer allows simple (and safe)
>>> + * numerical comparisons.
>>> + */
>>> +#define MAKE_GUC_VER(maj, min, pat)    (((maj) << 16) | ((min) << 
>>> 8) | (pat))
>>> +
>>> +/**
>>> + * DOC: SRIOV-changes
>>     DOC: SR-IOV Changes
>>
>>> + *
>>> + * We record SRIOV-specific changes here as those need to be 
>>> tracked carefully.
>>> + *
>> what about 1.23.0 (CCS) ?
>
> If you tell me exactly what to write I'll add it in, because I don't 
> know the specifics.
>
>>
>>> + * GuC 70.53.0 (VF interface 1.26.0):
>>> + *
>>> + * Added support for EGS. See:
>> probably we need extra line here to render correctly
>>
>>> + *  * GUC_KLV_VGT_POLICY_ENGINE_GROUP_CONFIG
>>> + *  * GUC_KLV_VF_CFG_THRESHOLD_MULTI_LRC_COUNT
>>> + *
>>> + * GuC 70.54.0 (VF interface 1.27.0):
>>> + *
>>> + * Updated VF migration support. See RESFIX actions
>> maybe we can list those actions:
>>
>>     * VF2GUC_RESFIX_START
>>     * VF2GUC_RESFIX_DONE
>
> Those don't seem to yet be in the driver. Should I list them anyway?

Ignore this question, I was looking in an older tree.

Daniele

>
> Daniele
>
>>> + *
>>> + * GuC 70.55.1 (VF interface 1.28.1):
>>> + *
>>> + * Fixes for EGS.
>>> + */
>>> +
>>> +#endif
>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups
  2025-12-06 23:04 ` [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups Daniele Ceraolo Spurio
@ 2025-12-08 23:38   ` Michal Wajdeczko
  2025-12-09  0:36     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Wajdeczko @ 2025-12-08 23:38 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe



On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
> Reading the debugfs file lists the available configurations by name.
> Writing the name of a configuration to the file will enable it.
> 
> v2: don't print anything if the feature is unsupported (Michal), add
>     TODO for reworking init order to know if there are valid groups
>     when we register debugfs, check for basic feature support.

btw, recently in Xe we started to follow core kernel rule and put the
change log under the --- line

> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 126 ++++++++++++++++++++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c  |  19 +--
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h  |   1 +
>  3 files changed, 139 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> index 5123ff1fb116..1be23809e624 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> @@ -156,6 +156,131 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
>  	debugfs_create_file_unsafe("sample_period_ms", 0644, parent, parent, &sample_period_fops);
>  }
>  
> +/*
> + *      /sys/kernel/debug/dri/BDF/
> + *      ├── sriov
> + *      :   ├── pf
> + *          :   ├── tile0
> + *              :   ├── gt0
> + *                  :   ├── sched_groups_mode
> + */
> +
> +static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
> +{
> +	switch (mode) {
> +	case XE_SRIOV_SCHED_GROUPS_NONE:
> +		return "disabled";

maybe we should be consistent and use either:

	"none" / XE_SRIOV_SCHED_GROUPS_NONE
or
	"disabled" / XE_SRIOV_SCHED_GROUPS_DISABLED

> +	case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
> +		return "media_slices";
> +	default:
> +		return "unknown";
> +	}
> +}
> +
> +static int sched_groups_info(struct seq_file *m, void *data)
> +{
> +	struct drm_printer p = drm_seq_file_printer(m);
> +	struct xe_gt *gt = extract_gt(m->private);
> +	u32 current_mode = gt->sriov.pf.policy.guc.sched_groups.current_mode;
> +	int mode = 0;
> +
> +	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
> +		return 0;
> +
> +	for (mode = 0; mode < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; mode++) {
> +		if (!xe_sriov_gt_pf_policy_has_sched_group_mode(gt, mode))
> +			continue;
> +
> +		if (mode)
> +			drm_printf(&p, " ");

nit: drm_puts() ?

> +
> +		if (mode == current_mode)
> +			drm_printf(&p, "[");
> +
> +		drm_printf(&p, "%s", sched_group_mode_to_string(mode));
> +
> +		if (mode == current_mode)
> +			drm_printf(&p, "]");
> +	}
> +
> +	drm_printf(&p, "\n");
> +
> +	return 0;
> +}
> +
> +static int sched_groups_open(struct inode *inode, struct file *file)
> +{
> +	return single_open(file, sched_groups_info, inode->i_private);
> +}
> +
> +static ssize_t sched_groups_write(struct file *file, const char __user *ubuf,
> +				  size_t size, loff_t *pos)
> +{
> +	struct xe_gt *gt = extract_gt(file_inode(file)->i_private);
> +	char name[32];
> +	int ret;
> +	int m;
> +
> +	if (*pos)
> +		return -ESPIPE;
> +
> +	if (!size)
> +		return -ENODATA;
> +
> +	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
> +		return -ENODEV;

maybe we should drop this condition?

if we don't have multi-groups, we will still display:

	[disabled]

so for consistency we should allow:

	echo disabled > sched_groups_mode


> +
> +	if (size > sizeof(name) - 1)
> +		return -EINVAL;
> +
> +	ret = simple_write_to_buffer(name, sizeof(name) - 1, pos, ubuf, size);
> +	if (ret < 0)
> +		return ret;
> +	name[ret] = '\0';
> +
> +	for (m = 0; m < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; m++)
> +		if (sysfs_streq(name, sched_group_mode_to_string(m)))
> +			break;
> +
> +	if (m == XE_SRIOV_SCHED_GROUPS_MODES_COUNT)
> +		return -EINVAL;
> +
> +	guard(xe_pm_runtime)(gt_to_xe(gt));
> +	ret = xe_gt_sriov_pf_policy_set_sched_groups_mode(gt, m);
> +
> +	return (ret < 0) ? ret : size;

we can drop ( ) here

> +}
> +
> +static const struct file_operations sched_groups_fops = {
> +	.owner = THIS_MODULE,
> +	.open = sched_groups_open,
> +	.read = seq_read,
> +	.write = sched_groups_write,
> +	.llseek = seq_lseek,
> +	.release = single_release,
> +};
> +
> +static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
> +{
> +	xe_gt_assert(gt, gt == extract_gt(parent));
> +	xe_gt_assert(gt, PFID == extract_vfid(parent));
> +
> +	/*
> +	 * TODO: we currently call this function before we initialize scheduler
> +	 * groups, so at this point in time we don't know if there are any
> +	 * valid groups on the GT and we can't selectively register the debugfs
> +	 * only if there are any. Therefore, we always register the debugfs
> +	 * files if we're on a platform that has support for groups.
> +	 * We should rework the flow so that debugfs is registered after the
> +	 * policy init, so that we check if there are valid groups before
> +	 * adding the debugfs files.
> +	 */
> +	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
> +		return;
> +
> +	debugfs_create_file("sched_groups_mode", 0644, parent, parent, &sched_groups_fops);
> +}
> +
>  /*
>   *      /sys/kernel/debug/dri/BDF/
>   *      ├── sriov
> @@ -531,6 +656,7 @@ static void pf_populate_gt(struct xe_gt *gt, struct dentry *dent, unsigned int v
>  	} else {
>  		pf_add_config_attrs(gt, dent, PFID);
>  		pf_add_policy_attrs(gt, dent);
> +		pf_add_sched_groups(gt, dent);
>  
>  		drm_debugfs_create_files(pf_info, ARRAY_SIZE(pf_info), dent, minor);
>  	}
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> index 6a682d788b02..2cafacac5d8e 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> @@ -451,19 +451,24 @@ static void pf_sched_group_media_slices(struct xe_gt *gt, u32 **masks, u32 *num_
>  	*num_masks = GUC_MAX_ENGINE_CLASSES * groups;
>  }
>  
> -static void pf_init_sched_groups(struct xe_gt *gt)

missing kernel doc

> +bool xe_sriov_gt_pf_policy_has_sched_groups_support(struct xe_gt *gt)
>  {
> -	int m;
> -
> -	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> -
>  	/*
>  	 * The GuC supports scheduler groups from v70.53.0, but a fix for it has
>  	 * been merged in v70.55.1, so we require the latter. The feature is
>  	 * also only enabled on BMG and newer FW.
>  	 */
> -	if (GUC_FIRMWARE_VER(&gt->uc.guc) < MAKE_GUC_VER(70, 55, 1) ||
> -	    gt_to_xe(gt)->info.platform < XE_BATTLEMAGE)
> +	return GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 55, 1) &&
> +	       gt_to_xe(gt)->info.platform >= XE_BATTLEMAGE;

and maybe we can introduce this function in patch 2/11 to avoid this diff?

> +}
> +> +static void pf_init_sched_groups(struct xe_gt *gt)
> +{
> +	int m;
> +
> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +
> +	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
>  		return;
>  
>  	/*
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> index ceaf797ca21b..f5ea44dcaf82 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> @@ -17,6 +17,7 @@ int xe_gt_sriov_pf_policy_set_reset_engine(struct xe_gt *gt, bool enable);
>  bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
>  int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
>  u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
> +bool xe_sriov_gt_pf_policy_has_sched_groups_support(struct xe_gt *gt);
>  bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
>  bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
>  int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 07/11] drm/xe/sriov: Add debugfs with scheduler groups information
  2025-12-06 23:04 ` [PATCH v2 07/11] drm/xe/sriov: Add debugfs with scheduler groups information Daniele Ceraolo Spurio
@ 2025-12-09  0:08   ` Michal Wajdeczko
  2025-12-09  0:23     ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Wajdeczko @ 2025-12-09  0:08 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe



On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
> Under a new subfolder, an entry is created for each group to list the
> engines assigned to them. We create enough entries for each possible
> group, with the disabled groups just returning an empty list.
> 
> v2: drop subfolders, always register debugfs for all groups (Michal)
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 70 +++++++++++++++++++++
>  1 file changed, 70 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> index 1be23809e624..15f5f3a40471 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
> @@ -163,6 +163,10 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
>   *          :   ├── tile0
>   *              :   ├── gt0
>   *                  :   ├── sched_groups_mode
> + *                      ├── sched_groups
> + *                      :   ├── group0
> + *                          :
> + *                          └── groupN
>   */
>  
>  static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
> @@ -260,8 +264,60 @@ static const struct file_operations sched_groups_fops = {
>  	.release = single_release,
>  };
>  
> +static ssize_t sched_group_engines_read(struct file *file, char __user *buf,
> +					size_t count, loff_t *ppos)
> +{
> +	struct dentry *dent = file_dentry(file);
> +	struct xe_gt *gt = extract_gt(dent->d_parent);
> +	struct xe_gt_sriov_scheduler_groups *groups = &gt->sriov.pf.policy.guc.sched_groups;
> +	u32 num_masks = groups->modes[groups->current_mode].num_masks;
> +	u32 *masks = groups->modes[groups->current_mode].masks;
> +	unsigned int group = GUC_MAX_SCHED_GROUPS;
> +	struct xe_hw_engine *hwe;
> +	enum xe_hw_engine_id id;
> +	char engines[128];
> +	int ret;
> +
> +	ret = sscanf(dent->d_name.name, "group%u", &group);

see below

> +	xe_gt_assert(gt, ret == 1 && group < GUC_MAX_SCHED_GROUPS);
> +
> +	engines[0] = '\0';
> +
> +	/* If there are no masks it means that all the engines are in group 0 */
> +	if (num_masks >= (group + 1) * GUC_MAX_ENGINE_CLASSES) {
> +		for_each_hw_engine(hwe, gt, id) {
> +			u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
> +			u32 mask = masks[group * GUC_MAX_ENGINE_CLASSES + guc_class];
> +
> +			if (mask & BIT(hwe->logical_instance)) {
> +				strlcat(engines, hwe->name, sizeof(engines));
> +				strlcat(engines, " ", sizeof(engines));
> +			}
> +		}
> +		strlcat(engines, "\n", sizeof(engines));
> +	} else if (group == 0) {
> +		for_each_hw_engine(hwe, gt, id) {
> +			strlcat(engines, hwe->name, sizeof(engines));
> +			strlcat(engines, " ", sizeof(engines));
> +		}
> +		strlcat(engines, "\n", sizeof(engines));
> +	}
> +
> +	return simple_read_from_buffer(buf, count, ppos, engines, strlen(engines));
> +}
> +
> +static const struct file_operations sched_group_engines_fops = {
> +	.owner = THIS_MODULE,
> +	.open = simple_open,
> +	.read = sched_group_engines_read,
> +	.llseek = default_llseek,
> +};
> +
>  static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
>  {
> +	struct dentry *groups;
> +	u8 g;

no cryptic names, please ;)

> +
>  	xe_gt_assert(gt, gt == extract_gt(parent));
>  	xe_gt_assert(gt, PFID == extract_vfid(parent));
>  
> @@ -274,11 +330,25 @@ static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
>  	 * We should rework the flow so that debugfs is registered after the
>  	 * policy init, so that we check if there are valid groups before
>  	 * adding the debugfs files.
> +	 * Similarly, instead of using GUC_MAX_SCHED_GROUPS we could use
> +	 * gt->sriov.pf.policy.guc.sched_groups.max_number_of_groups.
>  	 */
>  	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
>  		return;
>  
>  	debugfs_create_file("sched_groups_mode", 0644, parent, parent, &sched_groups_fops);
> +
> +	groups = debugfs_create_dir("sched_groups", parent);
> +	if (IS_ERR(groups))
> +		return;
> +	groups->d_inode->i_private = gt;

no need to store gt here, you can grab it from

	parent->d_parent
or
	parent->d_parent->d_parent

> +
> +	for (g = 0; g < GUC_MAX_SCHED_GROUPS; g++) {
> +		char name[10];
> +
> +		snprintf(name, sizeof(name), "group%u", g);
> +		debugfs_create_file(name, 0644, groups, parent, &sched_group_engines_fops);

why storing 'parent' as private data? 

better store group index there so you will not need to sscanf it back

and parent can always be retrieved from the dentry

> +	}
>  }
>  
>  /*


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 07/11] drm/xe/sriov: Add debugfs with scheduler groups information
  2025-12-09  0:08   ` Michal Wajdeczko
@ 2025-12-09  0:23     ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-09  0:23 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-xe



On 12/8/2025 4:08 PM, Michal Wajdeczko wrote:
>
> On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
>> Under a new subfolder, an entry is created for each group to list the
>> engines assigned to them. We create enough entries for each possible
>> group, with the disabled groups just returning an empty list.
>>
>> v2: drop subfolders, always register debugfs for all groups (Michal)
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 70 +++++++++++++++++++++
>>   1 file changed, 70 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> index 1be23809e624..15f5f3a40471 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> @@ -163,6 +163,10 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
>>    *          :   ├── tile0
>>    *              :   ├── gt0
>>    *                  :   ├── sched_groups_mode
>> + *                      ├── sched_groups
>> + *                      :   ├── group0
>> + *                          :
>> + *                          └── groupN
>>    */
>>   
>>   static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
>> @@ -260,8 +264,60 @@ static const struct file_operations sched_groups_fops = {
>>   	.release = single_release,
>>   };
>>   
>> +static ssize_t sched_group_engines_read(struct file *file, char __user *buf,
>> +					size_t count, loff_t *ppos)
>> +{
>> +	struct dentry *dent = file_dentry(file);
>> +	struct xe_gt *gt = extract_gt(dent->d_parent);
>> +	struct xe_gt_sriov_scheduler_groups *groups = &gt->sriov.pf.policy.guc.sched_groups;
>> +	u32 num_masks = groups->modes[groups->current_mode].num_masks;
>> +	u32 *masks = groups->modes[groups->current_mode].masks;
>> +	unsigned int group = GUC_MAX_SCHED_GROUPS;
>> +	struct xe_hw_engine *hwe;
>> +	enum xe_hw_engine_id id;
>> +	char engines[128];
>> +	int ret;
>> +
>> +	ret = sscanf(dent->d_name.name, "group%u", &group);
> see below
>
>> +	xe_gt_assert(gt, ret == 1 && group < GUC_MAX_SCHED_GROUPS);
>> +
>> +	engines[0] = '\0';
>> +
>> +	/* If there are no masks it means that all the engines are in group 0 */
>> +	if (num_masks >= (group + 1) * GUC_MAX_ENGINE_CLASSES) {
>> +		for_each_hw_engine(hwe, gt, id) {
>> +			u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
>> +			u32 mask = masks[group * GUC_MAX_ENGINE_CLASSES + guc_class];
>> +
>> +			if (mask & BIT(hwe->logical_instance)) {
>> +				strlcat(engines, hwe->name, sizeof(engines));
>> +				strlcat(engines, " ", sizeof(engines));
>> +			}
>> +		}
>> +		strlcat(engines, "\n", sizeof(engines));
>> +	} else if (group == 0) {
>> +		for_each_hw_engine(hwe, gt, id) {
>> +			strlcat(engines, hwe->name, sizeof(engines));
>> +			strlcat(engines, " ", sizeof(engines));
>> +		}
>> +		strlcat(engines, "\n", sizeof(engines));
>> +	}
>> +
>> +	return simple_read_from_buffer(buf, count, ppos, engines, strlen(engines));
>> +}
>> +
>> +static const struct file_operations sched_group_engines_fops = {
>> +	.owner = THIS_MODULE,
>> +	.open = simple_open,
>> +	.read = sched_group_engines_read,
>> +	.llseek = default_llseek,
>> +};
>> +
>>   static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
>>   {
>> +	struct dentry *groups;
>> +	u8 g;
> no cryptic names, please ;)
>
>> +
>>   	xe_gt_assert(gt, gt == extract_gt(parent));
>>   	xe_gt_assert(gt, PFID == extract_vfid(parent));
>>   
>> @@ -274,11 +330,25 @@ static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
>>   	 * We should rework the flow so that debugfs is registered after the
>>   	 * policy init, so that we check if there are valid groups before
>>   	 * adding the debugfs files.
>> +	 * Similarly, instead of using GUC_MAX_SCHED_GROUPS we could use
>> +	 * gt->sriov.pf.policy.guc.sched_groups.max_number_of_groups.
>>   	 */
>>   	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
>>   		return;
>>   
>>   	debugfs_create_file("sched_groups_mode", 0644, parent, parent, &sched_groups_fops);
>> +
>> +	groups = debugfs_create_dir("sched_groups", parent);
>> +	if (IS_ERR(groups))
>> +		return;
>> +	groups->d_inode->i_private = gt;
> no need to store gt here, you can grab it from
>
> 	parent->d_parent
> or
> 	parent->d_parent->d_parent
>
>> +
>> +	for (g = 0; g < GUC_MAX_SCHED_GROUPS; g++) {
>> +		char name[10];
>> +
>> +		snprintf(name, sizeof(name), "group%u", g);
>> +		debugfs_create_file(name, 0644, groups, parent, &sched_group_engines_fops);
> why storing 'parent' as private data?
>
> better store group index there so you will not need to sscanf it back
>
> and parent can always be retrieved from the dentry

The issue with storing the index was that it'd require allocating an 
array for it, and I wanted to avoid a new allocation just for that when 
it is very easy to just read it back via scanf.
Since I had nothing to store that and we're storing the parent in other 
cases, I just did the same here.

Daniele

>
>> +	}
>>   }
>>   
>>   /*


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups
  2025-12-08 23:38   ` Michal Wajdeczko
@ 2025-12-09  0:36     ` Daniele Ceraolo Spurio
  2025-12-09 15:07       ` Michal Wajdeczko
  0 siblings, 1 reply; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-09  0:36 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-xe



On 12/8/2025 3:38 PM, Michal Wajdeczko wrote:
>
> On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
>> Reading the debugfs file lists the available configurations by name.
>> Writing the name of a configuration to the file will enable it.
>>
>> v2: don't print anything if the feature is unsupported (Michal), add
>>      TODO for reworking init order to know if there are valid groups
>>      when we register debugfs, check for basic feature support.
> btw, recently in Xe we started to follow core kernel rule and put the
> change log under the --- line

ok

>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 126 ++++++++++++++++++++
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c  |  19 +--
>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h  |   1 +
>>   3 files changed, 139 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> index 5123ff1fb116..1be23809e624 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> @@ -156,6 +156,131 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
>>   	debugfs_create_file_unsafe("sample_period_ms", 0644, parent, parent, &sample_period_fops);
>>   }
>>   
>> +/*
>> + *      /sys/kernel/debug/dri/BDF/
>> + *      ├── sriov
>> + *      :   ├── pf
>> + *          :   ├── tile0
>> + *              :   ├── gt0
>> + *                  :   ├── sched_groups_mode
>> + */
>> +
>> +static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
>> +{
>> +	switch (mode) {
>> +	case XE_SRIOV_SCHED_GROUPS_NONE:
>> +		return "disabled";
> maybe we should be consistent and use either:
>
> 	"none" / XE_SRIOV_SCHED_GROUPS_NONE
> or
> 	"disabled" / XE_SRIOV_SCHED_GROUPS_DISABLED

I am ok with switching both to disabled. Or maybe we should just go with 
"all_gt_engines", to indicate that all engines in one GT are in the same 
group? Because AFAIU from the GuC POV you always have groups, the 
question is if more than 1 group is actually active (and it also counts 
as an actual mode instead of just being the disabled state)

>> +	case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
>> +		return "media_slices";
>> +	default:
>> +		return "unknown";
>> +	}
>> +}
>> +
>> +static int sched_groups_info(struct seq_file *m, void *data)
>> +{
>> +	struct drm_printer p = drm_seq_file_printer(m);
>> +	struct xe_gt *gt = extract_gt(m->private);
>> +	u32 current_mode = gt->sriov.pf.policy.guc.sched_groups.current_mode;
>> +	int mode = 0;
>> +
>> +	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
>> +		return 0;
>> +
>> +	for (mode = 0; mode < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; mode++) {
>> +		if (!xe_sriov_gt_pf_policy_has_sched_group_mode(gt, mode))
>> +			continue;
>> +
>> +		if (mode)
>> +			drm_printf(&p, " ");
> nit: drm_puts() ?

I'd prefer not to add newlines here.

>
>> +
>> +		if (mode == current_mode)
>> +			drm_printf(&p, "[");
>> +
>> +		drm_printf(&p, "%s", sched_group_mode_to_string(mode));
>> +
>> +		if (mode == current_mode)
>> +			drm_printf(&p, "]");
>> +	}
>> +
>> +	drm_printf(&p, "\n");
>> +
>> +	return 0;
>> +}
>> +
>> +static int sched_groups_open(struct inode *inode, struct file *file)
>> +{
>> +	return single_open(file, sched_groups_info, inode->i_private);
>> +}
>> +
>> +static ssize_t sched_groups_write(struct file *file, const char __user *ubuf,
>> +				  size_t size, loff_t *pos)
>> +{
>> +	struct xe_gt *gt = extract_gt(file_inode(file)->i_private);
>> +	char name[32];
>> +	int ret;
>> +	int m;
>> +
>> +	if (*pos)
>> +		return -ESPIPE;
>> +
>> +	if (!size)
>> +		return -ENODATA;
>> +
>> +	if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
>> +		return -ENODEV;
> maybe we should drop this condition?
>
> if we don't have multi-groups, we will still display:
>
> 	[disabled]
>
> so for consistency we should allow:
>
> 	echo disabled > sched_groups_mode

Sure, that makes things easier.

>
>> +
>> +	if (size > sizeof(name) - 1)
>> +		return -EINVAL;
>> +
>> +	ret = simple_write_to_buffer(name, sizeof(name) - 1, pos, ubuf, size);
>> +	if (ret < 0)
>> +		return ret;
>> +	name[ret] = '\0';
>> +
>> +	for (m = 0; m < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; m++)
>> +		if (sysfs_streq(name, sched_group_mode_to_string(m)))
>> +			break;
>> +
>> +	if (m == XE_SRIOV_SCHED_GROUPS_MODES_COUNT)
>> +		return -EINVAL;
>> +
>> +	guard(xe_pm_runtime)(gt_to_xe(gt));
>> +	ret = xe_gt_sriov_pf_policy_set_sched_groups_mode(gt, m);
>> +
>> +	return (ret < 0) ? ret : size;
> we can drop ( ) here

ok

>
>> +}
>> +
>> +static const struct file_operations sched_groups_fops = {
>> +	.owner = THIS_MODULE,
>> +	.open = sched_groups_open,
>> +	.read = seq_read,
>> +	.write = sched_groups_write,
>> +	.llseek = seq_lseek,
>> +	.release = single_release,
>> +};
>> +
>> +static void pf_add_sched_groups(struct xe_gt *gt, struct dentry *parent)
>> +{
>> +	xe_gt_assert(gt, gt == extract_gt(parent));
>> +	xe_gt_assert(gt, PFID == extract_vfid(parent));
>> +
>> +	/*
>> +	 * TODO: we currently call this function before we initialize scheduler
>> +	 * groups, so at this point in time we don't know if there are any
>> +	 * valid groups on the GT and we can't selectively register the debugfs
>> +	 * only if there are any. Therefore, we always register the debugfs
>> +	 * files if we're on a platform that has support for groups.
>> +	 * We should rework the flow so that debugfs is registered after the
>> +	 * policy init, so that we check if there are valid groups before
>> +	 * adding the debugfs files.
>> +	 */
>> +	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
>> +		return;
>> +
>> +	debugfs_create_file("sched_groups_mode", 0644, parent, parent, &sched_groups_fops);
>> +}
>> +
>>   /*
>>    *      /sys/kernel/debug/dri/BDF/
>>    *      ├── sriov
>> @@ -531,6 +656,7 @@ static void pf_populate_gt(struct xe_gt *gt, struct dentry *dent, unsigned int v
>>   	} else {
>>   		pf_add_config_attrs(gt, dent, PFID);
>>   		pf_add_policy_attrs(gt, dent);
>> +		pf_add_sched_groups(gt, dent);
>>   
>>   		drm_debugfs_create_files(pf_info, ARRAY_SIZE(pf_info), dent, minor);
>>   	}
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> index 6a682d788b02..2cafacac5d8e 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
>> @@ -451,19 +451,24 @@ static void pf_sched_group_media_slices(struct xe_gt *gt, u32 **masks, u32 *num_
>>   	*num_masks = GUC_MAX_ENGINE_CLASSES * groups;
>>   }
>>   
>> -static void pf_init_sched_groups(struct xe_gt *gt)
> missing kernel doc

will add.

>
>> +bool xe_sriov_gt_pf_policy_has_sched_groups_support(struct xe_gt *gt)
>>   {
>> -	int m;
>> -
>> -	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>> -
>>   	/*
>>   	 * The GuC supports scheduler groups from v70.53.0, but a fix for it has
>>   	 * been merged in v70.55.1, so we require the latter. The feature is
>>   	 * also only enabled on BMG and newer FW.
>>   	 */
>> -	if (GUC_FIRMWARE_VER(&gt->uc.guc) < MAKE_GUC_VER(70, 55, 1) ||
>> -	    gt_to_xe(gt)->info.platform < XE_BATTLEMAGE)
>> +	return GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 55, 1) &&
>> +	       gt_to_xe(gt)->info.platform >= XE_BATTLEMAGE;
> and maybe we can introduce this function in patch 2/11 to avoid this diff?

will do.

Daniele

>
>> +}
>> +> +static void pf_init_sched_groups(struct xe_gt *gt)
>> +{
>> +	int m;
>> +
>> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
>> +
>> +	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
>>   		return;
>>   
>>   	/*
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> index ceaf797ca21b..f5ea44dcaf82 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
>> @@ -17,6 +17,7 @@ int xe_gt_sriov_pf_policy_set_reset_engine(struct xe_gt *gt, bool enable);
>>   bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
>>   int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
>>   u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
>> +bool xe_sriov_gt_pf_policy_has_sched_groups_support(struct xe_gt *gt);
>>   bool xe_sriov_gt_pf_policy_has_multi_group_modes(struct xe_gt *gt);
>>   bool xe_sriov_gt_pf_policy_has_sched_group_mode(struct xe_gt *gt, u32 mode);
>>   int xe_gt_sriov_pf_policy_set_sched_groups_mode(struct xe_gt *gt, u32 value);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups
  2025-12-09  0:36     ` Daniele Ceraolo Spurio
@ 2025-12-09 15:07       ` Michal Wajdeczko
  2025-12-09 18:09         ` Daniele Ceraolo Spurio
  0 siblings, 1 reply; 30+ messages in thread
From: Michal Wajdeczko @ 2025-12-09 15:07 UTC (permalink / raw)
  To: Daniele Ceraolo Spurio, intel-xe



On 12/9/2025 1:36 AM, Daniele Ceraolo Spurio wrote:
> 
> 
> On 12/8/2025 3:38 PM, Michal Wajdeczko wrote:
>>
>> On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
>>> Reading the debugfs file lists the available configurations by name.
>>> Writing the name of a configuration to the file will enable it.
>>>
>>> v2: don't print anything if the feature is unsupported (Michal), add
>>>      TODO for reworking init order to know if there are valid groups
>>>      when we register debugfs, check for basic feature support.
>> btw, recently in Xe we started to follow core kernel rule and put the
>> change log under the --- line
> 
> ok
> 
>>
>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>> ---
>>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 126 ++++++++++++++++++++
>>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c  |  19 +--
>>>   drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h  |   1 +
>>>   3 files changed, 139 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> index 5123ff1fb116..1be23809e624 100644
>>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>> @@ -156,6 +156,131 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
>>>       debugfs_create_file_unsafe("sample_period_ms", 0644, parent, parent, &sample_period_fops);
>>>   }
>>>   +/*
>>> + *      /sys/kernel/debug/dri/BDF/
>>> + *      ├── sriov
>>> + *      :   ├── pf
>>> + *          :   ├── tile0
>>> + *              :   ├── gt0
>>> + *                  :   ├── sched_groups_mode
>>> + */
>>> +
>>> +static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
>>> +{
>>> +    switch (mode) {
>>> +    case XE_SRIOV_SCHED_GROUPS_NONE:
>>> +        return "disabled";
>> maybe we should be consistent and use either:
>>
>>     "none" / XE_SRIOV_SCHED_GROUPS_NONE
>> or
>>     "disabled" / XE_SRIOV_SCHED_GROUPS_DISABLED
> 
> I am ok with switching both to disabled. Or maybe we should just go with "all_gt_engines", to indicate that all engines in one GT are in the same group? Because AFAIU from the GuC POV you always have groups, the question is if more than 1 group is actually active (and it also counts as an actual mode instead of just being the disabled state)

I would rather try to keep debugfs as close as possible to driver POV and GuC ABI, not as exposing GuC POV
so IMO in case of no groups configured, we should show them as empty == not configured, not fully configured

but we may want to change the approach once we promote EGS to sysfs
where for consistency we should show "default group" with all engines, where EGS is N/A or disabled

> 
>>> +    case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
>>> +        return "media_slices";
>>> +    default:
>>> +        return "unknown";
>>> +    }
>>> +}
>>> +
>>> +static int sched_groups_info(struct seq_file *m, void *data)
>>> +{
>>> +    struct drm_printer p = drm_seq_file_printer(m);
>>> +    struct xe_gt *gt = extract_gt(m->private);
>>> +    u32 current_mode = gt->sriov.pf.policy.guc.sched_groups.current_mode;
>>> +    int mode = 0;
>>> +
>>> +    if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
>>> +        return 0;
>>> +
>>> +    for (mode = 0; mode < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; mode++) {
>>> +        if (!xe_sriov_gt_pf_policy_has_sched_group_mode(gt, mode))
>>> +            continue;
>>> +
>>> +        if (mode)
>>> +            drm_printf(&p, " ");
>> nit: drm_puts() ?
> 
> I'd prefer not to add newlines here.

it's not about newlines

/**
 * drm_puts - print a const string to a &drm_printer stream
 * @p: the &drm printer
 * @str: const string
 *
 * Allow &drm_printer types that have a constant string
 * option to use it.
 */

> 
>>
>>> +
>>> +        if (mode == current_mode)
>>> +            drm_printf(&p, "[");
>>> +
>>> +        drm_printf(&p, "%s", sched_group_mode_to_string(mode));

only here you need to use drm_printf, all else could be drm_puts

>>> +
>>> +        if (mode == current_mode)
>>> +            drm_printf(&p, "]");
>>> +    }
>>> +
>>> +    drm_printf(&p, "\n");
>>> +
>>> +    return 0;
>>> +}
>>> +

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups
  2025-12-09 15:07       ` Michal Wajdeczko
@ 2025-12-09 18:09         ` Daniele Ceraolo Spurio
  0 siblings, 0 replies; 30+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-09 18:09 UTC (permalink / raw)
  To: Michal Wajdeczko, intel-xe



On 12/9/2025 7:07 AM, Michal Wajdeczko wrote:
>
> On 12/9/2025 1:36 AM, Daniele Ceraolo Spurio wrote:
>>
>> On 12/8/2025 3:38 PM, Michal Wajdeczko wrote:
>>> On 12/7/2025 12:04 AM, Daniele Ceraolo Spurio wrote:
>>>> Reading the debugfs file lists the available configurations by name.
>>>> Writing the name of a configuration to the file will enable it.
>>>>
>>>> v2: don't print anything if the feature is unsupported (Michal), add
>>>>       TODO for reworking init order to know if there are valid groups
>>>>       when we register debugfs, check for basic feature support.
>>> btw, recently in Xe we started to follow core kernel rule and put the
>>> change log under the --- line
>> ok
>>
>>>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>>>> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>>>> ---
>>>>    drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 126 ++++++++++++++++++++
>>>>    drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c  |  19 +--
>>>>    drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h  |   1 +
>>>>    3 files changed, 139 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>>> index 5123ff1fb116..1be23809e624 100644
>>>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>>>> @@ -156,6 +156,131 @@ static void pf_add_policy_attrs(struct xe_gt *gt, struct dentry *parent)
>>>>        debugfs_create_file_unsafe("sample_period_ms", 0644, parent, parent, &sample_period_fops);
>>>>    }
>>>>    +/*
>>>> + *      /sys/kernel/debug/dri/BDF/
>>>> + *      ├── sriov
>>>> + *      :   ├── pf
>>>> + *          :   ├── tile0
>>>> + *              :   ├── gt0
>>>> + *                  :   ├── sched_groups_mode
>>>> + */
>>>> +
>>>> +static const char *sched_group_mode_to_string(enum xe_sriov_sched_group_modes mode)
>>>> +{
>>>> +    switch (mode) {
>>>> +    case XE_SRIOV_SCHED_GROUPS_NONE:
>>>> +        return "disabled";
>>> maybe we should be consistent and use either:
>>>
>>>      "none" / XE_SRIOV_SCHED_GROUPS_NONE
>>> or
>>>      "disabled" / XE_SRIOV_SCHED_GROUPS_DISABLED
>> I am ok with switching both to disabled. Or maybe we should just go with "all_gt_engines", to indicate that all engines in one GT are in the same group? Because AFAIU from the GuC POV you always have groups, the question is if more than 1 group is actually active (and it also counts as an actual mode instead of just being the disabled state)
> I would rather try to keep debugfs as close as possible to driver POV and GuC ABI, not as exposing GuC POV
> so IMO in case of no groups configured, we should show them as empty == not configured, not fully configured
>
> but we may want to change the approach once we promote EGS to sysfs
> where for consistency we should show "default group" with all engines, where EGS is N/A or disabled

My idea came from the fact that even with EGS off in sysfs we 
technically still have:

group0: rcs0, ccs0, bcs0, bcs9
group1: vcs0, vcs2, vecs0, vecs1

So the default state is 2 groups. But as you mentioned we can re-discuss 
when moving to sysfs.

>
>>>> +    case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
>>>> +        return "media_slices";
>>>> +    default:
>>>> +        return "unknown";
>>>> +    }
>>>> +}
>>>> +
>>>> +static int sched_groups_info(struct seq_file *m, void *data)
>>>> +{
>>>> +    struct drm_printer p = drm_seq_file_printer(m);
>>>> +    struct xe_gt *gt = extract_gt(m->private);
>>>> +    u32 current_mode = gt->sriov.pf.policy.guc.sched_groups.current_mode;
>>>> +    int mode = 0;
>>>> +
>>>> +    if (!xe_sriov_gt_pf_policy_has_multi_group_modes(gt))
>>>> +        return 0;
>>>> +
>>>> +    for (mode = 0; mode < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; mode++) {
>>>> +        if (!xe_sriov_gt_pf_policy_has_sched_group_mode(gt, mode))
>>>> +            continue;
>>>> +
>>>> +        if (mode)
>>>> +            drm_printf(&p, " ");
>>> nit: drm_puts() ?
>> I'd prefer not to add newlines here.
> it's not about newlines
>
> /**
>   * drm_puts - print a const string to a &drm_printer stream
>   * @p: the &drm printer
>   * @str: const string
>   *
>   * Allow &drm_printer types that have a constant string
>   * option to use it.
>   */

Sorry, I just assumed this would add a newline like the normal puts to 
stdout. Will switch to using it.

Daniele

>>>> +
>>>> +        if (mode == current_mode)
>>>> +            drm_printf(&p, "[");
>>>> +
>>>> +        drm_printf(&p, "%s", sched_group_mode_to_string(mode));
> only here you need to use drm_printf, all else could be drm_puts
>
>>>> +
>>>> +        if (mode == current_mode)
>>>> +            drm_printf(&p, "]");
>>>> +    }
>>>> +
>>>> +    drm_printf(&p, "\n");
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-12-09 18:10 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-06 23:03 [PATCH v2 00/11] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
2025-12-06 23:03 ` [PATCH v2 01/11] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
2025-12-07 15:35   ` Michal Wajdeczko
2025-12-06 23:03 ` [PATCH v2 02/11] drm/xe/sriov: Initialize scheduler groups Daniele Ceraolo Spurio
2025-12-07 21:57   ` Michal Wajdeczko
2025-12-08 17:36     ` Daniele Ceraolo Spurio
2025-12-06 23:03 ` [PATCH v2 03/11] drm/xe/sriov: Add support for enabling " Daniele Ceraolo Spurio
2025-12-07 21:57   ` Michal Wajdeczko
2025-12-08 17:41     ` Daniele Ceraolo Spurio
2025-12-06 23:04 ` [PATCH v2 04/11] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc Daniele Ceraolo Spurio
2025-12-07 21:58   ` Michal Wajdeczko
2025-12-08 17:48     ` Daniele Ceraolo Spurio
2025-12-06 23:04 ` [PATCH v2 05/11] drm/xe/sriov: Add handling for MLRC adverse event threshold Daniele Ceraolo Spurio
2025-12-07 22:03   ` Michal Wajdeczko
2025-12-08 17:52     ` Daniele Ceraolo Spurio
2025-12-08 18:27       ` Daniele Ceraolo Spurio
2025-12-06 23:04 ` [PATCH v2 06/11] drm/xe/sriov: Add debugfs to enable scheduler groups Daniele Ceraolo Spurio
2025-12-08 23:38   ` Michal Wajdeczko
2025-12-09  0:36     ` Daniele Ceraolo Spurio
2025-12-09 15:07       ` Michal Wajdeczko
2025-12-09 18:09         ` Daniele Ceraolo Spurio
2025-12-06 23:04 ` [PATCH v2 07/11] drm/xe/sriov: Add debugfs with scheduler groups information Daniele Ceraolo Spurio
2025-12-09  0:08   ` Michal Wajdeczko
2025-12-09  0:23     ` Daniele Ceraolo Spurio
2025-12-06 23:04 ` [PATCH v2 08/11] drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts Daniele Ceraolo Spurio
2025-12-06 23:04 ` [PATCH v2 09/11] drm/xe/sriov: Add functions to set exec quantums for each group Daniele Ceraolo Spurio
2025-12-06 23:04 ` [PATCH v2 10/11] drm/xe/sriov: Add functions to set preempt timeouts " Daniele Ceraolo Spurio
2025-12-06 23:04 ` [PATCH v2 11/11] drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups Daniele Ceraolo Spurio
2025-12-06 23:10 ` ✗ CI.checkpatch: warning for Introduce SRIOV scheduler groups (rev2) Patchwork
2025-12-06 23:11 ` ✓ CI.KUnit: success " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox