Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH v3 03/12] drm/xe/sriov: Initialize scheduler groups
Date: Thu, 11 Dec 2025 19:52:36 +0100	[thread overview]
Message-ID: <143fc553-229e-4bf8-8494-abb06cf7ade2@intel.com> (raw)
In-Reply-To: <20251211015700.34266-17-daniele.ceraolospurio@intel.com>



On 12/11/2025 2:57 AM, Daniele Ceraolo Spurio wrote:
> Scheduler groups (a.k.a. Engine Groups Scheduling, or EGS) is a GuC
> feature that allows the driver to define groups of engines that are
> independently scheduled across VFs, which allows different VFs to be
> active on the HW at the same time on different groups. The feature is
> available for BMG and newer HW starting on GuC 70.53.0, but some
> required fixes have been added to GuC 70.55.1.
> 
> This is intended for specific scenarios where the admin knows that the
> VFs are not going to fully utilize the HW and therefore assigning all of
> it to a single VF would lead to part of it being permanently idle.
> We do not allow the admin to decide how to divide the engines across
> groups, but we instead support specific configurations that are designed
> for specific use-cases. During PF initialization we detect which
> configurations are possible on a given GT and create the relevant
> groups. Since the GuC expect a mask for each class for each group, that
> is what we save when we init the configs.
> 
> Right now we only have one use-case on the media GT. If the VFs are
> running a frame render + encoding at a not-too-high resolution (e.g.
> 1080@30fps) the render can produce frames faster than the video engine
> can encode them, which means that the maximum number of parallel VFs is
> limited by the VCS bandwidth. Since our products can have multiple VCS
> engines, allowing multiple VFs to be active on the different VCS engines
> at the same time allows us to run more parallel VFs on the same HW.
> Given that engines in the same media slice share some resources (e.g.
> SFC), we assign each media slice to a different scheduling group. We
> refer to this configuration as "media_slices", given that each slice
> gets its own group. Since upcoming products have a different number of
> video engines per-slice, for now we limit the media_slices mode to BMG,
> but we expect to add support for newer HW soon.
> 
> Note that while the GuC interface supports a maximum of 8 groups, the
> actual number of groups that can be enabled can be lower than that and
> can be different on different devices. For now, all devices support up
> to 2 groups.
> 
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
> v2: Use asserts for coding errors, code cleanups, better docs (Michal),
>     limit groups to 2, limit to BMG and newer, bump required GuC to
>     70.55.1.
> v3: Use a struct sched_groups array instead of an array of u32 masks,
>     move the max_group check to the next patch, rename NONE to
>     DISABLED (Michal), limit the media_slices mode to BMG.
> ---
>  drivers/gpu/drm/xe/abi/guc_scheduler_abi.h    |   9 ++
>  drivers/gpu/drm/xe/xe_gt.h                    |   3 +
>  drivers/gpu/drm/xe/xe_gt_sriov_pf.c           |   3 +
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 142 ++++++++++++++++++
>  drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   2 +
>  .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |  33 ++++
>  6 files changed, 192 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/abi/guc_scheduler_abi.h b/drivers/gpu/drm/xe/abi/guc_scheduler_abi.h
> index db9c171f8b64..513b22a87428 100644
> --- a/drivers/gpu/drm/xe/abi/guc_scheduler_abi.h
> +++ b/drivers/gpu/drm/xe/abi/guc_scheduler_abi.h
> @@ -6,6 +6,8 @@
>  #ifndef _ABI_GUC_SCHEDULER_ABI_H
>  #define _ABI_GUC_SCHEDULER_ABI_H
>  
> +#include <linux/types.h>
> +
>  /**
>   * Generic defines required for registration with and submissions to the GuC
>   * scheduler. Includes engine class/instance defines and context attributes
> @@ -45,4 +47,11 @@
>  #define GUC_CONTEXT_DISABLE		0
>  #define GUC_CONTEXT_ENABLE		1
>  
> +/* scheduler groups */
> +#define GUC_MAX_SCHED_GROUPS		8
> +
> +struct guc_sched_group {
> +	u32 engines[GUC_MAX_ENGINE_CLASSES];
> +} __packed;
> +
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
> index a2ba80c954a6..de7e47763411 100644
> --- a/drivers/gpu/drm/xe/xe_gt.h
> +++ b/drivers/gpu/drm/xe/xe_gt.h
> @@ -29,6 +29,9 @@
>  #define CCS_INSTANCES(gt) XE_ENGINE_INSTANCES_FROM_MASK(gt, CCS)
>  #define GSCCS_INSTANCES(gt) XE_ENGINE_INSTANCES_FROM_MASK(gt, GSCCS)
>  
> +/* Our devices have up to 4 media slices */
> +#define MAX_MEDIA_SLICES 4
> +
>  #define GT_VER(gt) ({ \
>  	typeof(gt) gt_ = (gt); \
>  	struct xe_device *xe = gt_to_xe(gt_); \
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> index 0714c758b9c1..0d97a823e702 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> @@ -14,6 +14,7 @@
>  #include "xe_gt_sriov_pf_control.h"
>  #include "xe_gt_sriov_pf_helpers.h"
>  #include "xe_gt_sriov_pf_migration.h"
> +#include "xe_gt_sriov_pf_policy.h"
>  #include "xe_gt_sriov_pf_service.h"
>  #include "xe_gt_sriov_printk.h"
>  #include "xe_guc_submit.h"
> @@ -123,6 +124,8 @@ int xe_gt_sriov_pf_init(struct xe_gt *gt)
>  	if (err)
>  		return err;
>  
> +	xe_gt_sriov_pf_policy_init(gt);
> +
>  	err = xe_gt_sriov_pf_migration_init(gt);
>  	if (err)
>  		return err;
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> index 4445f660e6d1..003860661687 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> @@ -3,6 +3,8 @@
>   * Copyright © 2023-2024 Intel Corporation
>   */
>  
> +#include <drm/drm_managed.h>
> +
>  #include "abi/guc_actions_sriov_abi.h"
>  
>  #include "xe_bo.h"
> @@ -10,6 +12,7 @@
>  #include "xe_gt_sriov_pf_helpers.h"
>  #include "xe_gt_sriov_pf_policy.h"
>  #include "xe_gt_sriov_printk.h"
> +#include "xe_guc.h"
>  #include "xe_guc_buf.h"
>  #include "xe_guc_ct.h"
>  #include "xe_guc_klv_helpers.h"
> @@ -351,6 +354,133 @@ u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt)
>  	return value;
>  }
>  
> +static void pf_sched_group_media_slices(struct xe_gt *gt, struct guc_sched_group **groups,
> +					u32 *num_groups)
> +{
> +	u8 slice_to_group[MAX_MEDIA_SLICES];
> +	u32 vecs_mask = VECS_INSTANCES(gt);
> +	u32 gsc_mask = GSCCS_INSTANCES(gt);
> +	u32 vcs_mask = VCS_INSTANCES(gt);
> +	struct guc_sched_group *values;
> +	struct xe_hw_engine *hwe;
> +	enum xe_hw_engine_id id;
> +	int group = 0;
> +	int slice;
> +
> +	xe_gt_assert(gt, xe_gt_is_media_type(gt));
> +
> +	/*
> +	 * Post-BMG the matching of video engines to slices changes, so for now
> +	 * we don't allow this mode on those platforms.
> +	 */
> +	if (gt_to_xe(gt)->info.platform > XE_BATTLEMAGE)
> +		return;
> +
> +	/*
> +	 * On BMG and older platforms a media slice has 2 VCS and a VECS. We
> +	 * bundle the GSC with the first slice.
> +	 */
> +	for (slice = 0; slice < MAX_MEDIA_SLICES; slice++) {
> +		if ((vcs_mask & 0x3) || (vecs_mask & 0x1) || (gsc_mask & 0x1))
> +			slice_to_group[slice] = group++;
> +
> +		vcs_mask >>= 2;
> +		vecs_mask >>= 1;
> +		gsc_mask >>= 1;
> +	}
> +
> +	xe_gt_assert(gt, !vcs_mask);
> +	xe_gt_assert(gt, !vecs_mask);
> +	xe_gt_assert(gt, !gsc_mask);
> +
> +	/* We need at least 2 slices to split them up */
> +	if (group < 2)
> +		return;
> +
> +	/* The GuC expects an array with a guc_sched_group entry for each group */
> +	values = drmm_kcalloc(&gt_to_xe(gt)->drm, group, sizeof(struct guc_sched_group),
> +			      GFP_KERNEL);
> +	if (!values)
> +		return;
> +
> +	for_each_hw_engine(hwe, gt, id) {
> +		u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
> +
> +		switch (hwe->class) {
> +		case XE_ENGINE_CLASS_VIDEO_DECODE:
> +			slice = hwe->instance / 2;
> +			break;
> +		case XE_ENGINE_CLASS_VIDEO_ENHANCE:
> +			slice = hwe->instance;
> +			break;
> +		case XE_ENGINE_CLASS_OTHER:
> +			slice = 0;
> +			break;
> +		default:
> +			xe_gt_assert_msg(gt, false,
> +					 "unknown media gt class %u (%s) during EGS setup\n",
> +					 hwe->class, hwe->name);
> +			slice = 0;
> +		}
> +
> +		values[slice_to_group[slice]].engines[guc_class] |= BIT(hwe->logical_instance);
> +	}
> +
> +	*groups = values;
> +	*num_groups = group;
> +}
> +
> +/**
> + * xe_sriov_gt_pf_policy_has_sched_groups_support() - Checks whether scheduler
> + * groups are supported.
> + * @gt: the &xe_gt
> + *
> + * This function can only be called on PF.
> + *
> + * Return: true if scheduler groups are supported, false otherwise.
> + */
> +bool xe_sriov_gt_pf_policy_has_sched_groups_support(struct xe_gt *gt)
> +{
> +	xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +
> +	/*
> +	 * The GuC supports scheduler groups from v70.53.0, but a fix for it has
> +	 * been merged in v70.55.1, so we require the latter. The feature is
> +	 * also only enabled on BMG and newer FW.
> +	 */
> +	return GUC_FIRMWARE_VER(&gt->uc.guc) >= MAKE_GUC_VER(70, 55, 1) &&
> +	       gt_to_xe(gt)->info.platform >= XE_BATTLEMAGE;
> +}
> +
> +static void pf_init_sched_groups(struct xe_gt *gt)
> +{
> +	int m;
> +
> +	if (!xe_sriov_gt_pf_policy_has_sched_groups_support(gt))
> +		return;
> +
> +	for (m = 0; m < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; m++) {
> +		u32 *num_groups = &gt->sriov.pf.policy.guc.sched_groups.modes[m].num_groups;
> +		struct guc_sched_group **groups =
> +			&gt->sriov.pf.policy.guc.sched_groups.modes[m].groups;
> +
> +		switch (m) {
> +		case XE_SRIOV_SCHED_GROUPS_DISABLED:
> +			break;

nit: since we know that for DISABLED we have nothing to do in this loop,
so maybe we should not start with it?

	for (m = XE_SRIOV_SCHED_GROUPS_DISABLED + 1, ...
	for (m = XE_SRIOV_SCHED_GROUPS_DISABLED; ++m < ...
	

> +		case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
> +			/* this mode only has groups on the media GT */
> +			if (xe_gt_is_media_type(gt))
> +				pf_sched_group_media_slices(gt, groups, num_groups);
> +			break;
> +		default:

nit: since XE_SRIOV_SCHED_GROUPS are defined as enum, then if we declare 'm' as enum,
instead of this 'default' case we could just have dummy case for COUNT
and compiler will help us catch any new missed GROUP mode

> +			xe_gt_assert_msg(gt, false, "unknown sched group mode %u\n", m);
> +			return;

hmm, IIRC those modes are supposed to be per device, not per-GT,
so there is a high chance that in this loop we will have to handle non-media modes,
so maybe this assert and early return is too much?

> +		}
> +
> +		xe_gt_assert(gt, *num_groups < GUC_MAX_SCHED_GROUPS);
> +	}
> +}
> +
>  static void pf_sanitize_guc_policies(struct xe_gt *gt)
>  {
>  	pf_sanitize_sched_if_idle(gt);
> @@ -401,6 +531,18 @@ int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset)
>  	return err ? -ENXIO : 0;
>  }
>  
> +/**
> + * xe_gt_sriov_pf_policy_init() - Initializes the SW state of the PF policies.
> + * @gt: the &xe_gt
> + *
> + * This function can only be called on PF. This function does not touch the HW,
> + * but must be called after the engines have been initialized.
> + */
> +void xe_gt_sriov_pf_policy_init(struct xe_gt *gt)
> +{
> +	pf_init_sched_groups(gt);
> +}
> +
>  static void print_guc_policies(struct drm_printer *p, struct xe_gt_sriov_guc_policies *policy)
>  {
>  	drm_printf(p, "%s:\t%s\n",
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> index 2a5dc33dc6d7..f5e3b2595063 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> @@ -17,7 +17,9 @@ int xe_gt_sriov_pf_policy_set_reset_engine(struct xe_gt *gt, bool enable);
>  bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
>  int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
>  u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
> +bool xe_sriov_gt_pf_policy_has_sched_groups_support(struct xe_gt *gt);
>  
> +void xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
>  void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
>  int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset);
>  int xe_gt_sriov_pf_policy_print(struct xe_gt *gt, struct drm_printer *p);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> index 4de532af135e..d228cadcd8b0 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> @@ -8,16 +8,49 @@
>  
>  #include <linux/types.h>
>  
> +#include "abi/guc_scheduler_abi.h"
> +
> +/**
> + * enum xe_sriov_sched_group_modes - list of possible scheduler group modes
> + * @XE_SRIOV_SCHED_GROUPS_DISABLED - no separate groups (i.e., all engines in group 0)
> + * @XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES - separate groups for each media slice
> + * @XE_SRIOV_SCHED_GROUPS_MODES_COUNT - number of valid modes
> + */
> +enum xe_sriov_sched_group_modes {
> +	XE_SRIOV_SCHED_GROUPS_DISABLED = 0,
> +	XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES,
> +	XE_SRIOV_SCHED_GROUPS_MODES_COUNT

nit: maybe to emphasize that COUNT enumerator is not a real group mode, prefix it with underscore ?

	__XE_SRIOV_SCHED_GROUPS_MODES_COUNT /* must be last */

> +};
> +
> +/**
> + * struct xe_gt_sriov_scheduler_groups - Scheduler groups policy info
> + * @modes: array of masks and their number for each mode
> + * @modes.groups: array of engine instance groups in given mode, with each group
> + *                consisting of GUC_MAX_ENGINE_CLASSES engine instances masks. A
> + *                A NULL value indicates that all the engines are in the same
> + *                group for this mode on this GT.
> + * @modes.num_groups: number of groups in given mode, zero if all the engines
> + *                    are in the same group.
> + */
> +struct xe_gt_sriov_scheduler_groups {
> +	struct {
> +		struct guc_sched_group *groups;
> +		u32 num_groups;
> +	} modes[XE_SRIOV_SCHED_GROUPS_MODES_COUNT];
> +};
> +
>  /**
>   * struct xe_gt_sriov_guc_policies - GuC SR-IOV policies.
>   * @sched_if_idle: controls strict scheduling policy.
>   * @reset_engine: controls engines reset on VF switch policy.
>   * @sample_period: adverse events sampling period (in milliseconds).
> + * @sched_groups: available scheduling group configurations.
>   */
>  struct xe_gt_sriov_guc_policies {
>  	bool sched_if_idle;
>  	bool reset_engine;
>  	u32 sample_period;
> +	struct xe_gt_sriov_scheduler_groups sched_groups;
>  };
>  
>  /**

mostly nits, so

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>



  reply	other threads:[~2025-12-11 18:52 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-11  1:56 [PATCH v3 00/12] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
2025-12-11  1:57 ` [PATCH v3 01/12] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
2025-12-11 18:19   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 02/12] drm/gt/guc: extract scheduler-related defines from guc_fwif.h Daniele Ceraolo Spurio
2025-12-11 18:20   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 03/12] drm/xe/sriov: Initialize scheduler groups Daniele Ceraolo Spurio
2025-12-11 18:52   ` Michal Wajdeczko [this message]
2025-12-11 22:55     ` Daniele Ceraolo Spurio
2025-12-11  1:57 ` [PATCH v3 04/12] drm/xe/sriov: Add support for enabling " Daniele Ceraolo Spurio
2025-12-11 18:59   ` Michal Wajdeczko
2025-12-11 23:00     ` Daniele Ceraolo Spurio
2025-12-11  1:57 ` [PATCH v3 05/12] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc Daniele Ceraolo Spurio
2025-12-11 19:05   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 06/12] drm/xe/sriov: Add handling for MLRC adverse event threshold Daniele Ceraolo Spurio
2025-12-11 23:19   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 07/12] drm/xe/sriov: Add debugfs to enable scheduler groups Daniele Ceraolo Spurio
2025-12-11 21:07   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 08/12] drm/xe/sriov: Add debugfs with scheduler groups information Daniele Ceraolo Spurio
2025-12-11 22:40   ` Michal Wajdeczko
2025-12-11 22:44     ` Daniele Ceraolo Spurio
2025-12-11  1:57 ` [PATCH v3 09/12] drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts Daniele Ceraolo Spurio
2025-12-11 22:41   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 10/12] drm/xe/sriov: Add functions to set exec quantums for each group Daniele Ceraolo Spurio
2025-12-11 22:47   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 11/12] drm/xe/sriov: Add functions to set preempt timeouts " Daniele Ceraolo Spurio
2025-12-11 22:49   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 12/12] drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups Daniele Ceraolo Spurio
2025-12-11 23:07   ` Michal Wajdeczko
2025-12-11  2:31 ` ✗ CI.checkpatch: warning for Introduce SRIOV scheduler groups (rev3) Patchwork
2025-12-11  2:32 ` ✓ CI.KUnit: success " Patchwork
2025-12-11  3:34 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-11 10:47 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=143fc553-229e-4bf8-8494-abb06cf7ade2@intel.com \
    --to=michal.wajdeczko@intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox