From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
<intel-xe@lists.freedesktop.org>
Subject: Re: [PATCH 02/10] drm/xe/sriov: Initialize scheduler groups
Date: Mon, 1 Dec 2025 23:37:12 +0100 [thread overview]
Message-ID: <0254302e-35cf-4721-84d8-d0c95c2e6d62@intel.com> (raw)
In-Reply-To: <20251127014507.2323746-14-daniele.ceraolospurio@intel.com>
Hi Daniele,
some initial comments to keep you busy ;)
On 11/27/2025 2:45 AM, Daniele Ceraolo Spurio wrote:
> Scheduler groups (a.k.a. Engine Groups Scheduling, or EGS) is a GuC
> feature that allows the driver to define groups of engines that are
> independently scheduled across VFs, which allows different VFs to be
> active on the HW at the same time on different groups.
>
> This is intended for specific scenarios where the admin knows that the
> VFs are not going to fully utilize the HW and therefore assigning all of
> it to a single VF would lead to part of it being permanently idle.
> We do not allow the admin to decide how to divide the engines across
> groups, but we instead support specific configurations that are designed
> for specific use-cases. During PF initialization we detect which
> configurations are possible on a given GT and create the relevant
> groups. Since the GuC expect a mask for each class for each group, that
> is what we save when we init the configs.
>
> Right now we only have one use-case on the media GT. If the VFs are
> running a frame render + encoding at a not-too-high resolution (e.g.
> 1080@30fps) the render can produce frames faster than the video engine
> can encode them, which means that the maximum number of parallel VFs is
> limited by the VCS bandwidth. Since our products can have multiple VCS
> engines, allowing multiple VFs to be active on the different VCS engines
> at the same time allows us to run more parallel VFs on the same HW.
> Given that engines in the same media slice share some resources (e.g.
> SFC), we assign each media slice to a different scheduling group. We
> refer to this configuration as "media_slices", given that each slice
> gets its own group.
>
> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
> ---
> drivers/gpu/drm/xe/xe_gt_sriov_pf.c | 5 +
> drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c | 135 ++++++++++++++++++
> drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h | 1 +
> .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h | 14 ++
> 4 files changed, 155 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> index 0714c758b9c1..62dda1c24e77 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf.c
> @@ -14,6 +14,7 @@
> #include "xe_gt_sriov_pf_control.h"
> #include "xe_gt_sriov_pf_helpers.h"
> #include "xe_gt_sriov_pf_migration.h"
> +#include "xe_gt_sriov_pf_policy.h"
> #include "xe_gt_sriov_pf_service.h"
> #include "xe_gt_sriov_printk.h"
> #include "xe_guc_submit.h"
> @@ -123,6 +124,10 @@ int xe_gt_sriov_pf_init(struct xe_gt *gt)
> if (err)
> return err;
>
> + err = xe_gt_sriov_pf_policy_init(gt);
> + if (err)
> + return err;
> +
> err = xe_gt_sriov_pf_migration_init(gt);
> if (err)
> return err;
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> index 4445f660e6d1..9b878578ea90 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c
> @@ -5,11 +5,14 @@
>
> #include "abi/guc_actions_sriov_abi.h"
>
> +#include <drm/drm_managed.h>
system headers shall be included before our local headers
> +
> #include "xe_bo.h"
> #include "xe_gt.h"
> #include "xe_gt_sriov_pf_helpers.h"
> #include "xe_gt_sriov_pf_policy.h"
> #include "xe_gt_sriov_printk.h"
> +#include "xe_guc.h"
> #include "xe_guc_buf.h"
> #include "xe_guc_ct.h"
> #include "xe_guc_klv_helpers.h"
> @@ -351,6 +354,125 @@ u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt)
> return value;
> }
>
> +#define MAX_MEDIA_SLICES (hweight32(XE_HW_ENGINE_VECS_MASK))
redundant ( )
and maybe such macro should be placed in xe_hw_engine_types.h
with proper comment why it works:
/*
* Each media slice has 1x VECS
* Max number of VECS instances gives us a max number of slices
*/
> +static int pf_sched_group_media_slices(struct xe_gt *gt, u32 **masks, u32 *num_masks)
> +{
> + u8 slice_to_group[MAX_MEDIA_SLICES];
> + struct xe_hw_engine *hwe = NULL;
do we need to initialize this here?
it will be initialized by for_each_hw_engine, no?
> + enum xe_hw_engine_id id;
> + u32 vcs_mask = VCS_MASK(gt);
> + u32 vecs_mask = VECS_MASK(gt);
> + u32 gsc_mask = GSCCS_MASK(gt);
we try to define vars in rev-xmas-tree order
> + u32 *values;
> + u8 slice;
> + u8 groups;
I guess for those two generic counters we don't need to use explicitly sized int
> +
> + xe_gt_assert(gt, xe_gt_is_media_type(gt));
> +
> + /* A media slice has 2 VCS and a VECS. We bundle the GSC with the first slice */
> + for (slice = 0, groups = 0;
> + gsc_mask || vcs_mask || vecs_mask;
> + slice++, gsc_mask = 0, vcs_mask >>= 2, vecs_mask >>= 1) {
> + if (unlikely(slice >= MAX_MEDIA_SLICES)) {
maybe this 'for' loop should be just for 'slice = [0, MAX)'
> + xe_gt_sriov_err(gt, "Too many media slices (%u) during EGS setup\n",
> + slice);
then after this loop we can have asserts that no engine instances are left behind
> + return -EINVAL;
> + }
> +
> + if ((vcs_mask & 0x3) || (vecs_mask & 0x1) || (gsc_mask & 0x1))
> + slice_to_group[slice] = groups++;
> + }
like this:
for (slice = 0; slice < MAX_MEDIA_SLICES; slice++ ) {
if ((vcs_mask & 0x3) || (vecs_mask & 0x1) || (gsc_mask & 0x1))
slice_to_group[slice] = groups++;
vcs_mask >>= 2;
vecs_mask >>= 1;
gsc_mask >>= 1;
}
xe_gt_assert(gt, !vcs_mask)
xe_gt_assert(gt, !vecs_mask)
xe_gt_assert(gt, !gsc_mask)
> +
> + /* We need at least 2 slices to split them up */
shouldn't we also check that each group has both VCS and VECS?
> + if (groups < 2) {
> + *masks = NULL;
> + *num_masks = 0;
> + return 0;
> + }
> +
> + /*
> + * The GuC expects and array with GUC_MAX_ENGINE_CLASSES entries for
typo: an array ?
> + * each group.
> + */
> + values = drmm_kzalloc(>_to_xe(gt)->drm,
> + GUC_MAX_ENGINE_CLASSES * groups * sizeof(u32),
> + GFP_KERNEL);
> + if (!values)
> + return -ENOMEM;
> +
> + for_each_hw_engine(hwe, gt, id) {
> + u8 guc_class = xe_engine_class_to_guc_class(hwe->class);
> + u8 entry;
> +
> + switch (hwe->class) {
> + case XE_ENGINE_CLASS_VIDEO_DECODE:
> + slice = hwe->instance / 2;
hmm, this seems to duplicate previous loop where we had:
vcs_mask >>= 2;
maybe these two loops can be combined?
> + break;
> + case XE_ENGINE_CLASS_VIDEO_ENHANCE:
> + slice = hwe->instance;
> + break;
> + case XE_ENGINE_CLASS_OTHER:
> + slice = 0;
> + break;
> + default:
> + xe_gt_sriov_err(gt, "unknown media gt class %u (%s) during EGS setup\n",
> + hwe->class, hwe->name);
this also seems like an attempt to catch our coding error,
so plain assert should be sufficient
> + drmm_kfree(>_to_xe(gt)->drm, values);
do we need to bother with this kfree ?
it's managed allocation and any error from here will lead to abort the probe anyway
> + return -EINVAL;
> + }
> +
> + entry = (slice_to_group[slice] * GUC_MAX_ENGINE_CLASSES) + guc_class;
redundant ( )
> + values[entry] |= BIT(hwe->logical_instance);
> + }
> +
> + *masks = values;
> + *num_masks = GUC_MAX_ENGINE_CLASSES * groups;
> +
> + return 0;
> +}
> +
> +static int pf_init_sched_groups(struct xe_gt *gt)
> +{
> + int err;
> + int m;
> +
> + xe_gt_assert(gt, IS_SRIOV_PF(gt_to_xe(gt)));
> +
> + if (GUC_SUBMIT_VER(>->uc.guc) < MAKE_GUC_VER(1, 26, 0))
> + return 0;
> +
> + for (m = 0; m < XE_SRIOV_SCHED_GROUPS_MODES_COUNT; m++) {
> + u32 *masks = NULL;
> + u32 num_masks = 0;
> +
> + switch (m) {
> + case XE_SRIOV_SCHED_GROUPS_NONE:
> + break;
> + case XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES:
> + /* this mode only has groups on the media GT */
then having array of all modes on every GT seems non-optimal
> + if (xe_gt_is_media_type(gt)) {
> + err = pf_sched_group_media_slices(gt, &masks, &num_masks);
> + if (err)
> + return err;
> + }
> + break;
> + default:
> + xe_gt_sriov_err(gt, "unknown sched group mode %u\n", m);
this looks more like a coding error, so maybe use just assert?
> + return -EINVAL;
> + }
> +
> + xe_gt_assert(gt, (num_masks % GUC_MAX_ENGINE_CLASSES) == 0);
> +
> + if ((m == XE_SRIOV_SCHED_GROUPS_NONE) || num_masks)
> + gt->sriov.pf.policy.guc.sched_groups.supported_modes |= BIT(m);
> +
> + gt->sriov.pf.policy.guc.sched_groups.modes[m].masks = masks;
> + gt->sriov.pf.policy.guc.sched_groups.modes[m].num_masks = num_masks;
> + }
> +
> + return 0;
> +}
> +
> static void pf_sanitize_guc_policies(struct xe_gt *gt)
> {
> pf_sanitize_sched_if_idle(gt);
> @@ -401,6 +523,19 @@ int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset)
> return err ? -ENXIO : 0;
> }
>
> +/**
> + * xe_gt_sriov_pf_policy_init - Initializes the SW state of the PF policies.
> + * @gt: the &xe_gt
> + *
> + * This function can only be called on PF. This function does not touch the HW.
but must be called after checking HW engine fuses, right?
> + *
> + * Return: 0 on success or a negative error code on failure.
> + */
> +int xe_gt_sriov_pf_policy_init(struct xe_gt *gt)
> +{
> + return pf_init_sched_groups(gt);
> +}
> +
> static void print_guc_policies(struct drm_printer *p, struct xe_gt_sriov_guc_policies *policy)
> {
> drm_printf(p, "%s:\t%s\n",
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> index 2a5dc33dc6d7..c9c04d1b7f50 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h
> @@ -18,6 +18,7 @@ bool xe_gt_sriov_pf_policy_get_reset_engine(struct xe_gt *gt);
> int xe_gt_sriov_pf_policy_set_sample_period(struct xe_gt *gt, u32 value);
> u32 xe_gt_sriov_pf_policy_get_sample_period(struct xe_gt *gt);
>
> +int xe_gt_sriov_pf_policy_init(struct xe_gt *gt);
> void xe_gt_sriov_pf_policy_sanitize(struct xe_gt *gt);
> int xe_gt_sriov_pf_policy_reprovision(struct xe_gt *gt, bool reset);
> int xe_gt_sriov_pf_policy_print(struct xe_gt *gt, struct drm_printer *p);
> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> index 4de532af135e..3b915801c01b 100644
> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_policy_types.h
> @@ -8,16 +8,30 @@
>
> #include <linux/types.h>
>
we expect all enums (and enumerators) to have proper kernel-doc
> +enum xe_sriov_sched_group_modes {
> + XE_SRIOV_SCHED_GROUPS_NONE = 0, /* disabled */
do we really need this?
> + XE_SRIOV_SCHED_GROUPS_MEDIA_SLICES, /* separate groups for each media slice */
> + XE_SRIOV_SCHED_GROUPS_MODES_COUNT
while it looks handy solution to count enumerators,
now COUNT pretends to be a valid mode enumerator while it is not
maybe just provide separate #define for it?
> +};
> +
> /**
> * struct xe_gt_sriov_guc_policies - GuC SR-IOV policies.
> * @sched_if_idle: controls strict scheduling policy.
> * @reset_engine: controls engines reset on VF switch policy.
> * @sample_period: adverse events sampling period (in milliseconds).
> + * @sched_groups: available scheduling group configurations and current mode.
don't forget to describe inner members
and there is no 'current' now
> */
> struct xe_gt_sriov_guc_policies {
> bool sched_if_idle;
> bool reset_engine;
> u32 sample_period;
> + struct {
> + u32 supported_modes;
> + struct {
> + u32 *masks;
> + u32 num_masks;
> + } modes[XE_SRIOV_SCHED_GROUPS_MODES_COUNT];
hmm, for the NONE mode, all groups must be 0, so why bother with .masks/.num fields for it?
maybe all we need is:
struct {
const char *name;
u32 *config;
} modes[];
then:
supported_modes := iterate over modes array and look for non-null config/name
none := supported_modes is empty
current := pointer to modes[] or NULL
> + } sched_groups;
> };
>
> /**
next prev parent reply other threads:[~2025-12-01 22:37 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-27 1:45 [PATCH 00/10] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
2025-11-27 1:45 ` [PATCH 01/10] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
2025-12-01 16:52 ` Michal Wajdeczko
2025-11-27 1:45 ` [PATCH 02/10] drm/xe/sriov: Initialize scheduler groups Daniele Ceraolo Spurio
2025-12-01 22:37 ` Michal Wajdeczko [this message]
2025-12-01 23:33 ` Daniele Ceraolo Spurio
2025-12-02 21:08 ` Michal Wajdeczko
2025-12-02 23:02 ` Daniele Ceraolo Spurio
2025-12-03 1:15 ` Daniele Ceraolo Spurio
2025-11-27 1:45 ` [PATCH 03/10] drm/xe/sriov: Add support for enabling " Daniele Ceraolo Spurio
2025-12-02 11:49 ` Michal Wajdeczko
2025-12-02 17:39 ` Daniele Ceraolo Spurio
2025-12-04 22:06 ` Daniele Ceraolo Spurio
2025-11-27 1:45 ` [PATCH 04/10] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc Daniele Ceraolo Spurio
2025-12-02 13:32 ` Michal Wajdeczko
2025-12-02 17:57 ` Daniele Ceraolo Spurio
2025-12-02 21:17 ` Michal Wajdeczko
2025-12-02 21:25 ` Daniele Ceraolo Spurio
2025-12-02 21:37 ` Michal Wajdeczko
2025-12-02 21:42 ` Daniele Ceraolo Spurio
2025-11-27 1:45 ` [PATCH 05/10] drm/xe/sriov: Add debugfs to enable scheduler groups Daniele Ceraolo Spurio
2025-12-02 15:52 ` Michal Wajdeczko
2025-12-02 18:03 ` Daniele Ceraolo Spurio
2025-12-02 21:24 ` Michal Wajdeczko
2025-11-27 1:45 ` [PATCH 06/10] drm/xe/sriov: Add debugfs with scheduler groups information Daniele Ceraolo Spurio
2025-12-02 16:24 ` Michal Wajdeczko
2025-12-02 18:20 ` Daniele Ceraolo Spurio
2025-12-02 21:31 ` Michal Wajdeczko
2025-11-27 1:45 ` [PATCH 07/10] drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts Daniele Ceraolo Spurio
2025-12-02 16:42 ` Michal Wajdeczko
2025-12-06 1:55 ` Daniele Ceraolo Spurio
2025-11-27 1:45 ` [PATCH 08/10] drm/xe/sriov: Add functions to set exec quantums for each group Daniele Ceraolo Spurio
2025-12-02 19:54 ` Michal Wajdeczko
2025-12-06 1:58 ` Daniele Ceraolo Spurio
2025-11-27 1:45 ` [PATCH 09/10] drm/xe/sriov: Add functions to set preempt timeouts " Daniele Ceraolo Spurio
2025-12-02 20:01 ` Michal Wajdeczko
2025-11-27 1:45 ` [PATCH 10/10] drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups Daniele Ceraolo Spurio
2025-12-02 20:17 ` Michal Wajdeczko
2025-12-06 1:53 ` Daniele Ceraolo Spurio
2025-11-27 1:51 ` ✗ CI.checkpatch: warning for Introduce SRIOV " Patchwork
2025-11-27 1:52 ` ✓ CI.KUnit: success " Patchwork
2025-11-27 2:36 ` ✗ Xe.CI.BAT: failure " Patchwork
2025-11-27 3:18 ` ✗ Xe.CI.Full: " Patchwork
2025-12-01 17:46 ` Daniele Ceraolo Spurio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0254302e-35cf-4721-84d8-d0c95c2e6d62@intel.com \
--to=michal.wajdeczko@intel.com \
--cc=daniele.ceraolospurio@intel.com \
--cc=intel-xe@lists.freedesktop.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox