Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/12] Introduce SRIOV scheduler groups
@ 2025-12-11  1:56 Daniele Ceraolo Spurio
  2025-12-11  1:57 ` [PATCH v3 01/12] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
                   ` (15 more replies)
  0 siblings, 16 replies; 32+ messages in thread
From: Daniele Ceraolo Spurio @ 2025-12-11  1:56 UTC (permalink / raw)
  To: intel-xe; +Cc: Daniele Ceraolo Spurio, Michal Wajdeczko

The normal SRIOV setup timeslices the whole GT across VFs. While this is
fine in the great majority of cases, in some cases the admin knows that
a VF is not going to use all the GT HW and that some engines are going
to be permanently idle.
To increase HW utilization in such a scenario, starting from v70.53.0 the
GuC supports scheduler groups (a.k.a. Engine Group Scheduling or EGS);
this feature allows the driver to subdivide a GT into groups of engines,
which the GuC will then independently timeslice across VFs, thus allowing
multiple VF to access the HW at the same time. Given that each group is
independently scheduled, execution quantums and preemption timeouts are
settable per-group-per-VF. Note that while the GuC supports the feature
from v70.53.0, some fixes for it were merged in v70.55.1, so we require
the latter version in the driver.

While the GuC supports any group assignment (as long as each engine
belongs to only one group), we only allow specific tested configuration
to be set by the admin that are tailored to specific use-cases. This
series introduces one of those use cases: if each VF is doing a frame
rendering + encoding at a not-too-high resolution (e.g 1080p@30fps),
like it happens e.g. with a simple remote desktop, the render engine
can produce frames faster than the video engine can encode them.
However, our HW can have multiple video engines, so while one of them is
encoding a frame for a VF the other ones can be used for encoding frames
for other VFs. Given that media slices share some resources (e.g. SFC),
to obtain this parallel execution without impacting VF isolation we can
simply assign each media slice to a different group.

This series only allows enabling/disbling of this feature via debugfs
for now (like several other SRIOV features). Sysfs will be implemented
as a follow up, after the review of this series and the proposed
interface is complete.

The feature is enabled and disabled via the sched_groups_mode PF debugfs.
If any configs are supported on the GT, reading this file will dump the
available configs and which one is selected, e.g:

#cat sriov/pf/tile0/gt1/sched_groups_mode
[disabled] media_slices

Writing the config name to the file will enable that configuration.
Debugfs files are also available to set the per-group exec_quantum and
preempt_timeout, while a series of files under the sched_groups folder
lists the engines belonging to each group. Overall, the tree looks like
the following:

        /sys/kernel/debug/dri/BDF/
        ├── sriov
        :   ├── pf
            :   ├── tile0
                :   ├── gt0
                    :   ├── sched_groups_mode
                        ├── sched_groups_exec_quantums_ms
                        ├── sched_groups_preempt_timeout_us
                        ├── sched_groups
                        :   ├── group0
                            :
                :           └── groupN
                ├── vf1
                :   ├── tile0
                    :   ├── gt0
                        :   ├── sched_groups_exec_quantums_ms
                            ├── sched_groups_preempt_timeout_us
			    :

IMPORTANT NOTE: this series now requires GuC 70.55.1 or newer, while in
linux-firmware we're still on 70.54.0. The update to GuC 70.55.3 is in
progress, but this series can't be merged until that FW reaches
linux-firmware. We can however still continue the review and get the
patches ready for when the GuC FW is merged. Once that happens I'll also
re-submit the series for CI testing.

v2: several fixes, flow and style improvements, drop per-group EQ/PT
settings, allocate debugfs files statically, bump requirements to BMG+
and GuC v70.55.1.

v3: split out the guc scheduler ABI to use it with the KLV defines,
use the guc_sched_group structure instead of a raw array of masks, don't
split out the guc version macro, limit the media_slice mode to BMG for
now.

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>

Daniele Ceraolo Spurio (12):
  drm/xe/gt: Add engine masks for each class
  drm/gt/guc: extract scheduler-related defines from guc_fwif.h
  drm/xe/sriov: Initialize scheduler groups
  drm/xe/sriov: Add support for enabling scheduler groups
  drm/xe/sriov: Scheduler groups are incompatible with multi-lrc
  drm/xe/sriov: Add handling for MLRC adverse event threshold
  drm/xe/sriov: Add debugfs to enable scheduler groups
  drm/xe/sriov: Add debugfs with scheduler groups information
  drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts
  drm/xe/sriov: Add functions to set exec quantums for each group
  drm/xe/sriov: Add functions to set preempt timeouts for each group
  drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups

 drivers/gpu/drm/xe/abi/guc_klvs_abi.h         |  70 ++++
 drivers/gpu/drm/xe/abi/guc_scheduler_abi.h    |  57 ++++
 drivers/gpu/drm/xe/xe_exec_queue.c            |  19 ++
 drivers/gpu/drm/xe/xe_gt.h                    |  12 +-
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c           |   8 +-
 drivers/gpu/drm/xe/xe_gt_ccs_mode.h           |   2 +-
 drivers/gpu/drm/xe/xe_gt_sriov_pf.c           |  20 ++
 drivers/gpu/drm/xe/xe_gt_sriov_pf.h           |   8 +
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.c    | 298 +++++++++++++++-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_config.h    |  10 +
 .../gpu/drm/xe/xe_gt_sriov_pf_config_types.h  |   5 +-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c   | 315 ++++++++++++++++-
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.c    | 321 ++++++++++++++++++
 drivers/gpu/drm/xe/xe_gt_sriov_pf_policy.h    |   6 +
 .../gpu/drm/xe/xe_gt_sriov_pf_policy_types.h  |  39 +++
 drivers/gpu/drm/xe/xe_gt_sriov_vf.c           |  61 ++++
 drivers/gpu/drm/xe/xe_gt_sriov_vf.h           |   1 +
 drivers/gpu/drm/xe/xe_gt_sriov_vf_types.h     |   2 +
 drivers/gpu/drm/xe/xe_guc.c                   |   2 +-
 drivers/gpu/drm/xe/xe_guc_capture.h           |   2 +-
 drivers/gpu/drm/xe/xe_guc_fwif.h              |  52 +--
 drivers/gpu/drm/xe/xe_guc_klv_helpers.c       |   9 +
 .../drm/xe/xe_guc_klv_thresholds_set_types.h  |  17 +-
 drivers/gpu/drm/xe/xe_guc_submit.c            |  38 ++-
 drivers/gpu/drm/xe/xe_guc_submit.h            |   2 +
 25 files changed, 1285 insertions(+), 91 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/abi/guc_scheduler_abi.h

-- 
2.43.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2025-12-11 23:19 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-11  1:56 [PATCH v3 00/12] Introduce SRIOV scheduler groups Daniele Ceraolo Spurio
2025-12-11  1:57 ` [PATCH v3 01/12] drm/xe/gt: Add engine masks for each class Daniele Ceraolo Spurio
2025-12-11 18:19   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 02/12] drm/gt/guc: extract scheduler-related defines from guc_fwif.h Daniele Ceraolo Spurio
2025-12-11 18:20   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 03/12] drm/xe/sriov: Initialize scheduler groups Daniele Ceraolo Spurio
2025-12-11 18:52   ` Michal Wajdeczko
2025-12-11 22:55     ` Daniele Ceraolo Spurio
2025-12-11  1:57 ` [PATCH v3 04/12] drm/xe/sriov: Add support for enabling " Daniele Ceraolo Spurio
2025-12-11 18:59   ` Michal Wajdeczko
2025-12-11 23:00     ` Daniele Ceraolo Spurio
2025-12-11  1:57 ` [PATCH v3 05/12] drm/xe/sriov: Scheduler groups are incompatible with multi-lrc Daniele Ceraolo Spurio
2025-12-11 19:05   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 06/12] drm/xe/sriov: Add handling for MLRC adverse event threshold Daniele Ceraolo Spurio
2025-12-11 23:19   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 07/12] drm/xe/sriov: Add debugfs to enable scheduler groups Daniele Ceraolo Spurio
2025-12-11 21:07   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 08/12] drm/xe/sriov: Add debugfs with scheduler groups information Daniele Ceraolo Spurio
2025-12-11 22:40   ` Michal Wajdeczko
2025-12-11 22:44     ` Daniele Ceraolo Spurio
2025-12-11  1:57 ` [PATCH v3 09/12] drm/xe/sriov: Prep for multiple exec quantums and preemption timeouts Daniele Ceraolo Spurio
2025-12-11 22:41   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 10/12] drm/xe/sriov: Add functions to set exec quantums for each group Daniele Ceraolo Spurio
2025-12-11 22:47   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 11/12] drm/xe/sriov: Add functions to set preempt timeouts " Daniele Ceraolo Spurio
2025-12-11 22:49   ` Michal Wajdeczko
2025-12-11  1:57 ` [PATCH v3 12/12] drm/xe/sriov: Add debugfs to set EQ and PT for scheduler groups Daniele Ceraolo Spurio
2025-12-11 23:07   ` Michal Wajdeczko
2025-12-11  2:31 ` ✗ CI.checkpatch: warning for Introduce SRIOV scheduler groups (rev3) Patchwork
2025-12-11  2:32 ` ✓ CI.KUnit: success " Patchwork
2025-12-11  3:34 ` ✓ Xe.CI.BAT: " Patchwork
2025-12-11 10:47 ` ✗ Xe.CI.Full: failure " Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox