From: Michal Wajdeczko <michal.wajdeczko@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>,
<intel-xe@lists.freedesktop.org>,
Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Subject: Re: [PATCH v4 22/23] drm/xe/configfs: Add attribute to disable GT types
Date: Wed, 8 Oct 2025 12:12:37 +0200 [thread overview]
Message-ID: <c45b55d0-697d-4e3b-82d3-5cecda6362ff@intel.com> (raw)
In-Reply-To: <20251007204829.1468209-47-matthew.d.roper@intel.com>
On 10/7/2025 10:48 PM, Matt Roper wrote:
> Preventing the driver from initializing GTs of specific type(s) can be
> useful for debugging and early hardware bringup. Add a configfs
> attribute to allow this kind of control for debugging.
>
> With today's platforms and software design, this configuration setting
> is only effective for disabling the media GT since the driver currently
> requires that there always be a primary GT to probe the device. However
> this might change in the future --- in theory it should be possible
> (with some additional driver work) to allow an igpu device to come up
> with only the media GT and no primary GT. Or to allow an igpu device to
> come up with no GTs at all (for display-only usage). A primary GT will
> likely always be required on dgpu platforms because we rely on the BCS
> engines inside the primary GT for various vram operations.
>
> v2:
> - Expand/clarify kerneldoc for configfs attribute. (Gustavo)
> - Tighten type usage in gt_types[] structure. (Gustavo)
> - Adjust string parsing/name matching to match exact GT names and not
> accept partial names. (Gustavo)
>
> v3:
> - Switch to scope-based cleanup in gt_types_allowed_store() to fix a
> leak if the device is already bound. (Gustavo)
> - Switch configfs lookup interface to two boolean functions that
> specify whether primary/media are supported rather than one function
> that returns a mask. This is simpler to use and understand.
>
> Cc: Gustavo Sousa <gustavo.sousa@intel.com>
> Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
> ---
> drivers/gpu/drm/xe/xe_configfs.c | 145 +++++++++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_configfs.h | 4 +
> drivers/gpu/drm/xe/xe_pci.c | 22 +++++
> 3 files changed, 171 insertions(+)
>
> diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
> index 139663423185..e36cc5e1bc8f 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.c
> +++ b/drivers/gpu/drm/xe/xe_configfs.c
> @@ -15,6 +15,7 @@
>
> #include "instructions/xe_mi_commands.h"
> #include "xe_configfs.h"
> +#include "xe_gt_types.h"
> #include "xe_hw_engine_types.h"
> #include "xe_module.h"
> #include "xe_pci_types.h"
> @@ -56,6 +57,7 @@
> * :
> * └── 0000:03:00.0
> * ├── survivability_mode
> + * ├── gt_types_allowed
I'm wondering if we want to keep such advance knobs at the same level as others?
maybe create sub-group for them, like I did for sriov?
└── tweaks
├── gt_types_allowed
├── engines_allowed
> * ├── engines_allowed
> * └── enable_psmi
oops, and it looks that I missed to update this part of the doc when adding max_vfs with:
└── sriov
├── max_vfs
> *
> @@ -79,6 +81,44 @@
> *
> * This attribute can only be set before binding to the device.
> *
> + * Allowed GT types:
> + * -----------------
> + *
> + * Allow only specific types of GTs to be detected and initialized by the
> + * driver. Any combination of GT types can be enabled/disabled, although
> + * some settings will cause the device to fail to probe.
> + *
> + * Writes support both comma- and newline-separated input format. Reads
> + * will always return one GT type per line. "primary" and "media" are the
> + * GT type names supported by this interface.
> + *
> + * This attribute can only be set before binding to the device.
> + *
> + * Examples:
> + *
> + * Allow both primary and media GTs to be initialized and used. This matches
> + * the driver's default behavior::
> + *
> + * # echo 'primary,media' > /sys/kernel/config/xe/0000:03:00.0/gt_types_allowed
maybe "all" as an alias?
> + *
> + * Allow only the primary GT of each tile to be initialized and used,
> + * effectively disabling the media GT if it exists on the platform::
> + *
> + * # echo 'primary' > /sys/kernel/config/xe/0000:03:00.0/gt_types_allowed
> + *
> + * Allow only the media GT of each tile to be initialized and used,
> + * effectively disabling the primary GT. **This configuration will cause
> + * device probe failure on all current platforms, but may be allowed on
> + * igpu platforms in the future**::
> + *
> + * # echo 'media' > /sys/kernel/config/xe/0000:03:00.0/gt_types_allowed
> + *
> + * Disable all GTs. Only other GPU IP (such as display) is potentially usable.
> + * **This configuration will cause device probe failure on all current
> + * platforms, but may be allowed on igpu platforms in the future**::
> + *
> + * # echo '' > /sys/kernel/config/xe/0000:03:00.0/gt_types_allowed
maybe "none" as an alias?
> + *
> * Allowed engines:
> * ----------------
> *
> @@ -187,6 +227,7 @@ struct xe_config_group_device {
> struct config_group group;
>
> struct xe_config_device {
> + u64 gt_types_allowed;
> u64 engines_allowed;
> struct wa_bb ctx_restore_post_bb[XE_ENGINE_CLASS_MAX];
> struct wa_bb ctx_restore_mid_bb[XE_ENGINE_CLASS_MAX];
> @@ -201,6 +242,7 @@ struct xe_config_group_device {
> };
>
> static const struct xe_config_device device_defaults = {
> + .gt_types_allowed = U64_MAX,
> .engines_allowed = U64_MAX,
> .survivability_mode = false,
> .enable_psmi = false,
> @@ -220,6 +262,7 @@ struct engine_info {
> /* Some helpful macros to aid on the sizing of buffer allocation when parsing */
> #define MAX_ENGINE_CLASS_CHARS 5
> #define MAX_ENGINE_INSTANCE_CHARS 2
> +#define MAX_GT_TYPE_CHARS 7
>
> static const struct engine_info engine_info[] = {
> { .cls = "rcs", .mask = XE_HW_ENGINE_RCS_MASK, .engine_class = XE_ENGINE_CLASS_RENDER },
> @@ -230,6 +273,14 @@ static const struct engine_info engine_info[] = {
> { .cls = "gsccs", .mask = XE_HW_ENGINE_GSCCS_MASK, .engine_class = XE_ENGINE_CLASS_OTHER },
> };
>
> +static const struct {
> + const char name[MAX_GT_TYPE_CHARS + 1];
> + enum xe_gt_type type;
> +} gt_types[] = {
> + { .name = "primary", .type = XE_GT_TYPE_MAIN },
> + { .name = "media", .type = XE_GT_TYPE_MEDIA },
> +};
> +
> static struct xe_config_group_device *to_xe_config_group_device(struct config_item *item)
> {
> return container_of(to_config_group(item), struct xe_config_group_device, group);
> @@ -292,6 +343,58 @@ static ssize_t survivability_mode_store(struct config_item *item, const char *pa
> return len;
> }
>
> +static ssize_t gt_types_allowed_show(struct config_item *item, char *page)
> +{
> + struct xe_config_device *dev = to_xe_config_device(item);
> + char *p = page;
> +
> + for (size_t i = 0; i < ARRAY_SIZE(gt_types); i++)
> + if (dev->gt_types_allowed & BIT_ULL(gt_types[i].type))
> + p += sprintf(p, "%s\n", gt_types[i].name);
> +
> + return p - page;
> +}
> +
> +static ssize_t gt_types_allowed_store(struct config_item *item, const char *page,
> + size_t len)
> +{
> + struct xe_config_group_device *dev = to_xe_config_group_device(item);
> + char *buf __free(kfree) = kstrdup(page, GFP_KERNEL);
> + char *p = buf;
> + u64 typemask = 0;
> +
> + if (!buf)
> + return -ENOMEM;
> +
> + while (p) {
> + char *typename = strsep(&p, ",\n");
> + bool matched = false;
> +
> + if (typename[0] == '\0')
> + continue;
> +
> + for (size_t i = 0; i < ARRAY_SIZE(gt_types); i++) {
> + if (strcmp(typename, gt_types[i].name) == 0) {
> + typemask |= BIT(gt_types[i].type);
> + matched = true;
> + break;
> + }
> + }
> +
> + if (!matched)
> + return -EINVAL;
> + }
> +
> + scoped_guard(mutex, &dev->lock) {
probably plain guard(mutex) will work here too
> + if (is_bound(dev))
> + return -EBUSY;
then we can take a lock and return earlier, before parsing input
> +
> + dev->config.gt_types_allowed = typemask;
> + }
> +
> + return len;
> +}
> +
> static ssize_t engines_allowed_show(struct config_item *item, char *page)
> {
> struct xe_config_device *dev = to_xe_config_device(item);
> @@ -672,6 +775,7 @@ CONFIGFS_ATTR(, ctx_restore_mid_bb);
> CONFIGFS_ATTR(, ctx_restore_post_bb);
> CONFIGFS_ATTR(, enable_psmi);
> CONFIGFS_ATTR(, engines_allowed);
> +CONFIGFS_ATTR(, gt_types_allowed);
> CONFIGFS_ATTR(, survivability_mode);
>
> static struct configfs_attribute *xe_config_device_attrs[] = {
> @@ -679,6 +783,7 @@ static struct configfs_attribute *xe_config_device_attrs[] = {
> &attr_ctx_restore_post_bb,
> &attr_enable_psmi,
> &attr_engines_allowed,
> + &attr_gt_types_allowed,
> &attr_survivability_mode,
> NULL,
> };
> @@ -846,6 +951,7 @@ static void dump_custom_dev_config(struct pci_dev *pdev,
> dev->config.attr_); \
> } while (0)
>
> + PRI_CUSTOM_ATTR("%llx", gt_types_allowed);
> PRI_CUSTOM_ATTR("%llx", engines_allowed);
> PRI_CUSTOM_ATTR("%d", enable_psmi);
> PRI_CUSTOM_ATTR("%d", survivability_mode);
> @@ -896,6 +1002,45 @@ bool xe_configfs_get_survivability_mode(struct pci_dev *pdev)
> return mode;
> }
>
> +static u64 get_gt_types_allowed(struct xe_device *xe)
> +{
> + struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
> + struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
> + u64 mask;
> +
> + if (!dev)
> + return device_defaults.gt_types_allowed;
> +
> + mask = dev->config.gt_types_allowed;
btw, as we are using guard during write, shouldn't we also guard during read?
> + config_group_put(&dev->group);
> +
> + return mask;
> +}
> +
> +/**
> + * xe_configfs_primary_gt_supported - determine whether primary GTs are supported
> + * @xe: xe device
> + *
> + * Return: True if primary GTs are enabled, false if they have been disabled via
> + * configfs.
> + */
> +bool xe_configfs_primary_gt_supported(struct xe_device *xe)
> +{
> + return (get_gt_types_allowed(xe) & BIT_ULL(XE_GT_TYPE_MAIN)) != 0;
can't we just rely on the promotion to bool?
return get_gt_types_allowed(xe) & BIT_ULL(XE_GT_TYPE_MAIN);
> +}
> +
> +/**
> + * xe_configfs_media_gt_supported - determine whether media GTs are supported
> + * @xe: xe device
> + *
> + * Return: True if the media GTs are enabled, false if they have been disabled
> + * via configfs.
> + */
> +bool xe_configfs_media_gt_supported(struct xe_device *xe)
> +{
> + return (get_gt_types_allowed(xe) & BIT_ULL(XE_GT_TYPE_MEDIA)) != 0;
> +}
> +
> /**
> * xe_configfs_get_engines_allowed - get engine allowed mask from configfs
> * @pdev: pci device
> diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
> index c61e0e47ed94..5624e965b911 100644
> --- a/drivers/gpu/drm/xe/xe_configfs.h
> +++ b/drivers/gpu/drm/xe/xe_configfs.h
> @@ -17,6 +17,8 @@ int xe_configfs_init(void);
> void xe_configfs_exit(void);
> void xe_configfs_check_device(struct pci_dev *pdev);
> bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
> +bool xe_configfs_primary_gt_supported(struct xe_device *xe);
> +bool xe_configfs_media_gt_supported(struct xe_device *xe);
I guess we need decide now whether we want to continue to pass pdev or switch to xe as argument for all xe_configfs functions
> u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
> bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
> u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class,
> @@ -28,6 +30,8 @@ static inline int xe_configfs_init(void) { return 0; }
> static inline void xe_configfs_exit(void) { }
> static inline void xe_configfs_check_device(struct pci_dev *pdev) { }
> static inline bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) { return false; }
> +static inline bool xe_configfs_primary_gt_supported(struct xe_device *xe) { return true; }
> +static inline bool xe_configfs_media_gt_supported(struct xe_device *xe) { return true; }
> static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; }
> static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
> static inline u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class,
> diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
> index a5932e4f4a23..9c8ab2b41737 100644
> --- a/drivers/gpu/drm/xe/xe_pci.c
> +++ b/drivers/gpu/drm/xe/xe_pci.c
> @@ -695,6 +695,11 @@ static struct xe_gt *alloc_primary_gt(struct xe_tile *tile,
> struct xe_device *xe = tile_to_xe(tile);
> struct xe_gt *gt;
>
> + if (!xe_configfs_primary_gt_supported(xe)) {
> + drm_info(&xe->drm, "Primary GT disabled via configfs\n");
nit: you can use xe_info(xe, "...") now
> + return NULL;
> + }
> +
> gt = xe_gt_alloc(tile);
> if (IS_ERR(gt))
> return gt;
> @@ -720,6 +725,11 @@ static struct xe_gt *alloc_media_gt(struct xe_tile *tile,
> struct xe_device *xe = tile_to_xe(tile);
> struct xe_gt *gt;
>
> + if (!xe_configfs_media_gt_supported(xe)) {
> + drm_info(&xe->drm, "Media GT disabled via configfs\n");
> + return NULL;
> + }
> +
> if (MEDIA_VER(xe) < 13 || !media_desc)
> return NULL;
>
> @@ -829,6 +839,18 @@ static int xe_info_init(struct xe_device *xe,
> if (IS_ERR(tile->primary_gt))
> return PTR_ERR(tile->primary_gt);
>
> + /*
> + * It's not currently possible to probe a device with the
> + * primary GT disabled. With some work, this may be future in
> + * the possible for igpu platforms (although probably not for
> + * dgpu's since access to the primary GT's BCS engines is
> + * required for VRAM management).
> + */
> + if (!tile->primary_gt) {
> + drm_err(&xe->drm, "Cannot probe device with without a primary GT\n");
> + return -ENODEV;
> + }
> +
> tile->media_gt = alloc_media_gt(tile, media_desc);
> if (IS_ERR(tile->media_gt))
> return PTR_ERR(tile->media_gt);
next prev parent reply other threads:[~2025-10-08 10:12 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-07 20:48 [PATCH v4 00/23] Allow configfs to disable specific GT type(s) Matt Roper
2025-10-07 20:48 ` [PATCH v4 01/23] drm/xe/huc: Adjust HuC check on primary GT Matt Roper
2025-10-07 20:48 ` [PATCH v4 02/23] drm/xe: Drop GT parameter to xe_display_irq_postinstall() Matt Roper
2025-10-07 20:48 ` [PATCH v4 03/23] drm/xe: Move 'va_bits' flag back to platform descriptor Matt Roper
2025-10-07 22:02 ` Lucas De Marchi
2025-10-07 22:44 ` Matt Roper
2025-10-07 20:48 ` [PATCH v4 04/23] drm/xe: Move 'vm_max_level' " Matt Roper
2025-10-07 21:54 ` Lucas De Marchi
2025-10-08 13:28 ` Gustavo Sousa
2025-10-07 20:48 ` [PATCH v4 05/23] drm/xe: Move 'vram_flags' " Matt Roper
2025-10-07 20:48 ` [PATCH v4 06/23] drm/xe: Move 'has_flatccs' " Matt Roper
2025-10-10 10:50 ` Jani Nikula
2025-10-13 16:42 ` Matt Roper
2025-10-07 20:48 ` [PATCH v4 07/23] drm/xe: Read VF GMD_ID with a specifically-allocated dummy GT Matt Roper
2025-10-08 3:06 ` Lucas De Marchi
2025-10-07 20:48 ` [PATCH v4 08/23] drm/xe: Move primary GT allocation from xe_tile_init_early to xe_tile_init Matt Roper
2025-10-07 20:48 ` [PATCH v4 09/23] drm/xe: Skip L2 / TDF cache flushes if primary GT is disabled Matt Roper
2025-10-07 20:48 ` [PATCH v4 10/23] drm/xe/query: Report hwconfig size as 0 " Matt Roper
2025-10-07 20:48 ` [PATCH v4 11/23] drm/xe/pmu: Initialize PMU event types based on first available GT Matt Roper
2025-10-07 20:48 ` [PATCH v4 12/23] drm/xe: Check for primary GT before looking up Wa_22019338487 Matt Roper
2025-10-08 13:30 ` Gustavo Sousa
2025-10-07 20:48 ` [PATCH v4 13/23] drm/xe: Make display part of Wa_22019338487 a device workaround Matt Roper
2025-10-07 20:48 ` [PATCH v4 14/23] drm/xe/irq: Don't try to lookup engine masks for non-existent primary GT Matt Roper
2025-10-07 20:48 ` [PATCH v4 15/23] drm/xe: Handle Wa_22010954014 and Wa_14022085890 as device workarounds Matt Roper
2025-10-07 20:48 ` [PATCH v4 16/23] drm/xe/rtp: Pass xe_device parameter to FUNC matches Matt Roper
2025-10-07 20:48 ` [PATCH v4 17/23] drm/xe: Bypass Wa_14018094691 when primary GT is disabled Matt Roper
2025-10-07 20:48 ` [PATCH v4 18/23] drm/xe: Correct lineage for Wa_22014953428 and only check with valid GT Matt Roper
2025-10-07 20:48 ` [PATCH v4 19/23] drm/xe: Check that GT is not NULL before testing Wa_16023588340 Matt Roper
2025-10-07 20:48 ` [PATCH v4 20/23] drm/xe: Don't check BIOS-disabled FlatCCS if primary GT is disabled Matt Roper
2025-10-07 20:48 ` [PATCH v4 21/23] drm/xe: Break GT setup out of xe_info_init() Matt Roper
2025-10-08 3:15 ` Lucas De Marchi
2025-10-08 13:39 ` Gustavo Sousa
2025-10-07 20:48 ` [PATCH v4 22/23] drm/xe/configfs: Add attribute to disable GT types Matt Roper
2025-10-08 3:37 ` Lucas De Marchi
2025-10-08 19:10 ` Matt Roper
2025-10-08 19:22 ` Lucas De Marchi
2025-10-08 10:12 ` Michal Wajdeczko [this message]
2025-10-08 20:08 ` Matt Roper
2025-10-08 21:10 ` Lucas De Marchi
2025-10-08 14:06 ` Gustavo Sousa
2025-10-07 20:48 ` [PATCH v4 23/23] drm/xe/sriov: Disable SR-IOV if primary GT is disabled via configfs Matt Roper
2025-10-07 20:56 ` ✗ CI.checkpatch: warning for Allow configfs to disable specific GT type(s) (rev4) Patchwork
2025-10-07 20:57 ` ✓ CI.KUnit: success " Patchwork
2025-10-07 21:49 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-07 23:22 ` ✗ Xe.CI.Full: failure " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c45b55d0-697d-4e3b-82d3-5cecda6362ff@intel.com \
--to=michal.wajdeczko@intel.com \
--cc=gustavo.sousa@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.d.roper@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox