Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Gustavo Sousa <gustavo.sousa@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>, <intel-xe@lists.freedesktop.org>
Cc: <matthew.d.roper@intel.com>
Subject: Re: [PATCH v4 22/23] drm/xe/configfs: Add attribute to disable GT types
Date: Wed, 8 Oct 2025 11:06:14 -0300	[thread overview]
Message-ID: <175993237464.201562.8329727487094559284@intel.com> (raw)
In-Reply-To: <20251007204829.1468209-47-matthew.d.roper@intel.com>

Quoting Matt Roper (2025-10-07 17:48:52-03:00)
>Preventing the driver from initializing GTs of specific type(s) can be
>useful for debugging and early hardware bringup.  Add a configfs
>attribute to allow this kind of control for debugging.
>
>With today's platforms and software design, this configuration setting
>is only effective for disabling the media GT since the driver currently
>requires that there always be a primary GT to probe the device.  However
>this might change in the future ---  in theory it should be possible
>(with some additional driver work) to allow an igpu device to come up
>with only the media GT and no primary GT.  Or to allow an igpu device to
>come up with no GTs at all (for display-only usage).  A primary GT will
>likely always be required on dgpu platforms because we rely on the BCS
>engines inside the primary GT for various vram operations.
>
>v2:
> - Expand/clarify kerneldoc for configfs attribute.  (Gustavo)
> - Tighten type usage in gt_types[] structure.  (Gustavo)
> - Adjust string parsing/name matching to match exact GT names and not
>   accept partial names.  (Gustavo)
>
>v3:
> - Switch to scope-based cleanup in gt_types_allowed_store() to fix a
>   leak if the device is already bound.  (Gustavo)
> - Switch configfs lookup interface to two boolean functions that
>   specify whether primary/media are supported rather than one function
>   that returns a mask.  This is simpler to use and understand.
>
>Cc: Gustavo Sousa <gustavo.sousa@intel.com>
>Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
>---
> drivers/gpu/drm/xe/xe_configfs.c | 145 +++++++++++++++++++++++++++++++
> drivers/gpu/drm/xe/xe_configfs.h |   4 +
> drivers/gpu/drm/xe/xe_pci.c      |  22 +++++
> 3 files changed, 171 insertions(+)
>
>diff --git a/drivers/gpu/drm/xe/xe_configfs.c b/drivers/gpu/drm/xe/xe_configfs.c
>index 139663423185..e36cc5e1bc8f 100644
>--- a/drivers/gpu/drm/xe/xe_configfs.c
>+++ b/drivers/gpu/drm/xe/xe_configfs.c
>@@ -15,6 +15,7 @@
> 
> #include "instructions/xe_mi_commands.h"
> #include "xe_configfs.h"
>+#include "xe_gt_types.h"
> #include "xe_hw_engine_types.h"
> #include "xe_module.h"
> #include "xe_pci_types.h"
>@@ -56,6 +57,7 @@
>  *        :
>  *        └── 0000:03:00.0
>  *            ├── survivability_mode
>+ *            ├── gt_types_allowed
>  *            ├── engines_allowed
>  *            └── enable_psmi
>  *
>@@ -79,6 +81,44 @@
>  *
>  * This attribute can only be set before binding to the device.
>  *
>+ * Allowed GT types:
>+ * -----------------
>+ *
>+ * Allow only specific types of GTs to be detected and initialized by the
>+ * driver.  Any combination of GT types can be enabled/disabled, although
>+ * some settings will cause the device to fail to probe.
>+ *
>+ * Writes support both comma- and newline-separated input format. Reads
>+ * will always return one GT type per line. "primary" and "media" are the
>+ * GT type names supported by this interface.
>+ *
>+ * This attribute can only be set before binding to the device.
>+ *
>+ * Examples:
>+ *
>+ * Allow both primary and media GTs to be initialized and used.  This matches
>+ * the driver's default behavior::
>+ *
>+ *        # echo 'primary,media' > /sys/kernel/config/xe/0000:03:00.0/gt_types_allowed
>+ *
>+ * Allow only the primary GT of each tile to be initialized and used,
>+ * effectively disabling the media GT if it exists on the platform::
>+ *
>+ *        # echo 'primary' > /sys/kernel/config/xe/0000:03:00.0/gt_types_allowed
>+ *
>+ * Allow only the media GT of each tile to be initialized and used,
>+ * effectively disabling the primary GT.  **This configuration will cause
>+ * device probe failure on all current platforms, but may be allowed on
>+ * igpu platforms in the future**::
>+ *
>+ *        # echo 'media' > /sys/kernel/config/xe/0000:03:00.0/gt_types_allowed
>+ *
>+ * Disable all GTs.  Only other GPU IP (such as display) is potentially usable.
>+ * **This configuration will cause device probe failure on all current
>+ * platforms, but may be allowed on igpu platforms in the future**::
>+ *
>+ *        # echo '' > /sys/kernel/config/xe/0000:03:00.0/gt_types_allowed
>+ *
>  * Allowed engines:
>  * ----------------
>  *
>@@ -187,6 +227,7 @@ struct xe_config_group_device {
>         struct config_group group;
> 
>         struct xe_config_device {
>+                u64 gt_types_allowed;
>                 u64 engines_allowed;
>                 struct wa_bb ctx_restore_post_bb[XE_ENGINE_CLASS_MAX];
>                 struct wa_bb ctx_restore_mid_bb[XE_ENGINE_CLASS_MAX];
>@@ -201,6 +242,7 @@ struct xe_config_group_device {
> };
> 
> static const struct xe_config_device device_defaults = {
>+        .gt_types_allowed = U64_MAX,
>         .engines_allowed = U64_MAX,
>         .survivability_mode = false,
>         .enable_psmi = false,
>@@ -220,6 +262,7 @@ struct engine_info {
> /* Some helpful macros to aid on the sizing of buffer allocation when parsing */
> #define MAX_ENGINE_CLASS_CHARS 5
> #define MAX_ENGINE_INSTANCE_CHARS 2
>+#define MAX_GT_TYPE_CHARS 7
> 
> static const struct engine_info engine_info[] = {
>         { .cls = "rcs", .mask = XE_HW_ENGINE_RCS_MASK, .engine_class = XE_ENGINE_CLASS_RENDER },
>@@ -230,6 +273,14 @@ static const struct engine_info engine_info[] = {
>         { .cls = "gsccs", .mask = XE_HW_ENGINE_GSCCS_MASK, .engine_class = XE_ENGINE_CLASS_OTHER },
> };
> 
>+static const struct {
>+        const char name[MAX_GT_TYPE_CHARS + 1];
>+        enum xe_gt_type type;
>+} gt_types[] = {
>+        { .name = "primary", .type = XE_GT_TYPE_MAIN },
>+        { .name = "media", .type = XE_GT_TYPE_MEDIA },
>+};
>+
> static struct xe_config_group_device *to_xe_config_group_device(struct config_item *item)
> {
>         return container_of(to_config_group(item), struct xe_config_group_device, group);
>@@ -292,6 +343,58 @@ static ssize_t survivability_mode_store(struct config_item *item, const char *pa
>         return len;
> }
> 
>+static ssize_t gt_types_allowed_show(struct config_item *item, char *page)
>+{
>+        struct xe_config_device *dev = to_xe_config_device(item);
>+        char *p = page;
>+
>+        for (size_t i = 0; i < ARRAY_SIZE(gt_types); i++)
>+                if (dev->gt_types_allowed & BIT_ULL(gt_types[i].type))
>+                        p += sprintf(p, "%s\n", gt_types[i].name);
>+
>+        return p - page;
>+}
>+
>+static ssize_t gt_types_allowed_store(struct config_item *item, const char *page,
>+                                      size_t len)
>+{
>+        struct xe_config_group_device *dev = to_xe_config_group_device(item);
>+        char *buf __free(kfree) = kstrdup(page, GFP_KERNEL);
>+        char *p = buf;
>+        u64 typemask = 0;
>+
>+        if (!buf)
>+                return -ENOMEM;
>+
>+        while (p) {
>+                char *typename = strsep(&p, ",\n");
>+                bool matched = false;
>+
>+                if (typename[0] == '\0')
>+                        continue;
>+
>+                for (size_t i = 0; i < ARRAY_SIZE(gt_types); i++) {
>+                        if (strcmp(typename, gt_types[i].name) == 0) {
>+                                typemask |= BIT(gt_types[i].type);
>+                                matched = true;
>+                                break;
>+                        }
>+                }
>+
>+                if (!matched)
>+                        return -EINVAL;
>+        }
>+
>+        scoped_guard(mutex, &dev->lock) {
>+                if (is_bound(dev))
>+                        return -EBUSY;
>+
>+                dev->config.gt_types_allowed = typemask;
>+        }
>+
>+        return len;
>+}
>+
> static ssize_t engines_allowed_show(struct config_item *item, char *page)
> {
>         struct xe_config_device *dev = to_xe_config_device(item);
>@@ -672,6 +775,7 @@ CONFIGFS_ATTR(, ctx_restore_mid_bb);
> CONFIGFS_ATTR(, ctx_restore_post_bb);
> CONFIGFS_ATTR(, enable_psmi);
> CONFIGFS_ATTR(, engines_allowed);
>+CONFIGFS_ATTR(, gt_types_allowed);
> CONFIGFS_ATTR(, survivability_mode);
> 
> static struct configfs_attribute *xe_config_device_attrs[] = {
>@@ -679,6 +783,7 @@ static struct configfs_attribute *xe_config_device_attrs[] = {
>         &attr_ctx_restore_post_bb,
>         &attr_enable_psmi,
>         &attr_engines_allowed,
>+        &attr_gt_types_allowed,
>         &attr_survivability_mode,
>         NULL,
> };
>@@ -846,6 +951,7 @@ static void dump_custom_dev_config(struct pci_dev *pdev,
>                                  dev->config.attr_); \
>         } while (0)
> 
>+        PRI_CUSTOM_ATTR("%llx", gt_types_allowed);
>         PRI_CUSTOM_ATTR("%llx", engines_allowed);
>         PRI_CUSTOM_ATTR("%d", enable_psmi);
>         PRI_CUSTOM_ATTR("%d", survivability_mode);
>@@ -896,6 +1002,45 @@ bool xe_configfs_get_survivability_mode(struct pci_dev *pdev)
>         return mode;
> }
> 
>+static u64 get_gt_types_allowed(struct xe_device *xe)
>+{
>+        struct pci_dev *pdev = to_pci_dev(xe->drm.dev);
>+        struct xe_config_group_device *dev = find_xe_config_group_device(pdev);
>+        u64 mask;
>+
>+        if (!dev)
>+                return device_defaults.gt_types_allowed;
>+
>+        mask = dev->config.gt_types_allowed;
>+        config_group_put(&dev->group);
>+
>+        return mask;
>+}
>+
>+/**
>+ * xe_configfs_primary_gt_supported - determine whether primary GTs are supported
>+ * @xe: xe device
>+ *
>+ * Return: True if primary GTs are enabled, false if they have been disabled via
>+ *     configfs.
>+ */
>+bool xe_configfs_primary_gt_supported(struct xe_device *xe)

Nitpick: I think it would be more precise if we used _allowed instead of
_supported here...

I see there are some feedback from Lucas and Michal that could be
incorporated.  That said, the patch at its current state already looks
good to me, so:

Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>

Feel free to keep the r-b when applying any of their suggestions.

--
Gustavo Sousa

>+{
>+        return (get_gt_types_allowed(xe) & BIT_ULL(XE_GT_TYPE_MAIN)) != 0;
>+}
>+
>+/**
>+ * xe_configfs_media_gt_supported - determine whether media GTs are supported
>+ * @xe: xe device
>+ *
>+ * Return: True if the media GTs are enabled, false if they have been disabled
>+ *     via configfs.
>+ */
>+bool xe_configfs_media_gt_supported(struct xe_device *xe)
>+{
>+        return (get_gt_types_allowed(xe) & BIT_ULL(XE_GT_TYPE_MEDIA)) != 0;
>+}
>+
> /**
>  * xe_configfs_get_engines_allowed - get engine allowed mask from configfs
>  * @pdev: pci device
>diff --git a/drivers/gpu/drm/xe/xe_configfs.h b/drivers/gpu/drm/xe/xe_configfs.h
>index c61e0e47ed94..5624e965b911 100644
>--- a/drivers/gpu/drm/xe/xe_configfs.h
>+++ b/drivers/gpu/drm/xe/xe_configfs.h
>@@ -17,6 +17,8 @@ int xe_configfs_init(void);
> void xe_configfs_exit(void);
> void xe_configfs_check_device(struct pci_dev *pdev);
> bool xe_configfs_get_survivability_mode(struct pci_dev *pdev);
>+bool xe_configfs_primary_gt_supported(struct xe_device *xe);
>+bool xe_configfs_media_gt_supported(struct xe_device *xe);
> u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev);
> bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev);
> u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class,
>@@ -28,6 +30,8 @@ static inline int xe_configfs_init(void) { return 0; }
> static inline void xe_configfs_exit(void) { }
> static inline void xe_configfs_check_device(struct pci_dev *pdev) { }
> static inline bool xe_configfs_get_survivability_mode(struct pci_dev *pdev) { return false; }
>+static inline bool xe_configfs_primary_gt_supported(struct xe_device *xe) { return true; }
>+static inline bool xe_configfs_media_gt_supported(struct xe_device *xe) { return true; }
> static inline u64 xe_configfs_get_engines_allowed(struct pci_dev *pdev) { return U64_MAX; }
> static inline bool xe_configfs_get_psmi_enabled(struct pci_dev *pdev) { return false; }
> static inline u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class,
>diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
>index a5932e4f4a23..9c8ab2b41737 100644
>--- a/drivers/gpu/drm/xe/xe_pci.c
>+++ b/drivers/gpu/drm/xe/xe_pci.c
>@@ -695,6 +695,11 @@ static struct xe_gt *alloc_primary_gt(struct xe_tile *tile,
>         struct xe_device *xe = tile_to_xe(tile);
>         struct xe_gt *gt;
> 
>+        if (!xe_configfs_primary_gt_supported(xe)) {
>+                drm_info(&xe->drm, "Primary GT disabled via configfs\n");
>+                return NULL;
>+        }
>+
>         gt = xe_gt_alloc(tile);
>         if (IS_ERR(gt))
>                 return gt;
>@@ -720,6 +725,11 @@ static struct xe_gt *alloc_media_gt(struct xe_tile *tile,
>         struct xe_device *xe = tile_to_xe(tile);
>         struct xe_gt *gt;
> 
>+        if (!xe_configfs_media_gt_supported(xe)) {
>+                drm_info(&xe->drm, "Media GT disabled via configfs\n");
>+                return NULL;
>+        }
>+
>         if (MEDIA_VER(xe) < 13 || !media_desc)
>                 return NULL;
> 
>@@ -829,6 +839,18 @@ static int xe_info_init(struct xe_device *xe,
>                 if (IS_ERR(tile->primary_gt))
>                         return PTR_ERR(tile->primary_gt);
> 
>+                /*
>+                 * It's not currently possible to probe a device with the
>+                 * primary GT disabled.  With some work, this may be future in
>+                 * the possible for igpu platforms (although probably not for
>+                 * dgpu's since access to the primary GT's BCS engines is
>+                 * required for VRAM management).
>+                 */
>+                if (!tile->primary_gt) {
>+                        drm_err(&xe->drm, "Cannot probe device with without a primary GT\n");
>+                        return -ENODEV;
>+                }
>+
>                 tile->media_gt = alloc_media_gt(tile, media_desc);
>                 if (IS_ERR(tile->media_gt))
>                         return PTR_ERR(tile->media_gt);
>-- 
>2.51.0
>

  parent reply	other threads:[~2025-10-08 14:06 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-07 20:48 [PATCH v4 00/23] Allow configfs to disable specific GT type(s) Matt Roper
2025-10-07 20:48 ` [PATCH v4 01/23] drm/xe/huc: Adjust HuC check on primary GT Matt Roper
2025-10-07 20:48 ` [PATCH v4 02/23] drm/xe: Drop GT parameter to xe_display_irq_postinstall() Matt Roper
2025-10-07 20:48 ` [PATCH v4 03/23] drm/xe: Move 'va_bits' flag back to platform descriptor Matt Roper
2025-10-07 22:02   ` Lucas De Marchi
2025-10-07 22:44     ` Matt Roper
2025-10-07 20:48 ` [PATCH v4 04/23] drm/xe: Move 'vm_max_level' " Matt Roper
2025-10-07 21:54   ` Lucas De Marchi
2025-10-08 13:28   ` Gustavo Sousa
2025-10-07 20:48 ` [PATCH v4 05/23] drm/xe: Move 'vram_flags' " Matt Roper
2025-10-07 20:48 ` [PATCH v4 06/23] drm/xe: Move 'has_flatccs' " Matt Roper
2025-10-10 10:50   ` Jani Nikula
2025-10-13 16:42     ` Matt Roper
2025-10-07 20:48 ` [PATCH v4 07/23] drm/xe: Read VF GMD_ID with a specifically-allocated dummy GT Matt Roper
2025-10-08  3:06   ` Lucas De Marchi
2025-10-07 20:48 ` [PATCH v4 08/23] drm/xe: Move primary GT allocation from xe_tile_init_early to xe_tile_init Matt Roper
2025-10-07 20:48 ` [PATCH v4 09/23] drm/xe: Skip L2 / TDF cache flushes if primary GT is disabled Matt Roper
2025-10-07 20:48 ` [PATCH v4 10/23] drm/xe/query: Report hwconfig size as 0 " Matt Roper
2025-10-07 20:48 ` [PATCH v4 11/23] drm/xe/pmu: Initialize PMU event types based on first available GT Matt Roper
2025-10-07 20:48 ` [PATCH v4 12/23] drm/xe: Check for primary GT before looking up Wa_22019338487 Matt Roper
2025-10-08 13:30   ` Gustavo Sousa
2025-10-07 20:48 ` [PATCH v4 13/23] drm/xe: Make display part of Wa_22019338487 a device workaround Matt Roper
2025-10-07 20:48 ` [PATCH v4 14/23] drm/xe/irq: Don't try to lookup engine masks for non-existent primary GT Matt Roper
2025-10-07 20:48 ` [PATCH v4 15/23] drm/xe: Handle Wa_22010954014 and Wa_14022085890 as device workarounds Matt Roper
2025-10-07 20:48 ` [PATCH v4 16/23] drm/xe/rtp: Pass xe_device parameter to FUNC matches Matt Roper
2025-10-07 20:48 ` [PATCH v4 17/23] drm/xe: Bypass Wa_14018094691 when primary GT is disabled Matt Roper
2025-10-07 20:48 ` [PATCH v4 18/23] drm/xe: Correct lineage for Wa_22014953428 and only check with valid GT Matt Roper
2025-10-07 20:48 ` [PATCH v4 19/23] drm/xe: Check that GT is not NULL before testing Wa_16023588340 Matt Roper
2025-10-07 20:48 ` [PATCH v4 20/23] drm/xe: Don't check BIOS-disabled FlatCCS if primary GT is disabled Matt Roper
2025-10-07 20:48 ` [PATCH v4 21/23] drm/xe: Break GT setup out of xe_info_init() Matt Roper
2025-10-08  3:15   ` Lucas De Marchi
2025-10-08 13:39   ` Gustavo Sousa
2025-10-07 20:48 ` [PATCH v4 22/23] drm/xe/configfs: Add attribute to disable GT types Matt Roper
2025-10-08  3:37   ` Lucas De Marchi
2025-10-08 19:10     ` Matt Roper
2025-10-08 19:22       ` Lucas De Marchi
2025-10-08 10:12   ` Michal Wajdeczko
2025-10-08 20:08     ` Matt Roper
2025-10-08 21:10       ` Lucas De Marchi
2025-10-08 14:06   ` Gustavo Sousa [this message]
2025-10-07 20:48 ` [PATCH v4 23/23] drm/xe/sriov: Disable SR-IOV if primary GT is disabled via configfs Matt Roper
2025-10-07 20:56 ` ✗ CI.checkpatch: warning for Allow configfs to disable specific GT type(s) (rev4) Patchwork
2025-10-07 20:57 ` ✓ CI.KUnit: success " Patchwork
2025-10-07 21:49 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-07 23:22 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=175993237464.201562.8329727487094559284@intel.com \
    --to=gustavo.sousa@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=matthew.d.roper@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox