From: Matt Roper <matthew.d.roper@intel.com>
To: intel-xe@lists.freedesktop.org
Cc: matthew.d.roper@intel.com,
Michal Wajdeczko <michal.wajdeczko@intel.com>,
Lucas De Marchi <lucas.demarchi@intel.com>
Subject: [PATCH v5 07/23] drm/xe: Read VF GMD_ID with a specifically-allocated dummy GT
Date: Mon, 13 Oct 2025 13:09:50 -0700 [thread overview]
Message-ID: <20251013200944.2499947-32-matthew.d.roper@intel.com> (raw)
In-Reply-To: <20251013200944.2499947-25-matthew.d.roper@intel.com>
SRIOV VF initialization has a bit of a chicken and egg design problem.
Determining the IP version of the graphics and media IPs can't be done
via direct register reads as it is on PF or native and instead requires
querying the GuC. However initialization of the GT, including its GuC,
needs to wait until after we know the IP versions so that the proper
initialization steps for the platform/IP are followed.
Currently the (somewhat hacky) solution is to manually fill out just
enough fields in tile 0's primary GT structure to make it look as if the
GT has been initialized so that the GuC can be partially initialized and
queried to obtain the GMD_ID values. When the GT gets properly
initialized during the regular flows, the hacked-up values will get
overwritten as part of the general initialization flows.
Rather than using tile 0's primary GT structure to hold the hacked up
values for querying every GT on every tile, instead allocate a dedicated
dummy structure. This will allow us to move the tile->primary_gt's
allocation to a more consistent place later in the initialization flow
in future patches (i.e., we shouldn't even allocate this GT structure if
the GT is disabled/unavailable). It also helps ensure there can't be
any accidental leakage of initialization or state between the dummy
initialization for GMD_ID and the real driver initialization of the GT.
v2:
- Initialize gt->tile for temporary GT. (CI, Michal)
- Use scope-based cleanup handler to free temp GT. (Michal)
- Propagate actual error code from xe_gt_sriov_vf_bootstrap() rather
than just setting IP version to 0.0 now that read_gmdid() can return
an error. (Michal)
v3:
- Explicitly initialize gt to NULL, just in case something else gets
inserted before the kzalloc() in the future. (Lucas)
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
---
drivers/gpu/drm/xe/tests/xe_pci.c | 6 ++-
drivers/gpu/drm/xe/xe_pci.c | 70 ++++++++++++++++++-------------
2 files changed, 45 insertions(+), 31 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_pci.c b/drivers/gpu/drm/xe/tests/xe_pci.c
index 69e2840c7ef0..0b1401b88cb8 100644
--- a/drivers/gpu/drm/xe/tests/xe_pci.c
+++ b/drivers/gpu/drm/xe/tests/xe_pci.c
@@ -307,8 +307,8 @@ const void *xe_pci_id_gen_param(struct kunit *test, const void *prev, char *desc
}
EXPORT_SYMBOL_IF_KUNIT(xe_pci_id_gen_param);
-static void fake_read_gmdid(struct xe_device *xe, enum xe_gmdid_type type,
- u32 *ver, u32 *revid)
+static int fake_read_gmdid(struct xe_device *xe, enum xe_gmdid_type type,
+ u32 *ver, u32 *revid)
{
struct kunit *test = kunit_get_current_test();
struct xe_pci_fake_data *data = test->priv;
@@ -320,6 +320,8 @@ static void fake_read_gmdid(struct xe_device *xe, enum xe_gmdid_type type,
*ver = data->graphics_verx100;
*revid = xe_step_to_gmdid(data->step.graphics);
}
+
+ return 0;
}
static void fake_xe_info_probe_tile_count(struct xe_device *xe)
diff --git a/drivers/gpu/drm/xe/xe_pci.c b/drivers/gpu/drm/xe/xe_pci.c
index 076d944cce6a..7fb960b61bf4 100644
--- a/drivers/gpu/drm/xe/xe_pci.c
+++ b/drivers/gpu/drm/xe/xe_pci.c
@@ -466,7 +466,7 @@ enum xe_gmdid_type {
GMDID_MEDIA
};
-static void read_gmdid(struct xe_device *xe, enum xe_gmdid_type type, u32 *ver, u32 *revid)
+static int read_gmdid(struct xe_device *xe, enum xe_gmdid_type type, u32 *ver, u32 *revid)
{
struct xe_mmio *mmio = xe_root_tile_mmio(xe);
struct xe_reg gmdid_reg = GMD_ID;
@@ -475,22 +475,24 @@ static void read_gmdid(struct xe_device *xe, enum xe_gmdid_type type, u32 *ver,
KUNIT_STATIC_STUB_REDIRECT(read_gmdid, xe, type, ver, revid);
if (IS_SRIOV_VF(xe)) {
- struct xe_gt *gt = xe_root_mmio_gt(xe);
-
/*
* To get the value of the GMDID register, VFs must obtain it
* from the GuC using MMIO communication.
*
- * Note that at this point the xe_gt is not fully uninitialized
- * and only basic access to MMIO registers is possible. To use
- * our existing GuC communication functions we must perform at
- * least basic xe_gt and xe_guc initialization.
- *
- * Since to obtain the value of GMDID_MEDIA we need to use the
- * media GuC, temporarily tweak the gt type.
+ * Note that at this point the GTs are not initialized and only
+ * tile-level access to MMIO registers is possible. To use our
+ * existing GuC communication functions we must create a dummy
+ * GT structure and perform at least basic xe_gt and xe_guc
+ * initialization.
*/
- xe_gt_assert(gt, gt->info.type == XE_GT_TYPE_UNINITIALIZED);
+ struct xe_gt *gt __free(kfree) = NULL;
+ int err;
+ gt = kzalloc(sizeof(*gt), GFP_KERNEL);
+ if (!gt)
+ return -ENOMEM;
+
+ gt->tile = &xe->tiles[0];
if (type == GMDID_MEDIA) {
gt->info.id = 1;
gt->info.type = XE_GT_TYPE_MEDIA;
@@ -502,15 +504,11 @@ static void read_gmdid(struct xe_device *xe, enum xe_gmdid_type type, u32 *ver,
xe_gt_mmio_init(gt);
xe_guc_comm_init_early(>->uc.guc);
- /* Don't bother with GMDID if failed to negotiate the GuC ABI */
- val = xe_gt_sriov_vf_bootstrap(gt) ? 0 : xe_gt_sriov_vf_gmdid(gt);
+ err = xe_gt_sriov_vf_bootstrap(gt);
+ if (err)
+ return err;
- /*
- * Only undo xe_gt.info here, the remaining changes made above
- * will be overwritten as part of the regular initialization.
- */
- gt->info.id = 0;
- gt->info.type = XE_GT_TYPE_UNINITIALIZED;
+ val = xe_gt_sriov_vf_gmdid(gt);
} else {
/*
* GMD_ID is a GT register, but at this point in the driver
@@ -528,6 +526,8 @@ static void read_gmdid(struct xe_device *xe, enum xe_gmdid_type type, u32 *ver,
*ver = REG_FIELD_GET(GMD_ID_ARCH_MASK, val) * 100 + REG_FIELD_GET(GMD_ID_RELEASE_MASK, val);
*revid = REG_FIELD_GET(GMD_ID_REVID, val);
+
+ return 0;
}
static const struct xe_ip *find_graphics_ip(unsigned int verx100)
@@ -554,18 +554,21 @@ static const struct xe_ip *find_media_ip(unsigned int verx100)
* Read IP version from hardware and select graphics/media IP descriptors
* based on the result.
*/
-static void handle_gmdid(struct xe_device *xe,
- const struct xe_ip **graphics_ip,
- const struct xe_ip **media_ip,
- u32 *graphics_revid,
- u32 *media_revid)
+static int handle_gmdid(struct xe_device *xe,
+ const struct xe_ip **graphics_ip,
+ const struct xe_ip **media_ip,
+ u32 *graphics_revid,
+ u32 *media_revid)
{
u32 ver;
+ int ret;
*graphics_ip = NULL;
*media_ip = NULL;
- read_gmdid(xe, GMDID_GRAPHICS, &ver, graphics_revid);
+ ret = read_gmdid(xe, GMDID_GRAPHICS, &ver, graphics_revid);
+ if (ret)
+ return ret;
*graphics_ip = find_graphics_ip(ver);
if (!*graphics_ip) {
@@ -573,16 +576,21 @@ static void handle_gmdid(struct xe_device *xe,
ver / 100, ver % 100);
}
- read_gmdid(xe, GMDID_MEDIA, &ver, media_revid);
+ ret = read_gmdid(xe, GMDID_MEDIA, &ver, media_revid);
+ if (ret)
+ return ret;
+
/* Media may legitimately be fused off / not present */
if (ver == 0)
- return;
+ return 0;
*media_ip = find_media_ip(ver);
if (!*media_ip) {
drm_err(&xe->drm, "Hardware reports unknown media version %u.%02u\n",
ver / 100, ver % 100);
}
+
+ return 0;
}
/*
@@ -693,6 +701,7 @@ static int xe_info_init(struct xe_device *xe,
const struct xe_media_desc *media_desc;
struct xe_tile *tile;
struct xe_gt *gt;
+ int ret;
u8 id;
/*
@@ -708,8 +717,11 @@ static int xe_info_init(struct xe_device *xe,
xe->info.step = xe_step_pre_gmdid_get(xe);
} else {
xe_assert(xe, !desc->pre_gmdid_media_ip);
- handle_gmdid(xe, &graphics_ip, &media_ip,
- &graphics_gmdid_revid, &media_gmdid_revid);
+ ret = handle_gmdid(xe, &graphics_ip, &media_ip,
+ &graphics_gmdid_revid, &media_gmdid_revid);
+ if (ret)
+ return ret;
+
xe->info.step = xe_step_gmdid_get(xe,
graphics_gmdid_revid,
media_gmdid_revid);
--
2.51.0
next prev parent reply other threads:[~2025-10-13 20:10 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-13 20:09 [PATCH v5 00/23] Allow configfs to disable specific GT type(s) Matt Roper
2025-10-13 20:09 ` [PATCH v5 01/23] drm/xe/huc: Adjust HuC check on primary GT Matt Roper
2025-10-13 20:09 ` [PATCH v5 02/23] drm/xe: Drop GT parameter to xe_display_irq_postinstall() Matt Roper
2025-10-13 20:09 ` [PATCH v5 03/23] drm/xe: Move 'va_bits' flag back to platform descriptor Matt Roper
2025-10-13 20:09 ` [PATCH v5 04/23] drm/xe: Move 'vm_max_level' " Matt Roper
2025-10-13 20:09 ` [PATCH v5 05/23] drm/xe: Move 'vram_flags' " Matt Roper
2025-10-13 20:09 ` [PATCH v5 06/23] drm/xe: Move 'has_flatccs' " Matt Roper
2025-10-13 20:09 ` Matt Roper [this message]
2025-10-13 20:09 ` [PATCH v5 08/23] drm/xe: Move primary GT allocation from xe_tile_init_early to xe_tile_init Matt Roper
2025-10-13 20:09 ` [PATCH v5 09/23] drm/xe: Skip L2 / TDF cache flushes if primary GT is disabled Matt Roper
2025-10-13 20:09 ` [PATCH v5 10/23] drm/xe/query: Report hwconfig size as 0 " Matt Roper
2025-10-13 20:09 ` [PATCH v5 11/23] drm/xe/pmu: Initialize PMU event types based on first available GT Matt Roper
2025-10-13 20:09 ` [PATCH v5 12/23] drm/xe: Check for primary GT before looking up Wa_22019338487 Matt Roper
2025-10-13 20:09 ` [PATCH v5 13/23] drm/xe: Make display part of Wa_22019338487 a device workaround Matt Roper
2025-10-13 20:09 ` [PATCH v5 14/23] drm/xe/irq: Don't try to lookup engine masks for non-existent primary GT Matt Roper
2025-10-13 20:09 ` [PATCH v5 15/23] drm/xe: Handle Wa_22010954014 and Wa_14022085890 as device workarounds Matt Roper
2025-10-13 20:09 ` [PATCH v5 16/23] drm/xe/rtp: Pass xe_device parameter to FUNC matches Matt Roper
2025-10-13 20:10 ` [PATCH v5 17/23] drm/xe: Bypass Wa_14018094691 when primary GT is disabled Matt Roper
2025-10-13 20:10 ` [PATCH v5 18/23] drm/xe: Correct lineage for Wa_22014953428 and only check with valid GT Matt Roper
2025-10-13 20:10 ` [PATCH v5 19/23] drm/xe: Check that GT is not NULL before testing Wa_16023588340 Matt Roper
2025-10-13 20:10 ` [PATCH v5 20/23] drm/xe: Don't check BIOS-disabled FlatCCS if primary GT is disabled Matt Roper
2025-10-13 20:10 ` [PATCH v5 21/23] drm/xe: Break GT setup out of xe_info_init() Matt Roper
2025-10-13 20:10 ` [PATCH v5 22/23] drm/xe/configfs: Add attribute to disable GT types Matt Roper
2025-10-13 20:10 ` [PATCH v5 23/23] drm/xe/sriov: Disable SR-IOV if primary GT is disabled via configfs Matt Roper
2025-10-13 20:25 ` Michal Wajdeczko
2025-10-13 23:30 ` ✗ CI.checkpatch: warning for Allow configfs to disable specific GT type(s) (rev5) Patchwork
2025-10-13 23:31 ` ✓ CI.KUnit: success " Patchwork
2025-10-14 0:11 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-14 8:08 ` ✓ Xe.CI.Full: " Patchwork
2025-10-14 15:10 ` Matt Roper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251013200944.2499947-32-matthew.d.roper@intel.com \
--to=matthew.d.roper@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=michal.wajdeczko@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox