From: "Wang, X" <x.wang@intel.com>
To: "Matt Roper" <matthew.d.roper@intel.com>,
"Zbigniew Kempczyński" <zbigniew.kempczynski@intel.com>
Cc: <igt-dev@lists.freedesktop.org>,
Kamil Konieczny <kamil.konieczny@linux.intel.com>,
Ravi Kumar V <ravi.kumar.vodapalli@intel.com>
Subject: Re: [PATCH v10 1/2] lib/xe/xe_query: Get runtime xe device graphics version from GMD_ID
Date: Sun, 18 Jan 2026 23:43:01 -0800 [thread overview]
Message-ID: <ddf04df0-945c-4af1-b03a-0408410f30eb@intel.com> (raw)
In-Reply-To: <20260116213358.GJ458797@mdroper-desk1.amr.corp.intel.com>
On 1/16/2026 13:33, Matt Roper wrote:
> On Thu, Jan 15, 2026 at 09:00:53AM +0100, Zbigniew Kempczyński wrote:
>> On Mon, Jan 12, 2026 at 07:17:31PM +0000, Xin Wang wrote:
>>> This allows IGT to query the exact IP version for xe platforms.
>>>
>>> Key changes:
>>> - Add xe_device_ipver field to xe_device structure
>>> - set the graphics versions based on the GMD_ID
>>> - Cache device ipver in global map indexed by devid for efficient lookup
>>> - Implement xe_ipver_cache_lookup() to retrieve cached ipver by devid
>>> - Clean up cached device ipver when xe_device is released
>>>
>>> V2:
>>> - add new struct xe_device_ipver to hold the ipver info
>>> - separate cache map to eliminate collision (Roper, Matthew D)
>>> - changed function name to xe_ipver_cache_lookup() to avoid
>>> confusion (Roper, Matthew D)
>>>
>>> V3:
>>> - optimize the coding style. (Summers, Stuart)
>>>
>>> Cc: Kamil Konieczny <kamil.konieczny@linux.intel.com>
>>> Cc: Matt Roper <matthew.d.roper@intel.com>
>>> Signed-off-by: Xin Wang <x.wang@intel.com>
>>> Reviewed-by: Ravi Kumar V <ravi.kumar.vodapalli@intel.com>
>>> ---
>>> lib/intel_chipset.h | 6 +++++
>>> lib/xe/xe_query.c | 54 ++++++++++++++++++++++++++++++++++++++++++++-
>>> lib/xe/xe_query.h | 4 ++++
>>> 3 files changed, 63 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/intel_chipset.h b/lib/intel_chipset.h
>>> index cc2225110..424811f7c 100644
>>> --- a/lib/intel_chipset.h
>>> +++ b/lib/intel_chipset.h
>>> @@ -100,6 +100,12 @@ struct intel_device_info {
>>> const char *codename;
>>> };
>>>
>>> +struct xe_device_ipver {
>>> + uint32_t devid;
>>> + uint16_t graphics_ver;
>>> + uint16_t graphics_rel;
>>> +};
>>> +
>>> const struct intel_device_info *intel_get_device_info(uint16_t devid) __attribute__((pure));
>>>
>>> const struct intel_cmds_info *intel_get_cmds_info(uint16_t devid) __attribute__((pure));
>>> diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
>>> index 981d76948..823a29f2d 100644
>>> --- a/lib/xe/xe_query.c
>>> +++ b/lib/xe/xe_query.c
>>> @@ -210,6 +210,22 @@ static struct xe_device_cache {
>>> struct igt_map *map;
>>> } cache;
>>>
>>> +static struct xe_ipver_cache {
>>> + pthread_mutex_t mutex;
>>> + struct igt_map *map;
>>> +} xe_ipver;
>>> +
>>> +struct xe_device_ipver *xe_ipver_cache_lookup(uint32_t devid)
>>> +{
>>> + struct xe_device_ipver *ipver;
>>> +
>>> + pthread_mutex_lock(&xe_ipver.mutex);
>>> + ipver = igt_map_search(xe_ipver.map, &devid);
>>> + pthread_mutex_unlock(&xe_ipver.mutex);
>>> +
>>> + return ipver;
>>> +}
>>> +
>> Please forgive me for sharing my thoughts only now.
>>
>> I wondered about this a bit and according to current code shape
>> I don't like this design - especially ip_ver cache in xe_query.
>> I mean I don't like to use external cache which might or might not
>> be filled at time of calling intel_get_device_info().
>>
>> At the moment I see following options:
>>
>> 1. Provide update function intel_device_info_update(devid, ver, rel)
>> allowing to update rel field during xe_device_get(). This still
>> keeps intel_device_info.c independent but rel field might be
>> incorrect. BTW I would set it in static definition to -1 (max
>> unsigned) to catch it is not set/updated properly during
>> intel_get_device_info().
>>
>> 2. Add new function intel_get_device_info_by_fd() which will support
>> passing fd instead devid and migrate code to use it. Most of our code
>> uses fd to acquire device id before calling intel_get_device_info()
>> [exception is igt_device_scan which uses pci devid directly]. This
>> will break intel_device_info.c independency because we start calling
>> driver functions, but maybe it is time to do this (GMD_ID).
>>
>> Digression: It seems when INTEL_DEVID_OVERRIDE won't work properly
>> but I wouldn't care about it for now.
>>
>> 3. Mix 1 and 2 but build hash map from intel_device_match[] in the
>> constructor and update it during xe_device_get(). Pros of this
>> is we could stop using 'static __thread' for caching last info.
> The problem is that "graphics version" and "media version" are not
> supposed to be a characteristic of a general device type anymore. An
> individual device will have some version number, but another device of
> the same "type" with the same PCI ID can can potentially have a
> different IP version. We've been lucky enough so far to still have
> unique PCI IDs for cases where the version numbers are different (at
> least for production parts), but this is not something that we're
> supposed to be relying on since the IP disaggregation happened around
> the MTL timeframe. Personally I think that hacks that keep treating
> versions as something associated with a device type are just making the
> problem harder to untangle.
>
> For a proper solution, I think we really need to separate the concept of
> "traits associated with a type of device" (which is the stuff we have in
> the device info structure which can be accurately looked up by PCI ID)
> from "IP version associated with a specific device instance" (which on
> modern platforms can only be accurately looked up in the context of a
> specific device handle such as a FD).
>
> So aside from a few special cases, we want IGT to stop translating PCI
> IDs into version numbers and instead start translating fd's into version
> numbers. The special case exceptions I can think of that don't really
> have a way to do fd-based lookups are:
>
> * tools/tests that bypass the driver entirely (possibly running without
> the driver loaded) and just use libpciaccess for direct BAR access.
> Most of these either don't have version-based conditions or only have
> version conditions matching against very old platforms (where a
> PCIID-based lookup is "safe")
>
> * the old i915-era error dump decoder tool just reads its information
> (including PCI ID) from a dump file on disk. The tool might be
> executed on a completely different machine from where the dump was
> originally captured. I don't think this tool has been updated since
> gen8 (everyone switched over to using the Mesa tool instead), and I
> don't think it would function at all on any of our modern platforms.
> So if this tool is still useful at all, it would only be on ancient
> i915 platforms where a PCIID-based lookup is "safe."
>
>
> The solution I suggested during an earlier review of this series was:
>
> * Create separate "version from PCI ID" APIs that can be used in the
> few special cases listed above, and migrate those uses over.
>
> * Change the signature of intel_gen / intel_graphics_ver to take an FD
> rather than a PCI ID; when running on Xe this would do the lookup via
> the query ioctl to get an accurate value.
>
> * Update intel_gen / intel_graphics_ver:
> - If running on i915, fall back to the device_info lookup that
> we've always used. That's safe for all i915 platforms except
> MTL/ARL, and we were lucky enough that they never created any
> production MTL/ARL platforms where a single PCI ID had different
> possible versions, so it winds up being safe there too.
>
> - If running on Xe, use the query ioctl to obtain the version
> number. This will return the proper value for all platforms
> officially supported by Xe (i.e., Xe2 and later). It will return
> 0 on pre-MTL platforms (which aren't officially supported), but in
> that case IGT can fall back to the i915 approach above.
>
> I did a mockup of the first two parts several months ago:
>
> https://github.com/mattrope/intel-gpu-tools/commits/forupstream/version_query/
>
> Since the function interface is changing, it's a lot of churn, but most
> of it can be auto-converted quickly via Coccinelle.
>
>
> Matt
I created 2 series:
fd-based API:
https://patchwork.freedesktop.org/series/160271/
inject the graphics ver from xe_device_get(), keep the API unchanged
https://patchwork.freedesktop.org/series/160239/
>> --
>> Zbigniew
>>
>>> static struct xe_device *find_in_cache_unlocked(int fd)
>>> {
>>> return igt_map_search(cache.map, &fd);
>>> @@ -270,6 +286,24 @@ struct xe_device *xe_device_get(int fd)
>>> for (int gt = 0; gt < xe_dev->gt_list->num_gt; gt++)
>>> xe_dev->gt_mask |= (1ull << xe_dev->gt_list->gt_list[gt].gt_id);
>>>
>>> + /*
>>> + * Set graphics_ver and graphics_rel based on the main GT's GMD_ID.
>>> + * We should use the hardcoded value for the non-GMD_ID platforms (ip_ver_major == 0)
>>> + */
>>> + xe_dev->ipver.devid = 0;
>>> + for (int gt = 0; gt < xe_dev->gt_list->num_gt; gt++) {
>>> + if (xe_dev->gt_list->gt_list[gt].type == DRM_XE_QUERY_GT_TYPE_MAIN &&
>>> + xe_dev->gt_list->gt_list[gt].ip_ver_major) {
>>> + igt_debug("Setting graphics_ver to %u and graphics_rel to %u\n",
>>> + xe_dev->gt_list->gt_list[gt].ip_ver_major,
>>> + xe_dev->gt_list->gt_list[gt].ip_ver_minor);
>>> + xe_dev->ipver.graphics_ver = xe_dev->gt_list->gt_list[gt].ip_ver_major;
>>> + xe_dev->ipver.graphics_rel = xe_dev->gt_list->gt_list[gt].ip_ver_minor;
>>> + xe_dev->ipver.devid = xe_dev->dev_id;
>>> + break;
>>> + }
>>> + }
>>> +
>>> /* Tile IDs may be non-consecutive; keep a mask of valid IDs */
>>> for (int gt = 0; gt < xe_dev->gt_list->num_gt; gt++)
>>> xe_dev->tile_mask |= (1ull << xe_dev->gt_list->gt_list[gt].tile_id);
>>> @@ -304,6 +338,11 @@ struct xe_device *xe_device_get(int fd)
>>> prev = find_in_cache_unlocked(fd);
>>> if (!prev) {
>>> igt_map_insert(cache.map, &xe_dev->fd, xe_dev);
>>> + if (xe_dev->ipver.devid) {
>>> + pthread_mutex_lock(&xe_ipver.mutex);
>>> + igt_map_insert(xe_ipver.map, &xe_dev->ipver.devid, &xe_dev->ipver);
>>> + pthread_mutex_unlock(&xe_ipver.mutex);
>>> + }
>>> } else {
>>> xe_device_free(xe_dev);
>>> xe_dev = prev;
>>> @@ -315,7 +354,15 @@ struct xe_device *xe_device_get(int fd)
>>>
>>> static void delete_in_cache(struct igt_map_entry *entry)
>>> {
>>> - xe_device_free((struct xe_device *)entry->data);
>>> + struct xe_device *xe_dev = (struct xe_device *)entry->data;
>>> +
>>> + if (xe_dev->ipver.devid) {
>>> + pthread_mutex_lock(&xe_ipver.mutex);
>>> + igt_map_remove(xe_ipver.map, &xe_dev->ipver.devid, NULL);
>>> + pthread_mutex_unlock(&xe_ipver.mutex);
>>> + }
>>> +
>>> + xe_device_free(xe_dev);
>>> }
>>>
>>> /**
>>> @@ -365,13 +412,18 @@ static void xe_device_destroy_cache(void)
>>> pthread_mutex_lock(&cache.cache_mutex);
>>> igt_map_destroy(cache.map, delete_in_cache);
>>> pthread_mutex_unlock(&cache.cache_mutex);
>>> + pthread_mutex_lock(&xe_ipver.mutex);
>>> + igt_map_destroy(xe_ipver.map, NULL);
>>> + pthread_mutex_unlock(&xe_ipver.mutex);
>>> }
>>>
>>> static void xe_device_cache_init(void)
>>> {
>>> pthread_mutex_init(&cache.cache_mutex, NULL);
>>> + pthread_mutex_init(&xe_ipver.mutex, NULL);
>>> xe_device_destroy_cache();
>>> cache.map = igt_map_create(igt_map_hash_32, igt_map_equal_32);
>>> + xe_ipver.map = igt_map_create(igt_map_hash_32, igt_map_equal_32);
>>> }
>>>
>>> #define xe_dev_FN(_NAME, _FIELD, _TYPE) \
>>> diff --git a/lib/xe/xe_query.h b/lib/xe/xe_query.h
>>> index d7a9f95f9..19690cff3 100644
>>> --- a/lib/xe/xe_query.h
>>> +++ b/lib/xe/xe_query.h
>>> @@ -74,6 +74,9 @@ struct xe_device {
>>>
>>> /** @dev_id: Device id of xe device */
>>> uint16_t dev_id;
>>> +
>>> + /** @ipver: Device ip version */
>>> + struct xe_device_ipver ipver;
>>> };
>>>
>>> #define xe_for_each_engine(__fd, __hwe) \
>>> @@ -181,6 +184,7 @@ static inline void *xe_query_device(int fd, uint32_t type, uint32_t *size)
>>> }
>>>
>>> struct xe_device *xe_device_get(int fd);
>>> +struct xe_device_ipver *xe_ipver_cache_lookup(uint32_t devid);
>>> void xe_device_put(int fd);
>>>
>>> int xe_query_eu_count(int fd, int gt);
>>> --
>>> 2.43.0
>>>
next prev parent reply other threads:[~2026-01-19 7:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-12 19:17 [PATCH v10 0/2] lib/intel_device_info: get the xe .graphics_rel from GMD_ID Xin Wang
2026-01-12 19:17 ` [PATCH v10 1/2] lib/xe/xe_query: Get runtime xe device graphics version " Xin Wang
2026-01-15 8:00 ` Zbigniew Kempczyński
2026-01-16 18:55 ` Wang, X
2026-01-16 21:33 ` Matt Roper
2026-01-19 7:43 ` Wang, X [this message]
2026-01-19 10:36 ` Zbigniew Kempczyński
2026-01-20 23:15 ` Matt Roper
2026-01-12 19:17 ` [PATCH v10 2/2] lib/intel_device_info: Query runtime xe device graphics versions Xin Wang
2026-01-12 20:38 ` ✓ Xe.CI.BAT: success for lib/intel_device_info: get the xe .graphics_rel from GMD_ID Patchwork
2026-01-12 20:50 ` ✗ i915.CI.BAT: failure " Patchwork
2026-01-13 3:22 ` ✓ Xe.CI.Full: success " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ddf04df0-945c-4af1-b03a-0408410f30eb@intel.com \
--to=x.wang@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=kamil.konieczny@linux.intel.com \
--cc=matthew.d.roper@intel.com \
--cc=ravi.kumar.vodapalli@intel.com \
--cc=zbigniew.kempczynski@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox