From: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
To: Matt Roper <matthew.d.roper@intel.com>
Cc: "Wang, X" <x.wang@intel.com>,
"igt-dev@lists.freedesktop.org" <igt-dev@lists.freedesktop.org>,
"Lin, Shuicheng" <shuicheng.lin@intel.com>,
"Nguyen, Brian3" <brian3.nguyen@intel.com>,
"Zuo, Alex" <alex.zuo@intel.com>,
"Goyal, Nakshtra" <nakshtra.goyal@intel.com>,
"Bhadane, Dnyaneshwar" <dnyaneshwar.bhadane@intel.com>,
"Sousa, Gustavo" <gustavo.sousa@intel.com>,
"Konieczny, Kamil" <kamil.konieczny@intel.com>
Subject: Re: [PATCH v3 1/6] lib: Add runtime device info query APIs for xe devices
Date: Sat, 11 Oct 2025 02:25:39 +0300 [thread overview]
Message-ID: <aOmV8yQHOB9Pa_1d@intel.com> (raw)
In-Reply-To: <20251009235740.GH1207432@mdroper-desk1.amr.corp.intel.com>
On Thu, Oct 09, 2025 at 04:57:40PM -0700, Matt Roper wrote:
> On Thu, Oct 09, 2025 at 11:00:57AM -0700, Wang, X wrote:
> >
> > > -----Original Message-----
> > > From: Roper, Matthew D <matthew.d.roper@intel.com>
> > > Sent: Wednesday, October 8, 2025 15:02
> > > To: Wang, X <x.wang@intel.com>
> > > Cc: igt-dev@lists.freedesktop.org; kamil.konieczny@linux.intel.com; Lin,
> > > Shuicheng <shuicheng.lin@intel.com>; Nguyen, Brian3
> > > <brian3.nguyen@intel.com>; Zuo, Alex <alex.zuo@intel.com>; Goyal, Nakshtra
> > > <nakshtra.goyal@intel.com>; Bhadane, Dnyaneshwar
> > > <dnyaneshwar.bhadane@intel.com>; Sousa, Gustavo <gustavo.sousa@intel.com>
> > > Subject: Re: [PATCH v3 1/6] lib: Add runtime device info query APIs for xe devices
> > >
> > > On Wed, Oct 08, 2025 at 09:02:31PM +0000, Xin Wang wrote:
> > > > Introduce new APIs to query device information at runtime for xe devices:
> > > > - intel_query_gen(int fd): Get graphics generation
> > > > - intel_query_graphics_ver(int fd): Get combined graphics version
> > > > - intel_query_device_info(int fd): Get device info structure
> > >
> > > High-level question: across our existing IGT codebase, how many places
> > > in the code truly need to lookup IP versions based solely on the PCI ID
> > > and don't have any access to an actual device (in the form of an fd)?
> > > From a quick skim, it seems like the vast majority of our code has some
> > > pattern along the lines of
> > >
> > > int fd = ...;
> > > uint16_t devid = intel_get_drm_devid(fd);
> > >
> > > if (intel_graphics_ver(devid) >= ...)
> > > ...
> > >
> > > Aside from a few exceptions[*] it seems like the path of least
> > > resistance would be to adjust the signature of our existing intel_gen()
> > > and intel_graphics_ver() functions to just take an fd instead of a
> > > devid. Initially those functions could call intel_get_drm_devid()
> > > internally to get a device ID and do a device info lookup, and then
> > > subsequent patches could add additional logic to do a proper ioctl query
> > > first, and then only fall back to the device info in cases where the
> > > query isn't possible (i.e., i915 driver and/or pre-GMD_ID platform).
> > >
> > > [*] From a quick skim of the code, the main exceptions that jumped out
> > > at me as truly needing a devid rather than an fd were:
> > >
> > > - The printing/dumping functions in lib/instdone.c that get called by
> > > tools/intel_error_decode.c. These wind up being a bit different
> > > because they're specifically used by a tool that analyzes error
> > > dumps; the device ID used during error dump processing either comes
> > > from the error dump file on disk or from the tool's command line.
> > > The device itself isn't opened and may not even be available since
> > > the error decoding can be executed on a different platform than the
> > > dump itself originally came from.
> > >
> > > - Tools like intel_reg, intel_gtt, etc. that scan the list of PCI
> > > devices and use libpciaccess to map+access BARs without ever
> > > opening the DRM file handle. In fact, these tools can potentially
> > > run without the driver loaded at all since all they care about is PCI
> > > device enumeration and raw access to the BARs.
> > >
> > > Since the exceptions noted above are all tools (and mostly old tools
> > > that nobody actively updates anymore), I'd be inclined to let the tests
> > > and other parts of the codebase that are actively developed continue
> > > using the existing function names (intel_gen, intel_graphics_ver, etc.),
> > > just with a signature change to take an fd instead of devid. Then We
> > > can create a separate set of dedicated functions (e.g.,
> > > "i915_graphics_ver_from_devid()") for the special cases where lookups by
> > > device ID are unavoidable. Since lookups by device ID simply aren't
> > > possible going forward, this will also help us figure out if there are
> > > cases where our current tools are simply going to stop working and need
> > >
> > >
> > > Matt
> > >
> > Project-Wide Replacement Analysis
> > If we search the entire project for intel_get_device_info|intel_gen|intel_graphics_ver, we find:
> >
> > 136 files
> > 483 occurrences
> >
> > Option 1: Keep Function Names, Change Parameters
> > If we choose to keep the original function names but change their parameters to use FD instead of devid, we would need to:
> >
> > Modify all 136 files
> > Update every usage
> > I understand that keeping the function names unchanged would be more convenient for developers.
> >
> > Option 2: Introduce New API Names for XE and make compatible with i915
> > If we instead introduce new API names - intel_query_device_info|intel_query_gen|intel_query_graphics_ver - and only modify files related to XE testing, then:
> >
> > We only need to modify 36 files
> > With 127 occurrences
> >
> > I suggest we should not modify the legacy i915 code, because our goal is solely to fix issues on XE hardware. The i915 code has been running stably for many years, and it's better to leave it untouched.
>
> I don't think the goal should be to minimize lines/files changed, but
> rather to do a full refactor that makes the codebase easy to
> understand/develop/maintain going forward. If people writing new IGT
> code have two different APIs that look very similar, but have
> hard-to-understand hardware and driver differences in when a specific
> API can be used, or when either one might work properly, then it's going
> to lead to a lot of confusion, bad copy-pastes, and bugs.
>
> In the end we're going to wind up with two API's no matter what --- one
> device-based, and one pciid-based. I think the device (FD) API is the
> one we want to use almost everywhere because it matches the semantics of
> what we're really trying to do (lookup the version associated with a
> specific device); we should discourage use of the PCI ID API anywhere
> that isn't one of the special-case tools that doesn't operate on a DRM
> fd. So that means one of two ways forward:
>
> - Update the signature of the existing functions, and keep them the
> preferred functions to use throughout most of IGT
>
> - Demote the the existing functions the tool-only API and switch all of
> the other current tools and tests that currently call them over to
> using a new FD-based API.
>
> Both of those are going to require widespread refactor across the IGT
> codebase. But I think either one is still preferable to trying to
> minimize the diff. If we just bolt on another API, but then don't
> actually use it in most of the places where it makes sense, I think it's
> going to cause a lot more confusion and bugs in the long term.
Agreed. People must keep refactoring the codebase, otherwise it
will bitrot into an unmaintanable mess.
In this case I think it could be done following this simple rule:
anything that has an fd already can use the fd based API, anything
that doesn't have an fd should stick to the devid based API. Should
probalby be able to do the bulk of that with cocci. Afterwards you
can evaluate on a case-by-case basis whether more conversions are
warranted.
But to be clear we do need the no-fd API for the tools. I've lost
count how many times I've fixed the mmio stuff to not need the fd
after my tools stopped working correctly.
I'm not sure anyone needs the no-fd API for non-display stuff though?
I suppose the GPU error decoder might need it? But that only works
correctly for old hardware, and for new stuff you just use the
decoder from mesa. So that one least shouldn't require any future
maintenance (adding new PCI IDs and such).
The other related mess is that intel_graphics_ver() returns the
full IP version, but intel_display_ver() hands out the major version
only. Someone should look into unifying all that. Probably we
should have foo_ver() and foo_ver_full() just like in kernel land.
--
Ville Syrjälä
Intel
next prev parent reply other threads:[~2025-10-10 23:25 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-07 5:05 [PATCH] lib/intel_device_info: get the xe .graphics_rel from GMD_ID Xin Wang
2025-10-07 9:34 ` Kamil Konieczny
2025-10-07 13:12 ` ✗ Xe.CI.BAT: failure for " Patchwork
2025-10-07 13:32 ` ✓ i915.CI.BAT: success " Patchwork
2025-10-07 16:10 ` ✗ Xe.CI.Full: failure " Patchwork
2025-10-07 16:42 ` [PATCH] " Lin, Shuicheng
2025-10-07 23:26 ` [PATCH v2 0/6] lib: Add runtime device info query APIs for xe devices Xin Wang
2025-10-07 23:26 ` [PATCH v2 1/6] " Xin Wang
2025-10-07 23:26 ` [PATCH v2 2/6] lib/xe: Use new APIs for xe device info queries Xin Wang
2025-10-07 23:26 ` [PATCH v2 3/6] lib: " Xin Wang
2025-10-07 23:26 ` [PATCH v2 4/6] tests/intel: " Xin Wang
2025-10-07 23:26 ` [PATCH v2 5/6] tools: " Xin Wang
2025-10-07 23:26 ` [PATCH v2 6/6] lib/intel_device_info: Remove hardcoded .graphics_rel values Xin Wang
2025-10-08 1:19 ` ✓ Xe.CI.BAT: success for lib/intel_device_info: get the xe .graphics_rel from GMD_ID (rev2) Patchwork
2025-10-08 3:07 ` ✗ Xe.CI.Full: failure " Patchwork
2025-10-08 5:04 ` ✓ i915.CI.BAT: success " Patchwork
2025-10-08 6:45 ` ✗ i915.CI.Full: failure for lib/intel_device_info: get the xe .graphics_rel from GMD_ID Patchwork
2025-10-08 12:14 ` ✗ i915.CI.Full: failure for lib/intel_device_info: get the xe .graphics_rel from GMD_ID (rev2) Patchwork
2025-10-08 21:02 ` [PATCH v3 0/6] lib/intel_device_info: get the xe .graphics_rel from GMD_ID Xin Wang
2025-10-08 21:02 ` [PATCH v3 1/6] lib: Add runtime device info query APIs for xe devices Xin Wang
2025-10-08 22:01 ` Matt Roper
2025-10-09 18:00 ` Wang, X
2025-10-09 23:57 ` Matt Roper
2025-10-10 23:25 ` Ville Syrjälä [this message]
2025-10-08 22:07 ` Lin, Shuicheng
2025-10-09 22:34 ` Wang, X
2025-10-09 16:42 ` Kamil Konieczny
2025-10-09 22:30 ` Wang, X
2025-10-08 21:02 ` [PATCH v3 2/6] lib/xe: Use new APIs for xe device info queries Xin Wang
2025-10-08 21:02 ` [PATCH v3 3/6] lib: " Xin Wang
2025-10-08 21:02 ` [PATCH v3 4/6] tests/intel: " Xin Wang
2025-10-08 21:02 ` [PATCH v3 5/6] tools: " Xin Wang
2025-10-08 21:02 ` [PATCH v3 6/6] lib/intel_device_info: Remove hardcoded .graphics_rel values Xin Wang
2025-10-08 21:46 ` ✗ Xe.CI.BAT: failure for lib/intel_device_info: get the xe .graphics_rel from GMD_ID (rev3) Patchwork
2025-10-08 22:04 ` ✓ i915.CI.BAT: success " Patchwork
2025-10-09 1:06 ` ✗ Xe.CI.Full: failure " Patchwork
2025-10-09 11:21 ` ✗ i915.CI.Full: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOmV8yQHOB9Pa_1d@intel.com \
--to=ville.syrjala@linux.intel.com \
--cc=alex.zuo@intel.com \
--cc=brian3.nguyen@intel.com \
--cc=dnyaneshwar.bhadane@intel.com \
--cc=gustavo.sousa@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=kamil.konieczny@intel.com \
--cc=matthew.d.roper@intel.com \
--cc=nakshtra.goyal@intel.com \
--cc=shuicheng.lin@intel.com \
--cc=x.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox