Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
@ 2024-12-09  8:57 Tejas Upadhyay
  2024-12-09 10:39 ` Hajda, Andrzej
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Tejas Upadhyay @ 2024-12-09  8:57 UTC (permalink / raw)
  To: igt-dev, intel-xe; +Cc: Tejas Upadhyay, Matt Roper, Jan Maslak

There are hardware platforms which are not supporting
hwconfig table, for example ADLS. Querying hwconfig
on unsupported platforms leads to assert and failure
for some of tests like,
./build/tests/xe_module_load --r load

https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jan Maslak <jan.maslak@intel.com>
Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 lib/xe/xe_query.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
index f3731d9d3..8f0a5392c 100644
--- a/lib/xe/xe_query.c
+++ b/lib/xe/xe_query.c
@@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd, uint32_t *hwconfig_size)
 
 	/* Perform the initial query to get the size */
 	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
-	igt_assert_neq(query.size, 0);
+	if (!query.size)
+		return NULL;
 
 	hwconfig = malloc(query.size);
 	igt_assert(hwconfig);
@@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd, enum intel_hwconfig attribute, uint32
 	igt_assert(xe_dev);
 
 	hwconfig = xe_dev->hwconfig;
-	igt_assert(hwconfig);
+	if (!hwconfig)
+		return NULL;
 
 	/* Extract the value from the hwconfig */
 	pos = 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
  2024-12-09  8:57 [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms Tejas Upadhyay
@ 2024-12-09 10:39 ` Hajda, Andrzej
  2024-12-09 14:28 ` ✗ CI.Patch_applied: failure for " Patchwork
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Hajda, Andrzej @ 2024-12-09 10:39 UTC (permalink / raw)
  To: Tejas Upadhyay, igt-dev, intel-xe; +Cc: Matt Roper, Jan Maslak


W dniu 09.12.2024 o 09:57, Tejas Upadhyay pisze:
> There are hardware platforms which are not supporting
> hwconfig table, for example ADLS. Querying hwconfig
> on unsupported platforms leads to assert and failure
> for some of tests like,
> ./build/tests/xe_module_load --r load
>
> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683
>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Jan Maslak <jan.maslak@intel.com>
> Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>



Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>

Regards
Andrzej

> ---
>   lib/xe/xe_query.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
> index f3731d9d3..8f0a5392c 100644
> --- a/lib/xe/xe_query.c
> +++ b/lib/xe/xe_query.c
> @@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd, uint32_t *hwconfig_size)
>   
>   	/* Perform the initial query to get the size */
>   	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> -	igt_assert_neq(query.size, 0);
> +	if (!query.size)
> +		return NULL;
>   
>   	hwconfig = malloc(query.size);
>   	igt_assert(hwconfig);
> @@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd, enum intel_hwconfig attribute, uint32
>   	igt_assert(xe_dev);
>   
>   	hwconfig = xe_dev->hwconfig;
> -	igt_assert(hwconfig);
> +	if (!hwconfig)
> +		return NULL;
>   
>   	/* Extract the value from the hwconfig */
>   	pos = 0;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* ✗ CI.Patch_applied: failure for lib/xe: set hwconfig NULL for unsupported platforms
  2024-12-09  8:57 [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms Tejas Upadhyay
  2024-12-09 10:39 ` Hajda, Andrzej
@ 2024-12-09 14:28 ` Patchwork
  2024-12-09 16:04 ` [i-g-t] " Matt Roper
  2024-12-09 19:20 ` John Harrison
  3 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2024-12-09 14:28 UTC (permalink / raw)
  To: Tejas Upadhyay; +Cc: intel-xe

== Series Details ==

Series: lib/xe: set hwconfig NULL for unsupported platforms
URL   : https://patchwork.freedesktop.org/series/142281/
State : failure

== Summary ==

=== Applying kernel patches on branch 'drm-tip' with base: ===
Base commit: c266f2fbb9c0 drm-tip: 2024y-12m-09d-14h-13m-53s UTC integration manifest
=== git am output follows ===
error: lib/xe/xe_query.c: does not exist in index
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Applying: lib/xe: set hwconfig NULL for unsupported platforms
Patch failed at 0001 lib/xe: set hwconfig NULL for unsupported platforms
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
  2024-12-09  8:57 [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms Tejas Upadhyay
  2024-12-09 10:39 ` Hajda, Andrzej
  2024-12-09 14:28 ` ✗ CI.Patch_applied: failure for " Patchwork
@ 2024-12-09 16:04 ` Matt Roper
  2024-12-09 19:20 ` John Harrison
  3 siblings, 0 replies; 9+ messages in thread
From: Matt Roper @ 2024-12-09 16:04 UTC (permalink / raw)
  To: Tejas Upadhyay; +Cc: igt-dev, intel-xe, Jan Maslak

On Mon, Dec 09, 2024 at 02:27:09PM +0530, Tejas Upadhyay wrote:
> There are hardware platforms which are not supporting
> hwconfig table, for example ADLS. Querying hwconfig
> on unsupported platforms leads to assert and failure
> for some of tests like,
> ./build/tests/xe_module_load --r load
> 
> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683

This should be provided as a git trailer:

   Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683

Then gitlab will automatically update that ticket once this patch lands.


Matt

> 
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Jan Maslak <jan.maslak@intel.com>
> Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> ---
>  lib/xe/xe_query.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
> index f3731d9d3..8f0a5392c 100644
> --- a/lib/xe/xe_query.c
> +++ b/lib/xe/xe_query.c
> @@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd, uint32_t *hwconfig_size)
>  
>  	/* Perform the initial query to get the size */
>  	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> -	igt_assert_neq(query.size, 0);
> +	if (!query.size)
> +		return NULL;
>  
>  	hwconfig = malloc(query.size);
>  	igt_assert(hwconfig);
> @@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd, enum intel_hwconfig attribute, uint32
>  	igt_assert(xe_dev);
>  
>  	hwconfig = xe_dev->hwconfig;
> -	igt_assert(hwconfig);
> +	if (!hwconfig)
> +		return NULL;
>  
>  	/* Extract the value from the hwconfig */
>  	pos = 0;
> -- 
> 2.34.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
  2024-12-09  8:57 [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms Tejas Upadhyay
                   ` (2 preceding siblings ...)
  2024-12-09 16:04 ` [i-g-t] " Matt Roper
@ 2024-12-09 19:20 ` John Harrison
  2024-12-09 20:38   ` Matt Roper
  3 siblings, 1 reply; 9+ messages in thread
From: John Harrison @ 2024-12-09 19:20 UTC (permalink / raw)
  To: Tejas Upadhyay, igt-dev, intel-xe; +Cc: Matt Roper, Jan Maslak

On 12/9/2024 00:57, Tejas Upadhyay wrote:
> There are hardware platforms which are not supporting
> hwconfig table, for example ADLS. Querying hwconfig
> on unsupported platforms leads to assert and failure
> for some of tests like,
> ./build/tests/xe_module_load --r load
The module reload test should not be querying the hwconfig table?!

But the hwconfig test itself should be failing on platforms which do not 
have a table. That is intentional. There is no platform which is POR for 
the Xe driver which does not support hwconfig tables. So the test is 
supposed to fail if it ever does not get a valid table.

John.

>
> https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683
>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Jan Maslak <jan.maslak@intel.com>
> Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> ---
>   lib/xe/xe_query.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
> index f3731d9d3..8f0a5392c 100644
> --- a/lib/xe/xe_query.c
> +++ b/lib/xe/xe_query.c
> @@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd, uint32_t *hwconfig_size)
>   
>   	/* Perform the initial query to get the size */
>   	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> -	igt_assert_neq(query.size, 0);
> +	if (!query.size)
> +		return NULL;
>   
>   	hwconfig = malloc(query.size);
>   	igt_assert(hwconfig);
> @@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd, enum intel_hwconfig attribute, uint32
>   	igt_assert(xe_dev);
>   
>   	hwconfig = xe_dev->hwconfig;
> -	igt_assert(hwconfig);
> +	if (!hwconfig)
> +		return NULL;
>   
>   	/* Extract the value from the hwconfig */
>   	pos = 0;


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
  2024-12-09 19:20 ` John Harrison
@ 2024-12-09 20:38   ` Matt Roper
  2024-12-10  8:39     ` Upadhyay, Tejas
  2024-12-11  9:13     ` Upadhyay, Tejas
  0 siblings, 2 replies; 9+ messages in thread
From: Matt Roper @ 2024-12-09 20:38 UTC (permalink / raw)
  To: John Harrison; +Cc: Tejas Upadhyay, igt-dev, intel-xe, Jan Maslak

On Mon, Dec 09, 2024 at 11:20:07AM -0800, John Harrison wrote:
> On 12/9/2024 00:57, Tejas Upadhyay wrote:
> > There are hardware platforms which are not supporting
> > hwconfig table, for example ADLS. Querying hwconfig
> > on unsupported platforms leads to assert and failure
> > for some of tests like,
> > ./build/tests/xe_module_load --r load
> The module reload test should not be querying the hwconfig table?!

See the Fixes: line; that patch made it so that anything calling
xe_device_get() will fetch and cache the hwconfig (and assert if it's
missing).  xe_module load calls __drm_open_driver ->
__drm_open_driver_another -> xe_device_get so it does indeed fetch the
hwconfig table now and hit the assertion added by that patch.

I don't know if always grabbing and caching the hwconfig was actually a
good idea or not.  As we've seen here, the hwconfig may legitimately be
NULL, so as we remove the assertion here, we still need to do
NULL-checks at the point the hwconfig gets used to truly be safe.

> 
> But the hwconfig test itself should be failing on platforms which do not
> have a table. That is intentional. There is no platform which is POR for the
> Xe driver which does not support hwconfig tables. So the test is supposed to
> fail if it ever does not get a valid table.

The Xe driver is only officially supported on the Xe2 platforms and
beyond, but it can still load and run on some pre-Xe2 platforms that
people were using as development platforms; some of those don't support
hwconfig.  And even though our CI setup is only trying to test a subset
of those platforms that do contain hwconfig, note that our DG2 setups
also have an ADL-S igpu that also gets probed during module load, and
that's what leads to the CI failures that were reported here.


Maybe XeKMD is mature enough now (and we have enough real platforms for
testing now) that we could prune the list of platforms that Xe is
willing to load on with unofficial support.  E.g., drop the unofficial
support for TGL, RKL, and ADL-S since those platforms don't have
hwconfig and I don't think any of our developers are still using them as
development vehicles at this point.  That's probably a separate
discussion from the patch at hand though.


Matt

> 
> John.
> 
> > 
> > https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683
> > 
> > Cc: Matt Roper <matthew.d.roper@intel.com>
> > Cc: Jan Maslak <jan.maslak@intel.com>
> > Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > ---
> >   lib/xe/xe_query.c | 6 ++++--
> >   1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
> > index f3731d9d3..8f0a5392c 100644
> > --- a/lib/xe/xe_query.c
> > +++ b/lib/xe/xe_query.c
> > @@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd, uint32_t *hwconfig_size)
> >   	/* Perform the initial query to get the size */
> >   	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query), 0);
> > -	igt_assert_neq(query.size, 0);
> > +	if (!query.size)
> > +		return NULL;
> >   	hwconfig = malloc(query.size);
> >   	igt_assert(hwconfig);
> > @@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd, enum intel_hwconfig attribute, uint32
> >   	igt_assert(xe_dev);
> >   	hwconfig = xe_dev->hwconfig;
> > -	igt_assert(hwconfig);
> > +	if (!hwconfig)
> > +		return NULL;
> >   	/* Extract the value from the hwconfig */
> >   	pos = 0;
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
  2024-12-09 20:38   ` Matt Roper
@ 2024-12-10  8:39     ` Upadhyay, Tejas
  2024-12-11  9:13     ` Upadhyay, Tejas
  1 sibling, 0 replies; 9+ messages in thread
From: Upadhyay, Tejas @ 2024-12-10  8:39 UTC (permalink / raw)
  To: Roper, Matthew D, Harrison, John C
  Cc: igt-dev@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Maslak, Jan



> -----Original Message-----
> From: Roper, Matthew D <matthew.d.roper@intel.com>
> Sent: Tuesday, December 10, 2024 2:08 AM
> To: Harrison, John C <john.c.harrison@intel.com>
> Cc: Upadhyay, Tejas <tejas.upadhyay@intel.com>; igt-
> dev@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Maslak, Jan
> <jan.maslak@intel.com>
> Subject: Re: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
> 
> On Mon, Dec 09, 2024 at 11:20:07AM -0800, John Harrison wrote:
> > On 12/9/2024 00:57, Tejas Upadhyay wrote:
> > > There are hardware platforms which are not supporting hwconfig
> > > table, for example ADLS. Querying hwconfig on unsupported platforms
> > > leads to assert and failure for some of tests like,
> > > ./build/tests/xe_module_load --r load
> > The module reload test should not be querying the hwconfig table?!
> 
> See the Fixes: line; that patch made it so that anything calling
> xe_device_get() will fetch and cache the hwconfig (and assert if it's
> missing).  xe_module load calls __drm_open_driver ->
> __drm_open_driver_another -> xe_device_get so it does indeed fetch the
> hwconfig table now and hit the assertion added by that patch.
> 
> I don't know if always grabbing and caching the hwconfig was actually a
> good idea or not.  As we've seen here, the hwconfig may legitimately be
> NULL, so as we remove the assertion here, we still need to do
> NULL-checks at the point the hwconfig gets used to truly be safe.
> 
> >
> > But the hwconfig test itself should be failing on platforms which do not
> > have a table. That is intentional. There is no platform which is POR for the
> > Xe driver which does not support hwconfig tables. So the test is supposed
> to
> > fail if it ever does not get a valid table.
> 
> The Xe driver is only officially supported on the Xe2 platforms and
> beyond, but it can still load and run on some pre-Xe2 platforms that
> people were using as development platforms; some of those don't support
> hwconfig.  And even though our CI setup is only trying to test a subset
> of those platforms that do contain hwconfig, note that our DG2 setups
> also have an ADL-S igpu that also gets probed during module load, and
> that's what leads to the CI failures that were reported here.
> 
> 
> Maybe XeKMD is mature enough now (and we have enough real platforms for
> testing now) that we could prune the list of platforms that Xe is
> willing to load on with unofficial support.  E.g., drop the unofficial
> support for TGL, RKL, and ADL-S since those platforms don't have
> hwconfig and I don't think any of our developers are still using them as
> development vehicles at this point.  That's probably a separate
> discussion from the patch at hand though.
> 

We can go with this patch or skip hwconfig considerations for unsupported platforms like KMD is doing, but in the end it will also need same kind of changes.

Or if we plan to remove pre-xe2 platforms completely from CI then no change needed. 

Tejas

> 
> Matt
> 
> >
> > John.
> >
> > >
> > > https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683
> > >
> > > Cc: Matt Roper <matthew.d.roper@intel.com>
> > > Cc: Jan Maslak <jan.maslak@intel.com>
> > > Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
> > > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > > ---
> > >   lib/xe/xe_query.c | 6 ++++--
> > >   1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
> > > index f3731d9d3..8f0a5392c 100644
> > > --- a/lib/xe/xe_query.c
> > > +++ b/lib/xe/xe_query.c
> > > @@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd,
> uint32_t *hwconfig_size)
> > >   	/* Perform the initial query to get the size */
> > >   	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query),
> 0);
> > > -	igt_assert_neq(query.size, 0);
> > > +	if (!query.size)
> > > +		return NULL;
> > >   	hwconfig = malloc(query.size);
> > >   	igt_assert(hwconfig);
> > > @@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd, enum
> intel_hwconfig attribute, uint32
> > >   	igt_assert(xe_dev);
> > >   	hwconfig = xe_dev->hwconfig;
> > > -	igt_assert(hwconfig);
> > > +	if (!hwconfig)
> > > +		return NULL;
> > >   	/* Extract the value from the hwconfig */
> > >   	pos = 0;
> >
> 
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
  2024-12-09 20:38   ` Matt Roper
  2024-12-10  8:39     ` Upadhyay, Tejas
@ 2024-12-11  9:13     ` Upadhyay, Tejas
  2024-12-12 18:54       ` Kamil Konieczny
  1 sibling, 1 reply; 9+ messages in thread
From: Upadhyay, Tejas @ 2024-12-11  9:13 UTC (permalink / raw)
  To: Roper, Matthew D, Harrison, John C
  Cc: igt-dev@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Maslak, Jan

Going ahead with merge of this patch. 

Tejas

> -----Original Message-----
> From: Roper, Matthew D <matthew.d.roper@intel.com>
> Sent: Tuesday, December 10, 2024 2:08 AM
> To: Harrison, John C <john.c.harrison@intel.com>
> Cc: Upadhyay, Tejas <tejas.upadhyay@intel.com>; igt-
> dev@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Maslak, Jan
> <jan.maslak@intel.com>
> Subject: Re: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
> 
> On Mon, Dec 09, 2024 at 11:20:07AM -0800, John Harrison wrote:
> > On 12/9/2024 00:57, Tejas Upadhyay wrote:
> > > There are hardware platforms which are not supporting hwconfig
> > > table, for example ADLS. Querying hwconfig on unsupported platforms
> > > leads to assert and failure for some of tests like,
> > > ./build/tests/xe_module_load --r load
> > The module reload test should not be querying the hwconfig table?!
> 
> See the Fixes: line; that patch made it so that anything calling
> xe_device_get() will fetch and cache the hwconfig (and assert if it's
> missing).  xe_module load calls __drm_open_driver ->
> __drm_open_driver_another -> xe_device_get so it does indeed fetch the
> hwconfig table now and hit the assertion added by that patch.
> 
> I don't know if always grabbing and caching the hwconfig was actually a
> good idea or not.  As we've seen here, the hwconfig may legitimately be
> NULL, so as we remove the assertion here, we still need to do
> NULL-checks at the point the hwconfig gets used to truly be safe.
> 
> >
> > But the hwconfig test itself should be failing on platforms which do not
> > have a table. That is intentional. There is no platform which is POR for the
> > Xe driver which does not support hwconfig tables. So the test is supposed to
> > fail if it ever does not get a valid table.
> 
> The Xe driver is only officially supported on the Xe2 platforms and
> beyond, but it can still load and run on some pre-Xe2 platforms that
> people were using as development platforms; some of those don't support
> hwconfig.  And even though our CI setup is only trying to test a subset
> of those platforms that do contain hwconfig, note that our DG2 setups
> also have an ADL-S igpu that also gets probed during module load, and
> that's what leads to the CI failures that were reported here.
> 
> 
> Maybe XeKMD is mature enough now (and we have enough real platforms for
> testing now) that we could prune the list of platforms that Xe is
> willing to load on with unofficial support.  E.g., drop the unofficial
> support for TGL, RKL, and ADL-S since those platforms don't have
> hwconfig and I don't think any of our developers are still using them as
> development vehicles at this point.  That's probably a separate
> discussion from the patch at hand though.
> 
> 
> Matt
> 
> >
> > John.
> >
> > >
> > > https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683
> > >
> > > Cc: Matt Roper <matthew.d.roper@intel.com>
> > > Cc: Jan Maslak <jan.maslak@intel.com>
> > > Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
> > > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > > ---
> > >   lib/xe/xe_query.c | 6 ++++--
> > >   1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
> > > index f3731d9d3..8f0a5392c 100644
> > > --- a/lib/xe/xe_query.c
> > > +++ b/lib/xe/xe_query.c
> > > @@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd,
> uint32_t *hwconfig_size)
> > >   	/* Perform the initial query to get the size */
> > >   	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query),
> 0);
> > > -	igt_assert_neq(query.size, 0);
> > > +	if (!query.size)
> > > +		return NULL;
> > >   	hwconfig = malloc(query.size);
> > >   	igt_assert(hwconfig);
> > > @@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd,
> enum intel_hwconfig attribute, uint32
> > >   	igt_assert(xe_dev);
> > >   	hwconfig = xe_dev->hwconfig;
> > > -	igt_assert(hwconfig);
> > > +	if (!hwconfig)
> > > +		return NULL;
> > >   	/* Extract the value from the hwconfig */
> > >   	pos = 0;
> >
> 
> --
> Matt Roper
> Graphics Software Engineer
> Linux GPU Platform Enablement
> Intel Corporation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
  2024-12-11  9:13     ` Upadhyay, Tejas
@ 2024-12-12 18:54       ` Kamil Konieczny
  0 siblings, 0 replies; 9+ messages in thread
From: Kamil Konieczny @ 2024-12-12 18:54 UTC (permalink / raw)
  To: Upadhyay, Tejas
  Cc: Roper, Matthew D, Harrison, John C, igt-dev@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, Maslak, Jan

Hi Upadhyay,,
On 2024-12-11 at 09:13:50 +0000, Upadhyay, Tejas wrote:
> Going ahead with merge of this patch. 
> 
> Tejas
> 

Merged,

Regards,
Kamil

> > -----Original Message-----
> > From: Roper, Matthew D <matthew.d.roper@intel.com>
> > Sent: Tuesday, December 10, 2024 2:08 AM
> > To: Harrison, John C <john.c.harrison@intel.com>
> > Cc: Upadhyay, Tejas <tejas.upadhyay@intel.com>; igt-
> > dev@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Maslak, Jan
> > <jan.maslak@intel.com>
> > Subject: Re: [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms
> > 
> > On Mon, Dec 09, 2024 at 11:20:07AM -0800, John Harrison wrote:
> > > On 12/9/2024 00:57, Tejas Upadhyay wrote:
> > > > There are hardware platforms which are not supporting hwconfig
> > > > table, for example ADLS. Querying hwconfig on unsupported platforms
> > > > leads to assert and failure for some of tests like,
> > > > ./build/tests/xe_module_load --r load
> > > The module reload test should not be querying the hwconfig table?!
> > 
> > See the Fixes: line; that patch made it so that anything calling
> > xe_device_get() will fetch and cache the hwconfig (and assert if it's
> > missing).  xe_module load calls __drm_open_driver ->
> > __drm_open_driver_another -> xe_device_get so it does indeed fetch the
> > hwconfig table now and hit the assertion added by that patch.
> > 
> > I don't know if always grabbing and caching the hwconfig was actually a
> > good idea or not.  As we've seen here, the hwconfig may legitimately be
> > NULL, so as we remove the assertion here, we still need to do
> > NULL-checks at the point the hwconfig gets used to truly be safe.
> > 
> > >
> > > But the hwconfig test itself should be failing on platforms which do not
> > > have a table. That is intentional. There is no platform which is POR for the
> > > Xe driver which does not support hwconfig tables. So the test is supposed to
> > > fail if it ever does not get a valid table.
> > 
> > The Xe driver is only officially supported on the Xe2 platforms and
> > beyond, but it can still load and run on some pre-Xe2 platforms that
> > people were using as development platforms; some of those don't support
> > hwconfig.  And even though our CI setup is only trying to test a subset
> > of those platforms that do contain hwconfig, note that our DG2 setups
> > also have an ADL-S igpu that also gets probed during module load, and
> > that's what leads to the CI failures that were reported here.
> > 
> > 
> > Maybe XeKMD is mature enough now (and we have enough real platforms for
> > testing now) that we could prune the list of platforms that Xe is
> > willing to load on with unofficial support.  E.g., drop the unofficial
> > support for TGL, RKL, and ADL-S since those platforms don't have
> > hwconfig and I don't think any of our developers are still using them as
> > development vehicles at this point.  That's probably a separate
> > discussion from the patch at hand though.
> > 
> > 
> > Matt
> > 
> > >
> > > John.
> > >
> > > >
> > > > https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3683
> > > >
> > > > Cc: Matt Roper <matthew.d.roper@intel.com>
> > > > Cc: Jan Maslak <jan.maslak@intel.com>
> > > > Fixes: 37a230e50 ("lib/xe_query: add hwconfig to xe_device")
> > > > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > > > ---
> > > >   lib/xe/xe_query.c | 6 ++++--
> > > >   1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/lib/xe/xe_query.c b/lib/xe/xe_query.c
> > > > index f3731d9d3..8f0a5392c 100644
> > > > --- a/lib/xe/xe_query.c
> > > > +++ b/lib/xe/xe_query.c
> > > > @@ -51,7 +51,8 @@ static uint32_t *xe_query_hwconfig_new(int fd,
> > uint32_t *hwconfig_size)
> > > >   	/* Perform the initial query to get the size */
> > > >   	igt_assert_eq(igt_ioctl(fd, DRM_IOCTL_XE_DEVICE_QUERY, &query),
> > 0);
> > > > -	igt_assert_neq(query.size, 0);
> > > > +	if (!query.size)
> > > > +		return NULL;
> > > >   	hwconfig = malloc(query.size);
> > > >   	igt_assert(hwconfig);
> > > > @@ -858,7 +859,8 @@ uint32_t *xe_hwconfig_lookup_value(int fd,
> > enum intel_hwconfig attribute, uint32
> > > >   	igt_assert(xe_dev);
> > > >   	hwconfig = xe_dev->hwconfig;
> > > > -	igt_assert(hwconfig);
> > > > +	if (!hwconfig)
> > > > +		return NULL;
> > > >   	/* Extract the value from the hwconfig */
> > > >   	pos = 0;
> > >
> > 
> > --
> > Matt Roper
> > Graphics Software Engineer
> > Linux GPU Platform Enablement
> > Intel Corporation

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-12-12 18:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-09  8:57 [i-g-t] lib/xe: set hwconfig NULL for unsupported platforms Tejas Upadhyay
2024-12-09 10:39 ` Hajda, Andrzej
2024-12-09 14:28 ` ✗ CI.Patch_applied: failure for " Patchwork
2024-12-09 16:04 ` [i-g-t] " Matt Roper
2024-12-09 19:20 ` John Harrison
2024-12-09 20:38   ` Matt Roper
2024-12-10  8:39     ` Upadhyay, Tejas
2024-12-11  9:13     ` Upadhyay, Tejas
2024-12-12 18:54       ` Kamil Konieczny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox