[PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware

Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware
@ 2024-04-10  9:40 Tejas Upadhyay
  2024-04-10 10:24 ` ✗ CI.Patch_applied: failure for tests/xe_exec_threads: Make hang tests reset domain aware (rev2) Patchwork
  2024-04-29 22:37 ` [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware John Harrison
  0 siblings, 2 replies; 5+ messages in thread
From: Tejas Upadhyay @ 2024-04-10  9:40 UTC (permalink / raw)
  To: igt-dev; +Cc: intel-xe, Matthew Brost, Lucas De Marchi, Tejas Upadhyay

RCS/CCS are dependent engines as they are sharing reset
domain. Whenever there is reset from CCS, all the exec queues
running on RCS are victimised mainly on Lunarlake.

Lets skip parallel execution on CCS with RCS.

It helps in fixing following errors:
1. Test assertion failure function test_legacy_mode, file, Failed assertion: data[i].data == 0xc0ffee

2.Test assertion failure function xe_exec, file ../lib/xe/xe_ioctl.c, Failed assertion: __xe_exec(fd, exec) == 0, error: -125 != 0

V3:
  - Check victimization reset domain wise irrespective of platform - Lucas/MattR
V2:
  - Add error details

Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
---
 tests/intel/xe_exec_threads.c | 43 ++++++++++++++++++++++++++++++++++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/tests/intel/xe_exec_threads.c b/tests/intel/xe_exec_threads.c
index 8083980f9..24eac39de 100644
--- a/tests/intel/xe_exec_threads.c
+++ b/tests/intel/xe_exec_threads.c
@@ -710,6 +710,24 @@ static void *thread(void *data)
 	return NULL;
 }
 
+static bool is_engine_contexts_victimized(__u16 eclass, unsigned int flags,
+					  bool has_rcs, bool multi_ccs,
+					  bool *ccs0_created)
+{
+	if (!(eclass == DRM_XE_ENGINE_CLASS_COMPUTE && flags & HANG))
+		return false;
+	if (has_rcs) {
+		return true;
+	} else if (multi_ccs) {
+		/* In case of multi ccs, allow only 1 ccs to run HANG test*/
+		if (*ccs0_created)
+			return true;
+		*ccs0_created = true;
+	}
+
+	return false;
+}
+
 /**
  * SUBTEST: threads-%s
  * Description: Run threads %arg[1] test with multi threads
@@ -955,9 +973,20 @@ static void threads(int fd, int flags)
 	bool go = false;
 	int n_threads = 0;
 	int gt;
+	bool has_rcs = false;
+	bool multi_ccs = false, has_ccs = false, ccs0_created = false;
 
-	xe_for_each_engine(fd, hwe)
+	xe_for_each_engine(fd, hwe) {
+		if (hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
+			has_rcs = true;
+		} else if (hwe->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) {
+			if (has_ccs)
+				multi_ccs = true;
+			else
+				has_ccs = true;
+		}
 		++n_engines;
+	}
 
 	if (flags & BALANCER) {
 		xe_for_each_gt(fd, gt)
@@ -990,6 +1019,18 @@ static void threads(int fd, int flags)
 	}
 
 	xe_for_each_engine(fd, hwe) {
+		/* RCS/CCS sharing reset domain hence dependent engines.
+		 * When CCS is doing reset, all the contexts of RCS are
+		 * victimized. Also in case of multiple CCS instances
+		 * contexts running on other CCS engine are victimized.
+		 * so skip the compute engine avoiding parallel execution
+		 * with RCS and other CCS.
+		 */
+		if (is_engine_contexts_victimized(hwe->engine_class, flags,
+						  has_rcs, multi_ccs,
+						  &ccs0_created))
+			continue;
+
 		threads_data[i].mutex = &mutex;
 		threads_data[i].cond = &cond;
 #define ADDRESS_SHIFT	39
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* ✗ CI.Patch_applied: failure for tests/xe_exec_threads: Make hang tests reset domain aware (rev2)
  2024-04-10  9:40 [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware Tejas Upadhyay
@ 2024-04-10 10:24 ` Patchwork
  2024-04-29 22:37 ` [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware John Harrison
  1 sibling, 0 replies; 5+ messages in thread
From: Patchwork @ 2024-04-10 10:24 UTC (permalink / raw)
  To: Tejas Upadhyay; +Cc: intel-xe

== Series Details ==

Series: tests/xe_exec_threads: Make hang tests reset domain aware (rev2)
URL   : https://patchwork.freedesktop.org/series/131937/
State : failure

== Summary ==

=== Applying kernel patches on branch 'drm-tip' with base: ===
Base commit: c111f7a5945a drm-tip: 2024y-04m-10d-09h-28m-42s UTC integration manifest
=== git am output follows ===
error: tests/intel/xe_exec_threads.c: does not exist in index
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Applying: tests/xe_exec_threads: Make hang tests reset domain aware
Patch failed at 0001 tests/xe_exec_threads: Make hang tests reset domain aware
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware
  2024-04-10  9:40 [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware Tejas Upadhyay
  2024-04-10 10:24 ` ✗ CI.Patch_applied: failure for tests/xe_exec_threads: Make hang tests reset domain aware (rev2) Patchwork
@ 2024-04-29 22:37 ` John Harrison
  2024-05-08 17:25   ` Upadhyay, Tejas
  1 sibling, 1 reply; 5+ messages in thread
From: John Harrison @ 2024-04-29 22:37 UTC (permalink / raw)
  To: Tejas Upadhyay, igt-dev; +Cc: intel-xe, Matthew Brost, Lucas De Marchi

On 4/10/2024 02:40, Tejas Upadhyay wrote:
> RCS/CCS are dependent engines as they are sharing reset
> domain. Whenever there is reset from CCS, all the exec queues
> running on RCS are victimised mainly on Lunarlake.
>
> Lets skip parallel execution on CCS with RCS.
But why? In a real-world scenario, this should not happen.

As per earlier comments, before changing the test we need to understand 
more about what is actually happening and why. The previous descriptions 
of what is failing and on what platforms do not make sense.

Also, the request for more explanation was not just for the email 
discussion but to actually add that description into the test. I even 
explicitly asked for that in one of the review emails. The 
xe_exec_threads executable reports 114 separate subtests, none of which 
has any documentation at all. The comments within the code basically 
come down to saying 'wait for completion' immediately before waiting.

At one point Matthew Brost stated:
 > If the HANG flag is set, 1 of the exec queue per thread will insert a
 > non-preemptable spinner. It is expected the GuC will reset this exec
 > queue only. Cross CCS / RCS resets will break this as one of the
 > 'good' exec queues from another thread could also be reset.

To which my replay was that if only one spinner is non-pre-emptible then 
all the other spinners should get automatically switched out prior to 
the reset being triggered and should therefore not be killed. The 'good' 
queues should survive and be totally oblivious to the reset occurring.

So either the above description is inaccurate and all the spinners are 
actually non-pre-emptible. In which case, if that is not the intention 
of the test then the correct fix is to make them pre-emptible and it 
should then pass. Or if the description is accurate, then the test is 
correctly failing because it has found a bug which needs to be fixed. 
Either way, the correct solution is not to simply skip the failing case 
and hide it under the rug.

John.


>
> It helps in fixing following errors:
> 1. Test assertion failure function test_legacy_mode, file, Failed assertion: data[i].data == 0xc0ffee
>
> 2.Test assertion failure function xe_exec, file ../lib/xe/xe_ioctl.c, Failed assertion: __xe_exec(fd, exec) == 0, error: -125 != 0
>
> V3:
>    - Check victimization reset domain wise irrespective of platform - Lucas/MattR
> V2:
>    - Add error details
>
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> ---
>   tests/intel/xe_exec_threads.c | 43 ++++++++++++++++++++++++++++++++++-
>   1 file changed, 42 insertions(+), 1 deletion(-)
>
> diff --git a/tests/intel/xe_exec_threads.c b/tests/intel/xe_exec_threads.c
> index 8083980f9..24eac39de 100644
> --- a/tests/intel/xe_exec_threads.c
> +++ b/tests/intel/xe_exec_threads.c
> @@ -710,6 +710,24 @@ static void *thread(void *data)
>   	return NULL;
>   }
>   
> +static bool is_engine_contexts_victimized(__u16 eclass, unsigned int flags,
> +					  bool has_rcs, bool multi_ccs,
> +					  bool *ccs0_created)
> +{
> +	if (!(eclass == DRM_XE_ENGINE_CLASS_COMPUTE && flags & HANG))
> +		return false;
> +	if (has_rcs) {
> +		return true;
> +	} else if (multi_ccs) {
> +		/* In case of multi ccs, allow only 1 ccs to run HANG test*/
> +		if (*ccs0_created)
> +			return true;
> +		*ccs0_created = true;
> +	}
> +
> +	return false;
> +}
> +
>   /**
>    * SUBTEST: threads-%s
>    * Description: Run threads %arg[1] test with multi threads
> @@ -955,9 +973,20 @@ static void threads(int fd, int flags)
>   	bool go = false;
>   	int n_threads = 0;
>   	int gt;
> +	bool has_rcs = false;
> +	bool multi_ccs = false, has_ccs = false, ccs0_created = false;
>   
> -	xe_for_each_engine(fd, hwe)
> +	xe_for_each_engine(fd, hwe) {
> +		if (hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
> +			has_rcs = true;
> +		} else if (hwe->engine_class == DRM_XE_ENGINE_CLASS_COMPUTE) {
> +			if (has_ccs)
> +				multi_ccs = true;
> +			else
> +				has_ccs = true;
> +		}
>   		++n_engines;
> +	}
>   
>   	if (flags & BALANCER) {
>   		xe_for_each_gt(fd, gt)
> @@ -990,6 +1019,18 @@ static void threads(int fd, int flags)
>   	}
>   
>   	xe_for_each_engine(fd, hwe) {
> +		/* RCS/CCS sharing reset domain hence dependent engines.
> +		 * When CCS is doing reset, all the contexts of RCS are
> +		 * victimized. Also in case of multiple CCS instances
> +		 * contexts running on other CCS engine are victimized.
> +		 * so skip the compute engine avoiding parallel execution
> +		 * with RCS and other CCS.
> +		 */
> +		if (is_engine_contexts_victimized(hwe->engine_class, flags,
> +						  has_rcs, multi_ccs,
> +						  &ccs0_created))
> +			continue;
> +
>   		threads_data[i].mutex = &mutex;
>   		threads_data[i].cond = &cond;
>   #define ADDRESS_SHIFT	39


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware
  2024-04-29 22:37 ` [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware John Harrison
@ 2024-05-08 17:25   ` Upadhyay, Tejas
  2024-06-06 14:16     ` Upadhyay, Tejas
  0 siblings, 1 reply; 5+ messages in thread
From: Upadhyay, Tejas @ 2024-05-08 17:25 UTC (permalink / raw)
  To: Harrison, John C, igt-dev@lists.freedesktop.org, Brost, Matthew
  Cc: intel-xe@lists.freedesktop.org, De Marchi, Lucas



> -----Original Message-----
> From: Harrison, John C <john.c.harrison@intel.com>
> Sent: Tuesday, April 30, 2024 4:08 AM
> To: Upadhyay, Tejas <tejas.upadhyay@intel.com>; igt-
> dev@lists.freedesktop.org
> Cc: intel-xe@lists.freedesktop.org; Brost, Matthew
> <matthew.brost@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>
> Subject: Re: [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset
> domain aware
> 
> On 4/10/2024 02:40, Tejas Upadhyay wrote:
> > RCS/CCS are dependent engines as they are sharing reset domain.
> > Whenever there is reset from CCS, all the exec queues running on RCS
> > are victimised mainly on Lunarlake.
> >
> > Lets skip parallel execution on CCS with RCS.
> But why? In a real-world scenario, this should not happen.
> 
> As per earlier comments, before changing the test we need to understand
> more about what is actually happening and why. The previous descriptions of
> what is failing and on what platforms do not make sense.
> 
> Also, the request for more explanation was not just for the email discussion
> but to actually add that description into the test. I even explicitly asked for
> that in one of the review emails. The xe_exec_threads executable reports 114
> separate subtests, none of which has any documentation at all. The
> comments within the code basically come down to saying 'wait for
> completion' immediately before waiting.
> 
> At one point Matthew Brost stated:
>  > If the HANG flag is set, 1 of the exec queue per thread will insert a  > non-
> preemptable spinner. It is expected the GuC will reset this exec  > queue only.
> Cross CCS / RCS resets will break this as one of the  > 'good' exec queues from
> another thread could also be reset.
> 
> To which my replay was that if only one spinner is non-pre-emptible then all
> the other spinners should get automatically switched out prior to the reset
> being triggered and should therefore not be killed. The 'good'
> queues should survive and be totally oblivious to the reset occurring.

As far as I am seeing test code, each spinner that is created is non-preemptible in this test. So all spinners will not get automatically switched out prior to reset and hence we will see test failure as 'good' queues will also not survive on reset. @Brost, Matthew please help to confirm behaviour. We need this to move further towards final solution.

Thanks,
Tejas  
> 
> So either the above description is inaccurate and all the spinners are actually
> non-pre-emptible. In which case, if that is not the intention of the test then
> the correct fix is to make them pre-emptible and it should then pass. Or if the
> description is accurate, then the test is correctly failing because it has found a
> bug which needs to be fixed.
> Either way, the correct solution is not to simply skip the failing case and hide it
> under the rug.
> 
> John.
> 
> 
> >
> > It helps in fixing following errors:
> > 1. Test assertion failure function test_legacy_mode, file, Failed
> > assertion: data[i].data == 0xc0ffee
> >
> > 2.Test assertion failure function xe_exec, file ../lib/xe/xe_ioctl.c,
> > Failed assertion: __xe_exec(fd, exec) == 0, error: -125 != 0
> >
> > V3:
> >    - Check victimization reset domain wise irrespective of platform -
> > Lucas/MattR
> > V2:
> >    - Add error details
> >
> > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > ---
> >   tests/intel/xe_exec_threads.c | 43
> ++++++++++++++++++++++++++++++++++-
> >   1 file changed, 42 insertions(+), 1 deletion(-)
> >
> > diff --git a/tests/intel/xe_exec_threads.c
> > b/tests/intel/xe_exec_threads.c index 8083980f9..24eac39de 100644
> > --- a/tests/intel/xe_exec_threads.c
> > +++ b/tests/intel/xe_exec_threads.c
> > @@ -710,6 +710,24 @@ static void *thread(void *data)
> >   	return NULL;
> >   }
> >
> > +static bool is_engine_contexts_victimized(__u16 eclass, unsigned int flags,
> > +					  bool has_rcs, bool multi_ccs,
> > +					  bool *ccs0_created)
> > +{
> > +	if (!(eclass == DRM_XE_ENGINE_CLASS_COMPUTE && flags & HANG))
> > +		return false;
> > +	if (has_rcs) {
> > +		return true;
> > +	} else if (multi_ccs) {
> > +		/* In case of multi ccs, allow only 1 ccs to run HANG test*/
> > +		if (*ccs0_created)
> > +			return true;
> > +		*ccs0_created = true;
> > +	}
> > +
> > +	return false;
> > +}
> > +
> >   /**
> >    * SUBTEST: threads-%s
> >    * Description: Run threads %arg[1] test with multi threads @@
> > -955,9 +973,20 @@ static void threads(int fd, int flags)
> >   	bool go = false;
> >   	int n_threads = 0;
> >   	int gt;
> > +	bool has_rcs = false;
> > +	bool multi_ccs = false, has_ccs = false, ccs0_created = false;
> >
> > -	xe_for_each_engine(fd, hwe)
> > +	xe_for_each_engine(fd, hwe) {
> > +		if (hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
> > +			has_rcs = true;
> > +		} else if (hwe->engine_class ==
> DRM_XE_ENGINE_CLASS_COMPUTE) {
> > +			if (has_ccs)
> > +				multi_ccs = true;
> > +			else
> > +				has_ccs = true;
> > +		}
> >   		++n_engines;
> > +	}
> >
> >   	if (flags & BALANCER) {
> >   		xe_for_each_gt(fd, gt)
> > @@ -990,6 +1019,18 @@ static void threads(int fd, int flags)
> >   	}
> >
> >   	xe_for_each_engine(fd, hwe) {
> > +		/* RCS/CCS sharing reset domain hence dependent engines.
> > +		 * When CCS is doing reset, all the contexts of RCS are
> > +		 * victimized. Also in case of multiple CCS instances
> > +		 * contexts running on other CCS engine are victimized.
> > +		 * so skip the compute engine avoiding parallel execution
> > +		 * with RCS and other CCS.
> > +		 */
> > +		if (is_engine_contexts_victimized(hwe->engine_class, flags,
> > +						  has_rcs, multi_ccs,
> > +						  &ccs0_created))
> > +			continue;
> > +
> >   		threads_data[i].mutex = &mutex;
> >   		threads_data[i].cond = &cond;
> >   #define ADDRESS_SHIFT	39


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware
  2024-05-08 17:25   ` Upadhyay, Tejas
@ 2024-06-06 14:16     ` Upadhyay, Tejas
  0 siblings, 0 replies; 5+ messages in thread
From: Upadhyay, Tejas @ 2024-06-06 14:16 UTC (permalink / raw)
  To: Upadhyay, Tejas, Harrison, John C, igt-dev@lists.freedesktop.org,
	Brost, Matthew, Summers, Stuart
  Cc: intel-xe@lists.freedesktop.org, De Marchi, Lucas



> -----Original Message-----
> From: Intel-xe <intel-xe-bounces@lists.freedesktop.org> On Behalf Of
> Upadhyay, Tejas
> Sent: Wednesday, May 8, 2024 10:55 PM
> To: Harrison, John C <john.c.harrison@intel.com>; igt-
> dev@lists.freedesktop.org; Brost, Matthew <matthew.brost@intel.com>
> Cc: intel-xe@lists.freedesktop.org; De Marchi, Lucas
> <lucas.demarchi@intel.com>
> Subject: RE: [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset
> domain aware
> 
> 
> 
> > -----Original Message-----
> > From: Harrison, John C <john.c.harrison@intel.com>
> > Sent: Tuesday, April 30, 2024 4:08 AM
> > To: Upadhyay, Tejas <tejas.upadhyay@intel.com>; igt-
> > dev@lists.freedesktop.org
> > Cc: intel-xe@lists.freedesktop.org; Brost, Matthew
> > <matthew.brost@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>
> > Subject: Re: [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests
> > reset domain aware
> >
> > On 4/10/2024 02:40, Tejas Upadhyay wrote:
> > > RCS/CCS are dependent engines as they are sharing reset domain.
> > > Whenever there is reset from CCS, all the exec queues running on RCS
> > > are victimised mainly on Lunarlake.
> > >
> > > Lets skip parallel execution on CCS with RCS.
> > But why? In a real-world scenario, this should not happen.
> >
> > As per earlier comments, before changing the test we need to
> > understand more about what is actually happening and why. The previous
> > descriptions of what is failing and on what platforms do not make sense.
> >
> > Also, the request for more explanation was not just for the email
> > discussion but to actually add that description into the test. I even
> > explicitly asked for that in one of the review emails. The
> > xe_exec_threads executable reports 114 separate subtests, none of
> > which has any documentation at all. The comments within the code
> > basically come down to saying 'wait for completion' immediately before
> waiting.
> >
> > At one point Matthew Brost stated:
> >  > If the HANG flag is set, 1 of the exec queue per thread will insert
> > a  > non- preemptable spinner. It is expected the GuC will reset this exec  >
> queue only.
> > Cross CCS / RCS resets will break this as one of the  > 'good' exec
> > queues from another thread could also be reset.
> >
> > To which my replay was that if only one spinner is non-pre-emptible
> > then all the other spinners should get automatically switched out
> > prior to the reset being triggered and should therefore not be killed. The
> 'good'
> > queues should survive and be totally oblivious to the reset occurring.
> 
> As far as I am seeing test code, each spinner that is created is non-
> preemptible in this test. So all spinners will not get automatically switched out
> prior to reset and hence we will see test failure as 'good' queues will also not
> survive on reset. @Brost, Matthew please help to confirm behaviour. We
> need this to move further towards final solution.

Any update here please? @Summers, Stuart FYI

Thanks,
Tejas

> 
> Thanks,
> Tejas
> >
> > So either the above description is inaccurate and all the spinners are
> > actually non-pre-emptible. In which case, if that is not the intention
> > of the test then the correct fix is to make them pre-emptible and it
> > should then pass. Or if the description is accurate, then the test is
> > correctly failing because it has found a bug which needs to be fixed.
> > Either way, the correct solution is not to simply skip the failing
> > case and hide it under the rug.
> >
> > John.
> >
> >
> > >
> > > It helps in fixing following errors:
> > > 1. Test assertion failure function test_legacy_mode, file, Failed
> > > assertion: data[i].data == 0xc0ffee
> > >
> > > 2.Test assertion failure function xe_exec, file
> > > ../lib/xe/xe_ioctl.c, Failed assertion: __xe_exec(fd, exec) == 0,
> > > error: -125 != 0
> > >
> > > V3:
> > >    - Check victimization reset domain wise irrespective of platform
> > > - Lucas/MattR
> > > V2:
> > >    - Add error details
> > >
> > > Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> > > ---
> > >   tests/intel/xe_exec_threads.c | 43
> > ++++++++++++++++++++++++++++++++++-
> > >   1 file changed, 42 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/tests/intel/xe_exec_threads.c
> > > b/tests/intel/xe_exec_threads.c index 8083980f9..24eac39de 100644
> > > --- a/tests/intel/xe_exec_threads.c
> > > +++ b/tests/intel/xe_exec_threads.c
> > > @@ -710,6 +710,24 @@ static void *thread(void *data)
> > >   	return NULL;
> > >   }
> > >
> > > +static bool is_engine_contexts_victimized(__u16 eclass, unsigned int
> flags,
> > > +					  bool has_rcs, bool multi_ccs,
> > > +					  bool *ccs0_created)
> > > +{
> > > +	if (!(eclass == DRM_XE_ENGINE_CLASS_COMPUTE && flags & HANG))
> > > +		return false;
> > > +	if (has_rcs) {
> > > +		return true;
> > > +	} else if (multi_ccs) {
> > > +		/* In case of multi ccs, allow only 1 ccs to run HANG test*/
> > > +		if (*ccs0_created)
> > > +			return true;
> > > +		*ccs0_created = true;
> > > +	}
> > > +
> > > +	return false;
> > > +}
> > > +
> > >   /**
> > >    * SUBTEST: threads-%s
> > >    * Description: Run threads %arg[1] test with multi threads @@
> > > -955,9 +973,20 @@ static void threads(int fd, int flags)
> > >   	bool go = false;
> > >   	int n_threads = 0;
> > >   	int gt;
> > > +	bool has_rcs = false;
> > > +	bool multi_ccs = false, has_ccs = false, ccs0_created = false;
> > >
> > > -	xe_for_each_engine(fd, hwe)
> > > +	xe_for_each_engine(fd, hwe) {
> > > +		if (hwe->engine_class == DRM_XE_ENGINE_CLASS_RENDER) {
> > > +			has_rcs = true;
> > > +		} else if (hwe->engine_class ==
> > DRM_XE_ENGINE_CLASS_COMPUTE) {
> > > +			if (has_ccs)
> > > +				multi_ccs = true;
> > > +			else
> > > +				has_ccs = true;
> > > +		}
> > >   		++n_engines;
> > > +	}
> > >
> > >   	if (flags & BALANCER) {
> > >   		xe_for_each_gt(fd, gt)
> > > @@ -990,6 +1019,18 @@ static void threads(int fd, int flags)
> > >   	}
> > >
> > >   	xe_for_each_engine(fd, hwe) {
> > > +		/* RCS/CCS sharing reset domain hence dependent engines.
> > > +		 * When CCS is doing reset, all the contexts of RCS are
> > > +		 * victimized. Also in case of multiple CCS instances
> > > +		 * contexts running on other CCS engine are victimized.
> > > +		 * so skip the compute engine avoiding parallel execution
> > > +		 * with RCS and other CCS.
> > > +		 */
> > > +		if (is_engine_contexts_victimized(hwe->engine_class, flags,
> > > +						  has_rcs, multi_ccs,
> > > +						  &ccs0_created))
> > > +			continue;
> > > +
> > >   		threads_data[i].mutex = &mutex;
> > >   		threads_data[i].cond = &cond;
> > >   #define ADDRESS_SHIFT	39


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-06-06 14:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-04-10  9:40 [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware Tejas Upadhyay
2024-04-10 10:24 ` ✗ CI.Patch_applied: failure for tests/xe_exec_threads: Make hang tests reset domain aware (rev2) Patchwork
2024-04-29 22:37 ` [PATCH V3 i-g-t] tests/xe_exec_threads: Make hang tests reset domain aware John Harrison
2024-05-08 17:25   ` Upadhyay, Tejas
2024-06-06 14:16     ` Upadhyay, Tejas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox